Characterizing Migration Dynamics with Convolution-Based Movement Models
Migratory animals often exhibit changes in their behavior during their migration. Telemetry data provide a way to observe geographic animal positions over time, but not necessarily changes in the dynamics in the movement process. Continuous-time continuous-space animal movement models provide a statistical framework for learning about individual-based animal movement dynamics. Continuous-time models also allow for statistical predictions of the trajectory during periods when the telemetry device did not record the animal’s position and in the presence of location error. Predicted trajectories can be inferred using continuous-time models, but models capable of mimicking realistic trajectories with sufficient detail are computationally challenging to fit to large data sets. Furthermore, basic continuous-time model specifications (e.g., Brownian motion) lack realism in their ability to capture nonstationary dynamics. We present a unified class of animal movement models that are computationally efficient and provide a suite of approaches for accommodating nonstationarity in continuous animal movement trajectories. Our approach uses so-called process convolutions to allow for flexibility in the movement process while facilitating implementation and incorporating location uncertainty. We review convolution-based approaches to account for nonstationarity in continuous-time animal movement models including temporal deformation and nested process convolutions. We demonstrate these approaches in two case studies involving migratory birds. Specifically, we use temporal deformation to account for heterogeneity in individual greater white-fronted goose migrations in Europe and Iceland and we use nested process convolutions to model dynamic migratory networks in sandhill cranes in North America.
Keywords: basis function, Brownian motion, continuous-time model, network model, process convolution, spatial statistics
Rapid improvement in technology has led to high-quality animal tracking (i.e., telemetry; see Appendix A for a glossary of terms) data that are accumulating at an incredible rate (Cagnacci et al., 2010; Kays et al., 2015). There are not only more data being collected in more studies, but the variety of data is also increasing. Variation exists in telemetry device, fix rates and regularity, accuracy, types of measurement error, duration, and taxa studied. Behavioral variation also exists within individual and taxa. Many approaches have been developed to characterize the variation within individual animal trajectories (Hooten and Johnson, 2017b; Hooten et al., 2017). These approaches include the use of spatial and temporal covariates and clustering methods to understand the portions of animal trajectories that indicate distinctly different patterns. For example, potential function specifications in stochastic differential equations (SDEs; Brillinger, 2010) have facilitated the explicit inclusion of covariates in continuous-time models. By contrast, most phenomenological clustering approaches have appeared in discrete-time animal movement models (e.g., Morales et al., 2014; Langrock et al., 2012; McClintock et al., 2012).
The use of SDEs to infer relationships between animal movement and habitat based on telemetry data is increasing (e.g., Parton and Blackwell, 2017; Russell et al., 2017), but the associated computational challenges are also increasing as a function of data set size as well as data and model complexity (Scharf et al., 2017). Thus, several approaches have been developed to utilize continuous-time animal trajectories based on telemetry data in a second-stage modeling framework to infer relationships between animal movement and habitat (Hooten et al., 2010; Hanks et al., 2015; McClintock et al. 2017; Scharf et al., 2017). Some two-stage approaches enable inference on the effects of covariates on movement while accounting for uncertainty when predicting the location of an animal in continuous-time (Hooten et al., 2017). Thus, accurate continuous-time models are essential to represent the trajectory distribution in the first stage.
To serve as accurate representations of the marginal trajectory distribution, movement models must be continuous, but also allow for variation in the movement dynamics throughout the trajectory. Thus, the movement models need to capture nonstationarity, a term used in time series and spatial statistics, to account for changes in the dynamics of the animal as it moves. Nonstationarity could be caused by behavioral responses to the environment, diurnal cycles, or interactions with other individuals of the same or different species.
In what follows, we demonstrate a unified framework to account for nonstationarity in animal trajectories using statistically principled methods that are flexible and fast to implement. We review convolution-based approaches for modeling continuous-time continuous-space trajectories and show how to tailor them to incorporate nonstationarity and additional data sources. Convolutions have become popular in spatial statistics because they result in models that are easy to specify and fit to data (e.g., Barry and Ver Hoef, 1996).
We illustrate convolution-based approaches to characterize nonstationary animal trajectories through two examples involving migratory birds. First, we apply individual-based process convolution models to account for heterogeneity in the migration trajectories of Greenland white-fronted geese (Anser albifrons flavirostris; a subspecies of greater white-fronted goose; Dalgety and Scott, 1948) from Ireland to staging grounds in Iceland. For this example, we developed a temporal deformation for migratory animals that provides inference about the timing and duration of migration-induced nonstationarity in the movement dynamics. In the second example, we demonstrate a nested process convolution approach that utilizes telemetry data from multiple sandhill crane (Antigone canadensis) individuals migrating in North America. Our convolution-based approach provides substantial reductions in uncertainty for trajectory estimates by pooling across individuals using a dynamic migratory network specification.
In the method section that follows, we begin by explaining the convolution approach for modeling continuous-time animal trajectories. These process convolutions can be nested within each other and we explain how certain types of nesting can be used to accommodate additional structure and sources of data in movement models, including social structure among individuals. We show how convolution-based approaches can be tailored to account for nonstationarity in trajectories by incorporating knowledge of the movement process. We also review techniques that facilitate efficient computational implementation of the convolution models.
2.1 Convolution Models
Models for animal trajectories often rely on Brownian motion (i.e., Wiener process) concepts. A Wiener process is accumulated, scaled white noise (e.g., the cumulative infinite sum of Gaussian random vectors). Whereas white noise is not correlated over time, the temporal sum of white noise is because the next value in time always depends on the cumulative past values. Hence, Wiener processes served as a basis for simple continuous-time animal movement models. To make animal movement models more realistic, a generalized form of Brownian motion was used to include features like central place attractors (i.e., Ornstein-Uhlenbeck models) and external forcings (e.g., potential functions), typically in a SDE framework. The SDE framework is attractive because it allows the user to model the system with a dynamic forward process that aligns with their hypotheses of animal movement mechanisms. However, there can be advantages to modeling movement processes jointly rather than conditionally. Joint process specifications often involve explicit models for the dependence structure inherent to the process through covariance parameterizations. For example, in modern spatial statistics, second-order covariances are commonly parameterized using first-order representations of the dependence structure in the process in the form of convolutions (Sampson, 2010; Hefley et al., 2017).
A basic convolution representation for an animal trajectory is
where represents the true unobserved animal position at time (, where and bound the temporal period of interest), is a diagonal matrix with as the diagonal elements, and is a Wiener process based on cumulative white noise with variance . The function is a one-dimensional temporal kernel anchored at time (e.g., a Gaussian probability density function with mean ). The shape of the one-dimensional kernel can assume various forms and may be specific to the ecology of the species (e.g., Fig. 2 in Hooten and Johnson, 2017a). The expression in (Eqn. 1) is referred to as a process convolution because it defines the position of an individual at time as a convolution (i.e., an integral of a product) of a kernel function with a stochastic process (Higdon, 2002).
Hooten and Johnson (2017a) showed that (Eqn. 1) can be written as a convolution with white noise
where the matrix has diagonal elements equal to . Figure 1 depicts how convolutions work in one dimension.
The middle row of Figure 1 represents two possible basis functions at a particular time point . As time progresses, shifts to the right. To obtain the movement process at time (i.e., the location of the individual), we multiply by white noise and integrate over the time domain. We approximate the convolution in a computer algorithm as a sum of the product, i.e., . Convolutions of white noise are well-studied and have useful properties (Barry and Ver Hoef, 1996; Higdon, 2002). For example, convolutions can be used to construct complex covariance functions for dependent processes that are not easy to specify otherwise.
2.2 Convolution-Induced Covariance
The stochastic process in (Eqn. 2) provides a generalizable framework that can be used to account for a wide range of animal movement processes. In addition to its flexibility, convolutions such as (Eqn. 2) provide an intuitive way to specify dependence for continuous processes based on covariance (Hefley et al., 2017). For the convolution model in (Eqn. 2), we can write the covariance between time points and as .
In spatial statistics, it is common to express dependence in terms of covariance, at least in part, because it can yield computational advantages for fitting models to data (Hefley et al., 2017). For our convolution model, we can express a joint Gaussian process model for the observed telemetry data (where each is a vector representing the observed location) as
where, the matrix has element (also called a “basis function;” e.g., Hefley et al. 2017). The two-dimensional telemetry data set is stacked in a vector , and the observational covariance matrix is . The multivariate Gaussian form of the jointly specified movement model is attractive because fast numerical methods can be used to fit the model to high-resolution telemetry data sets.
2.3 Temporal Deformation
The statistical model in (Eqn. 3) is flexible because the shape and scale of the kernels can vary with time to accommodate realistic dynamics in animal movement. For example, in a Gaussian kernel, the temporal range parameter () may vary (i.e., ) to allow for heterogeneity in the smoothness of the individual’s track over time (Higdon, 2002). Larger values for imply animal behavior that results in smoother trajectories, such as migration periods. An alternative to letting the range parameter vary in the kernel function, is to deform (i.e., compress or expand) the temporal domain itself (Sampson and Guttorp, 1992). By compressing time in certain regions and expanding it in others, we can account for the same type of heterogeneous dynamics as the varying parameter approach previously described. By conditioning on a temporal deformation, we are able to use the same software we would use to fit the temporally homogeneous convolution model.
Temporal deformation can be induced using a warping function in place of time (e.g., ). Warping functions have traditionally been expressed as smooth stochastic functions in the time domain, such as Gaussian processes (e.g., Hooten and Johnson, 2017a). The derivative of the warping function (i.e., ) indicates the portions of the time domain that are compressed () and expanded (). Temporal compression leads to rough trajectories and temporal expansion leads to smooth trajectories. Gaussian process warping functions are quite general, but not mechanistically linked to known natural history or animal behavior. Furthermore, a critical characteristic of deformation approaches to account for nonstationarity is that the warping function does not fold (i.e., the resulting warped time field retains the same order as the original time domain). Previous implementations of deformation approaches have imposed a non-folding constraint by tuning the Gaussian process associated with the warping function so that it does not result in temporal expansions that induce folding. Such constraints may be computationally demanding to implement using conventional deformation approaches.
We propose a warping function for implementing temporal deformation that acknowledges the natural history of migratory animals and is guaranteed not to fold. We refer to this warping function as a “temporally deforming cumulative function” (TDCF) and define it as
where , , is any probability density function that integrates to one over the time domain, and and represent the beginning and end of the time domain (or first and last times at which data were collected). It can be shown that the derivative of the TDCF in (Eqn. 4) is
which is a linear function of .
The general form of TDCF in (Eqn. 4) will not fold the time domain and will retain the original temporal extent of the data (Figure 2). The latter characteristic can be helpful for specifying a prior distribution for the range parameter () in Bayesian implementations of the convolution model in (Eqn. 3). We return to specific forms of deformation and warping kernel functions in an example involving Greenland white-fronted geese.
2.4 Dynamic Movement Networks
Temporal heterogeneity in animal movement dynamics may also arise as a result of intraspecific interactions. While adding complexity to the statistical model, accounting for dependence among individuals as populations redistribute over space can be beneficial for inference (as we demonstrate in the sandhill crane example that follows). Many studies collect telemetry data for multiple individuals of a population simultaneously, thus we can make use of those data to improve our understanding of the trajectories of each individual by demonstrating an extension of the convolution approach. Scharf et al. (In Review) proposed a nested structure for multiple convolutions that we use to reconcile the individual-level movement model in (Eqn. 2) with dynamic social network models for movement (e.g., Russell et al., 2016; Scharf et al., 2016; Russell et al., 2017).
Extending (Eqn. 1), a general three-level nested process convolution structure can be expressed as
where and correspond to observed individuals and the kernel functions on the diagonals of each of the convolution matrices (, , ) are specified as
The first two kernel functions ( and ) are the same as in the previous example for individual-based movement (i.e., inducing Brownian and inertial smoothing). However, the third kernel () is a function of a weighted network describing the joint dynamics of a group of moving individuals. The network weights correspond to pairwise relationships among individuals that may vary over time.
Many approaches have been proposed for modeling network weights, including exponential random graph models and latent space models (Goldenberg et al., 2010). In what follows, we describe a latent space approach (Hoff et al., 2002) to model the weights in the nested process convolution movement model (Eqns. 6–8). We define weights based on distances among individuals in a latent Euclidean space . Thus, we denote the latent position at time of individual in latent network space as and define the network weights as . Thus, as and approach each other in latent space, the network weight between them increases. For inference in static networks where the social relationship of the individuals is homogeneous over time, all and the set of (for ) arise as a point process in . In essence, acts as a random effect in the model that accounts for dependence among individuals as they move. We can improve the inference for all individuals by borrowing strength across multiple connected individuals, as well as account for heterogeneity in the movement trajectories.
In the case where we expect the social structure of the observed individuals to change over time, a variety of dynamic models are available for . We describe one such specific dynamic model for in the application pertaining to sandhill crane migrations.
3.1 Individual Movement: Greenland White-Fronted Geese
The Greenland white-fronted goose (GWFG) is the most morphologically distinct subspecies of the circumpolar greater white-fronted goose (A. albifrons; Ely et al. 2005; Figure 3). GWFG are long-distance migrants that breed in west Greenland (Malecki et al., 2000), stage during autumn and spring in south and west Iceland, and winter at over 70 regularly used sites across Great Britain and Ireland (Ruttledge and Ogilvie, 1979). Thus, their annual migration spans 5,000 km and includes crossing the Greenland Ice Sheet (a 1.7 million km expanse of ice peaking at 3,000 m in elevation; Comiso and Parkinson 2004). The global population of GWFG has declined in recent years, from approximately 36,000 individuals in 1999 to 19,000 in 2016 (Fox et al., 2016). GWFG are listed as ‘Endangered’ under IUCN Red List criteria and as a priority species in the Biodiversity Action Plan in the United Kingdom (U.K.), and managed under a Species Action Plan through the African-Eurasian Migratory Waterbird Agreement (Stroud et al., 2012). GWFG have been protected from hunting since 1982 in Ireland and Scotland, 2006 in Iceland and 2009 in Greenland; a voluntary shooting ban on the birds remains in place in Wales, where they are still legal quarry, as in England.
GWFG occupy breeding areas from May to early September and feed on tubers and exposed plant matter, mainly common cottongrass (Eriophorum angustifolium; Madsen and Fox, 1981). They lay 4-6 eggs and incubation occurs over 25-27 days (Fox and Stroud, 1988), similar to other Arctic-nesting geese (Cooke et al., 1995). A four-week complete wing moult occurs during late summer. Autumn migration begins in September and birds stage in Iceland until October (now into early November; Fox et al., 1999), when they migrate to wintering areas in Great Britain and Ireland. Food sources on staging and wintering areas are mainly agricultural (e.g., cereal crops or managed grassland; Fox and Stroud, 2002). Although spring migration from Great Britain and Ireland began in April in the 1970s and 1980s, in recent years, birds have departed for Iceland successively earlier and now do so in late March (Fox et al., 2014), with greater fat stores than in previous years (Fox and Walsh, 2012). The spring staging period in Iceland has increased in duration over the same time period because, although GWFG arrive earlier, they depart within a few days of historical departure dates in early May (Fox et al., 2014).
During late winter 2016, GWFG were caught over intensively managed grassland at Wexford Slobs, Ireland using rocket-propelled nets (Wheeler and Lewis, 1972) under permission from the British Trust for Ornithology. We analyzed data from 4 female GWFG that were fitted with 28 g Global Positioning System (GPS) tracking devices (with internal GPS aerial; Cellular Tracking Technologies; Rio Grande, New Jersey, USA) attached to neck collars (i.e., total package weight 39 g). The GPS logger measured and recorded spatial position at each fix. Tags were programmed to log eight GPS fixes per day. Data were uploaded daily to an online user interface via the Global System for Mobile Communications technology.
We used the convolution model (Eqn. 3) to analyze the GWFG telemetry data from each individual separately. We used Gaussian kernel functions () with range parameter modeled with a discrete uniform distribution to facilitate computation (see Hooten and Johnson, 2017a, for details). To account for heterogeneity in time, we applied the temporal deformation approach using the warping function in (Eqn. 5) based on a mixture , where is a truncated Gaussian PDF with location , scale parameter , and support . The mixture model for the warping function allowed us to approximate nearly any temporal deformation indicated by the data and also facilitates the computational implementation because we can use Bayesian model averaging (BMA) across mixture components (in essence, estimating the mixture probabilities as posterior model probabilities). We used the two-stage BMA approach by Barker and Link (2013), distributing each model fit across processors, and then recombining the results in a second stage algorithm to obtain the optimal model averaged trajectory. See Appendix B for further model implementation details.
The results of fitting the model to 4 GWFG individuals are presented in Figure 4, where the left panel shows the estimated trajectories for the four GWFG individuals on their migration from Ireland to Iceland during March 20 - April 15, 2017.
In Figure 4, the trajectories are shown as posterior predictive realizations (the individual lines) of the trajectories for model averaged posterior distribution. Figure 4 also indicates the utility of the warp functions to account for heterogeneity in the migration of each individual. We can also glean inference from the derivatives of the model averaged warp functions because they indicate the time and duration of the migratory period for each individual, illustrating the variability in migration among individuals. The red and blue individuals both departed on their migration early (March 26-27) with a similar migratory duration, whereas the green and purple individuals both departed late (April 2) with the green individual taking a more circuitous route that lasted nearly twice as long.
Convolution models provide a statistically principled means to interpolate telemetry data while accommodating uncertainty in the data and heterogeneous dynamics. Convolution models based on temporal deformations that acknowledge the natural history of the species (e.g., the TDCFs we proposed herein) also provide a way to quantify differences in migration characteristics among individuals and over time. For example, among the 4 GWFG migration trajectories we analyzed, there were clearly two pulses in the timing of migration initiation (Figure 4, lower right panel). However, there was no clear indication that the differences in migration onset were related to individual fitness or age. In the second group of migrating individuals (purple and green individuals in Figure 4), the migration of the green individual led to nearly double the energetic demands as the purple individual because the migration distance and duration were substantially longer. Prior to our analyses, little evidence existed that GWFG stopover on the Faroe Islands between wintering and staging areas. While the green individual began northward almost immediately upon departing wintering areas, the red individual seemed to lose its ability to orient correctly approximately half way through the trip (perhaps due to weather, influence of other individuals, or other unknown causes). After a loop north of the U.K. however, the red individual corrected its orienting and reached Iceland. Remarkably, the total migration period from wintering to staging areas for the red individual (Figure 4, lower right panel) was not longer than those individuals that flew directly to Iceland from Ireland (purple and blue).
3.2 Group Movement: Sandhill Cranes
Sandhill cranes (SACR; Antigone canadensis; Figure 5) are a long-lived bird species found in wetland-rich landscapes across North America. SACR are divided into various migratory and nonmigratory management populations across North America. The midcontinent SACR population is the largest, comprising approximately 600,000 individuals (Kruse and Dubovsky 2015). They breed from western Quebec, across the Canadian Arctic and Alaska to northeastern Russia in a variety of ecoregions from Arctic tundra to temperate grasslands. Twice each year SACR migrate through the Great Plains and winter from southern Oklahoma to northern Mexico, using playa and coastal wetlands for roosting and foraging (Krapu et al., 2011, 2014).
The SACR is a species with a unique convergence of multiple user groups that share a common interest in the continued health of the species. Midcontinent SACR are a popular sport harvest species during fall and winter in Canada and the United States (U.S.) and are the most hunted population of cranes in the world. Furthermore, SACR attract a large and committed following of wildlife viewers. For example, spring staging and courtship displays along the Platte River in central Nebraska attracts tens of thousands of people each year (Stoll et al., 2006).
Four geographically distinct groups can be identified forming the midcontinent population. Each expresses differences in breeding, migration, and wintering space use and timing; groups also differ in potential exposure to hunting (Krapu et al., 2011). For this study, all individuals were from a single group that breeds in western Alaska and Siberia (lesser SACR; a subspecies distinction) and represent the smallest individuals found in the midcontinent population. The lesser SACR group also has the greatest abundance, comprising approximately 40% of the entire MCP (Krapu et al., 2011).
We analyzed data from 5 adult SACRs that were captured by rocket-propelled nets (Wheeler and Lewis, 1972) during March and April 2011 in the North Platte River Valley near North Platte, Nebraska. Captured birds were tagged with a solar powered GPS platform terminal transmitter (GPS-PTT; Geotrak, Inc., Apex, North Carolina) attached with two-piece leg bands and released in the same location. GPS-PTTs can remotely provide locations to within approximately ten meters of the true position of the transmitter; therefore, they are the most accurate non-invasive tracking method available for use on this wide-ranging species. Transmitters were programmed to record GPS locations every 6-8 hours, which provided daytime and nighttime locations, allowing for detailed information on roosting sites, diurnal use sites, and flight paths. We monitored and archived crane locations from data provided by ARGOS (www.argos-system.org).
We used the nested process convolution model (Eqns. 6–8) described in the Dynamic Movement Networks section, which results in a Gaussian process model of the form (Eqn. 3), to analyze the SACR telemetry data for all 5 individuals simultaneously. To account for heterogeneity in time, we specified the third kernel (Eqn. 11) as a latent space network weight , as previously described. However, to account for a network that smoothly varies over time, we placed additional structure on the latent processes . In the latent space network model, is a trajectory that we assume exists on a latent 2 dimensional real plane (). Thus, because is a trajectory, we used a convolution-based movement model for the latent processes themselves in the latent space. For the SACR migration network example, we specified
where has diagonal elements . In this movement model for the latent processes, controls the overall density of the network and controls the rate of change in the dynamics of the network. Larger values of imply a less connected network over time because the pushes the away from each other in latent space, whereas, large values of imply a more slowly varying network in time. The stochastic process is assumed to be white noise with variance . We fixed the parameter to match our understanding of the variation in network structure over time (so that network connections change less than 10 times per study period on average) and we estimated to account for unknown network density. See Appendix C for further model implementation details.
The results of fitting the migratory network model to 5 SACR individuals are presented in Figure 6, where the right panel shows the estimated trajectories for the 5 SACR individuals in geographic space. The left panels of Figure 6 correspond to the marginal trajectories in latitude and longitude, respectively, and the gray symbols along the x-axis are placed at the time points of the positions in geographic space on the right panel.
The trajectories in Figure 6 provide insight about the geographic positions and timing of the SACR individuals. However, because we modeled the network, we also gain inference on the network connectivity. Figure 7 illustrates the dynamic connectedness of each SACR individual during the migration via the derived quantity referred to as “individual degree” . Individual degree is a function of the migratory network weights , thus, we can assess its uncertainty using the Markov chain Monte Carlo (MCMC) output based on the model fit (using the equivariance property of MCMC, Hobbs and Hooten, 2015).
In the migratory group of SACR individuals we analyzed, Figure 7 indicates that all individuals are connected to approximately one other individual in the migratory network during early September. However, in early October, the individuals we analyzed reached the Prairie Pothole region of North America (near the symbol in the right panel of Figure 6). During the week long stopover in the Prairie Pothole region, the red and orange individuals mostly stayed within a few kilometers of each other while the blue, green, and purple SACR individuals remained farther away. SACR fly multiple kilometers daily between nocturnal roost wetlands and various diurnal foraging sites. Thus, these daily flights imply that the red and orange individuals were likely aware of each other during this portion of the migration, but less aware of the blue, green, and purple individuals.
In addition to providing insights into the movement ecology and behavior of animals, the migratory network model we described herein can be used to reduce the uncertainty of the individual trajectories when individuals in a migratory group are inferred to be connected. For example, Figure 8 illustrates the reduction in uncertainty that is gained by modeling all SACR individuals jointly.
In Figure 8, the uncertainty in the predicted location for each individual is small when sufficient telemetry data exist, but increases as the sparsity of data increases resulting in the “bumps” in the uncertainty lines. When the individuals lacking telemetry data are well connected to other individuals with more regular data, the potential for a reduction in uncertainty is greatest. An example of uncertainty reduction occurred during the short period of time near September 15 when data existed for all SACR individuals except the orange individual (bottom plot in Figure 8). In this case, we see a reduction in the uncertainty for the orange individual because it was well connected to at least one other individual at that time according to posterior individual network degree (Figure 7). Thus, a knowledge of all individuals in a migratory group helps account for changes in movement dynamics and can reduce uncertainty in the predicted locations of individuals.
Convolution specifications for continuous-space models have been popular in spatial statistics (Higdon, 2002; Ver Hoef and Peterson, 2010), but they have only recently been applied to model animal trajectories (e.g., Hooten and Johnson, 2017a; Scharf et al., In Review). We demonstrated how convolution-based statistical models for trajectories can be useful to model the trajectories of migratory birds. To account for heterogeneity in the dynamics of animal trajectories we introduced a flexible and mechanistically linked temporal warping function that can improve inference on individual trajectories as well as provide quantitative insight about the timing and duration of migration periods.
Following the convolution nesting approach of Scharf et al. (In Review), we used three stages of convolutions to account for time-varying dynamics in individual trajectories (without relying on the temporal warping approach described previously for individual trajectories). The nesting of convolutions is particularly useful for characterizing the migratory behavior of groups of birds because they may change their social network structure during different portions of the migration (e.g., the clustering of SACR individuals we observed during stopovers). Furthermore, the migratory network movement model may improve the inference pertaining to geographic position because it leverages the potential connectivity to borrow strength from individuals with more, or higher quality, data to assist the inference for individuals with missing data. This concept could be used to design optimal duty cycling for telemetry devices for groups of moving individuals to save battery power and extend the life of the device providing more data for movement ecology studies.
Our methods rely on well-known Gaussian process specifications and we leveraged common techniques in big data settings to implement the models and provide inference. The temporal deformation approach we described has ties to spatial statistics (Sampson and Guttorp, 1992; Schmidt and O’Hagan, 2003) and provides an accessible way to fit nonstationary Gaussian process models using Bayesian model averaging. To compute posterior model probabilities, we applied the two-stage procedure developed by Barker and Link (2013) that allowed us to fit individual movement models and then post-process model output to compare individual movement models in our GWFG example.
Overall, while discrete-time animal movement models are still commonly employed, continuous-time continuous-space models for animal movement are useful when data are collected irregularly in time and continuous-time inference is desired. By extending continuous-time movement models to accommodate heterogeneous dynamics, we showed that convolution specifications provide a valuable means to characterize complex trajectories of migratory animals.
This research was funded by NSF DMS 1614392. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Barker, R., and W. Link. (2013), Bayesian multimodel inference by RJMCMC: A Gibbs sampling approach. The American Statistician, 67: 150-156.
Barry, R.P., and J. Ver Hoef. (1996). Blackbox kriging: spatial prediction without specifying variogram models. Journal of Agricultural, Biological, and Environmental Statistics 1: 297â322.
Buderman, F.E., M.B. Hooten, J.S. Ivan, and T.M. Shenk. (2016). A functional model for characterizing long distance movement behavior. Methods in Ecology and Evolution, 7: 264-273.
Brillinger, D.R. (2010). Modeling spatial trajectories. In Gelfand, A. E., P. J. Diggle, M. Fuentes, and P. Guttorp, editors, Handbook of Spatial Statistics, chapter 26, pages 463â-475. Chapman & Hall/CRC, Boca Raton, Florida, USA.
Cagnacci, F., L. Boitani, R.A. Powell, and M.S. Boyce. (2010). Animal ecology meetings GPS-based radiotelemetry: A perfect storm of opportunities and challenges. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 365: 2157-2162.
Cooke, F., D.B. Lank, and R.F. Rockwell. (1995). The Snow Geese of La Perouse Bay: Natural Selection in the Wild. Oxford University Press, Oxford, UK.
Comiso, J.C. and C.L. Parkinson. (2004). Satellite-observed changes in the Arctic. Physics Today, 57: 38-44.
Dalgety, C.T. and P. Scott. (1948). A new race of the white-fronted goose. Bulletin of the British Ornithologistsâ Club, 68: 109-121.
Ely, C.R., A.D. Fox, R.T. Alisauskas, A. Andreev, R.G. Bromley, A.G. Degtyarev, B. Ebbinge, E.N. Gurtovaya, R. Kerbes, Alexander V. Kondratyev, I. Kostin, A.V. Krechmar, K.E. Litvin, Y. Miyabayashi, J.H. Moou, R.M. Oates, D.L. Orthmeyer, Y. Sabano, S.G. Simpson, D.V. Solovieva, Michael A. Spindler, Y.V. Syroechkovsky, J.Y. Takekawa, and A. Walsh. (2005). Circumpolar variation in morphological characteristics of greater white-fronted geese Anser albifrons. Bird Study, 52: 104-119.
Fox, A.D. and D.A. Stroud. (1988). The breeding biology of the Greenland white-fronted goose (Anser albifrons flavirostris). Meddelelser Om Gronland, Bioscience, 27: 1-14.
Fox, A.D. and D.A. Stroud. (2002). Greenland white-fronted goose Anser albifrons flavirostris. Birds of the Western Palearctic Update, 4: 65-88.
Fox, A.D., J.O. Hilmarsson, O. Einarsson, H. Boyd, J.N. Kristiansen, D.A. Stroud, A.J. Walsh, S.M. Warren, C. Mitchell, I.S. Francis, T. Nygard. (1999). Phenology and distribution of Greenland white-fronted geese Anser albifrons flavirostris staging in Iceland. Wildfowl, 50: 29-43.
Fox, A.D., I. Francis, D. Norriss, and A. Walsh. (2016). Report of the 2015/16 International census of Greenland white-fronted geese. Greenland White-fronted Goose Study, Ronde, Denmark.
Fox, A.D. and A.J. Walsh. (2012). Warming winter effects, fat store accumulation and timing of spring departure of Greenland white-fronted geese Anser albifrons flavirostris from their winter quarters. Hydrobiologia, 697: 97-102.
Fox, A.D., M.D. Weegman, S. Bearhop, G.M. Hilton, L. Griffin, D.A. Stroud, and A. Walsh. (2014). Climate change and contrasting plasticity in timing of a two-step migration episode of an Arctic-nesting avian herbivore. Current Zoology, 60: 233-242.
Goldenberg, A., A.X. Zheng, S.E. Fienberg, and E.M. Airoldi. (2010). A survey of statistical network models. Foundations and Trends in Machine Learning, 2: 129-233.
Hanks, E.M., M.B. Hooten, and M. Alldredge. (2015). Continuous-time discrete-space models for animal movement. Annals of Applied Statistics, 9: 145-165.
Hefley, T.J., K.M. Broms, B.M. Brost, F.E. Buderman, S.L. Kay, H.R. Scharf, J.R. Tipton, P.J. Williams, and M.B. Hooten. (2017). The basis function approach to modeling autocorrelation in ecological data. Ecology, 98: 632-646.
Higdon, D. (2002). Space and space-time modeling using process convolutions. Pages 38â56 in C. W. Anderson, V. Barnett, P. C. Chatwin, and A. H. El-Shaarawi, editors. Quantitative methods for current environmental issues. Springer, New York, New York, USA.
Hobbs, N.T. and M.B. Hooten. (2015). Bayesian Models: A Statistical Primer for Ecologists. Princeton University Press.
Hoff, P.D., A.E. Raftery, and M.S. Handcock. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97: 1090-1098.
Hooten, M.B., Johnson, D.S., Hanks, E.M., and J.H. Lowry. (2010). Agent-based inference for animal movement and selection. Journal of Agricultural, Biological and Environmental Statistics, 15: 523-538.
Hooten, M.B., F.E. Buderman, B.M. Brost, E.M. Hanks, and J.S. Ivan. (2016). Hierarchical animal movement models for population-level inference. Environmetrics, 27: 322-333.
Hooten, M.B. and D.S. Johnson. (2017a). Basis function models for animal movement. Journal of the American Statistical Association, 112: 578-589.
Hooten, M.B. and D.S. Johnson. (2017b). Modeling Animal Movement. Gelfand, A.E., M. Fuentes, and J.A. Hoeting (eds). In Handbook of Environmental and Ecological Statistics. Chapman and Hall/CRC.
Hooten, M.B., D.S. Johnson, B.T. McClintock, and J.M. Morales. (2017). Animal Movement: Statistical Models for Telemetry Data. Chapman and Hall/CRC.
Hooten, M.B. and N.T. Hobbs. (2015). A guide to Bayesian model selection for ecologists. Ecological Monographs, 85: 3-28.
Kays, R., M. Crofoot, W. Jetz, and M. Wikelski. (2015). Terrestrial animal tracking as an eye on life and planet. Science, 384(6240): aaa2478.
Krapu, G. L., D. A. Brandt, K. L. Jones, and D. H. Johnson. 2011. Geographic distribution of the mid-continent population of sandhill cranes and related management applications. Wildlife Monographs, 175: 1-38.
Krapu, G. L., D. A. Brandt, P. J. Kinzel, and A. T. Pearse. 2014. Spring migration ecology in the mid-continent population of sandhill cranes with an emphasis on the central Platte River Valley, Nebraska. Wildlife Monographs, 189: 1-44.
Kruse, K. L., and J. A. Dubovsky. 2015. Status and harvests of sandhill cranes: Mid-continent, Rocky Mountain, Lower Colorado River Valley and Eastern populations. U.S. Fish and Wildlife Service, Lakewood, Colorado.
Langrock, R., R. King, J. Matthiopoulos, L. Thomas, D. Fortin, and J. Morales. (2012). Flexible and practical modeling of animal telemetry data: Hidden Markov models and extensions. Ecology, 93: 2336-2342.
Madsen, J. and A.D. Fox. (1981). The summer diet of the Greenland White-fronted Goose. Report of the 1979 Expedition to Eqalungmiut Nunat, West Greenland (ed A.D. Fox and D.A. Stroud), pp. 108-118. Greenland White-fronted Goose Study, Aberystwyth, UK.
Malecki, R.A., A.D. Fox, and B.D.J. Batt. (2000). An aerial survey of nesting Greenland white-fronted and Canada geese in west Greenland. Wildfowl, 51: 49-58.
McClintock, B.T., R. King, L. Thomas, J. Matthiopoulos, B. McConnell, and J. Morales. (2012). A general discrete-time modeling framework for animal movement using multistate random walks. Ecological Monographs, 82: 335-349.
McClintock, B.T., D.S. Johnson, M.B. Hooten, J.M. Ver Hoef, and J.M. Morales. (2014). When to be discrete: the importance of time formulation in understanding animal movement. Movement Ecology, 2: 21.
Morales, J.M., D. Haydon, J. Friar, K. Holsinger, and J. Fryxell. (2004). Extracting more out of relocation data: Building movement models as mixtures of random walks. Ecology, 85: 2436-2445.
Parton, A. and P.G. Blackwell. (2017). Bayesian inference for multistate ‘step and turn’ animal movement in continuous time. Journal of Agricultural, Biological, and Environmental Statistics, In Press.
Pearse, A.T., G.L. Krapu, D.A. Brandt, and G.A. Sargeant. 2015. Timing of spring surveys for midcontinent sandhill cranes. Wildife Society Bulletin, 39: 87-93.
Postlethwaite, C.M., P. Brown, and T.E. Dennis. 2013. A new multi-scale measure for analysing animal movement data. Journal of Theoretical Biology, 317: 175-185.
Russell, J.C., E.M. Hanks, and D. Hughes. (2017). Modeling collective animal movement through interactions in behavioral states. Journal of Agricultural, Biological, and Environmental Statistics, In Press.
Russell, J.C., E.M. Hanks, and M. Haran. Dynamic models of animal movement with spatial point process interactions. Journal of Agricultural, Biological, and Environmental Statistics, 21: 22-40.
Ruttledge, R.F. and M.A. Ogilvie. (1979). The past and current status of the Greenland white-fronted goose in Ireland and Britain. Irish Birds, 1: 293-363.
Sampson, P. (2010). Constructions for nonstationary spatial processes. Gelfand, A.E., P.J. Diggle, M. Fuentes, and P. Guttorp (eds). In Handbook of Spatial Statistics. Chapman and Hall/CRC.
Sampson, P. and P. Guttorp. (1992). Nonparametric estimation of nonstationary spatial covariance structure. Journal of the American Statistical Association, 87: 108-119.
Scharf, H.R., M.B. Hooten, B.K. Fosdick, D.S. Johnson, J.M. London, and J.W. Durban. (2016). Dynamic social networks based on movement. Annals of Applied Statistics, 10: 2182-2202.
Scharf, H.R., M.B. Hooten, D.S. Johnson, and J.W. Durban. (2017). Imputation approaches for animal movement modeling. Journal of Agricultural, Biological, and Environmental Statistics, In Press.
Scharf, H.R., M.B. Hooten, D.S. Johnson, and J.W. Durban. (In Review). Process convolution approaches for modeling interacting trajectories. arXiv:1703.02112.
Schmidt, A. and A. O’Hagan. (2003). Bayesian inference for nonstationary spatial covariance structure via spatial deformations. Journal of the Royal Statistical Society, Series B, 65: 743-758.
Stoll, J. R., R. B. Ditton, and T. L. Eubanks. 2006. Platte River birding and the spring migration: Humans, values, and unique ecological resources. Human Dimensions of Wildlife 11:241-254.
Stroud, D.A., A.D. Fox, C. Urquhart and I.S. Francis. (compilers) (2012). International Single Species Action Plan for the Conservation of the Greenland White-fronted Goose Anser albifrons flavirostris, 2012-2022. AEWA Technical Series No. 45, Bonn, Germany.
Ver Hoef, J.M., and E. Peterson. (2010). A moving average approach for spatial statistical models of stream networks. Journal of the American Statistical Association, 105: 6-18.
Wheeler, R.H., and J.C. Lewis. 1972. Trapping techniques for sandhill crane studies in the Platte River Valley. U.S. Department of Interior, U.S. Fish and Wildlife Service Publication 107. Washington D.C., USA.
Appendix A: Glossary
|Convolution||An integral of the product of two functions with respect to the same variable|
|Gaussian Process||A multivariate Normal random variable with dependence expressed using covariance|
|Kernel function||A positive function providing weights for averaging another process (e.g., a normal distribution)|
|Measurement error||Location uncertainty associated with telemetry data that are not equal to the true positions|
|Movement trajectory||The track associated with an individual animal’s movement (continuous set of positions over time)|
|Nested process convolution||Sequentially convolving a function with another convolution from the same or different stochastic process|
|Nonstationary||Heterogeneity of the statistical properties associated with a process change over time (or space)|
|Predicted trajectory||A statistical estimate of the true movement trajectory|
|Process convolution||A covolution of a kernel function and a stochastic process, usually serving as a model for a latent process|
|Stochastic process||An infinite number of random variables in a domain (e.g., time), possibly correlated|
|Telemetry data||Recorded locations of animal positions over time|
|Temporal deformation||A function that warps (expands or compresses) time|
Appendix B: GWFG Model Implementation
Project Data: We converted the telemetry data () to an equidistant projection centered on the geographical mean of the data. Center and scale projected telemetry data by subtracting mean and dividing by pooled geographic standard deviation. Scale time domain so that and ; this scaling makes it easy to specify reasonable priors for parameters controlling the range of .
Specify Convolution Type: We calculated the kernel functions at a large, but finite, number of time points to closely approximate the trajectory process. In the GWFG example, we used and a Gaussian probability density function for with scale parameter .
Specify Likelihood: We specified the integrated model for the data as in Eqn. 3 based on the matrices resulting from the choice of :
where corresponds to the measurement error variance and with representing the process variance. We set .
Specify Priors: We used the following priors for , , and , where are equally spaced real numbers between and . The prior for the measurement error variance was informative to represent the well-known GPS accuracy to within 10 meters of the true position (converted to our rescaled spatial domain for analysis). The discrete uniform prior for allowed us to pre-calculate the matrix and recall as necessary in the MCMC algorithm.
Implement Temporal Deformation: To account for heterogeneity in the movement dynamics, we approximated the mixture truncated Gaussian warping kernel using separate model fits based on equally spaced truncated Gaussian kernels throughout the time domain. We also explored the space of the kernel scale parameter as well as the warp magnitude parameter using a grid search over 100 permutations of values for these parameters (ranging from – for and – for ). Based on a preliminary model fit without temporal deformation, we scored the set of warping functions using the deviance score (Hooten and Hobbs, 2015) and fit the full model to the resulting top 20 using an MCMC algorithm. We used second-stage Bayesian model averaging (Barker and Link, 2013) to find the posterior model probabilities.
Prediction: We sampled realizations from the model averaged posterior predictive trajectory distribution to create Figure 4.
Appendix C: SACR Model Implementation
Project Data: We converted the telemetry data to an equidistant projection centered on -110 longitude, 51 latitude. We scaled the time domain so that and ; this scaling makes it easy to specify reasonable priors for parameters controlling the temporal range of the kernel functions and .
Specify Nested Convolution Types: We evaluated the process convolution over a dense, but finite, grid of equally-spaced time points to closely approximate the trajectory process. This grid corresponds to approximately 5 times points per day. In the SACR example, we used a Gaussian probability density function for for with scale parameter . We used the kernel given in (Eqn. 11), conditioned on the unobserved migratory network , to induce dependence across individuals.
Specify Latent Migratory Network Model: We modeled the latent paths, , that define the migratory network, , using another process convolution evaluated over a grid of time points, corresponding to approximately 2 time points per week. The sparsity of the time grid for the latent migratory network process relative to the observed movement process is appropriate because the latent process evolves much more slowly in time and does not required as dense a grid to adequately approximate the necessary integral. We used a Gaussian probability density function for with fixed scale parameter .
Specify Likelihood: We specified the integrated model for the data as in (Eqn. 3) based on the matrices , , and resulting from the choice of , , and . The vector contains the initial locations for all individuals. The resulting likelihood is:
where corresponds to the measurement error variance and with representing the process variance.
Specify Priors: We used the following priors for , , . Additionally, we used the following priors for , , and .
Prediction: We sampled realizations from the posterior predictive trajectory distribution to create Figure 5.