# Discrete and Continuous, Probabilistic Anticipation for Autonomous Robots in Urban Environments

###### Abstract

This paper develops a probabilistic anticipation algorithm for dynamic objects observed by an autonomous robot in an urban environment. Predictive Gaussian mixture models are used due to their ability to probabilistically capture continuous and discrete obstacle decisions and behaviors; the predictive system uses the probabilistic output (state estimate and covariance) of a tracking system, and map of the environment to compute the probability distribution over future obstacle states for a specified anticipation horizon. A Gaussian splitting method is proposed based on the sigma-point transform and the nonlinear dynamics function, which enables increased accuracy as the number of mixands grows. An approach to caching elements of this optimal splitting method is proposed, in order to enable real-time implementation. Simulation results and evaluations on data from the research community demonstrate that the proposed algorithm can accurately anticipate the probability distributions over future states of nonlinear systems.

## I Introduction

Autonomous urban driving is an important and maturing field in mobile robotics. Intelligent vehicles promise to improve both road safety, vehicle efficiency, and convenience[13, 4, 25]. The finals of the 2007 DARPA Urban Challenge (DUC) was an empirical evaluation of the state-of-the-art at the time, integrating 11 autonomous vehicles together with other robots and human drivers in an urban environment for long-duration operations ( 4 hours)[34, 45, 3]. Continued development in the field has led to autonomous cars beginning to drive in real urban environments alongside civilian traffic [33, 31, 30, 32].

Many autonomous cars use primarily reactionary planners that rely on rapid re-planning in order to respond to the dynamic environments in which they operate [24]. A collision between the MIT and Cornell entries was one of several examples in the 2007 DUC that raised safety concerns about reactionary planning for autonomous driving [12]. One approach proposed to more intelligently handle autonomous driving in dynamic environments is to incorporate ‘anticipation’ into path planning, or predicting the future motion of dynamic obstacles for use in planning. This area has been an active topic in mobile robotics in recent years [18, 51]. This paper presents a formal method for probabilistically anticipating the future behavior of tracked obstacles.

The problem of anticipation is inherently probabilistic, as it is impossible to know the true future behavior of dynamic obstacles that make their own independent decisions. In addition, the behavioral models used to anticipate obstacle behavior are often highly non-linear. In the literature, several algorithms have been proposed to simplify the problem, such as assuming no uncertainty in future obstacle behavior [37, 6, 36]. These algorithms are well suited for cooperative situations, where the obstacles can communicate their intentions to the robot, or for short anticipation horizons. However, they do not provide sufficient information for reasoning about an obstacle with unknown intentions over a significant anticipation horizon. Similarly, several proposed methods consider only a subset of obstacle uncertainty, such as along-track error [35]. These approaches reduce the complexity of the problem to a manageable level, while still considering the probabilistic aspects of obstacle anticipation, but are typically very simple and narrow in their application.

Another class of algorithms applies standard estimation filters (Kalman Filter, Extended Kalman Filter, etc.) to the problem of anticipation [15, 8]. Such approaches assume a model for the behavior of the obstacle, and provide mathematically rigorous, probabilistic estimates of that obstacle’s state over the anticipation horizon. These approaches are well-suited to obstacles that are accurately described by linear models because they maintain a single Gaussian to represent the uncertainty. For obstacles with more complex behaviors, such as those based on non-linear dynamics (e.g. cars, bicycles, etc.) and those that make discrete decisions (e.g. intersections, passing, etc.), the uncertainty of the anticipated obstacle state becomes inaccurate very quickly, thus severely limiting the prediction horizon. Du Toit and Burdick [8] has been extended by the authors to use Gaussian Mixture Models (GMMs) to capture multiple obstacle hypothesis, but is still focused primarily on linear obstacle models [9].

More complex uncertainties can be addressed, while avoiding the linearization problems of standard filters; for example, Monte-Carlo (MC) methods[11]. These approaches are attractive because they can consider complex, non-Gaussian uncertainties and allow for the use of non-linear obstacle models to capture complex obstacle behavior. However, the accuracy of prediction scales with the number of particles, and there are no guarantees that the particle set effectively covers the true distribution of possible future obstacle states. Because the assumed dynamics model for the obstacle has to be evaluated at every particle used in anticipation, increasing confidence in the estimate is strongly traded with computational resources.

A GMM based predictor is proposed in the paper to anticipate obstacle behaviors [2]. GMMs provide a well-suited representation to probabilistically anticipate non-Gaussian obstacle states. Here, the GMM is used to uniquely include discrete state elements that capture complex, high-level obstacle behaviors. Accurate anticipation of a wide variety of dynamic obstacles is ensured using a novel method for detecting linearization errors using sigma-point methods, and adjusting the mixture accordingly by optimally splitting inaccurately propagated mixands. This approach reduces the individual covariances of inaccurately propagated mixands, bringing them into a nearly linear regime. The presented algorithm provides accurate, probabilistic future obstacle state estimates and is shown to perform well even with highly non-linear obstacle motion models.

This paper is outlined as follows: Section II-A defines the representation of the obstacle state and Section II-B defines the mixture model. Sections II-C describes the discrete and continuous anticipation of the obstacle state. Section II-D provides an overview of the proposed algorithm. The details of temporal propagation are described in Section III, including the non-linearity detection and mixand splitting. Section IV-A provides an example implementation of the anticipation algorithm in simulation, and Section IV-B demonstrates the efficacy of the algorithm on a real data set. Section IV-C demonstrates the potential safety improvements by applying the proposed algorithm to the 2007 MIT-Cornell autonomous collision.

## Ii Hybrid Mixture Anticipation Algorithm

The hybrid mixture anticipation algorithm described in this paper is designed to predict the probability distribution over the state of a tracked obstacle forward in time. Hybrid, here, is used to denote jointly discrete and continuous components. The intended application of the presented algorithm is to make accurate, probabilistic information about future obstacle behaviors available for use in path planning. In order to provide the most general algorithm, obstacle models can include non-linear behaviors, as well as discrete variables to capture higher-level decisions. To meet these requirements, the distribution over the obstacle state is described using a hybrid Gaussian/discrete mixture model (hGMM).

### Ii-a Obstacle State Model

The obstacle state at time () is assumed to have continuous elements (, representing position, velocity, etc.) and discrete elements (, representing behavioral modes). The state vector is partitioned accordingly:

(1) |

A model for the evolution of the obstacle state is assumed to be available, and partitioned into discrete () and continuous () components:

(2) |

where is the process noise vector at time . and are both functions that take as inputs point values for the vehicle state () and process noise (), as this is typically how models are defined. The following sections generalize these to and , which operate over distributions over, rather than samples from, the vehicle state and process noise.

### Ii-B Hybrid Mixture Probability Distribution Representation

The probability distribution is approximated using an hGMM. The hGMM extends the GMM presented in Alspach and Sorenson [2] by including discrete variables. Discrete variables allow the hGMM to capture both continuous behavior of a system (position, velocity, etc.) as well as high-level, abstract behaviors (turning left, going straight, etc.). The hGMM inherits the capability of GMMs to represent many general probability distributions, with increasing accuracy in the limit of a large number of mixands, while still maintaining the convenient computational properties of Gaussian distributions. The hGMM is defined:

where is the number of mixands in the hGMM at time , and are the weights on each mixand at time , such that

(3) |

The mixand is defined as a Gaussian distribution over the continuous state elements and a hypothesis (using a delta function) over the discrete state elements:

(4) |

where is the mixand hypothesis of the discrete state, and and are the mean and covariance of the Gaussian distribution over the continuous state.

### Ii-C Hybrid Mixture Propagation

The hGMM formulation enables each mixand to be propagated forward in time independently using the dynamics model (Equation II-A), thereby reducing the complexity of the problem. The propagation of the mixands is complicated in two ways, however. First, more than one discrete state can be transitioned into (for example, a tracked obstacle approaching an intersection may turn left, turn right, or continue straight). Second, the variance on the continuous state elements may grow to the point where the mixand can no longer be accurately propagated through the dynamics model. Each of these are addressed in the proposed probabilistic anticipation algorithm.

Consider the discrete mixand propagation through the dynamics function from Equation II-A, is:

(5) |

where is the discrete state for the mixand at the next time step (). In cases where a mixand has multiple possible next discrete states (such as a choice of roads at an intersection), the mixand is split so that one copy of the mixand can transition to each possible discrete state. Although the continuous aspects of the mixand () are not affected by the discrete propagation, the time index is used to account for growth in the mixture size due to mixands transitioning to multiple discrete states.

Similarly, the continuous mixand propagation function , is defined as:

(6) |

where the right-hand side characterizes the propagated hGMM after one full step.

### Ii-D Algorithm Overview

The propagation steps of the hGMM are summarized in Figure 1. First, each mixand in the hGMM at time is propagated through the discrete dynamics. This step has the potential to increase the number of mixands in the hGMM due to the possibility of multiple discrete decisions being available to mixands. Next, each mixand in the hGMM after discrete propagation is propagated through the continuous dynamics. This step includes a linearity criteria (Section III-B) that ensures the accuracy of propagation by splitting mixands that propagate inaccurately (Section III-C). After the continuous propagation is completed, the probability distribution over the state at time can be written as:

(7) |

Note that the number of mixands () in the hGMM after the continuous propagation can be larger than the number of mixands after the discrete propagation () due to the possibility that some mixands were split to ensure accurate propagation through nonlinear dynamics.

## Iii Mixand Continuous Propagation

The first step in the mixture continuous propagation is to predict the continuous state distribution in each of the mixands forward one time step. The continuous dynamics function (Equation 6) uses the sigma-points (SP) transform (sometimes called the unscented transform) [27, 44, 26, 48] to propagate the mixand through the continuous dynamics function (Equation II-A). Sigma-points are well-explored for use in estimation problems involving nonlinear dynamics and measurement models[27, 26]. The Gaussian distributions and are approximated by deterministically choosing sets of points and , called sigma points:

(8) |

where is the dimension of and is the dimension of ; the process noise distribution is assumed to be time-invariant and Gaussian without loss of generality – time varying GMM process noise distributions can be used with minor modifications. To find the individual sigma points in Equation III, the matrix square-roots of the covariances and are first solved:

such that

(9) |

The individual sigma-points are then defined using these matrix square-roots and a parameter :

(10) |

where .

Each pair of points () is then individually propagated through , and the resulting set is used to evaluate the impact of nonlinearities (Section III-B) on the accuracy of prediction, and to find the predicted distribution at time , (Section III-A).

### Iii-a No Splitting Case

If the mixand passes the linearity criteria described in Section III-B, the propagated sigma points are used to find the predicted mean and covariance ( and ) for the mixand:

(11) |

Once the mean and covariance are computed, the propagation of the mixand is complete. If the mixand fails the linearity criteria, it is split into several mixands with reduced covariance, described in Section III-C.

### Iii-B Linearity Criteria

The strength of Gaussian mixture models is that they can accurately approximate non-Gaussian pdfs that arise either through non-Gaussian process or measurement noise or through non-linearities in the system model[2]. Traditional Gaussian mixture model propagation algorithms [47, 41] use standard non-linear filtering techniques, such as Extended Kalman Filtering or Unscented Kalman Filtering, to propagate the individual mixands without considering the impact of local non-linearities in the system model on the accuracy of propagation. Motivated by a careful examination of the sigma-points, this paper develops a natural measure for how accurately a given mixand is propagated through the system model, along with a splitting method to reduce covariances, which, in turn, improves propagation accuracy.

The SP transform propagates Gaussian mixands through a temporal model using sigma-points; intuitively, this process performs a statistical linearization of the model in the neighborhood of the Gaussian mixand. However, the SP transform does not provide a measure of error introduced by the statistical linearization. In order to evaluate the error introduced by the SP propagation, the linearization residuals are proposed here as a metric. The process is as follows. First, a linear model is defined:

(12) |

where the notation , describes the subset of sigma-points related to the state uncertainty, and not to process noise:

(13) |

For a linear system, Equation III-B can be solved exactly for and . For a non-linear system, the residuals can be calculate by casting the linearization as a least-squares problem.

(14) |

The residual linearization error (), defined as:

(15) |

The residual error is a direct metric of how well the optimal linear model explains the propagation of sigma points through the nonlinear dynamics. If the underlying model is linear, the residual error is zero. For nonlinear systems, the residual linearization error indicates how locally linear or non-linear the underlying model is in the neighborhood of the sigma-points.

An L-Q factorization can be used to compute :

where

such that is lower triangular, and therefore cheaply invertible, and is orthonormal:

(16) |

(17) |

Because of the structure of , the arguments only effect the first columns of the norm in Equation III-B. The matrix is partitioned according to what is explained by the linear model () and what is not explained by a linear model (). In block form, this is written:

(18) |

The optimal arguments can be extracted from the partitioned matrix in Equation 18 and :

(19) |

The desired linearization error , defined in Equation III-B, simplifies to:

(20) |

Note that the linearization error can be calculated without requiring the computed optimal arguments and in Equation 19. The scaler residual error in Equation 20 quantifies how well the temporal propagation of the sigma-points can be explained by a linear model, and therefore how well the local linearity assumptions made by the SP transform hold for the mixand being propagated. This metric is a refinement of the metric proposed by the authors in Havlak and Campbell [19], and similar to the metric proposed in Huber [22]. Other accuracy metrics for the sigma-point transform have been developed, for example Van Der Merwe [46] proposes a metric based on the Taylor series expansion of the dynamics function, which requires a differentiable dynamics model. If the error metric is above some defined threshold (), the mixand can be targeted for covariance reduction. However, this metric is scalar, and in order to effectively reduce the covariance of the mixand to a specified level of linearization error, it is desirable to identify which sigma-points are explained the least by the optimal linear model.

Defining as the difference between the sigma-points propagated through the system model and the optimal linear model,

(21) |

The column of , is the residual linearization error associated with the sigma-point. Similar to the linearization error , can be computed without computing the optimal linearization, given by:

(22) |

To effectively reduce the covariance of the mixand in order to reduce the linearization error, the direction along which the linearization error is greatest is identified as the optimal splitting axis (). To find , the sigma points before propagation () are weighted by the norm of their associated residual linearization errors (). The first eigenvector of the second moment of this set of weighted sigma points is the axis along which the residual linearization error is the greatest, and is used as the optimal splitting axis.

### Iii-C Gaussian Mixand Splitting

Given the sigma point residual error of the propagated mixand in Equation 22, the next step is to adapt the mixture to decrease this error (and increase the accuracy of the temporal propagation). This is accomplished by identifying mixands that are poorly propagated by the SP transform (using Equation 22) and replacing them with several mixands with smaller covariances.

The approximation of a GMM by a GMM with fewer mixands has been explored in the literature [23, 40, 16, 20, 39], as classical GMM filters experience geometric growth in the number of mixands and require a mixture reduction step in order to remain computationally tractable. The problem here, however, is the reverse – splitting a mixand in order to reduce the variance of mixands. This problem is only beginning to receive attention in the literature [22, 19, 1, 38], although there has been interest in the topic in the machine learning community [43, 10, 50]. Other adaptive GMM filters include Caputi and Moose [5], which relies on non-Gaussian process noise to increase the number of mixands, and Terejanu et al. [42], which adapts the GMM to nonlinearities by developing an update rule for the mixand weights rather than by increasing the number of mixands.

There are two competing goals in approximating a single Gaussian by a GMM. First, the GMM must closely approximate the original Gaussian. Second, the covariance of the mixands in the GMM must be significantly smaller than the covariance of the original Gaussian. A common metric used to evaluate how closely two distributions approximate each other (e.g. comparing the GMM to the original Gaussian) is the Kullback Leibler divergence (KLD), which measures the relative entropy between two distributions[29]. However, the KLD between a Gaussian and a GMM can not be evaluated analytically, and effective approximations can be resource intensive[17, 21]. An alternative metric is the integral-squared difference (ISD), which has a closed-form analytic solution when comparing a single Gaussian to a Gaussian mixture [49]. Because the ISD can be exactly and quickly evaluated, it is used here as the statistical error metric when comparing a Gaussian and its GMM [49].

(23) |

Formally, the Gaussian mixand splitting problem is defined as the approximation of a single Gaussian mixand by a GMM such that the covariances of the mixands () are smaller than the original (), and the ISD between the original Gaussian and its GMM (Equation 23) is minimized. The formal problem is to solve for for all such that:

(24) |

Because this optimization is computationally expensive, an off-line optimal solution is computed and stored for use as needed by the anticipation algorithm in real time. Pre-computing an optimal split for real time use requires that the pre-computed split can be applied to an arbitrary single Gaussian and splitting axis (i.e. the general problem). Any single Gaussian and splitting axis combination can be transformed to the problem of splitting a unit, zero-mean Gaussian along the -axis () using an affine transformation.

Given a Gaussian mixand and a splitting axis , the following transformations are applied in order to arrive at the pre-computed problem:

where is the matrix square-root of

and is a rotation matrix, computed such that the final splitting axis aligns with the -axis:

(25) |

Figure 2 illustrates the transformation in Equation 25 applied to a general two dimensional Gaussian. Figure 2(a) shows the original single Gaussian to be split, and the axis along which the Gaussian is to be split, . Figures 2(b) and 2(c) show the translation of the Gaussian to the origin () and the transformation that yields a unit covariance of the single Gaussian. Finally, Figure 2(d) shows the rotation () applied that aligns the splitting axis with the -axis.

The splitting problem has now been posed as splitting a zero-mean, unit Gaussian along the -axis into an -component mixture of Gaussians:

(26) |

The solution to this optimization problem is computed off-line, and can be applied at runtime to the general problem (Equation 24) using the transformation from Equation 25:

(27) |

The offline optimization (Equation 26) solves for a Gaussian mixture with an odd number () elements such that the ISD between the split Gaussian mixture and the Gaussian distribution is minimized. For an dimensional Gaussian, there are parameters associated with the means, parameters associated with the covariances, and parameters associated with the weights. Due to the large parameter space, the optimization problem is ill-posed, and computationally intractable even for off-line optimization. To address these problems, several constraints are imposed to reduce the parameter space. First, the means through are restricted to lie on the -axis and be evenly spaced, reducing the optimization parameters associated with the means to a single spread parameter :

(28) |

This assumption is reasonable, as it spreads the means of split mixture along the splitting axis and only this dimension of the covariance is being targeted for reduction. Furthermore, the derivative of with respect to the off--axis elements of the means evaluated at the proposed ’s is exactly zero.

The next parameter reduction constrains the covariance matrices through to be diagonal, equal, and of the form:

where

(30) |

This targets only the covariance along the splitting axis; all other elements of the split covariance matrices are optimally (the derivate of with respect to these elements evaluated at the proposed value is zero), so they are not considered. This step reduces the parameters associated with the variance matrices to a single parameter .

Finally, a further reduction to the parameter space is made by recognizing that the optimal weights ( through ) for a given set of means ( through ) and variances ( through ) can be found using a quadratic program, removing the weights from the parameter search space entirely. Expanding the ISD, defined in Equation 23, yields:

Williams provides a closed-form solution for each of the above terms:[49]

Re-arranging the terms yields a quadratic program with the weights as arguments:

where

such that

(31) |

Equation III-C can be solved with a constrained quadratic program to find the optimal weights for the split, . This allows the weights to be removed from the parameter space, as the optimal weights can be easily computed for each set of parameters considered.

The resulting parameter search space includes only the spacing of the means () and the reduction of the covariance (), making a high-resolution exhaustive parameter search possible. The optimization is an interesting trade between goals. First, should be small, as this represents the error introduced by approximating the original Gaussian by the split distribution. Second, should be small, as this reduces the effects of nonlinearities in the propagation. Finally, should be small in order to create manageable computation requirements.

Figure 3 plots the mean and standard deviation of the ISD as a function of , and for randomly generated examples of two-dimensional Gaussians distributions and splitting axes. For each value of , is varied between and and the optimal spread parameter and weights are computed. The optimized split is then applied to randomly generated two-dimensional Gaussians and splitting axes, and the ISD between each Gaussian and the resulting GMM after the split is computed. Note that the tight error bounds indicate that the optimized split consistently works well, even as Gaussian/splitting axis pairs vary, supporting the approximation in Equation III-C. Intuitively, minimizes the ISD, while most reduces the variance along the x-axis. It is proposed here to first choose a single value for , and then compute the optimal values for the spread parameter and the weights for different values of .

### Iii-D Benchmark Splitting Examples

To explore the impact of the parameter choices and on the accuracy of propagation of a single Gaussian through a nonlinear temporal model, two nonlinear models are used as benchmark examples. The first is the univariate non-stationary growth model (UNGM):[28, 7]

(32) |

and the second is a univariate cubic:

(33) |

The UNGM is commonly used to evaluate the performance of nonlinear filters[28, 7]. Also, for both the UNGM and the cubic model, the propagated distribution can be found analytically for comparison.

The accuracy of the GMM splitting routine is evaluated using 100 randomly generated samples; each is a normal distribution (means uniformly distributed between -2 and 2, and variances uniformly distributed between 0 and 2) and propagated through the UNGM (Equation III-D) and cubic (Equation III-D) models using cached splits. In order to explore the sensitivities in the proposed anticipation algorithm, the cached splits are generated using different variance reductions () and different numbers of mixands in the split (). The GMM approximation to the propagated distribution is compared to the exact propagated distribution . The performance metric used to evaluate the accuracy of propagation is the numerically evaluated KLD between the propagated mixture and the true distribution, or:

(34) |

Figure 4 plots the mean and standard deviation of the KLD between the true distribution and the GMM approximation as a function of and . The trends in each are similar. As decreases from , the accuracy of the split mixtures increases (smaller KLD compared to the true propagated distribution ). As becomes very small, approaching zero, the split mixture becomes a poor approximation of the prior, introducing error in the propagation (higher KLD values). As expected, larger values of allow smaller values of , and therefore more accurate mixture propagation. Even the least aggressive splits () result in KLDs of approximately one-half that of propagation with no splitting, while more aggressive splits show KLDs of approximately one-tenth that of the no-split solution.

These example problems can also be used to analyze how effective the linearity metric (Equation 20) is at predicting propagation errors. is stored for each single Gaussian propagated through the UNGM and the cubic function, and the Pearson correlation between the residual linearization error, a metric that predicts the error in propagation, and the KLD between the no-split solution and truth, which is the actual measured propagation error, is computed. The Pearson correlation coefficient is (UNGM) and (cubic model), indicating strong correlation between the residual linearization error metric and the actual KLD from the true mixture after propagation.

These two example problems indicate that the linearity criteria and the mixand splitting algorithms greatly improve the accuracy of propagation of a single Gaussian through a non-linear model. The choice of the linearity threshold () and of the split parameters and are problem dependent, but can be explored off-line prior to implementation.

## Iv Experimental Results

Two systems, one simulated and one experimental, are used to analyze and evaluate the hGMM anticipation algorithm. In both systems, the hGMM is used to predict the future behavior of a tracked car driving on a known roadmap. Additionally, the case of the 2007 Cornell-MIT autonomous collision is studied anecdotally. In order to manage mixture complexity, the GMM reduction method proposed in Runnalls [39] is used in the following examples.

### Iv-a Simulation

The simulation uses the hGMM to predict the future behavior of a simple simulated car driving in a Manhattan grid. Implementing in simulation allows the assumed dynamics model to exactly match the true dynamics model of the simulated obstacle vehicle. This allows the performance of the hGMM in predicting the distribution over future obstacle vehicle states to be studied independently of the assumed dynamics model.

The hGMM algorithm is implemented using a four-state obstacle vehicle model described in reference[19]. Obstacle vehicles are assumed to drive on a known road network and to obey specified traffic regulations. Obstacle vehicles are modeled as 4-state bicycle robots, with the continuous state:

(35) |

where and are the two-axis position of the center of the rear axle, is the speed of the obstacle vehicle, and is the heading of the obstacle vehicle, all at time . The continuous dynamics model for the obstacle vehicle is simple, though non-linear.

(36) |

Control inputs are throttle () and steering angle (), and zero-mean, Gaussian process noise () enters on the control inputs. The time step is seconds in this example.

The hybrid aspect of the obstacle vehicle anticipation problem enters through an assumed route planning controller, where the discrete states are road segments in the map. The discrete dynamics function assumes that the route planning controller takes all possible routes with equal probability. A path following controller is assumed, which generates control inputs and that enables the obstacle vehicle to follow the current route and maintain a speed of meters per second:

(37) |

This path following controller (Equation 37) is composed with the bicycle dynamics (Equation 36) to arrive at the continuous dynamics function (Equation II-A) required by the hGMM:

(38) |

The anticipation algorithm is used with a time horizon of seconds (approximately the time required for an obstacle to fully traverse an intersection) to predict the behavior of an obstacle car in several different scenarios. For comparison to the hGMM, a large particle set is used to approximate the true distribution of the obstacle state at future times. Three example scenarios (straight road, turn, and intersection) are used to compare the output of the hGMM algorithm to the approximate true distribution. The negative log-likelihood (NLL) is used to evaluate the difference between the true distribution, as represented by a particle set, and the hGMM approximation. The performance of the hGMM is evaluated for different values of , and compared to the baseline of the hGMM algorithm with no splitting, which is equivalent to a single Gaussian UKF anticipation algorithm.

Figure 5 shows the NLL metric for the three considered scenarios. The NLL is plotted as a function of lookahead time. Results are compared for four values of , for each scenario: the hGMM with , the hGMM with , the hGMM with , and finally the hGMM with ; the latter is equivalent to a single Gaussian UKF predictor. When the obstacle vehicle is traveling on a straight road (Figure 5(a)), all of the anticipation predictors show similar performance, because the dynamics model is very close to linear in this case. In Figures 5(b),5(c), the hGMM anticipation algorithm shows significant performance improvement over the no splitting predictor, as non-linearities have greater impact on the accuracy of propagation. The peaks in NLL that appear in the second and third scenario correspond to the future time when the vehicle is anticipated to be negotiating the bend in the road, when the model is exhibiting a high degree of non-linear behavior. Once the turn is negotiated, and the vehicle is anticipated to be on the straight road after the turn, the non-linearities fade and the NLL decreases again.

Figure 6 shows the hGMM anticipated distributions of the rear-axle position for the intersection scenario, for . The hGMM anticipation algorithm clearly captures the non-Gaussian nature of the probability masses that turn right and left. This example illustrates both the splitting due to the discrete dynamics (seen by separate probability masses taking each of the three options available at the intersection) and the splitting due to nonlinearities (seen by the non-Gaussian shape of the probability masses making the left and right turns). This figure also demonstrates why Figure 5(c) has a peak just before 2 seconds lookahead – the non-Gaussian shape of the individual probability masses is at its highest as the obstacle vehicle is anticipated to be negotiating the intersection, as in Figure 6(b).

### Iv-B Experimental Data

In order to validate the probabilistic anticipation approach on more realistic problems, the hGMM algorithm was evaluated on a set of tracked vehicle data near intersections. The data set used for this validation is the 2007 Columbus large image format (CLIF 2007) data set made available by the United States Air Force [14]. The data set consists of aerial imagery of an urban environment, at approximately two frames per second, collected by a large-format electronics observation platform. A large number of vehicles are observed driving in a variety of conditions. The validation focuses on vehicles observed traversing intersections, in relatively light traffic, that have the right-of-way. A total of 40 vehicle tracks at three different intersections are used.

In the proposed validation approach, an observed vehicle is tracked at time , giving a state estimate. This state is then predicted forward in time using the hGMM with various values for , and an anticipation horizon equal to the time over which the vehicle is tracked. The estimate of the observed vehicle state at time is used as the initial condition for the hGMM; subsequent measurements are used to evaluate how well the hGMM predicts the behavior of the vehicle. The same vehicle behavior model described in Section IV-A (Equation 36) is used in this experimental validation to describe the dynamics of observed vehicles. Successive measurements of the tracked obstacle at times after are compared to the anticipated distribution over the vehicle state to evaluate how well the hGMM algorithm anticipates the behavior of the tracked vehicle.

Figure 7 plots the log-likelihood (LL) of the hGMM anticipation agreeing with CLIF observations as a function of . At high values of , the LL is low with larger uncertainty bounds, indicating poorer predictive performance. The figure shows that smaller values for provide more accurate anticipation of observed cars, as well as much more consistent performance. At values for above approximately , the standard deviation becomes quite large. This indicates that the covariance of mixands in the hGMM becomes large enough that the sigma-point propagation becomes a poor and unpredictable approximation. For very small values of , performance also degrades due to errors introduced by repeated mixand splitting in the hGMM.

Figure 8 plots the p-values of t-test comparisons between the log-likelihood data for each linearity threshold to the log-likelihood data for the maximum linearity threshold (). This figure shows that the improved performance seen in Figure 7 for linearity thresholds is statistically significant, as the corresponding p-values are below . For linearity thresholds below this range, the accumulated errors introduced by repeated mixand splitting degrade performance, and the log-likelihood performance is not statistically different from the data taken with a linearity threshold high enough that no splitting occurs ().

The LL metric studied in Figures 7 and 8 considers only the likelihood that the observations agree with the anticipated obstacle state, and therefore only evaluates the hGMM at the observations. Because of this, unreasonable predictions (i.e. the car being very far outside the driving lane) are not penalized by this metric. To compliment this metric, a second metric is proposed: the expected off-track error (EOTE). The EOTE is defined as:

(39) |

where is the anticipation horizon, is the distance of the rear axle position from the lane center, and is the anticipated probability of the vehicle being at that position from the hGMM. This metric is equivalent to the expected value of the out-of-lane position of the observed vehicle. This metric directly measures how reasonable the output of the hGMM is, assuming the observed vehicle is not behaving anomalously.

Figure 9 plots the EOTE as a function of the linearity threshold used in the hGMM. At high values of , the EOTE is large and has a large standard deviation, indicating poor anticipation performance. For small values of , the EOTE and its standard deviation decrease, indicating that the hGMM predictions are more reasonable and more consistent. As with Figure 7, for above the standard deviation of this metric becomes large as the mixand covariances become too large for the sigma-point approximation.

Figure 10 plots the results of t-test comparisons between the EOTE data for linearity threshold to the EOTE data for the maximum linearity threshold. These results show that the decreased EOTE seen in Figure 9 for small linearity thresholds () is statistically significant.

Figure 11 shows the results of the hGMM and single Gaussian anticipation algorithms, respectively, for a vehicle making a turn at an intersection. The initial observation of the vehicle (green circle) is plotted along with the anticipated distribution of the rear-axle position at future anticipated times. The actual measurements are plotted as green crosses. Comparing the two approaches, the hGMM predicts the measurements about as well as the single Gaussian method, according to the LL metric in Figure 7. However, Figures 11(c) and 11(f) show that the hGMM distribution is much more reasonable than the single Gaussian method distribution in predicting locations that are actually on the road network, particularly as the look ahead time increases. This is demonstrated by the large amounts of probability mass outside of the actual expected driving corridor for the vehicle in Figure 11(f), and is quantified by the EOTE metric which shows clearly superior performance of the hGMM algorithm.

### Iv-C MIT-Cornell Collision Example

A motivating example for the importance of this work is the collision between the Cornell and MIT entries in the 2007 DARPA Urban Challenge (DUC). The collision occurred when the Cornell entry, Skynet, stopped due to a perceived blockage that was the result of a misplaced waypoint in the roadmap. MIT’s entry, Talos, observing that Skynet had stopped, initiated a pass. While Talos was executing the pass, Skynet recovered and began moving again. Talos, not recognizing that Skynet was no longer a static obstacle, moved back into the driving lane while Skynet, not able to anticipate the behavior of Talos, continued forward. The result was the low speed collision shown in Figure 12. Further details on the collision are available in Fletcher et al. [12].

To demonstrate the efficacy of the hGMM anticipation algorithm at improving safety, the scenario is re-visited using logged data from the 2007 collision. In this example, Skynet uses the hGMM to anticipate the behavior of Talos (using the simple vehicle model described in IV-A). At the same time, Skynet simulates it’s own trajectory for the proposed behavior of resuming motion. The probability of collision between Skynet’s proposed motion and the hGMM over Talos’ anticipated state is evaluated using the approximation developed in Hardy and Campbell [18], over a horizon of 3 seconds at 10 Hz.

Figure 13 shows the results of the anticipation as a function of the lookahead time. Figure 13(a) shows the initial condition – Talos is shown as the black car, Cornell is shown as the green car. Figures 13(b), 13(c), and 13(d) show the scenario as it evolves at three different lookahead times. The hGMM probability distribution prediction of Talos is plotted – in this case, the x-y position of the rear axle. Skynet’s proposed position at the given lookahead time is also shown, and the color corresponds to the probability of collision, with green being safe (probability of collision of zero) and red being dangerous (probability of collision of one). Figure 14 plots the anticipated probability of collision as a function of lookahead time. The anticipated probability of collision increases as the anticipation algorithm looks ahead in time, and a collision is almost guaranteed by 2.5 seconds. These figures clearly show that even with a basic obstacle model, the hGMM could have predicted, and therefore prevented, the MIT-Cornell collision.

Figure 15 plots the time required to predict an obstacle’s state forward 4.5 seconds for a range of values for both the linearity threshold and the maximum number of mixands allowed at the end of each prediction step. The hGMM is implemented in C# in Skynet’s planning algorithms on a modern, Intel Core i7 rackmount computer. The computation time grows with the number of allowed mixands and the inverse of the linearity threshold, as expected.

Figure 16 plots the prediction accuracy of the hGMM as a function of the linearity threshold and the maximum number of allowed mixands. Here the prediction accuracy is measured by computing the Log-Likelihood between the hGMM result and a large particle set propagated using the same vehicle dynamics model as the hGMM. Figure 16 shows that prediction accuracy improves as the linearity threshold becomes smaller, and degrades as the number of allowed mixands becomes small. Note that the performance is independent of the number of allowed mixands until the number of mixands becomes very small. Comparing Figures 15 and 16 provides insight on how to select the linearity threshold and the number of allowed mixands for the assumed dynamics model.

The hGMM is currently implemented in Skynet’s planning algorithms with a linearity threshold of and a maximum number of allowed mixands of 10. This implementation allows Skynet to anticipate the behavior of up to three obstacle vehicles to a horizon of 4.5 seconds at greater than 3 Hz. Because the anticipation result is never older than seconds, but extends to 4.5 seconds in the future, this is considered real-time performance.

## V Conclusion

An anticipation algorithm is developed which uniquely recognizes and mitigates the impact of non-linearities in the hybrid dynamics function on the accuracy of probability distribution propagation. A unique method for evaluating the accuracy of propagation of a Gaussian distribution through non-linear dynamics is proposed, using the propagated sigma-points from the distribution. In addition, a new method for splitting propagated Gaussian mixands due to nonlinearities is developed. By detecting, and reacting to propagation errors introduced by nonlinearities, the proposed algorithm is shown to have significant improvements in accuracy over standard propagation methods, such as the Unscented Transform.

The behavior of the non-linearity detection and the mixand splitting algorithms are explored using several simulated example problems, and compared to a single Gaussian sigma point method. In the case of benchmark nonlinear problems, the hGMM anticipation algorithm is shown to better approximate the true distributions that arise than the single Gaussian sigma point method. The second set of simulations use an obstacle vehicle driving on a known road network. The accuracy of the anticipated probability distributions is compared across choices for the linearity threshold parameter in the hGMM anticipation algorithm, and compared to a single Gaussian sigma point predictor. Using a large particle set as truth, the hGMM is shown to more accurately capture the behavior of the distribution over future obstacle vehicle states, defined as the negative log-likelihood between the particle set and the anticipated distribution.

The hGMM anticipation algorithm is validated on an experimental data set. Specifically, the hGMM algorithm is used to predict the behavior of vehicles in the CLIF 2007 data set provided by the USAF. The hGMM is compared to a single Gaussian sigma point method by comparing predictions of a tracked vehicle state to observations of the tracked vehicle. The hGMM is shown to provide increased accuracy over the single Gaussian sigma point method using the log-likelihood as a metric. The hGMM is also shown to provide predictions that are more reasonable (i.e. that do not include predictions of anomalous behavior for vehicles that are not behaving anomalously, by the expected off-track error metric) than propagation with no splitting.

Additionally, the hGMM is applied to the scenario leading to the MIT-Cornell collision in the 2007 DUC, and it is shown that the hGMM could have anticipated the collision in time to prevent it.

## References

- Ali-Löytty and Sirola [2007] S. Ali-Löytty and N. Sirola. Gaussian mixture filter in hybrid navigation. In Proceedings of The European Navigation Conference GNSS, volume 2007, 2007.
- Alspach and Sorenson [1972] D. Alspach and HW Sorenson. Nonlinear bayesian estimation using gaussian sum approximations. Automatic Control, IEEE Transactions on, 17(4):439–448, 1972.
- Bacha et al. [2008] Andrew Bacha, Cheryl Bauman, Ruel Faruque, Michael Fleming, Chris Terwelp, Charles F. Reinholtz, Dennis Hong, Al Wicks, Thomas Alberi, David Anderson, Stephen Cacciola, Patrick Currier, Aaron Dalton, Jesse Farmer, Jesse Hurdus, Shawn Kimmel, Peter King, Andrew Taylor, David Van Covern, and Mike Webster. Odin: Team victortango’s entry in the darpa urban challenge. J. Field Robotics, 25(8):467–492, 2008.
- Bishop [2000] R. Bishop. Intelligent vehicle applications worldwide. Intelligent Systems and Their Applications, IEEE, 15(1):78–81, 2000.
- Caputi and Moose [1993] Mauro J Caputi and Richard L Moose. A modified gaussian sum approach to estimation of non-gaussian signals. Aerospace and Electronic Systems, IEEE Transactions on, 29(2):446–451, 1993.
- [6] J.S. Choi, G. Eoh, J. Kim, Y. Yoon, J. Park, and B.H. Lee. Analytic collision anticipation technology considering agents’ future behavior. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pages 1656–1661. IEEE.
- Doucet et al. [2000] A. Doucet, S. Godsill, and C. Andrieu. On sequential monte carlo sampling methods for bayesian filtering. Statistics and computing, 10(3):197–208, 2000.
- Du Toit and Burdick [2010] N.E. Du Toit and J.W. Burdick. Robotic Motion Planning in Dynamic, Cluttered, Uncertain Environments. In 2010 IEEE Conference on Robotics and Automation, pages 966–973, 2010.
- Du Toit and Burdick [May] N.E. Du Toit and J.W. Burdick. Robotic motion planning in dynamic, cluttered, uncertain environments. In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages 966–973, May. doi: 10.1109/ROBOT.2010.5509278.
- Faubel et al. [2009] F. Faubel, J. McDonough, and D. Klakow. The split and merge unscented gaussian mixture filter. Signal Processing Letters, IEEE, 16(9):786–789, 2009.
- Ferguson et al. [2008] D. Ferguson, M. Darms, C. Urmson, and S. Kolski. Detection, prediction, and avoidance of dynamic obstacles in urban environments. In 2008 IEEE Intelligent Vehicles Symposium, pages 1149–1154, 2008.
- Fletcher et al. [2008] L. Fletcher, S. Teller, E. Olson, D. Moore, Y. Kuwata, J. How, J. Leonard, I. Miller, M. Campbell, D. Huttenlocher, et al. The mit-cornell collision and why it happened. Journal of Field Robotics, 25(10):775–807, 2008.
- Folsom [2011] T.C. Folsom. Social ramifications of autonomous urban land vehicles. In IEEE International Symposium on Technology and Society, May, 2011.
- Force [2007] United States Air Force. 2007 clif data set. 2007. URL https://www.sdms.afrl.af.mil/index.php?collection=clif2007.
- Fulgenzi et al. [2009] C. Fulgenzi, A. Spalanzani, and C. Laugier. Probabilistic rapidly-exploring random trees for autonomous navigation among moving obstacles. In Workshop on safe navigation, IEEE International Conference on Robotics and Automation (ICRA), 2009.
- Garcia et al. [2010] V. Garcia, F. Nielsen, and R. Nock. Levels of details for gaussian mixture models. Computer Vision–ACCV 2009, pages 514–525, 2010.
- Goldberger and Aronowitz [2005] J. Goldberger and H. Aronowitz. A distance measure between gmms based on the unscented transform and its application to speaker recognition. In Ninth European Conference on Speech Communication and Technology. Citeseer, 2005.
- Hardy and Campbell [In Press] J. Hardy and M. Campbell. Contingency Planning over Probabilistic Hybrid Obstacle Predictions for Autonomous Road Vehicles. In IEEE International Conference on Intelligent Robots and Systems, 2010, In Press.
- Havlak and Campbell [2010] F. Havlak and M. Campbell. Discrete and continuous, probabilistic anticipation for autonomous robots in urban environments. In Proceedings of SPIE, volume 7833, page 78330H, 2010.
- Hennig [2010] C. Hennig. Methods for merging gaussian mixture components. Advances in Data Analysis and Classification, 4(1):3–34, 2010.
- Hershey and Olsen [2007] J.R. Hershey and P.A. Olsen. Approximating the Kullback Leibler divergence between Gaussian mixture models. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007, volume 4, 2007.
- Huber [2011] M.F. Huber. Adaptive gaussian mixture filter based on statistical linearization. 2011.
- Huber and Hanebeck [2008] M.F. Huber and U.D. Hanebeck. Progressive Gaussian mixture reduction. In Information Fusion, 2008 11th International Conference on, pages 1–8. IEEE, 2008.
- Huttenlocher et al. [2008] I.M.M.C.D. Huttenlocher, F.R.K.A.N. Sergei, L.J.C.B.S. Pete, M.N.Z.E.G. Mike, and K.H. Fujishima. Team Cornell’s Skynet: Robust Perception and Planning in an Urban Environment. Journal of Field Robotics, 25(8), 2008.
- Juan et al. [2006] Z. Juan, J. Wu, and M. McDonald. Socio-economic impact assessment of intelligent transport systems. Tsinghua Science & Technology, 11(3):339–350, 2006.
- Julier and Uhlmann [1996] S. Julier and J.K. Uhlmann. A general method for approximating nonlinear transformations of probability distributions. Robotics Research Group, Department of Engineering Science, University of Oxford, Oxford, OC1 3PJ United Kingdom, Tech. Rep, 1996.
- Julier and Uhlmann [1997] S.J. Julier and J.K. Uhlmann. A new extension of the kalman filter to nonlinear systems. In Int. Symp. Aerospace/Defense Sensing, Simul. and Controls, volume 3, page 26. Spie Bellingham, WA, 1997.
- Kotecha and Djuric [2003] J.H. Kotecha and P.M. Djuric. Gaussian sum particle filtering. Signal Processing, IEEE Transactions on, 51(10):2602–2612, 2003.
- Kullback [1968] S. Kullback. Information theory and statistics. Dover, 1968.
- [30] Autonomous Labs. Autonomous car navigates the streets of berlin.
- Levinson et al. [2011] J. Levinson, J. Askeland, J. Becker, J. Dolson, D. Held, S. Kammel, J.Z. Kolter, D. Langer, O. Pink, V. Pratt, et al. Towards fully autonomous driving: Systems and algorithms. In Intelligent Vehicles Symposium (IV), 2011 IEEE, pages 163–168. IEEE, 2011.
- Markoff [2010] John Markoff. Google cars drive themselves, in traffic. New York Times, Oct 2010.
- Markoff [2012] John Markoff. Collision in the making between self-driving cars and how the world works, Jan 2012.
- Miller et al. [2008] Isaac Miller, Mark E. Campbell, Dan Huttenlocher, Frank-Robert Kline, Aaron Nathan, Sergei Lupashin, Jason Catlin, Brian Schimpf, Pete Moran, Noah Zych, Ephrahim Garcia, Mike Kurdziel, and Hikaru Fujishima. Team cornell’s skynet: Robust perception and planning in an urban environment. J. Field Robotics, 25(8):493–527, 2008.
- Miura and Shirai [2002] J. Miura and Y. Shirai. Probabilistic uncertainty modeling of obstacle motion for robot motion planning. Journal of Robotics and Mechatronics, 14(4):349–356, 2002.
- Ohki et al. [2010] T. Ohki, K. Nagatani, and K. Yoshida. Collision avoidance method for mobile robot considering motion and personal spaces of evacuees. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pages 1819–1824. IEEE, 2010.
- Petti et al. [2005] Stephane Petti, Thierry Fraichard, Inria Rocquencourt, and Ecole Des Mines De Paris. Safe motion planning in dynamic environments. pages 3726–3731, 2005.
- Psiaki et al. [2010] Mark L Psiaki, Jonathan R Schoenberg, and Isaac T Miller. Gaussian mixture approximation by another gaussian mixture for” blob” filter re-sampling. AIAA Paper, (2010-7747):2–5, 2010.
- Runnalls [2007] Andrew R Runnalls. Kullback-leibler approach to gaussian mixture reduction. Aerospace and Electronic Systems, IEEE Transactions on, 43(3):989–999, 2007.
- Salmond [2009] D.J. Salmond. Mixture reduction algorithms for point and extended object tracking in clutter. Aerospace and Electronic Systems, IEEE Transactions on, 45(2):667–686, April 2009. ISSN 0018-9251. doi: 10.1109/TAES.2009.5089549.
- Sorenson and Alspach [1971] H.W. Sorenson and D.L. Alspach. Recursive bayesian estimation using gaussian sums. Automatica, 7(4):465–479, 1971.
- Terejanu et al. [2008] Gabriel Terejanu, Puneet Singla, Tarunraj Singh, and Peter D Scott. Uncertainty propagation for nonlinear dynamic systems using gaussian mixture models. Journal of guidance, control, and dynamics, 31(6):1623, 2008.
- Ueda et al. [1999] Naonori Ueda, Ryohei Nakano, Zoubin Ghahramani, and Geoffrey E. Hinton. Smem algorithm for mixture models. NEURAL COMPUTATION, pages 200–0, 1999.
- Uhlmann [1995] J.K. Uhlmann. Dynamic map building and localization: New theoretical foundations. PhD thesis, University of Oxford, 1995.
- Urmson et al. [2008] Chris Urmson, Joshua Anhalt, Drew Bagnell, Christopher R. Baker, Robert Bittner, M. N. Clark, John M. Dolan, Dave Duggins, Tugrul Galatali, Christopher Geyer, Michele Gittleman, Sam Harbaugh, Martial Hebert, Thomas M. Howard, Sascha Kolski, Alonzo Kelly, Maxim Likhachev, Matthew McNaughton, Nick Miller, Kevin Peterson, Brian Pilnick, Raj Rajkumar, Paul E. Rybski, Bryan Salesky, Young-Woo Seo, Sanjiv Singh, Jarrod Snider, Anthony Stentz, William Whittaker, Ziv Wolkowicki, Jason Ziglar, Hong Bae, Thomas Brown, Daniel Demitrish, Bakhtiar Litkouhi, Jim Nickolaou, Varsha Sadekar, Wende Zhang, Joshua Struble, Michael Taylor, Michael Darms, and Dave Ferguson. Autonomous driving in urban environments: Boss and the urban challenge. J. Field Robotics, 25(8):425–466, 2008.
- Van Der Merwe [2004] Rudolph Van Der Merwe. Sigma-point Kalman filters for probabilistic inference in dynamic state-space models. PhD thesis, University of Stellenbosch, 2004.
- Šimandl and Dunık [2005] M. Šimandl and J. Dunık. Sigma point gaussian sum filter design using square root unscented filters. In Preprints of the 16th IFAC World Congress. IFAC. Prague, 2005.
- Wan and Van Der Merwe [2000] Eric A Wan and Rudolph Van Der Merwe. The unscented kalman filter for nonlinear estimation. In Adaptive Systems for Signal Processing, Communications, and Control Symposium 2000. AS-SPCC. The IEEE 2000, pages 153–158. IEEE, 2000.
- Williams and Maybeck [2003] J.L. Williams and P.S. Maybeck. Cost-function-based gaussian mixture reduction for target tracking. In Proc. 6th Int. Conf. Inform. Fusion, volume 2, pages 1047–1054, 2003.
- Zhang et al. [2003] Zhihua Zhang, Chibiao Chen, Jian Sun, and Kap Luk Chan. Em algorithms for gaussian mixtures with split-and-merge operation, 2003.
- Ziebart et al. [2009] B.D. Ziebart, N. Ratliff, G. Gallagher, C. Mertz, K. Peterson, J.A. Bagnell, M. Hebert, A.K. Dey, and S. Srinivasa. Planning-based prediction for pedestrians. In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages 3931–3936. IEEE, 2009.