Discrete and Continuous, Probabilistic Anticipation for Autonomous Robots in Urban Environments

Discrete and Continuous, Probabilistic Anticipation for Autonomous Robots in Urban Environments


This paper develops a probabilistic anticipation algorithm for dynamic objects observed by an autonomous robot in an urban environment. Predictive Gaussian mixture models are used due to their ability to probabilistically capture continuous and discrete obstacle decisions and behaviors; the predictive system uses the probabilistic output (state estimate and covariance) of a tracking system, and map of the environment to compute the probability distribution over future obstacle states for a specified anticipation horizon. A Gaussian splitting method is proposed based on the sigma-point transform and the nonlinear dynamics function, which enables increased accuracy as the number of mixands grows. An approach to caching elements of this optimal splitting method is proposed, in order to enable real-time implementation. Simulation results and evaluations on data from the research community demonstrate that the proposed algorithm can accurately anticipate the probability distributions over future states of nonlinear systems.


Autonomous urban driving is an important and maturing field in mobile robotics. Intelligent vehicles promise to improve both road safety, vehicle efficiency, and convenience[13]. The finals of the 2007 DARPA Urban Challenge (DUC) was an empirical evaluation of the state-of-the-art at the time, integrating 11 autonomous vehicles together with other robots and human drivers in an urban environment for long-duration operations ( 4 hours)[34]. Continued development in the field has led to autonomous cars beginning to drive in real urban environments alongside civilian traffic [33].

Many autonomous cars use primarily reactionary planners that rely on rapid re-planning in order to respond to the dynamic environments in which they operate [24]. A collision between the MIT and Cornell entries was one of several examples in the 2007 DUC that raised safety concerns about reactionary planning for autonomous driving [12]. One approach proposed to more intelligently handle autonomous driving in dynamic environments is to incorporate ‘anticipation’ into path planning, or predicting the future motion of dynamic obstacles for use in planning. This area has been an active topic in mobile robotics in recent years [18]. This paper presents a formal method for probabilistically anticipating the future behavior of tracked obstacles.

The problem of anticipation is inherently probabilistic, as it is impossible to know the true future behavior of dynamic obstacles that make their own independent decisions. In addition, the behavioral models used to anticipate obstacle behavior are often highly non-linear. In the literature, several algorithms have been proposed to simplify the problem, such as assuming no uncertainty in future obstacle behavior [37]. These algorithms are well suited for cooperative situations, where the obstacles can communicate their intentions to the robot, or for short anticipation horizons. However, they do not provide sufficient information for reasoning about an obstacle with unknown intentions over a significant anticipation horizon. Similarly, several proposed methods consider only a subset of obstacle uncertainty, such as along-track error [35]. These approaches reduce the complexity of the problem to a manageable level, while still considering the probabilistic aspects of obstacle anticipation, but are typically very simple and narrow in their application.

Another class of algorithms applies standard estimation filters (Kalman Filter, Extended Kalman Filter, etc.) to the problem of anticipation [15]. Such approaches assume a model for the behavior of the obstacle, and provide mathematically rigorous, probabilistic estimates of that obstacle’s state over the anticipation horizon. These approaches are well-suited to obstacles that are accurately described by linear models because they maintain a single Gaussian to represent the uncertainty. For obstacles with more complex behaviors, such as those based on non-linear dynamics (e.g. cars, bicycles, etc.) and those that make discrete decisions (e.g. intersections, passing, etc.), the uncertainty of the anticipated obstacle state becomes inaccurate very quickly, thus severely limiting the prediction horizon. [8] has been extended by the authors to use Gaussian Mixture Models (GMMs) to capture multiple obstacle hypothesis, but is still focused primarily on linear obstacle models [9].

More complex uncertainties can be addressed, while avoiding the linearization problems of standard filters; for example, Monte-Carlo (MC) methods[11]. These approaches are attractive because they can consider complex, non-Gaussian uncertainties and allow for the use of non-linear obstacle models to capture complex obstacle behavior. However, the accuracy of prediction scales with the number of particles, and there are no guarantees that the particle set effectively covers the true distribution of possible future obstacle states. Because the assumed dynamics model for the obstacle has to be evaluated at every particle used in anticipation, increasing confidence in the estimate is strongly traded with computational resources.

A GMM based predictor is proposed in the paper to anticipate obstacle behaviors [2]. GMMs provide a well-suited representation to probabilistically anticipate non-Gaussian obstacle states. Here, the GMM is used to uniquely include discrete state elements that capture complex, high-level obstacle behaviors. Accurate anticipation of a wide variety of dynamic obstacles is ensured using a novel method for detecting linearization errors using sigma-point methods, and adjusting the mixture accordingly by optimally splitting inaccurately propagated mixands. This approach reduces the individual covariances of inaccurately propagated mixands, bringing them into a nearly linear regime. The presented algorithm provides accurate, probabilistic future obstacle state estimates and is shown to perform well even with highly non-linear obstacle motion models.

This paper is outlined as follows: Section 2.1 defines the representation of the obstacle state and Section 2.2 defines the mixture model. Sections Section 2.3 describes the discrete and continuous anticipation of the obstacle state. Section 2.4 provides an overview of the proposed algorithm. The details of temporal propagation are described in Section 3, including the non-linearity detection and mixand splitting. Section 4.1 provides an example implementation of the anticipation algorithm in simulation, and Section 4.2 demonstrates the efficacy of the algorithm on a real data set. Section 4.3 demonstrates the potential safety improvements by applying the proposed algorithm to the 2007 MIT-Cornell autonomous collision.

2Hybrid Mixture Anticipation Algorithm

The hybrid mixture anticipation algorithm described in this paper is designed to predict the probability distribution over the state of a tracked obstacle forward in time. Hybrid, here, is used to denote jointly discrete and continuous components. The intended application of the presented algorithm is to make accurate, probabilistic information about future obstacle behaviors available for use in path planning. In order to provide the most general algorithm, obstacle models can include non-linear behaviors, as well as discrete variables to capture higher-level decisions. To meet these requirements, the distribution over the obstacle state is described using a hybrid Gaussian/discrete mixture model (hGMM).

2.1Obstacle State Model

The obstacle state at time () is assumed to have continuous elements (, representing position, velocity, etc.) and discrete elements (, representing behavioral modes). The state vector is partitioned accordingly:

A model for the evolution of the obstacle state is assumed to be available, and partitioned into discrete () and continuous () components:

where is the process noise vector at time . and are both functions that take as inputs point values for the vehicle state () and process noise (), as this is typically how models are defined. The following sections generalize these to and , which operate over distributions over, rather than samples from, the vehicle state and process noise.

2.2Hybrid Mixture Probability Distribution Representation

The probability distribution is approximated using an hGMM. The hGMM extends the GMM presented in [2] by including discrete variables. Discrete variables allow the hGMM to capture both continuous behavior of a system (position, velocity, etc.) as well as high-level, abstract behaviors (turning left, going straight, etc.). The hGMM inherits the capability of GMMs to represent many general probability distributions, with increasing accuracy in the limit of a large number of mixands, while still maintaining the convenient computational properties of Gaussian distributions. The hGMM is defined:

where is the number of mixands in the hGMM at time , and are the weights on each mixand at time , such that

The mixand is defined as a Gaussian distribution over the continuous state elements and a hypothesis (using a delta function) over the discrete state elements:

where is the mixand hypothesis of the discrete state, and and are the mean and covariance of the Gaussian distribution over the continuous state.

2.3Hybrid Mixture Propagation

The hGMM formulation enables each mixand to be propagated forward in time independently using the dynamics model (Equation Equation 1), thereby reducing the complexity of the problem. The propagation of the mixands is complicated in two ways, however. First, more than one discrete state can be transitioned into (for example, a tracked obstacle approaching an intersection may turn left, turn right, or continue straight). Second, the variance on the continuous state elements may grow to the point where the mixand can no longer be accurately propagated through the dynamics model. Each of these are addressed in the proposed probabilistic anticipation algorithm.

Consider the discrete mixand propagation through the dynamics function from Equation 1, is:

where is the discrete state for the mixand at the next time step (). In cases where a mixand has multiple possible next discrete states (such as a choice of roads at an intersection), the mixand is split so that one copy of the mixand can transition to each possible discrete state. Although the continuous aspects of the mixand () are not affected by the discrete propagation, the time index is used to account for growth in the mixture size due to mixands transitioning to multiple discrete states.

Similarly, the continuous mixand propagation function , is defined as:

where the right-hand side characterizes the propagated hGMM after one full step.

2.4Algorithm Overview

Figure 1: Block diagram illustrating propagation of hybrid Gaussian mixture through a given dynamics model
Figure 1: Block diagram illustrating propagation of hybrid Gaussian mixture through a given dynamics model

The propagation steps of the hGMM are summarized in Figure 1. First, each mixand in the hGMM at time is propagated through the discrete dynamics. This step has the potential to increase the number of mixands in the hGMM due to the possibility of multiple discrete decisions being available to mixands. Next, each mixand in the hGMM after discrete propagation is propagated through the continuous dynamics. This step includes a linearity criteria (Section Section 3.2) that ensures the accuracy of propagation by splitting mixands that propagate inaccurately (Section Section 3.3). After the continuous propagation is completed, the probability distribution over the state at time can be written as:

Note that the number of mixands () in the hGMM after the continuous propagation can be larger than the number of mixands after the discrete propagation () due to the possibility that some mixands were split to ensure accurate propagation through nonlinear dynamics.

3Mixand Continuous Propagation

The first step in the mixture continuous propagation is to predict the continuous state distribution in each of the mixands forward one time step. The continuous dynamics function (Equation Equation 4) uses the sigma-points (SP) transform (sometimes called the unscented transform) [27] to propagate the mixand through the continuous dynamics function (Equation Equation 1). Sigma-points are well-explored for use in estimation problems involving nonlinear dynamics and measurement models[27]. The Gaussian distributions and are approximated by deterministically choosing sets of points and , called sigma points:

where is the dimension of and is the dimension of ; the process noise distribution is assumed to be time-invariant and Gaussian without loss of generality – time varying GMM process noise distributions can be used with minor modifications. To find the individual sigma points in Equation 6, the matrix square-roots of the covariances and are first solved:

such that

The individual sigma-points are then defined using these matrix square-roots and a parameter :

where .

Each pair of points () is then individually propagated through , and the resulting set is used to evaluate the impact of nonlinearities (Section Section 3.2) on the accuracy of prediction, and to find the predicted distribution at time , (Section Section 3.1).

3.1No Splitting Case

If the mixand passes the linearity criteria described in Section 3.2, the propagated sigma points are used to find the predicted mean and covariance ( and ) for the mixand:

Once the mean and covariance are computed, the propagation of the mixand is complete. If the mixand fails the linearity criteria, it is split into several mixands with reduced covariance, described in Section 3.3.

3.2Linearity Criteria

The strength of Gaussian mixture models is that they can accurately approximate non-Gaussian pdfs that arise either through non-Gaussian process or measurement noise or through non-linearities in the system model[2]. Traditional Gaussian mixture model propagation algorithms [47] use standard non-linear filtering techniques, such as Extended Kalman Filtering or Unscented Kalman Filtering, to propagate the individual mixands without considering the impact of local non-linearities in the system model on the accuracy of propagation. Motivated by a careful examination of the sigma-points, this paper develops a natural measure for how accurately a given mixand is propagated through the system model, along with a splitting method to reduce covariances, which, in turn, improves propagation accuracy.

The SP transform propagates Gaussian mixands through a temporal model using sigma-points; intuitively, this process performs a statistical linearization of the model in the neighborhood of the Gaussian mixand. However, the SP transform does not provide a measure of error introduced by the statistical linearization. In order to evaluate the error introduced by the SP propagation, the linearization residuals are proposed here as a metric. The process is as follows. First, a linear model is defined:

where the notation , describes the subset of sigma-points related to the state uncertainty, and not to process noise:

For a linear system, Equation 8 can be solved exactly for and . For a non-linear system, the residuals can be calculate by casting the linearization as a least-squares problem.

The residual linearization error (), defined as:

The residual error is a direct metric of how well the optimal linear model explains the propagation of sigma points through the nonlinear dynamics. If the underlying model is linear, the residual error is zero. For nonlinear systems, the residual linearization error indicates how locally linear or non-linear the underlying model is in the neighborhood of the sigma-points.

An L-Q factorization can be used to compute :


such that is lower triangular, and therefore cheaply invertible, and is orthonormal:

Substituting the factorization from Equation 11 into Equation 9 yields:

Because of the structure of , the arguments only effect the first columns of the norm in Equation 12. The matrix is partitioned according to what is explained by the linear model () and what is not explained by a linear model (). In block form, this is written:

The optimal arguments can be extracted from the partitioned matrix in Equation 13 and :

The desired linearization error , defined in Equation 10, simplifies to:

Note that the linearization error can be calculated without requiring the computed optimal arguments and in Equation 14. The scaler residual error in Equation 15 quantifies how well the temporal propagation of the sigma-points can be explained by a linear model, and therefore how well the local linearity assumptions made by the SP transform hold for the mixand being propagated. This metric is a refinement of the metric proposed by the authors in [19], and similar to the metric proposed in [22]. Other accuracy metrics for the sigma-point transform have been developed, for example [46] proposes a metric based on the Taylor series expansion of the dynamics function, which requires a differentiable dynamics model. If the error metric is above some defined threshold (), the mixand can be targeted for covariance reduction. However, this metric is scalar, and in order to effectively reduce the covariance of the mixand to a specified level of linearization error, it is desirable to identify which sigma-points are explained the least by the optimal linear model.

Defining as the difference between the sigma-points propagated through the system model and the optimal linear model,

The column of , is the residual linearization error associated with the sigma-point. Similar to the linearization error , can be computed without computing the optimal linearization, given by:

To effectively reduce the covariance of the mixand in order to reduce the linearization error, the direction along which the linearization error is greatest is identified as the optimal splitting axis (). To find , the sigma points before propagation () are weighted by the norm of their associated residual linearization errors (). The first eigenvector of the second moment of this set of weighted sigma points is the axis along which the residual linearization error is the greatest, and is used as the optimal splitting axis.

3.3Gaussian Mixand Splitting

Given the sigma point residual error of the propagated mixand in Equation 17, the next step is to adapt the mixture to decrease this error (and increase the accuracy of the temporal propagation). This is accomplished by identifying mixands that are poorly propagated by the SP transform (using Equation 17) and replacing them with several mixands with smaller covariances.

The approximation of a GMM by a GMM with fewer mixands has been explored in the literature [23], as classical GMM filters experience geometric growth in the number of mixands and require a mixture reduction step in order to remain computationally tractable. The problem here, however, is the reverse – splitting a mixand in order to reduce the variance of mixands. This problem is only beginning to receive attention in the literature [22], although there has been interest in the topic in the machine learning community [43]. Other adaptive GMM filters include [5], which relies on non-Gaussian process noise to increase the number of mixands, and [42], which adapts the GMM to nonlinearities by developing an update rule for the mixand weights rather than by increasing the number of mixands.

There are two competing goals in approximating a single Gaussian by a GMM. First, the GMM must closely approximate the original Gaussian. Second, the covariance of the mixands in the GMM must be significantly smaller than the covariance of the original Gaussian. A common metric used to evaluate how closely two distributions approximate each other (e.g. comparing the GMM to the original Gaussian) is the Kullback Leibler divergence (KLD), which measures the relative entropy between two distributions[29]. However, the KLD between a Gaussian and a GMM can not be evaluated analytically, and effective approximations can be resource intensive[17]. An alternative metric is the integral-squared difference (ISD), which has a closed-form analytic solution when comparing a single Gaussian to a Gaussian mixture [49]. Because the ISD can be exactly and quickly evaluated, it is used here as the statistical error metric when comparing a Gaussian and its GMM [?].

Formally, the Gaussian mixand splitting problem is defined as the approximation of a single Gaussian mixand by a GMM such that the covariances of the mixands () are smaller than the original (), and the ISD between the original Gaussian and its GMM (Equation Equation 18) is minimized. The formal problem is to solve for for all such that:

Because this optimization is computationally expensive, an off-line optimal solution is computed and stored for use as needed by the anticipation algorithm in real time. Pre-computing an optimal split for real time use requires that the pre-computed split can be applied to an arbitrary single Gaussian and splitting axis (i.e. the general problem). Any single Gaussian and splitting axis combination can be transformed to the problem of splitting a unit, zero-mean Gaussian along the -axis () using an affine transformation.

Given a Gaussian mixand and a splitting axis , the following transformations are applied in order to arrive at the pre-computed problem:

where is the matrix square-root of

and is a rotation matrix, computed such that the final splitting axis aligns with the -axis:

Figure ? illustrates the transformation in Equation 20 applied to a general two dimensional Gaussian. Figure ? shows the original single Gaussian to be split, and the axis along which the Gaussian is to be split, . Figures ? and ? show the translation of the Gaussian to the origin () and the transformation that yields a unit covariance of the single Gaussian. Finally, Figure ? shows the rotation () applied that aligns the splitting axis with the -axis.

The splitting problem has now been posed as splitting a zero-mean, unit Gaussian along the -axis into an -component mixture of Gaussians:

The solution to this optimization problem is computed off-line, and can be applied at runtime to the general problem (Equation Equation 19) using the transformation from Equation 20:

The offline optimization (Equation Equation 21) solves for a Gaussian mixture with an odd number () elements such that the ISD between the split Gaussian mixture and the Gaussian distribution is minimized. For an dimensional Gaussian, there are parameters associated with the means, parameters associated with the covariances, and parameters associated with the weights. Due to the large parameter space, the optimization problem is ill-posed, and computationally intractable even for off-line optimization. To address these problems, several constraints are imposed to reduce the parameter space. First, the means through are restricted to lie on the -axis and be evenly spaced, reducing the optimization parameters associated with the means to a single spread parameter :

This assumption is reasonable, as it spreads the means of split mixture along the splitting axis and only this dimension of the covariance is being targeted for reduction. Furthermore, the derivative of with respect to the off--axis elements of the means evaluated at the proposed ’s is exactly zero.

The next parameter reduction constrains the covariance matrices through to be diagonal, equal, and of the form:


This targets only the covariance along the splitting axis; all other elements of the split covariance matrices are optimally (the derivate of with respect to these elements evaluated at the proposed value is zero), so they are not considered. This step reduces the parameters associated with the variance matrices to a single parameter .

Finally, a further reduction to the parameter space is made by recognizing that the optimal weights ( through ) for a given set of means ( through ) and variances ( through ) can be found using a quadratic program, removing the weights from the parameter search space entirely. Expanding the ISD, defined in Equation 18, yields:

Williams provides a closed-form solution for each of the above terms:[49]

Re-arranging the terms yields a quadratic program with the weights as arguments:


such that

Equation 24 can be solved with a constrained quadratic program to find the optimal weights for the split, . This allows the weights to be removed from the parameter space, as the optimal weights can be easily computed for each set of parameters considered.

The resulting parameter search space includes only the spacing of the means () and the reduction of the covariance (), making a high-resolution exhaustive parameter search possible. The optimization is an interesting trade between goals. First, should be small, as this represents the error introduced by approximating the original Gaussian by the split distribution. Second, should be small, as this reduces the effects of nonlinearities in the propagation. Finally, should be small in order to create manageable computation requirements.

Figure 2: J_{\text{ISD}} as a function of N and \sigma, collected for 5000 randomly generated two-dimensional Gaussian distributions and splitting axes. The one-sigma error bars are shown
Figure 2: as a function of and , collected for randomly generated two-dimensional Gaussian distributions and splitting axes. The one-sigma error bars are shown

Figure 2 plots the mean and standard deviation of the ISD as a function of , and for randomly generated examples of two-dimensional Gaussians distributions and splitting axes. For each value of , is varied between and and the optimal spread parameter and weights are computed. The optimized split is then applied to randomly generated two-dimensional Gaussians and splitting axes, and the ISD between each Gaussian and the resulting GMM after the split is computed. Note that the tight error bounds indicate that the optimized split consistently works well, even as Gaussian/splitting axis pairs vary, supporting the approximation in Equation 22. Intuitively, minimizes the ISD, while most reduces the variance along the x-axis. It is proposed here to first choose a single value for , and then compute the optimal values for the spread parameter and the weights for different values of .

3.4Benchmark Splitting Examples

To explore the impact of the parameter choices and on the accuracy of propagation of a single Gaussian through a nonlinear temporal model, two nonlinear models are used as benchmark examples. The first is the univariate non-stationary growth model (UNGM):[28]

and the second is a univariate cubic:

The UNGM is commonly used to evaluate the performance of nonlinear filters[28]. Also, for both the UNGM and the cubic model, the propagated distribution can be found analytically for comparison.

The accuracy of the GMM splitting routine is evaluated using 100 randomly generated samples; each is a normal distribution (means uniformly distributed between -2 and 2, and variances uniformly distributed between 0 and 2) and propagated through the UNGM (Equation Equation 25) and cubic (Equation Equation 26) models using cached splits. In order to explore the sensitivities in the proposed anticipation algorithm, the cached splits are generated using different variance reductions () and different numbers of mixands in the split (). The GMM approximation to the propagated distribution is compared to the exact propagated distribution . The performance metric used to evaluate the accuracy of propagation is the numerically evaluated KLD between the propagated mixture and the true distribution, or:

Figure ? plots the mean and standard deviation of the KLD between the true distribution and the GMM approximation as a function of and . The trends in each are similar. As decreases from , the accuracy of the split mixtures increases (smaller KLD compared to the true propagated distribution ). As becomes very small, approaching zero, the split mixture becomes a poor approximation of the prior, introducing error in the propagation (higher KLD values). As expected, larger values of allow smaller values of , and therefore more accurate mixture propagation. Even the least aggressive splits () result in KLDs of approximately one-half that of propagation with no splitting, while more aggressive splits show KLDs of approximately one-tenth that of the no-split solution.

These example problems can also be used to analyze how effective the linearity metric (Equation Equation 15) is at predicting propagation errors. is stored for each single Gaussian propagated through the UNGM and the cubic function, and the Pearson correlation between the residual linearization error, a metric that predicts the error in propagation, and the KLD between the no-split solution and truth, which is the actual measured propagation error, is computed. The Pearson correlation coefficient is (UNGM) and (cubic model), indicating strong correlation between the residual linearization error metric and the actual KLD from the true mixture after propagation.

These two example problems indicate that the linearity criteria and the mixand splitting algorithms greatly improve the accuracy of propagation of a single Gaussian through a non-linear model. The choice of the linearity threshold () and of the split parameters and are problem dependent, but can be explored off-line prior to implementation.

4Experimental Results

Two systems, one simulated and one experimental, are used to analyze and evaluate the hGMM anticipation algorithm. In both systems, the hGMM is used to predict the future behavior of a tracked car driving on a known roadmap. Additionally, the case of the 2007 Cornell-MIT autonomous collision is studied anecdotally. In order to manage mixture complexity, the GMM reduction method proposed in [39] is used in the following examples.


The simulation uses the hGMM to predict the future behavior of a simple simulated car driving in a Manhattan grid. Implementing in simulation allows the assumed dynamics model to exactly match the true dynamics model of the simulated obstacle vehicle. This allows the performance of the hGMM in predicting the distribution over future obstacle vehicle states to be studied independently of the assumed dynamics model.

The hGMM algorithm is implemented using a four-state obstacle vehicle model described in reference[19]. Obstacle vehicles are assumed to drive on a known road network and to obey specified traffic regulations. Obstacle vehicles are modeled as 4-state bicycle robots, with the continuous state:

where and are the two-axis position of the center of the rear axle, is the speed of the obstacle vehicle, and is the heading of the obstacle vehicle, all at time . The continuous dynamics model for the obstacle vehicle is simple, though non-linear.

Control inputs are throttle () and steering angle (), and zero-mean, Gaussian process noise () enters on the control inputs. The time step is seconds in this example.

The hybrid aspect of the obstacle vehicle anticipation problem enters through an assumed route planning controller, where the discrete states are road segments in the map. The discrete dynamics function assumes that the route planning controller takes all possible routes with equal probability. A path following controller is assumed, which generates control inputs and that enables the obstacle vehicle to follow the current route and maintain a speed of meters per second:

This path following controller (Equation Equation 30) is composed with the bicycle dynamics (Equation Equation 29) to arrive at the continuous dynamics function (Equation Equation 1) required by the hGMM:

The anticipation algorithm is used with a time horizon of seconds (approximately the time required for an obstacle to fully traverse an intersection) to predict the behavior of an obstacle car in several different scenarios. For comparison to the hGMM, a large particle set is used to approximate the true distribution of the obstacle state at future times. Three example scenarios (straight road, turn, and intersection) are used to compare the output of the hGMM algorithm to the approximate true distribution. The negative log-likelihood (NLL) is used to evaluate the difference between the true distribution, as represented by a particle set, and the hGMM approximation. The performance of the hGMM is evaluated for different values of , and compared to the baseline of the hGMM algorithm with no splitting, which is equivalent to a single Gaussian UKF anticipation algorithm.

Figure ? shows the NLL metric for the three considered scenarios. The NLL is plotted as a function of lookahead time. Results are compared for four values of , for each scenario: the hGMM with , the hGMM with , the hGMM with , and finally the hGMM with ; the latter is equivalent to a single Gaussian UKF predictor. When the obstacle vehicle is traveling on a straight road (Figure ?), all of the anticipation predictors show similar performance, because the dynamics model is very close to linear in this case. In Figures ?, ?, the hGMM anticipation algorithm shows significant performance improvement over the no splitting predictor, as non-linearities have greater impact on the accuracy of propagation. The peaks in NLL that appear in the second and third scenario correspond to the future time when the vehicle is anticipated to be negotiating the bend in the road, when the model is exhibiting a high degree of non-linear behavior. Once the turn is negotiated, and the vehicle is anticipated to be on the straight road after the turn, the non-linearities fade and the NLL decreases again.

Figure ? shows the hGMM anticipated distributions of the rear-axle position for the intersection scenario, for . The hGMM anticipation algorithm clearly captures the non-Gaussian nature of the probability masses that turn right and left. This example illustrates both the splitting due to the discrete dynamics (seen by separate probability masses taking each of the three options available at the intersection) and the splitting due to nonlinearities (seen by the non-Gaussian shape of the probability masses making the left and right turns). This figure also demonstrates why Figure ? has a peak just before 2 seconds lookahead – the non-Gaussian shape of the individual probability masses is at its highest as the obstacle vehicle is anticipated to be negotiating the intersection, as in Figure ?.

4.2Experimental Data

In order to validate the probabilistic anticipation approach on more realistic problems, the hGMM algorithm was evaluated on a set of tracked vehicle data near intersections. The data set used for this validation is the 2007 Columbus large image format (CLIF 2007) data set made available by the United States Air Force [14]. The data set consists of aerial imagery of an urban environment, at approximately two frames per second, collected by a large-format electronics observation platform. A large number of vehicles are observed driving in a variety of conditions. The validation focuses on vehicles observed traversing intersections, in relatively light traffic, that have the right-of-way. A total of 40 vehicle tracks at three different intersections are used.

In the proposed validation approach, an observed vehicle is tracked at time , giving a state estimate. This state is then predicted forward in time using the hGMM with various values for , and an anticipation horizon equal to the time over which the vehicle is tracked. The estimate of the observed vehicle state at time is used as the initial condition for the hGMM; subsequent measurements are used to evaluate how well the hGMM predicts the behavior of the vehicle. The same vehicle behavior model described in Section 4.1 (Equation Equation 29) is used in this experimental validation to describe the dynamics of observed vehicles. Successive measurements of the tracked obstacle at times after are compared to the anticipated distribution over the vehicle state to evaluate how well the hGMM algorithm anticipates the behavior of the tracked vehicle.

Figure 3: Log-likelihood and standard deviation of observed vehicles given by hGMM anticipation algorithm as a function of e_{\text{res},\text{max}}
Figure 3: Log-likelihood and standard deviation of observed vehicles given by hGMM anticipation algorithm as a function of

Figure 3 plots the log-likelihood (LL) of the hGMM anticipation agreeing with CLIF observations as a function of . At high values of , the LL is low with larger uncertainty bounds, indicating poorer predictive performance. The figure shows that smaller values for provide more accurate anticipation of observed cars, as well as much more consistent performance. At values for above approximately , the standard deviation becomes quite large. This indicates that the covariance of mixands in the hGMM becomes large enough that the sigma-point propagation becomes a poor and unpredictable approximation. For very small values of , performance also degrades due to errors introduced by repeated mixand splitting in the hGMM.

Figure 4: t-test results comparing Log-Likelihood data at different linearity thresholds e_{\text{res},\text{max}} to the data for e_{\text{res},\text{max}} = 1.
Figure 4: t-test results comparing Log-Likelihood data at different linearity thresholds to the data for .

Figure 4 plots the p-values of t-test comparisons between the log-likelihood data for each linearity threshold to the log-likelihood data for the maximum linearity threshold (). This figure shows that the improved performance seen in Figure 3 for linearity thresholds is statistically significant, as the corresponding p-values are below . For linearity thresholds below this range, the accumulated errors introduced by repeated mixand splitting degrade performance, and the log-likelihood performance is not statistically different from the data taken with a linearity threshold high enough that no splitting occurs ().

The LL metric studied in Figures Figure 3 and Figure 4 considers only the likelihood that the observations agree with the anticipated obstacle state, and therefore only evaluates the hGMM at the observations. Because of this, unreasonable predictions (i.e. the car being very far outside the driving lane) are not penalized by this metric. To compliment this metric, a second metric is proposed: the expected off-track error (EOTE). The EOTE is defined as:

where is the anticipation horizon, is the distance of the rear axle position from the lane center, and is the anticipated probability of the vehicle being at that position from the hGMM. This metric is equivalent to the expected value of the out-of-lane position of the observed vehicle. This metric directly measures how reasonable the output of the hGMM is, assuming the observed vehicle is not behaving anomalously.

Figure 5: Expected off-track error and standard deviation of observed vehicles given by hGMM anticipation algorithm as a function of the linearity threshold e_{\text{res},\text{max}}.
Figure 5: Expected off-track error and standard deviation of observed vehicles given by hGMM anticipation algorithm as a function of the linearity threshold .

Figure 5 plots the EOTE as a function of the linearity threshold used in the hGMM. At high values of , the EOTE is large and has a large standard deviation, indicating poor anticipation performance. For small values of , the EOTE and its standard deviation decrease, indicating that the hGMM predictions are more reasonable and more consistent. As with Figure 3, for above the standard deviation of this metric becomes large as the mixand covariances become too large for the sigma-point approximation.

Figure 6: t-test results comparing EOTE data at different linearity thresholds e_{\text{res},\text{max}} to the data for e_{\text{res},\text{max}} = 1.
Figure 6: t-test results comparing EOTE data at different linearity thresholds to the data for .

Figure 6 plots the results of t-test comparisons between the EOTE data for linearity threshold to the EOTE data for the maximum linearity threshold. These results show that the decreased EOTE seen in Figure 5 for small linearity thresholds () is statistically significant.

Figure ? shows the results of the hGMM and single Gaussian anticipation algorithms, respectively, for a vehicle making a turn at an intersection. The initial observation of the vehicle (green circle) is plotted along with the anticipated distribution of the rear-axle position at future anticipated times. The actual measurements are plotted as green crosses. Comparing the two approaches, the hGMM predicts the measurements about as well as the single Gaussian method, according to the LL metric in Figure 3. However, Figures ? and ? show that the hGMM distribution is much more reasonable than the single Gaussian method distribution in predicting locations that are actually on the road network, particularly as the look ahead time increases. This is demonstrated by the large amounts of probability mass outside of the actual expected driving corridor for the vehicle in Figure ?, and is quantified by the EOTE metric which shows clearly superior performance of the hGMM algorithm.

4.3MIT-Cornell Collision Example

Figure 7: MIT-Cornell collision in the 2007 DUC
Figure 7: MIT-Cornell collision in the 2007 DUC

A motivating example for the importance of this work is the collision between the Cornell and MIT entries in the 2007 DARPA Urban Challenge (DUC). The collision occurred when the Cornell entry, Skynet, stopped due to a perceived blockage that was the result of a misplaced waypoint in the roadmap. MIT’s entry, Talos, observing that Skynet had stopped, initiated a pass. While Talos was executing the pass, Skynet recovered and began moving again. Talos, not recognizing that Skynet was no longer a static obstacle, moved back into the driving lane while Skynet, not able to anticipate the behavior of Talos, continued forward. The result was the low speed collision shown in Figure 7. Further details on the collision are available in [12].

To demonstrate the efficacy of the hGMM anticipation algorithm at improving safety, the scenario is re-visited using logged data from the 2007 collision. In this example, Skynet uses the hGMM to anticipate the behavior of Talos (using the simple vehicle model described in Section 4.1). At the same time, Skynet simulates it’s own trajectory for the proposed behavior of resuming motion. The probability of collision between Skynet’s proposed motion and the hGMM over Talos’ anticipated state is evaluated using the approximation developed in [18], over a horizon of 3 seconds at 10 Hz.

Figure ? shows the results of the anticipation as a function of the lookahead time. Figure ? shows the initial condition – Talos is shown as the black car, Cornell is shown as the green car. Figures ?, ?, and ? show the scenario as it evolves at three different lookahead times. The hGMM probability distribution prediction of Talos is plotted – in this case, the x-y position of the rear axle. Skynet’s proposed position at the given lookahead time is also shown, and the color corresponds to the probability of collision, with green being safe (probability of collision of zero) and red being dangerous (probability of collision of one). Figure 8 plots the anticipated probability of collision as a function of lookahead time. The anticipated probability of collision increases as the anticipation algorithm looks ahead in time, and a collision is almost guaranteed by 2.5 seconds. These figures clearly show that even with a basic obstacle model, the hGMM could have predicted, and therefore prevented, the MIT-Cornell collision.

Figure 8: Anticipated probability of collision
Figure 8: Anticipated probability of collision
Figure 9: Computation time required to anticipate an obstacle to a 4.5 second horizon as a function of linearity threshold and maximum number of allowed mixands (enforced by Runnall’s reduction algorithm) .
Figure 9: Computation time required to anticipate an obstacle to a 4.5 second horizon as a function of linearity threshold and maximum number of allowed mixands (enforced by Runnall’s reduction algorithm) .

Figure 9 plots the time required to predict an obstacle’s state forward 4.5 seconds for a range of values for both the linearity threshold and the maximum number of mixands allowed at the end of each prediction step. The hGMM is implemented in C# in Skynet’s planning algorithms on a modern, Intel Core i7 rackmount computer. The computation time grows with the number of allowed mixands and the inverse of the linearity threshold, as expected.

Figure 10: hGMM prediction accuracy as a function of linearity threshold and maximum number of allowed mixands.
Figure 10: hGMM prediction accuracy as a function of linearity threshold and maximum number of allowed mixands.

Figure 10 plots the prediction accuracy of the hGMM as a function of the linearity threshold and the maximum number of allowed mixands. Here the prediction accuracy is measured by computing the Log-Likelihood between the hGMM result and a large particle set propagated using the same vehicle dynamics model as the hGMM. Figure 10 shows that prediction accuracy improves as the linearity threshold becomes smaller, and degrades as the number of allowed mixands becomes small. Note that the performance is independent of the number of allowed mixands until the number of mixands becomes very small. Comparing Figures Figure 9 and Figure 10 provides insight on how to select the linearity threshold and the number of allowed mixands for the assumed dynamics model.

The hGMM is currently implemented in Skynet’s planning algorithms with a linearity threshold of and a maximum number of allowed mixands of 10. This implementation allows Skynet to anticipate the behavior of up to three obstacle vehicles to a horizon of 4.5 seconds at greater than 3 Hz. Because the anticipation result is never older than seconds, but extends to 4.5 seconds in the future, this is considered real-time performance.


An anticipation algorithm is developed which uniquely recognizes and mitigates the impact of non-linearities in the hybrid dynamics function on the accuracy of probability distribution propagation. A unique method for evaluating the accuracy of propagation of a Gaussian distribution through non-linear dynamics is proposed, using the propagated sigma-points from the distribution. In addition, a new method for splitting propagated Gaussian mixands due to nonlinearities is developed. By detecting, and reacting to propagation errors introduced by nonlinearities, the proposed algorithm is shown to have significant improvements in accuracy over standard propagation methods, such as the Unscented Transform.

The behavior of the non-linearity detection and the mixand splitting algorithms are explored using several simulated example problems, and compared to a single Gaussian sigma point method. In the case of benchmark nonlinear problems, the hGMM anticipation algorithm is shown to better approximate the true distributions that arise than the single Gaussian sigma point method. The second set of simulations use an obstacle vehicle driving on a known road network. The accuracy of the anticipated probability distributions is compared across choices for the linearity threshold parameter in the hGMM anticipation algorithm, and compared to a single Gaussian sigma point predictor. Using a large particle set as truth, the hGMM is shown to more accurately capture the behavior of the distribution over future obstacle vehicle states, defined as the negative log-likelihood between the particle set and the anticipated distribution.

The hGMM anticipation algorithm is validated on an experimental data set. Specifically, the hGMM algorithm is used to predict the behavior of vehicles in the CLIF 2007 data set provided by the USAF. The hGMM is compared to a single Gaussian sigma point method by comparing predictions of a tracked vehicle state to observations of the tracked vehicle. The hGMM is shown to provide increased accuracy over the single Gaussian sigma point method using the log-likelihood as a metric. The hGMM is also shown to provide predictions that are more reasonable (i.e. that do not include predictions of anomalous behavior for vehicles that are not behaving anomalously, by the expected off-track error metric) than propagation with no splitting.

Additionally, the hGMM is applied to the scenario leading to the MIT-Cornell collision in the 2007 DUC, and it is shown that the hGMM could have anticipated the collision in time to prevent it.


  1. Gaussian mixture filter in hybrid navigation.
    S. Ali-Löytty and N. Sirola. In Proceedings of The European Navigation Conference GNSS, volume 2007, 2007.
  2. Nonlinear bayesian estimation using gaussian sum approximations.
    D. Alspach and HW Sorenson. Automatic Control, IEEE Transactions on
  3. Odin: Team victortango’s entry in the darpa urban challenge.
    Andrew Bacha, Cheryl Bauman, Ruel Faruque, Michael Fleming, Chris Terwelp, Charles F. Reinholtz, Dennis Hong, Al Wicks, Thomas Alberi, David Anderson, Stephen Cacciola, Patrick Currier, Aaron Dalton, Jesse Farmer, Jesse Hurdus, Shawn Kimmel, Peter King, Andrew Taylor, David Van Covern, and Mike Webster. J. Field Robotics
  4. Intelligent vehicle applications worldwide.
    R. Bishop. Intelligent Systems and Their Applications, IEEE
  5. A modified gaussian sum approach to estimation of non-gaussian signals.
    Mauro J Caputi and Richard L Moose. Aerospace and Electronic Systems, IEEE Transactions on
  6. Analytic collision anticipation technology considering agents’ future behavior.
    J.S. Choi, G. Eoh, J. Kim, Y. Yoon, J. Park, and B.H. Lee. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pages 1656–1661. IEEE.
  7. On sequential monte carlo sampling methods for bayesian filtering.
    A. Doucet, S. Godsill, and C. Andrieu. Statistics and computing
  8. Robotic Motion Planning in Dynamic, Cluttered, Uncertain Environments.
    N.E. Du Toit and J.W. Burdick. In 2010 IEEE Conference on Robotics and Automation, pages 966–973, 2010.
  9. Robotic motion planning in dynamic, cluttered, uncertain environments.
    N.E. Du Toit and J.W. Burdick. In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages 966–973, May.
  10. The split and merge unscented gaussian mixture filter.
    F. Faubel, J. McDonough, and D. Klakow. Signal Processing Letters, IEEE
  11. Detection, prediction, and avoidance of dynamic obstacles in urban environments.
    D. Ferguson, M. Darms, C. Urmson, and S. Kolski. In 2008 IEEE Intelligent Vehicles Symposium, pages 1149–1154, 2008.
  12. The mit-cornell collision and why it happened.
    L. Fletcher, S. Teller, E. Olson, D. Moore, Y. Kuwata, J. How, J. Leonard, I. Miller, M. Campbell, D. Huttenlocher, et al. Journal of Field Robotics
  13. Social ramifications of autonomous urban land vehicles.
    T.C. Folsom. In IEEE International Symposium on Technology and Society, May, 2011.
  14. 2007 clif data set.
    United States Air Force. 2007.
  15. Probabilistic rapidly-exploring random trees for autonomous navigation among moving obstacles.
    C. Fulgenzi, A. Spalanzani, and C. Laugier. In Workshop on safe navigation, IEEE International Conference on Robotics and Automation (ICRA), 2009.
  16. Levels of details for gaussian mixture models.
    V. Garcia, F. Nielsen, and R. Nock. Computer Vision–ACCV 2009
  17. A distance measure between gmms based on the unscented transform and its application to speaker recognition.
    J. Goldberger and H. Aronowitz. In Ninth European Conference on Speech Communication and Technology. Citeseer, 2005.
  18. Contingency Planning over Probabilistic Hybrid Obstacle Predictions for Autonomous Road Vehicles.
    J. Hardy and M. Campbell. In IEEE International Conference on Intelligent Robots and Systems, 2010, In Press.
  19. Discrete and continuous, probabilistic anticipation for autonomous robots in urban environments.
    F. Havlak and M. Campbell. In Proceedings of SPIE, volume 7833, page 78330H, 2010.
  20. Methods for merging gaussian mixture components.
    C. Hennig. Advances in Data Analysis and Classification
  21. Approximating the Kullback Leibler divergence between Gaussian mixture models.
    J.R. Hershey and P.A. Olsen. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007, volume 4, 2007.
  22. Adaptive gaussian mixture filter based on statistical linearization.
    M.F. Huber. 2011.
  23. Progressive Gaussian mixture reduction.
    M.F. Huber and U.D. Hanebeck. In Information Fusion, 2008 11th International Conference on, pages 1–8. IEEE, 2008.
  24. Team Cornell’s Skynet: Robust Perception and Planning in an Urban Environment.
    I.M.M.C.D. Huttenlocher, F.R.K.A.N. Sergei, L.J.C.B.S. Pete, M.N.Z.E.G. Mike, and K.H. Fujishima. Journal of Field Robotics
  25. Socio-economic impact assessment of intelligent transport systems.
    Z. Juan, J. Wu, and M. McDonald. Tsinghua Science & Technology
  26. A general method for approximating nonlinear transformations of probability distributions.
    S. Julier and J.K. Uhlmann. Robotics Research Group, Department of Engineering Science, University of Oxford, Oxford, OC1 3PJ United Kingdom, Tech. Rep
  27. A new extension of the kalman filter to nonlinear systems.
    S.J. Julier and J.K. Uhlmann. In Int. Symp. Aerospace/Defense Sensing, Simul. and Controls, volume 3, page 26. Spie Bellingham, WA, 1997.
  28. Gaussian sum particle filtering.
    J.H. Kotecha and P.M. Djuric. Signal Processing, IEEE Transactions on
  29. Information theory and statistics

    S. Kullback. .
  30. Autonomous car navigates the streets of berlin.
    Autonomous Labs.
  31. Towards fully autonomous driving: Systems and algorithms.
    J. Levinson, J. Askeland, J. Becker, J. Dolson, D. Held, S. Kammel, J.Z. Kolter, D. Langer, O. Pink, V. Pratt, et al. In Intelligent Vehicles Symposium (IV), 2011 IEEE, pages 163–168. IEEE, 2011.
  32. Google cars drive themselves, in traffic.
    John Markoff. New York Times
  33. Collision in the making between self-driving cars and how the world works, Jan 2012.
    John Markoff.
  34. Team cornell’s skynet: Robust perception and planning in an urban environment.
    Isaac Miller, Mark E. Campbell, Dan Huttenlocher, Frank-Robert Kline, Aaron Nathan, Sergei Lupashin, Jason Catlin, Brian Schimpf, Pete Moran, Noah Zych, Ephrahim Garcia, Mike Kurdziel, and Hikaru Fujishima. J. Field Robotics
  35. Probabilistic uncertainty modeling of obstacle motion for robot motion planning.
    J. Miura and Y. Shirai. Journal of Robotics and Mechatronics
  36. Collision avoidance method for mobile robot considering motion and personal spaces of evacuees.
    T. Ohki, K. Nagatani, and K. Yoshida. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pages 1819–1824. IEEE, 2010.
  37. Safe motion planning in dynamic environments.
    Stephane Petti, Thierry Fraichard, Inria Rocquencourt, and Ecole Des Mines De Paris. pages 3726–3731, 2005.
  38. Gaussian mixture approximation by another gaussian mixture for“ blob” filter re-sampling.
    Mark L Psiaki, Jonathan R Schoenberg, and Isaac T Miller. AIAA Paper
  39. Kullback-leibler approach to gaussian mixture reduction.
    Andrew R Runnalls. Aerospace and Electronic Systems, IEEE Transactions on
  40. Mixture reduction algorithms for point and extended object tracking in clutter.
    D.J. Salmond. Aerospace and Electronic Systems, IEEE Transactions on
  41. Recursive bayesian estimation using gaussian sums.
    H.W. Sorenson and D.L. Alspach. Automatica
  42. Uncertainty propagation for nonlinear dynamic systems using gaussian mixture models.
    Gabriel Terejanu, Puneet Singla, Tarunraj Singh, and Peter D Scott. Journal of guidance, control, and dynamics
  43. Smem algorithm for mixture models.
    Naonori Ueda, Ryohei Nakano, Zoubin Ghahramani, and Geoffrey E. Hinton. NEURAL COMPUTATION
  44. Dynamic map building and localization: New theoretical foundations

    J.K. Uhlmann. .
  45. Autonomous driving in urban environments: Boss and the urban challenge.
    Chris Urmson, Joshua Anhalt, Drew Bagnell, Christopher R. Baker, Robert Bittner, M. N. Clark, John M. Dolan, Dave Duggins, Tugrul Galatali, Christopher Geyer, Michele Gittleman, Sam Harbaugh, Martial Hebert, Thomas M. Howard, Sascha Kolski, Alonzo Kelly, Maxim Likhachev, Matthew McNaughton, Nick Miller, Kevin Peterson, Brian Pilnick, Raj Rajkumar, Paul E. Rybski, Bryan Salesky, Young-Woo Seo, Sanjiv Singh, Jarrod Snider, Anthony Stentz, William Whittaker, Ziv Wolkowicki, Jason Ziglar, Hong Bae, Thomas Brown, Daniel Demitrish, Bakhtiar Litkouhi, Jim Nickolaou, Varsha Sadekar, Wende Zhang, Joshua Struble, Michael Taylor, Michael Darms, and Dave Ferguson. J. Field Robotics
  46. Sigma-point Kalman filters for probabilistic inference in dynamic state-space models

    Rudolph Van Der Merwe. .
  47. Sigma point gaussian sum filter design using square root unscented filters.
    M. Šimandl and J. Dunik. In Preprints of the 16th IFAC World Congress. IFAC. Prague, 2005.
  48. The unscented kalman filter for nonlinear estimation.
    Eric A Wan and Rudolph Van Der Merwe. In Adaptive Systems for Signal Processing, Communications, and Control Symposium 2000. AS-SPCC. The IEEE 2000, pages 153–158. IEEE, 2000.
  49. Cost-function-based gaussian mixture reduction for target tracking.
    J.L. Williams and P.S. Maybeck. In Proc. 6th Int. Conf. Inform. Fusion, volume 2, pages 1047–1054, 2003.
  50. Em algorithms for gaussian mixtures with split-and-merge operation, 2003.
    Zhihua Zhang, Chibiao Chen, Jian Sun, and Kap Luk Chan.
  51. Planning-based prediction for pedestrians.
    B.D. Ziebart, N. Ratliff, G. Gallagher, C. Mertz, K. Peterson, J.A. Bagnell, M. Hebert, A.K. Dey, and S. Srinivasa. In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages 3931–3936. IEEE, 2009.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description