Traffic state estimation using stochastic Lagrangian dynamics Fangfang Zheng{}^{\mathrm{a}}, Saif Eddin Jabari{}^{\mathrm{b,c,}}1footnote 11footnote 1Corresponding author, Email: sej7@nyu.edu , Henry X. Liu{}^{\mathrm{d}}, DianChao Lin{}^{\mathrm{b,c}}

Traffic state estimation using stochastic Lagrangian dynamics
Fangfang Zheng, Saif Eddin Jabari111Corresponding author, Email: sej7@nyu.edu , Henry X. Liu, DianChao Lin

School of Transportation and Logistics, Southwest Jiaotong Univeristy,
No. 111, Erhuanlu Beiyiduan, Chengdu 610031, P.R. China
Division of Engineering, New York University Abu Dhabi, Saadiyat Island,
P.O. Box 129188, Abu Dhabi, U.A.E.
Tandon School of Engineering, New York University, New York, U.S.A.
Department of Civil and Environmental Engineering,
University of Michigan Ann Arbor, 2350 Hayward,
2116 GG Brown, Ann Arbor, Michigan 48109-2125, U.S.A.
Abstract

This paper proposes a new stochastic model of traffic dynamics in Lagrangian coordinates. The source of uncertainty is heterogeneity in driving behavior, captured using driver-specific speed-spacing relations, i.e., parametric uncertainty. It also results in smooth vehicle trajectories in a stochastic context, which is in agreement with real-world traffic dynamics and, thereby, overcoming issues with aggressive oscillation typically observed in sample paths of stochastic traffic flow models. We utilize ensemble filtering techniques for data assimilation (traffic state estimation), but derive the mean and covariance dynamics as the ensemble sizes go to infinity, thereby bypassing the need to sample from the parameter distributions while estimating the traffic states. As a result, the estimation algorithm is just a standard Kalman-Bucy algorithm, which renders the proposed approach amenable to real-time applications using recursive data. Data assimilation examples are performed and our results indicate good agreement with out-of-sample data.

Keywords: Lagrangian coordinates; heterogeneous drivers; car following; mean dynamics; variability; hydrodynamic limits; uncertainty quantification; data assimilation; traffic state estimation; Kalman filtering

\mdfdefinestyle

myStyleroundcorner=5pt,backgroundcolor=brown!20,linecolor=red!40!black,linewidth=1.5pt

1 Introduction

Efficient traffic operation and optimization require knowledge of prevailing traffic conditions. The Lighthill, Whitham and Richards traffic flow model [25, 35] (the LWR model) has been widely applied in estimation and prediction of traffic states on both freeways and high-speed intersections. The model is formulated using traditional spatial-temporal (Eulerian) coordinates and is suitable for state estimation with point sensor measurements (macroscopic data, e.g., traffic volume, speeds). Data from probe vehicles or connected vehicles (microscopic data, e.g., vehicle trajectories) are becoming increasingly available. Traffic flow models that are able to effectively utilize such data are of greater interest in modern applications. A simple way of interfacing between the microscopic and the macroscopic worlds is via coordinate transformations. Indeed, this was done by Daganzo [7, 8] and later extended by [24]. The former proposes a variational formulation of the LWR model in Eulerian coordinates while the later proposes to formulate the model in Lagrangian coordinates. More recently, Hamilton-Jacobi based formulations of traffic flow have appeared in the literature [4, 14] and [23] applied the theory to formulate first-order models in three different coordinate systems, namely the traditional Eulerian coordinates and two variants of the Lagrangian coordinates. Proposed solutions schemes for the deterministic Lagrangian models include both variational techniques and the Godunov scheme using a triangular fundamental diagram. Specifically, the Godunov scheme in Lagrangian coordinates simplifies to an upwind scheme, enabling more efficient application of data assimilation methods [12, 23, 46].

Though deterministic traffic flow models and their solution methods have been extensively studied in the literature, stochastic models of traffic flow are still in a burgeoning stage of development and are primarily extensions of existing deterministic models. For example, stochastic extensions of the cell transmission model [5, 6] have been proposed [37, 18]; other approaches have extended the link transmission model [44], both at the individual link level and the network level [33, 32, 34, 26]. In general, there still remain issues related to the physical accuracy of the sample paths of existing stochastic traffic models, particularly those developed for purposes of traffic state estimation (see [36, 39] for recent reviews). The main culprit is the dominance of time-stochasticity (or noise) in the stochastic models, mostly developed in Eulerian coordinates [15, 38, 28, 16, 40, 2, 41, 43, 10, 37, 1, 19], but also in Lagrangian coordinates [46, 45, 3]. This results in sample paths prone to aggressive oscillation in the time dimension. The interpretation of these oscillations is (unreasonably) aggressive acceleration and deceleration dynamics.

This paper addresses the physical relevance issue of stochastic traffic dynamics via a new stochastic Lagrangian model of traffic flow. The source of uncertainty in the model is parametric in the same sense presented in [21]. The interpretation of this form of uncertainty is heterogeneity in the driving population. We utilize a stochastic version of Newell-Franklin speed spacing relation [29, 9]. Unlike Newell’s simplified relation [31], we can derive a unique inverse function, which can be used in data assimilation applications. Using parametric uncertainty, the sample paths of the stochastic process are smooth and do not contain the oscillatory behavior above. Our analysis substantially extends and expands our previous work [20].

The paper focuses on application of the proposed model for traffic state estimation (TSE), which is a precursor to a variety of traffic management applications. TSE is the fundamental tool providing situational awareness, particularly when data availability is limited. In this context, non-linearity of traffic models renders the state estimation problem particularly challenging. In theory, one utilizes sampling techniques (e.g., ensemble filters, particle filter, etc.). These approaches are time consuming and cannot be applied in real-time. To address this issue, we derive the mean and covariance dynamics in a way that preserves the dependencies (i.e., richness) in the model, while allowing for use of standard Kalman filtering techniques. The latter are known to be computationally tractable and amenable to real-time applications.

This paper is organized as follows: Sec. 2 discusses the motivation of this research. Sec. 3 presents the Newell’s speed-spacing relation with heterogeneous drivers along with the stochastic version of this relation. We interpret the stochasticity as uncertainty about the driver characteristics using driver-specific stochastic parameters. In Sec. 4, we derive the mean and covariance dynamics of the stochastic system by applying ensemble averaging and then derive the dynamics of a deviation process, which serves as a (second-order) Gaussian approximation. Sec. 5 demonstrates how the proposed stochastic model can be utilized in data assimilation, that is, to estimate missing information when only limited vehicle trajectory data is available. Sec. 6 presents numerical examples to show the estimation performance both on the individual level (spacing dynamics, position trajectories) and the aggregated level (queue length, speed dynamics and density dynamics). Sec. 7 concludes the paper.

2 Motivation

We interpret the stochasticity in traffic flow models as one that describes uncertainty about the vehicle/driver attributes. This type of uncertainty arises in situations where data is limited and/or noisy, e.g., when there are low probe vehicle penetration rates. In such situations, one combines data that is available with models of traffic flow to fill the gaps. The combination of data and models can be (heuristically) thought of as taking a weighted average of the two. More weight is assigned to the predictor with lower uncertainty and vice versa.

Example: When the dynamics involve linear mappings, the Kalman filter is known to produce optimal solutions (both in terms of producing posterior probabilities and least squares estimates). The Kalman gain matrix plays the role of the weight used to combine a prediction produced by the model with the measurements. The main ingredients used to calculate the Kalman gain are the state covariance matrix (representing model uncertainty) with measurement covariance (representing measurement error).

In the context of data assimilation, there are two sources of challenges:

  1. Uncertainty about traffic dynamics depends on the traffic state. The variance in a vehicle’s position depends not only on their own state, but also on the positions (and speeds) of adjacent vehicles, particularly the leader. These types of dependencies need to be considered when assessing the uncertainty about the dynamics to produce accurate estimates.

  2. Traffic flow models are non-linear. This dictates the use of estimation techniques that rely on sampling to produce the estimates (e.g., ensemble Kalman filtering). These techniques can be computationally cumbersome and preclude real-time applications.

We address the first challenge by identifying the source of uncertainty in the model. This is the role played by the proposed Lagrangian model with a stochastic speed-spacing relation. The majority of traffic state estimation (TSE) papers in the literature provide little or no guidance on how to model variability (namely, model covariance). In contrast, we provide an approach that bases the variability on uncertainty about simple driver/vehicle primitives. This endows stochasticity in the model with a physical interpretation that can be quantified.

To address the second challenge, we carefully derive a surrogate stochastic model that is amenable to standard estimation techniques. Essentially, we derive the mean and covariance dynamics corresponding to our Lagrangian model. We apply a (functional) law of large numbers to an ensemble of our Lagrangian dynamics to obtain a mean relation and a (functional) central limit theorem to obtain a model of the deviation of the ensemble from the mean, representing the covariance dynamics of the system. These can be thought of as generalizations of the standard law of large numbers and central limit theorem. While the latter are applied to ensembles (a.k.a. sequences) of independent and identically distributed (i.i.d.) scalar random variables, in our case the ensemble is a group of independent identical random processes, each describing the evolution of the trajectories of a group/platoon of vehicles. These derivations are carried out in order to preserve the dependence on state both in the mean and the covariance/deviation dynamics and in order to ensure the correct approximation is used.

Remark. It is notable that these types of approximations, which apply functional laws of large numbers and functional central limit theorems are widely applied in the queueing systems literature. However, they were originally pioneered in the mid-1960s for traffic operations problems by Gordon F. Newell [30]! To the best of our knowledge, it was in fact Gordon F. Newell who coined the terms fluid and diffusion approximations, which correspond, respectively, to the mean and covariance dynamics in this paper.

3 The traffic dynamics

3.1 Heterogeneous model

We assume a discrete system with vehicles numbered in descending order of position; that is vehicle is the leader, is the immediate follower, and so on. We assume a finite time horizon and that time is continuous (i.e., ). Let and denote the position and speed of vehicle at time , respectively. We denote the spacing between vehicle and their leader, , by {linenomath*}

(1)

Heterogeneity in the driver population is represented by driver-specific speed-spacing relations. Without loss of generality, we adopt the Newell-Franklin (stationary) speed-spacing relation [29, 9]: {linenomath*}

(2)

where the driver-specific parameters represent driver ’s desired (free-flow) speed, minimum safety distance, and the constant is the inverse of the reaction time of driver when their speed is restricted by the trajectory of their leader. In addition to the properties discussed in [9], this choice is inspired by the unique inverse function222Opposed, for instance, to Newell’s simplified relation [31]., which can be used in data assimilation applications. The inverse is given by: {linenomath*}

(3)

The position dynamics are given, for any , by {linenomath*}

(4)

and utilizing the speed spacing relation, we write {linenomath*}

(5)

Hence, the spacing dynamics evolve according to {linenomath*}

(6)

The spacing dynamics can be simulated using the following recursion: {linenomath*}

(7)

In settings with homogeneous drivers, in which for all , is chosen so as to ensure no violations of the Courant-Friedrichs-Lewy (CFL) condition, i.e., and to mitigating numerical diffusion, one chooses the largest such time discretization: .

3.2 Parametric uncertainty and stochastic dynamics

To introduce stochasticity, we let the parameters be random variables. We interpret this as uncertainty about the driver characteristics. To differentiate the stochastic case from the deterministic case, we write the (stochastic) parameters as functions of , where is the random space. We assume the random triples (the parameters) constitute independent draws from identically distributed joint distributions. That is, we define the parameter vector with joint distribution function and the parameter tuple for each driver , , is drawn independently from this common distribution: . The stochastic speed-spacing relation is given by333We will use to distinguish between the stochastic relation and the deterministic ones .: {linenomath*}

(8)

The stochastic dynamical model evolves according to {linenomath*}

(9)

To ensure that the speed-spacing relations are physically reasonable, the supports of the three distributions must be bounded from both above and below. That is, we assume the existence of constants, , , and , such that where is a rectangular box with extrema given by the constants. In this stochastic setting, time discretization is chosen as: . An algorithm for simulating the sample paths of the process is given in A.

4 Mean dynamics and variability

A consequence of the non-linearity in the stochastic model (9), via the non-linearity in , is that applications such as data assimilation and traffic control will require some form of sampling. In this section, we derive a surrogate stochastic model that approximates the Lagrangian model above.

We derive two deterministic models below: the first representing the dynamics of the mean of the stochastic Lagrangian model, the second representing the dynamics of the covariance of the system. These are achieved as mean and covariance dynamics of an infinitely sized ensemble of the proposed Lagrangian model. This is akin to using a sampling technique (e.g., ensemble Kalman filter) with an infinitely sized sample, while circumventing the computational costs that come with the need for sampling. Indeed, standard Kalman filters can be applied using these two ingredients.

4.1 Mean dynamics

In this section, we demonstrate that the mean of a large ensemble converges to a (particular) deterministic mean process. Due to the stochasticity in and (the resulting) stochasticity in the spacings, simply taking expectations will not deliver the desired result. Instead, consider the deterministic processes given, for each , by {linenomath*}

(10)

where is a deterministic speed-spacing process to be defined below. In this section, we formally establish that the average of a large number of stochastic processes given by (9), converges to this mean process (10).

Ensemble-averaged process. Let denote the ensemble size and the index of a stochastic process in the ensemble. We denote the th spacing process by for 444We use the notation “” to indicate that we are referring to the entire trajectory of the process. In other words, the difference between and is that the former is a scalar random variable while the latter is an entire random curve.. The ensemble averaged spacing is given by:

(11)

In essence, these are independent stochastic processes, with parameter tuples that are drawn from identical distributions. Below, we derive the ensemble-averaged state and the deviation processes.

For an ensemble of size , we denote the ensemble-averaged process by . which evolves according to {linenomath*}

(12)

for . Here, are random realizations of the stochastic relations, (8). Without loss of generality, we have assumed that the initial spacings are deterministic.

Mean speed-spacing relation. From the strong law of large numbers we have that {linenomath*}

(13)

where . Note that , where , , and . The right-hand side is a percentile speed-spacing relation (typically, a 0.5-percentile or equilibrium relation), while is a mean speed-spacing relation; see [21] for more details. An example comparison is shown in Fig. 1.

Fig. 1: Mean relation, vs. percentile relation.

We will not attempt to derive an expression for . Instead, we will use an empirical approximation: Let be a random sample of size of parameter 3-tuples. Then, for any and a sufficiently large , is very well approximated by {linenomath*}

(14)

This approximation can be carried out off-line as part of a preprocessing step using historical data. Computational efficiency can be further improved by means of sparse approximations (see [11, 17] for example).

The norm. To establish convergence we utilize the uniform norm, which for the sake of completeness we review next: For a process (with continuous sample paths) with components , i.e., , the uniform norm is defined as {linenomath*}

(15)

Whenever {linenomath*}

(16)

holds for a sequence of processes and a limit process , the sequence is said to converge uniformly on compact sets (u.o.c.), a form of strong convergence (almost sure convergence on ).

Vector notation. To establish the mean dynamics, we first write (12) in vector form: {linenomath*}

(17)

where , , and is an affine transformation defined by . We may alternatively append to the vector (as the first element), then is a linear operator. We will treat as a linear operator below.

Next, define and , where is the deterministic process defined by (10) and is the mean speed relation defined by (14) and (13). Then {linenomath*}

(18)

Convergence result. We are ready to state the main result of this section: that the ensemble-average process converges to the mean dynamic. This is stated as follows: {linenomath*}

(19)

Proof of (19). First, it can be easily demonstrated that is Lipschitz continuous (in ); let denote its Lipschitz constant: is the smallest constant such that for all and any {linenomath*}

(20)

for some appropriately chosen norm . Since the supports of the parameters are bounded from above and below, exists and is easy to determine.

Next, it follows immediately from (13) that {linenomath*}

(21)

for any (vector) process with continuous sample paths.

Using the triangle inequality and noting that, for any , , we have that {linenomath*}

(22)

Adding and subtracting to the right-hand side (inside the integral), we have that {linenomath*}

(23)

To simplify notation, define {linenomath*}

(24)

We have that {linenomath*}

(25)

for all from (21) and the continuous mapping theorem. Applying the triangle inequality, the first term on the right-hand side of (23) is bounded from above by {linenomath*}

(26)

Let denote the Lipschitz constant of , then (26) is bounded from above by {linenomath*}

(27)

Hence, {linenomath*}

(28)

Applying the Bellman-Grönwall inequality, we have that {linenomath*}

(29)

For all , (19) follows from (25) as . This completes the proof.

4.2 Hydrodynamic limit

The result above can be generalized to any vehicle size scaling. In essence, we have thus far assumed that . This can be easily generalized to any : The ensemble averaged process becomes {linenomath*}

(30)

In this case, we have ‘vehicles’ in the system ( is the floor function). The limit spacing process can be derived using the same procedure presented above. It is given, for , by {linenomath*}

(31)

In vector form (): {linenomath*}

(32)

This is a deterministic process that converges as to a conservation law in Lagrangian coordinates: {linenomath*}

(33)

where is a process in which is continuous and the speed relation is a mean relation and not the traditional equilibrium relation used in the literature.

4.3 Covariance dynamics

The mean dynamics, , given by (30) can be considered as a first-order approximation of the stochastic Lagrangian model. It essentially represents a first moment of the system. In this section, we derive the dynamics of a second moment of the system, the covariance dynamics, to achieve (i) a second-order approximation and (ii) facilitate the use of standard Kalman filtering techniques for traffic state estimation.

The deviation process. In this section, we derive the covariance dynamics of the stochastic spacing process . First, consider the (amplified) deviation process {linenomath*}

(34)

The scaling ensures that the covariance matrix pertaining to is the same as that pertaining to (for each ). This is demonstrated as follows: Since is a random vector, is the covariance matrix of at time . We have established in Sec. 4.1 that centers . Consequently,

(35)

We compare this to the covariance matrix of the deviation process:

(36)

where the last equality follows from being independent and identically distributed random vectors, each with the same distribution as .

Note that this is true for any . The remainder of this section demonstrates that when a tractable (deterministic) closed expression for the covariance is obtained.

Limiting Deviation process. The boundedness properties of the speed-spacing relations ensures the existence of a limiting process, , such that (weakly) as . We derive this limiting process next.

Expanding (34) we have {linenomath*}

(37)

Adding and subtracting to the right-hand side of (37), we get (with some rearrangement) {linenomath*}

(38)

We will derive the limiting processes for the two terms on the right-hand side of (38) separately. Take the first term. We have by definition, (34), that . Hence, the first term can be written as {linenomath*}

(39)

Upon dividing and multiplying the terms inside the integral by , this is equivalent to {linenomath*}

(40)

In accord with (19), almost surely as . Applying the (generalized) continuous mapping theorem [42, Theorem 3.4.4], (40) converges to {linenomath*}

(41)

where is the directional derivative of along the direction given by the vector . This simplifies to {linenomath*}

(42)

where and is a diagonal matrix with diagonal elements given by the vector . To simplify notation, define .

We now turn to derivation of the limiting process of the second term in (38), which we re-write as: {linenomath*}

(43)

Applying the central limit theorem, we have, for any , that {linenomath*}

(44)

converges weakly (in distribution) to a zero mean Normal random vector with a diagonal covariance matrix, the diagonal elements of which are given by . Denote this covariance matrix by . We have by the continuous mapping theorem that (43) converges to the stochastic integral {linenomath*}

(45)

where is an -dimensional Wiener process. The matrix valued function can be calculated (off-line) using an empirical approximation as in (14) (using the same pseudo-random parameter sample).

Putting these results obtained in (42) and (45) we have that the limiting deviation process, , is the solution of the following stochastic integral equation {linenomath*}

(46)

This can be written (symbolically) in differential form as {linenomath*}

(47)

The latter is a linear matrix stochastic differential equation and has a closed form solution given by [27, 13] {linenomath*}

(48)

where is the fundamental matrix, that is, it is the solution of {linenomath*}

(49)

with initial condition , which is an identity matrix.

Covariance dynamics. The above implies that is a Gaussian process with (deterministic) covariance process given by . To avoid the need to compute the fundamental matrix, , the evolution of the covariance matrix can be calculated by taking the time derivative of , which results in the following matrix differential equation describing the evolution of the covariance dynamics: {linenomath*}

(50)

where .

5 Data Assimilation

This section demonstrates how the model can be used to estimate missing information when (limited) vehicle trajectory data is available. In essence, we utilize an ensemble filter for this purpose. We first develop the system state dynamics (both mean and covariance), then the measurement dynamics, and finally the recursive estimation algorithm (Kalman-Bucy).

5.1 System state dynamics

Let denote the traffic state vector. Define the matrix {linenomath*}

(51)

and let given by so that ( is an matrix of zeros). Then, the mean state dynamics are given by {linenomath*}

(52)

where . Similarly, the ensemble averaged state evolution is given by {linenomath*}

(53)

The same derivations in Sec. 4 yield {linenomath*}

(54)

The covariance dynamics of a Gaussian approximation of the state process are given by: {linenomath*}

(55)

where and .

5.2 Measurements

Suppose position and speed measurements are available for a subset of the “probe” vehicles. We assume that is known. However, this assumption can be relaxed to one where is estimated from some sampled information, such as probe vehicle/connected vehicle trajectory data. A methodology has been proposed in the literature [47] for estimating traffic volumes from sample probe vehicle trajectory data for urban signalized roads. We refer interested readers to [47] for more details.

Let denote the indices of the vehicles with measurements, i.e., is the penetration rate. We will denote by the index of vehicle in the set . There are two types of measurements, depending on whether the probe vehicles are instrumented with sensors that can measure distances to/from surrounding vehicles or are equipped with vehicle communication systems.

Unequipped vehicles. Vehicles that are unequipped with communication systems are the more widely available sources of probe vehicle data today (e.g., taxis, ridesharing service providers, etc.). For such systems, spacings are not measured directly. Instead, they can be represented as noisy measurements from the measured speeds using a random version of (3): {linenomath*}

(56)

For , let denote a spacing measurement and let denote a position measurement. Since the first of these two quantities is not directly measured (in this scenario), it is assumed to be random. Specifically, we assume that a sufficiently large historical sample is available that is well approximated by a Normally distributed random variable, so that . Define the diagonal matrix with diagonal elements , which can also be estimated using historical data. The measurement vector is denoted by . The measurement equation is given by {linenomath*}

(57)

where is a measurement-state variable incidence matrix, is a -dimensional standard Normal random vector and {linenomath*}

(58)

When spacing measurements are available. With the vast advances in vehicle technologies, it is reasonable to expect that (probe) vehicles will be able to measure not only their own positions and speed, but also those pertaining to vehicles in their immediate surroundings. It is, therefore, reasonable to expect that such probe vehicles can measure spacings between them and both their leaders and followers. In this case, for , the spacing measurements and are available in addition to their corresponding position measurements: , , and . Note that, in this case, we have dropped the ‘’ notation to indicate determinism555This, of course, ignores the role of measurement errors. But in the case described here, these tend to small enough as to be negligible.. In this case (when spacing measurements are available), the measurement-state variable incidence matrix, , is denser than in the unequipped vehicle case. Specifically, we have five measurements per probe, i.e., and {linenomath*}

(59)

The right-hand side above is random while the left-hand side is deterministic. This should be interpreted as an assignment of deterministic measurements to (what would otherwise be) random quantities. Again, assuming (without loss of generality) that measurements are error-free, we have that for all when spacing measurements are available.

5.3 Kalman-Bucy filter

The mean and covariance dynamics represent the empirical mean and covariance dynamics of the ensemble when the ensemble size grows to infinity. In this way, we avoid having to sample from a distribution! With these elements in place, the estimation too used is simply a Kalman-Bucy filter (a.k.a. continuous linear filter); see [22] for more details.

Let and denote the (optimal) state mean and state covariance estimates. Their evolution is given by {linenomath*}