Bayesian inference on dynamic linear models of day-to-day origin-destination flows in transportation networks

Bayesian inference on dynamic linear models of day-to-day origin-destination flows in transportation networks


Estimation of origin-destination (OD) demand plays a key role in successful transportation studies. In this paper, we consider the estimation of time-varying day-to-day OD flows given data on traffic volumes in a transportation network for a sequence of days. We propose a dynamic linear model (DLM) in order to represent the stochastic evolution of OD flows over time. DLM’s are Bayesian state-space models which can capture non-stationarity. We take into account the hierarchical relationships between the distribution of OD flows among routes and the assignment of traffic volumes on links. Route choice probabilities are obtained through a utility model based on past route costs. We propose a Markov chain Monte Carlo algorithm, which integrates Gibbs sampling and a forward filtering backward sampling technique, in order to approximate the joint posterior distribution of mean OD flows and parameters of the route choice model. Our approach can be applied to congested networks and in the case when data are available on only a subset of links. We illustrate the application of our approach through simulated experiments on a test network from the literature.

Keywords: Origin-destination flows; transportation networks; dynamic linear models; Bayesian inference.

1 Introduction

Given a geographic region, one of the main problems in planning and operating transportation systems is the estimation of origin-destination (OD) flows, i.e., the amount of trips made by people or freight between points in the region over a defined time interval. This is also referred to in the literature as the OD matrix estimation problem, since many initial models represented the vector of OD flows as a matrix.

OD flows are traditionally estimated through surveys, in which households or drivers are inquired about their daily journeys [24]. However, direct surveys are expensive and due to this reason they are in general carried out every decade. [8]. This low frequency implies that planners may remain many years with no data on the evolution of OD flows over time.

In the last years, governments in urban regions have built traffic control systems in order to manage traffic congestion in transportation networks. Those systems automatically gather large amounts of data on traffic volumes at low cost on a daily or even hourly basis. This allowed in theory the indirect estimation of OD flows by means of mathematical models. The idea is that we can extract information on OD flows from data on traffic volumes if we have a suitable mathematical model which describes their relationships. These models are of an optimization or statistical nature.

The early models assumed that the transportation network from which one observes traffic volumes is not congested. We can trace back the first attempts to the work of [26], who applied linear regression to estimate OD flows from traffic volumes. [23] proposed a model based on Beckman’s optimization model for traffic assignment [2], whose solution is an estimate of the OD flows.

[31] proposed a non-linear programming model based on entropy maximization in which the constraints are the linear conservation flow equations, whose solution corresponds to a maximum entropy OD flow configuration. [7] proposed a generalized least squares (GLS) model which can cope with errors in observed traffic volumes by minimizing the Mahalanobis distance between predicted volumes and observed volumes. [10] and [4] proposed general frameworks which generalize previous models.

[13, 14] proposed models which take into account traffic congestion, by seeking solutions which correspond to an equilibrium state as defined for example by Wardrop’s first principle [33, 29]. [37] proposed a framework based on bilevel optimization, in which the first level corresponds to an objective function (e.g., maxent or GLS) subject to constraints which are obtained as a solution of a traffic assignment model in the second level. Due to the non-convex nature of the model, the authors proposed a heuristic procedure to solve it [36].

[11] formulated the OD matrix estimation problem as a compound fixed point problem, in which the solution of an inner fixed point problem corresponds to an equilibrium traffic assignment of an OD matrix obtained as a solution of an outer fixed point problem. In order to solve the problem, they propose a fixed point iteration based on the method of successive averages [28].

A common caveat in the abovementioned models is that they do not take variability of OD flows into account. Assuming that OD flows follow independent Poisson probability distributions, [32] considered the estimation of mean OD flows given a sample of observed traffic volumes vectors. Since the mean and variance of Poisson random variables is equal, variance of traffic volumes carry information on mean OD flows. He also assumed the existence of a single route for each OD pair, and proposed maximum likelihood and moment-based estimators for the mean OD flows.

Following Vardi’s work, [30] proposed Bayesian estimators for the case when a sample of size one is available for the traffic volumes vector. [18, 20, 19] relaxed the assumption of a single route between OD pairs and proposed multivariate normal approximations to the likelihood function assuming that the variances of OD flows are functions of their means.

More recent approaches consider the development of dynamic models for the estimation of OD flows. As traffic volumes are observed over time, we cannot ignore their temporal nature. Here we must make a distinction between within-day and day-to-day dynamics. When considering within-day dynamics, one is concerned with the estimation of OD flows over the course of a single day. Here we can cite the noteworthy works by [12], [9] and [1]. In contrast, in day-to-day dynamics we consider the estimation of time-varying OD flows for a reference time period, e.g. the morning peak, given a time series of observed traffic volumes over many consecutive days. It is assumed that this reference time period is long enough so that most trips in the study region start and finish within the observation period.

[21] developed one of the first models for the estimation of day-to-day dynamic OD flows. The author assumes that mean OD flows are functions of parameters that do not change over time. Particular cases include constant demand, linear trend and weekday-weekend models. [25] proposed a general Bayesian framework for the estimation of parameters of day-to-day dynamic traffic models. They assumed that OD flows vary according to a Markovian transition kernel, and proposed a Markov chain Monte Carlo (MCMC) algorithm for the estimation of parameters. [22] proposed statistical methods to compare alternative day-to-day dynamic models.

A central aspect in the development of realistic day-to-day dynamic models is the modeling of users’ behavior through route choice models. In practice, users choose routes based on the travel times they have experienced over past days. Thus, a suitable route choice model should take into account the influence of past route costs on the decision of users in a given time. [34, 5] discuss models for the analysis of dynamic users’ behavior.

In this paper, we represent stochastic day-to-day OD flows as a dynamic linear model (DLM), which allows us to capture the time dependencies of OD flows as well as non-stationarity. To the extent of our knowledge, this approach has not been applied before to the estimation of time-varying OD flows. In the formulation of the DLM, we take into account the variability originating from OD flows, user’s route choices and measurement of traffic volumes on links. We model route choices through a utility model based on past route costs. Our model can be applied to congested networks and when there are available data only on partial links.

We also propose an MCMC algorithm, based on Gibbs sampling and forward filtering backward sampling, in order to approximate the joint posterior distributions of mean OD flows and parameters of the route choice model. We illustrate its application through numerical studies in a network from the literature.

This paper is organized as follows: in Section 2 we describe the proposed dynamic linear model; in Section 3 we describe a route choice model and an MCMC algorithm for the approximation of the joint posterior of mean OD flows and route choice parameters; in Section 4 we present numerical studies and discuss the results; finally, in Section 5 we draw some conclusions and propose further developments.

2 A dynamic linear model for day-to-day OD flows

In Section 2.1 we define the proposed dynamic linear model and in Section 2.2 we describe the procedure for estimation of mean OD flows.

2.1 Model definition

Dynamic linear models are defined by a state transition equation, which describes how a system evolves in time, and an observational equation, which describes how observed quantities relate to the system’s states. They have a Markovian structure and assume multivariate normal probability distributions for the random variables. System’s states are regarded as unobserved parameters whose estimation is done through Bayesian updating. We refer to [35] for an excellent introduction to the corresponding theory.

Let be a transportation network in which is a set of nodes, a set of directed links and a set of OD pairs. For a sequence of subsequent time periods , we define as the mean OD flow vector, in which is the mean OD flow for OD pair at time and . We define as the vector of observed traffic volumes in a subset of links in a network at time , in which is the observed volume on link and .

We consider the vector of mean OD flows as the unobserved system state vector, which depends only on its previous state at time plus a stochastic error. The vector of observed link volumes at time depends only on the current unobserved vector of states, such that:


We refer to equation (1) as the dynamic model, while equation (2) is the observational model. is the system matrix and is a stochastic error, where the letter “N” stands for the multivariate normal density with the appropriate dimension and is a covariance matrix also referred to as an evolution matrix. It governs the stochastic evolution of the system through time. We notice that . is an assignment matrix which relates mean OD flows to observed traffic volumes, while is a zero-mean error term which corresponds to the deviation of observed volumes relative to the expected value . We have that .

In order to fully determine the model given by equations (1) and (2), we must specify the corresponding matrices. Regarding the system matrix, we assume that in the short term mean OD flows are locally constant. This implies that at time mean OD flows should be approximately equal to mean OD flows at time , so that and , the identity matrix with the appropriate dimension. The evolution matrix is in general supplied by the modeler.

Regarding matrices and of the observational model, they must represent correctly the relationship between mean OD flows and observed traffic volumes. We define as realized OD flows around mean OD flows at time , which distributes among flows on routes , which aggregate into observed traffic volumes . Figure 1 summarizes the hierarchical relationship between variables in our proposed DLM.

Figure 1: Hierarchical relationship between variables in the DLM. Mean OD flows, realized OD flows and route flows are unobserved variables.

Let be the vector of realized OD flows at time . We assume where is a covariance matrix, which can account for correlations between OD flows or simply be a diagonal matrix in case OD flows are independent. Given a realized vector , for each OD pair there is a vector of route flows , in which is the size of the route set of OD pair , and let be the probability of route being selected at time . We assume follows a multivariate normal distribution with a multinomial-like covariance structure, i.e.:

Where and denote the expected value and the covariance, respectively. Thus , in which is the vector of route choice probabilities of OD pair and the covariance matrix is given by:


Where denotes a matrix whose diagonal elements correspond to the vector and all other elements are zero. Notice also that as , the covariance matrix will be singular and the corresponding multivariate normal distribution will be degenerate. In order to avoid this, we assume there is a positive probability of none of the routes in a route set being chosen, so that assuring that is non-singular. This assumption will be realistic if the route sets include many routes, so that flows on the excluded routes are negligible.

Let be the joint vector of route flows across all OD pairs. We also define as a block-diagonal matrix composed of route choice probability vectors and as a block-diagonal matrix composed of covariance matrices of route flows, i.e.:


Since we defined multivariate normal distributions for all subvectors , and assuming that route flow subvectors are conditionally independent given realized OD flow vector we have that . (For convenience, we omit the explicit dependence on in the following developments).

Now we are able to obtain the conditional by marginalizing . Since both the densities and are multivariate normal, and ignoring the dependence of on (we will take the dependence of on into account in the inference of in next section), from the properties of the multivariate normal we have [27]:

In order to complete the formulation of our model, we must obtain the conditional density of observed traffic volumes given mean OD flows. First we assume that , where is the covariance matrix of measurement errors when observing volumes on links. is the link-path incidence matrix for observed links, whose entries if link make part of route and otherwise. It is a deterministic parameter which is a function of the topology of the network and of the route choice sets.

Since both and are multivariate normal, by marginalizing we have:


From (6), we can finally specify and by defining:


as an assignment matrix and the covariance matrix by:


It is noteworthy in equation (8) that the variability in OD flows, in route choices and in volume measurement are all represented in the covariance matrix.

In next section we describe the estimation equations for mean OD flows.

2.2 Estimation of mean OD flows

Estimation of mean OD flows from observed link volumes can be made by Bayesian updating. We assume that the covariance matrices , and , the link-path incidence matrix , the route choice matrices and the probability of none of the routes in a route set being chosen are known parameters and make part of prior knowledge.

At time , let be the posterior distribution in previous time step , where is the set of observed volume vectors in past time periods. We have a prior distribution before the observation of . Since in our local constant model given by equation (1) the system matrix , then:


Notice that uncertainty on mean OD flows increases by addition of the evolution matrix to the covariance matrix . Next, we define as the one-step forecast distribution of the vector of observed link volumes, where:


It is worth noting that in computing by equation (8), the covariance matrix of route flows is computed based on the predicted mean OD flows , i.e., in which from equation (3):


Finally, the parameters of the posterior distribution are given by the Bayesian updating of the parameters, which in this case coincide with Kalman filter updating equations [35]:


Where is an adjustment matrix, which controls how the parameters from the posterior distribution are modified according to the new observation . In particular, the adjustment matrix is a function of the prior covariance matrix and of the inverse of the covariance matrix of the one-step forecast distribution of link volumes, so that the adjustment matrix gives more or less weight to observed link volumes according to their uncertainty relative to the uncertainty in OD flows.

At a time , is an estimator of the mean OD flows and is a measure of uncertainty of the estimate. At , we define where are not actually observed link volumes, but symbolically represents the modeler’s prior knowledge on OD flows. The estimation procedure can be summarized as follows:

  1. Starting from , set ;

  2. For to , do:

    1. Determine prior distribution parameters and by means of equations (9) and (10), respectively;

    2. Compute the assignment matrix by means of equation (7)

    3. Compute the covariance matrix by means of equations (8) and (13);

    4. Determine the parameters of the one-step forecast distribution and by using equations (11) and (12);

    5. Determine posterior distribution parameters and by means of equations (14), (15) and (16).

In next section, we model route choice probabilities based on a utility model in order to estimate jointly mean OD flows and the route choice matrix.

3 Bayesian inference on route choice parameters

In Section 3.1, we treat route choice probabilities as unobserved quantities and represent them through a utility model. In Section 3.2 we propose an MCMC algorithm to approximate the joint posterior distribution of route choice parameters and mean OD flows.

3.1 Utility model and route choice probabilities

In congested networks, route choice probabilities greatly depend on past route costs. Users take into account their previous experiences in order to evaluate the utility associated with each route. Utilities will vary over time depending on user’s memory, their sensitivity to route costs, fluctuations in mean OD flows, among other possibly influencing factors. As route choice probabilities are functions of utilities, these will also dynamically change.

We assume that the utility of a route in an OD pair at a time as perceived by users is a linear function of its past route costs:


in which are parameters, are observed past route costs and is the length of users’ memory. The minus sign in equation (17) is used since costs are disutilities, in order that routes with lower costs are preferable. We may interpret the parameters as users’ sensitivities to past route costs. It is reasonable to assume that they have non-negative values, otherwise higher costs will contribute to higher utilities. We also expect users to be more sensitive to recent route costs than to older costs, which implies that .

We adopt a multinomial logit model for route choice probabilities [3] in order that the probability of a route in OD pair at time being chosen by a user is given by:


Where is the probability of a user not following a route in a route choice set and denotes Euler’s number.

Let be the vector of route costs for routes in OD pair at time and . We notice that, given route costs for , and parameters and , we uniquely determine the vectors and the route choice matrix for to by means of equations (4), (17) and (18).

3.2 An MCMC algorithm

In this section, we develop an MCMC algorithm in order to sample from the joint posterior distribution of mean OD flows and the parameters of the route choice utility model given in (17).

From Bayes theorem, we have:


where is a shorthand referring to the vectors and a shorthand for . is the joint posterior, is the likelihood function of the observed link volumes and is the prior distribution. We include in prior knowledge set all other parameters, such as for , and route costs for .

We notice that Gibbs sampling [16] allows us to sample from the joint distribution by alternately sampling from conditional distributions and . We must then devise how to sample from each distribution.

Sampling from can be implemented through the forward filtering backward sampling (FFBS) method [6, 15, 35]:

  1. (forward filtering) For , apply Kalman filtering recurrences given by equations (14) and (15). Save , , and for ;

  2. (backward sampling) Starting from , for , sample each backwards from the conditionals , in which:


    and .

We notice that step 1 (forward filtering) of the FFBS algorithm is possible since given , observed link volumes and prior knowledge , we have the route choice matrix and all other parameters determined for , so that we can apply recurrence equations (14) and (15). Then, step 2 generates a sample of mean OD flows .

Regarding the conditional , sampling can be performed through a Metropolis-Hastings (MH) step [17]. First, we notice that:


as does not depend on , and is the prior distribution of . From (6), the likelihood term is given by:


in which matrices and are dependent on since they are functions of the route choice matrices , which are dependent on through equations (17) and (18).

The MH step is then performed as follows: given a candidate vector , where is the current vector at iteration and is a proposal distribution, sample and accept as next sample if:


otherwise, make .

Finally, the proposed MCMC algorithm is summarized below:

  1. Initialize vectors and .

  2. From iteration onwards, repeat until convergence:

    1. (FFBS step) Sample by applying updating equations (14), (15) and sampling backwards through equations (20) and (21);

    2. (MH step) Sample a candidate according to a proposal distribution and make according to acceptance test given in equation (24), otherwise make .

In next section we illustrate the application of the proposed MCMC algorithm through some numerical studies.

4 Numerical studies

In the following subsections, we describe the generation of simulated data and present the results from the application of our approach to a test network.

4.1 Generation of simulated data

In order to illustrate the application of our proposed DLM and MCMC algorithm, we generated data for a transportation network from [22], given in Figure 2, which has 8 nodes and 10 links. Nodes 1 and 2 are the origins and nodes 7 and 8 are the destinations, so that we have 4 OD pairs: (1,7), (1,8), (2,7) and (2,8).

Figure 2: Network used in the numerical studies [?, adapted from]]hazelton15

In order to adapt the network to our experiment, we did not use the original data on demand and link parameters. We assumed that travel times on links are given by BPR functions according to equation (25).


Where is the travel time on a link with traffic volume , denotes the travel time in “free flow” and is the capacity of the link. and are parameters of the function, for which we adopt the typical values from the literature 0.15 and 4, respectively. Note that the capacity of the link is not treated as a hard constraint, i.e., simulated volumes are allowed to be greater than . In the test network, we set and on all links.

The procedure used to generate the simulated data was the following:

  1. At , set values for , , , , , , for to and past route costs , where is the size of users’ memory and denotes route costs computed assuming free flow in the network.

  2. For to do:

    1. Sample mean OD flows ;

    2. Sample OD flows

    3. Compute and for all routes for all OD pairs from equations (17) and (18), respectively, and compute the route choice matrix ;

    4. Sample route flows , where is calculated from equation (3);

    5. Calculate route costs , where is a vector function which returns a vector of costs on links based on BPR functions given by equation (25);

    6. Sample observed traffic volumes .

OD flows vary within the interval and follow the locally constant model given in equation (1), starting at time from the OD flow vector . The OD pairs are ordered lexicographically in vector , so that mean OD flows in OD pair (1,7) corresponds to , (1,8) corresponds to and so on. The evolution covariance matrix is constant and given by , where is the appropriate identity matrix. We assume that variability around mean OD flows is negligible, so as to and for all , and we also assume that . These assumptions imply that variability in observed link volumes will be mostly due to variability in route choices. We simulate OD flows for time periods.

In addition, we set a users’ memory length of for the utility model, so that utility of a route of OD pari at time is given by (equation (17)) and is the vector of route choice parameters. We enumerated exhaustively all 12 routes between origins and destinations, resulting in a link-path incidence matrix with 10 rows and 12 columns. Since we enumerated all routes, we assumed a small probability of a trip not following one of the routes. Table 1 shows the mean congestion level (CL) for a link , given by the ratio between mean simulated traffic volumes and the link capacities:


As can be seen in Table 1, links 2, 5, 6 and 9, which are in the central area of the network, have high congestion levels.

Link 1 2 3 4 5
CL 0.5500 0.8940 0.2482 0.2535 0.6378
Link 6 7 8 9 10
CL 0.6473 0.2481 0.2542 0.8914 0.5685
Table 1: Mean congestion level (CL) on links from simulated OD flows.

The simulations, the proposed DLM and MCMC algorithm were implemented in the Python programming language version 2.7 by using Numerical Python module version 1.10.

4.2 Application of the MCMC algorithm

When applying the MCMC algorithm, we assume that we observe link volumes on all links and route costs . In practice, these data on traffic volumes are routinely collected through modern ITS systems and route costs can be estimated from data on travel times on links or can be collected from a sample of cars following selected routes. Except for the mean OD flows and the parameters , we assume all other parameters are known. We use an uninformative multivariate normal prior for the initial mean OD flows vector , with where is the unit vector with the appropriate dimension and , and an uninformative improper prior for .

The MCMC algorithm was ran through 10000 iterations with a multivariate normal proposal distribution for candidate vectors with covariance matrix equals to , which resulted in an acceptance percentage of about 22%, and we discarded the initial 2000 samples as burn-in. We used as starting values for and the starting values for are sampled from a multivariate normal with mean and covariance matrix . It took about 10 min to run the MCMC algorithm in a Core i7 machine with 3.1 GHz and 8GB RAM.

The resulting Markov chain of sampled values is given in Figure 3. Figure 4 shows kernel density estimation of the marginal posteriors of and obtained from the sampled values. It can be seen that the sampled marginal posteriors have high densities around the true simulated values for and . The sample mean values for and were 0.5250 and 0.3651, respectively, and highest posterior density regions with 95% probability were [0.3546, 0.7067] and [0.1979, 0.5437], respectively. Figure 5 shows the simulated and estimated mean OD flows, with mean squared error (MSE) between simulated and estimated mean OD flows of 15.83. From these results, we can conclude that the proposed MCMC algorithm was able to estimate with good accuracy both mean OD flows and route choice parameters.

Figure 3: Markov chain for and .
Figure 4: Kernel density estimation for the marginal posteriors of and . Vertical bars show true simulated values.
Figure 5: Simulated and estimated mean OD flows

4.3 Unknown evolution matrix

Although in theory the evolution matrix in DLMs may be jointly estimated along with other parameters, it is often treated as a known parameter provided by the analyst. A convenient way of specifying the evolution matrix is through discount factors [35]. According to equation (9), the prior covariance matrix of the OD flows at time is given by , i.e., it is the posterior covariance matrix at the previous time period amplified by the evolution covariance matrix, which corresponds to the increase in uncertainty due to time. Then, we can write the prior covariance matrix as , where , so as to . The term is a discount factor. Notice that, when , we do not have an increase in uncertainty from time to , thus corresponding to a static model. Lower values correspond to higher increase in uncertainty from time to time .

We ran our proposed MCMC algorithm with varying discount factors , starting from covariance matrix , in order to assess its impact on the quality of the estimation of the route choice parameters and mean OD flows. In Table 2 we show the corresponding sample means and , 95% highest posterior density regions for and , and MSE between simulated and estimated mean OD flows.

0.7 0.4941 [0.3265, 0.6659] 0.3340 [0.1475, 0.5007] 137.27
0.8 0.4960 [0.3178, 0.6600] 0.3380 [0.1674, 0.5288] 82.84
0.9 0.4898 [0.3121, 0.6606] 0.3313 [0.1480, 0.5057] 33.07
Table 2: Estimation results given different discount factors (HPD - 95% highest posterior density region)

Figure 6 illustrates the effect of using on the estimation of mean OD flows. As can be seen from the results, there was little difference in estimation quality between the cases with known and unknown evolution matrix regarding the parameters . The difference is more pronounced with relation to estimation quality of mean OD flows, which exhibited higher MSE with the evolution matrix specified by means of a discount factor.

Figure 6: Simulated and estimated mean OD flows for

4.4 Observation of traffic volumes on partial links

In some real transportation networks, there are data on traffic volumes only on a few links. We consider the application of our MCMC algorithm in these settings. In these set of experiments, all parameters are equal to the simulated ones, except for the mean OD flows and the parameters of the utility model, which are estimated by the MCMC algorithm. We vary only the number of links and in which of them we have traffic volume data.

We test three cases: observations on 1 link, on 2 links, and on 3 links in the network in Figure 2. Links were selected so that we have a representative subset of the network. In Table 3 we show the corresponding sample means and , 95% highest posterior density regions for and , and MSE between simulated and estimated mean OD flows.

Observed links HPD HPD MSE(OD flows)
1 0.6088 [0.0065, 1.1342] 0.3904 [0.0014, 0.8312] 471.67
2 0.6283 [0.3897, 0.8247] 0.4707 [0.2509, 0.6827] 256.53
9 0.5877 [0.3697, 0.7972] 0.3644 [0.1608, 0.5752] 233.34
2 and 5 0.5115 [0.3194, 0.6817] 0.3681 [0.1692, 0.5422] 246.70
1 and 9 0.5728 [0.3608, 0.7456] 0.3640 [0.1760, 0.5546] 175.43
2, 5 and 9 0.5023 [0.3241, 0.6838] 0.3323 [0.1383, 0.5039] 86.51
1, 7 and 9 0.5933 [0.3947, 0.7936] 0.3676 [0.1870, 0.5870] 59.82
Table 3: Estimation results with observation on partial links (HPD - 95% highest posterior density region)

In Table 3, we see that MSE of mean OD flows decreases as we observe more links. This result is in agreement with similar results of experiments with static models in the literature. The best results were obtained with data on links 1, 7 and 9, for which . It is noteworthy that this result is not very far from the case studied in section 4.2, where we have traffic volume data on all links and . Figure 7 shows simulated and estimated mean OD flows for this case with data on links 1, 7 and 9.

Figure 7: Simulated and estimated mean OD flows with observed traffic volumes on link 1, 7 and 9

Estimation quality of route choice parameters also increases with the observation of more links, as the HPD regions get tighter around the simulated values . Nevertheless, this effect is not so pronounced for OD flows. The best result among the tested cases was obtained when observing links 2, 5 and 9, which are links located in the central and more congested region of the network. It is also worth noting that we could estimate route choice parameters to a good accuracy by observing only link 9.

In addition, we notice that the estimation results for the case when we have traffic volumes data only on link 1 are the worst. This may be possibly due to the fact that there is no route in OD pairs (2,7) and (2,8) which includes link 1. Thus, traffic volumes on link 1 give no information on OD flows for these OD pairs. In Figure 8 we see that estimated mean OD flows remain at 100 in this case, which is our prior mean at .

Figure 8: Simulated and estimated mean OD flows with observed traffic volumes on link 1

Finally, we should emphasize that we obtained good estimates when observing only a subset of links even though we used uninformative priors for both mean OD flows and route choice parameters. Static models often resort to prior knowledge, in the form of prior OD matrices, in order to “regularize” estimates when data are available only on a few links.

5 Conclusion

In this paper, we proposed a dynamic linear model for day-to-day OD flows in transportation networks. In our model, we took into account variability in OD flows, route choices and traffic volume measurements. We also modeled route choices through a utility model based on past route costs. We proposed a Markov chain Monte Carlo algorithm in order to sample from the joint posterior distribution of mean OD flows and route choice parameters by using data on traffic volumes on links and past route costs. Our model can be applied to congested networks and when data on only a subset of links are available.

We illustrated the application of the DLM and MCMC algorithm on a test network from the literature using simulated data. In our experiments, we were able to estimate with good accuracy both mean OD flows and route choice parameters using uninformative prior distributions and data on a subset of links in the test network. These are very promising results and indicate that dynamic linear modeling of day-to-day OD flows may provide a valuable tool in analyzing and estimating OD demand in transportation networks.

As further extensions to our work we suggest the consideration of trend and seasonality in OD flows. This can be implemented through the definition of additional parameters such as trend and seasonal factors, whose relationship with mean OD flows can be easily established by means of the system matrix . Another promising research direction is the use of our proposed DLM to forecast future traffic volumes, after setting up the model with estimated route choice parameters.


  1. Email:


  1. K. Ashok and M. E. Ben-Akiva. Estimation and prediction of time-dependent origin-destination flows with a stochastic mapping to path flows and link flows. Transportation Science, 36:184–198, 2002.
  2. M. Beckmann, C. McGuire, and C. B. Winsten. Studies in Economics of Transportation. Yale University Press, Ann Arbor, 1956.
  3. M. Ben-Akiva and S. R. Lerman. Discrete choice analysis: Theory and application to travel demand. MIT Press, Cambridge, MA, 1985.
  4. M. Brenninger-Göthe and K. O. Jörnsten. Estimation of origin-destination matrices from traffic counts using multiobjective programming formulations. Transportation Research Part B, 23B:257–269, 1989.
  5. G. E. Cantarella and D. P. Watling. Modelling road traffic assignment as a day-to-day dynamic, deterministic process: a unified approach to discrete- and continuous-time models. EURO Journal on Transportation and Logistics, 5(1):69–98, 2015.
  6. C. K. Carter and R. Kohn. On Gibbs sampling for state space models. Biometrika, 81(3):541–553, 1994.
  7. E. Cascetta. Estimation of trip matrices from traffic counts and survey data: a generalized least squares estimator. Transportation Research Part B, 16:289–299, 1984.
  8. E. Cascetta. Transportation Systems Analysis: Models and Applications. Springer, New York, NY, 2 edition, 2009.
  9. E. Cascetta, D. Inaudi, and G. Marquis. Dynamic estimators of origin-destination matrices using traffic counts. Transportation Science, 27:363–373, 1993.
  10. E. Cascetta and S. Nguyen. A unified framework for estimating or updating origin/destination matrices from traffic counts. Transportation Research Part B, 22B:437–455, 1988.
  11. E. Cascetta and N. Postorino. Fixed point approaches to the estimation of o/d matrices using traffic counts on congested networks. Transportation Science, 35:134–147, 2001.
  12. M. Cremer and H. Keller. A new class of dynamic methods for the identification of origin-destination flows. Transportation Research Part B, 21:117–132, 1987.
  13. C. Fisk. On combining maximum entropy trip matrix estimation with user optimal assignment. Transportation Research Part B, 22B:69–79, 1988.
  14. C. Fisk. Trip matrix estimation from link counts: the congested network case. Transportation Research Part B, 23B:331–336, 1989.
  15. S. Frühwirth-Schnatter. Data augmentation and dynamic linear models. Journal of Time Series Analysis, 15(2):183–202, 1994.
  16. S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-6(6):721–741, 1984.
  17. W. K. Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1):97–109, 1970.
  18. M. L. Hazelton. Estimation of origin-destination matrices from link flows on uncongested networks. Transportation Research Part B, 34:549–566, 2000.
  19. M. L. Hazelton. Inference for origin-destination matrices: estimation, prediction and reconstruction. Transportation Research Part B, 35:667–676, 2001.
  20. M. L. Hazelton. Some comments on origin-destination matrix estimation. Transportation Research Part A, 37:811–822, 2003.
  21. M. L. Hazelton. Statistical inference for time varying origin-destination matrices. Transportation Research Part B, 42:542–552, 2008.
  22. M. L. Hazelton and K. Parry. Statistical methods for comparison of day-to-day traffic models. Transportation Research Part B: Methodological, in press, 2015.
  23. S. Nguyen. Estimating an od matrix from network data: A network equilibrium approach. Technical report, Centre de Recherche sur les Transports, Université de Montreal, Montreal, 1977.
  24. J. d. D. Ortúzar and L. Willumsen. Modelling Transport. Wiley, Chichester, 4th edition, 2011.
  25. K. Parry and M. Hazelton. Bayesian inference for day-to-day dynamic traffic models. Transportation Research Part B, 50:104–115, 2013.
  26. P. Robillard. Estimating the od matrix from observed link volumes. Transportation Research, pages 123–128, 1975.
  27. S. Särkkä. Bayesian Filtering and Smoothing. Cambridge University Press, New York, NY, USA, 2013.
  28. Y. Sheffi. Urban Transportation Networks: Equilibrium analysis with Mathematical Programming methods. Prentice-Hall, Englewood Cliffs, NJ, 1985.
  29. M. Smith. The existence, uniqueness and stability of traffic equilibria. Transportation Research Part B, 13B:295–304, 1979.
  30. C. Tebaldi and M. West. Bayesian inference on network traffic using link count data. Journal of the American Statistical Association, 93:557–573, 1998.
  31. H. J. Van Zuylen and L. G. Willumsen. The most likely trip matrix estimated from traffic counts. Transportation Research Part B, pages 281–293, 1980.
  32. Y. Vardi. Network tomography: Estimating source-destination traffic intensities from link data. Journal of the American Statistical Association, 91:365–377, 1996.
  33. J. G. Wardrop. Some theoretical aspects of traffic research. In Proceedings of the Institution of Civil Engineers part II, volume 1, pages 325–378. Institution of Civil Engineers, 1952.
  34. D. P. Watling and G. E. Cantarella. Modelling sources of variation in transportation systems: theoretical foundations of day-to-day dynamic models. Transportmetrica B: Transport Dynamics, 1(1):3–32, 2013.
  35. M. West and J. Harrison. Bayesian Forecasting and Dynamic Models. Springer-Verlag, New York, NY, USA, 2 edition, 1997.
  36. H. Yang. Heuristic algorithms for the bilevel origin-destination matrix estimation problem. Transportation Research Part B, 29B:231–242, 1995.
  37. H. Yang, T. Sasaki, Y. Iida, and Y. Asakura. Estimation of origin-destination matrices from link counts on congested networks. Transportation Research Part B, 26B:417–434, 1992.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description