Optimizing city-scale traffic flows through modeling isolated observations of vehicle movements
Mobile phones and the Internet of Things provide unprecedented opportunities for transportation researchers and computational social scientists to observe city-scale human dynamics in terms of millions of vehicles or people moving around. They also enable policy researchers to identify the best strategies to influence the individuals in order for the complex system to achieve the best utility. However, the mobility data become sparse at the individual level and it is non-trivial to stitch together the isolated observations with high fidelity models to infer the macroscopic dynamics. In this paper, we introduce a discrete-event decision process to capture the high fidelity dynamics of a complex system at the individual level in terms of a collection of microscopic events where each one brings minimum changes but together induce complex behaviors. We further derive a particle filter algorithm to connect the dots of isolated observations through driving the discrete-event decision process in agreement with these observations. Finally, we solve a partially observable Markov decision process problem through reducing it into a learning and inference task. Evaluation with one synthesized dataset (SynthTown), one partly real and partly synthesized dataset (Berlin), and three real world datasets (Santiago de Chile, Dakar, and NYC) show that the discrete-event decision process gives an accurate estimation of complex system dynamics due to its better integration of high-fidelity dynamics and human mobility data.
addressref=aff1,firstname.lastname@example.org ]\initsF\fnmFan \snmYang addressref=aff1, email@example.com ]\initsA\fnmAlina \snmVereshchaka addressref=aff2, firstname.lastname@example.org ]\initsB\fnmBruno \snmLepri addressref=aff1, corref=aff1, email@example.com ]\initsW\fnmWen \snmDong
discrete-event decision process \kwdparticle filter algorithm \kwdvehicle trajectories \kwdtransportation dynamics \kwdmachine learning
With 80% newly sold vehicles in the U.S. being able to communicate vehicle state through a telematics unit, and 57% population connected to the Internet by smart phones, data sets that track vehicles and people are increasingly available for city science and transportation researchers, computational social scientists, and policy researchers to study human mobility at large scale. Such data contain rich information — from how drivers plan their daily activities and trips at the microscopic level to how road networks respond to transportation demand at the macroscopic level. At the same time, the large volume of trajectory data allows high-fidelity and complex models of human mobility [1, 2, 3, 4, 5, 6, 7]. Trajectory data permits individual-level modeling of how people plan daily activities and trips [8, 9] through monitoring where they are , where they travel to , how they travel there  and why they travel there . Moreover, this kind of data enables large-scale traffic predictions using deep neural networks [14, 15, 16], graphical models , state space models , vector ARIMA (AutoRegressive Integrated Moving Average) models , and so on [20, 21]. However, the large volume of trajectory data is generally comprised of sparse observed locations of people from a large population, and we often need detailed observations about a large number of individuals to explain society-level phenomena with individual-level mechanics and to infer how interventions may affect the complex system evolution. Thus, lacking the capability to stitch together the isolated dots of observed locations with high fidelity models imposes a serious limitation in the usefulness of such human mobility data. For example, estimating the travel time between two locations at a time-point without a trajectory exists in the data set requires us to synthesize trajectories from existing data, and controlling a city-scale road-transportation network to achieve optimal utility in real time requires us to project existing data into the future.
Our solution is to introduce a discrete event simulation model  to combine the expressiveness of simulation modeling in specifying high-fidelity models of complex system dynamics and the analytic power of machine learning in making accurate predictions from noisy observations. Moreover, we apply a particle filter algorithm  to turn the observed vehicle trajectories into predictions of traffic-flow dynamics and optimized driving strategies in a road network. The premise of introducing a discrete event simulation model for specifying road network dynamics is that complex system dynamics can be described by a set of microscopic events that individually bring only minimal changes but in sequence induce complex behaviors. Using a discrete event model, we specify traffic-flow dynamics in a road network with a set of stochastic events — such as a driver starting a trip, moving to the next road, and ending a trip — and we introduce a set of control variables to influence drivers’ choices in response to the environment. To connect the isolated dots of observed locations into continuous estimations of how people moves and traffic flows in the road network, we adopt the particle filter algorithm to regularly predict the next state of traffic in the form of samples (particles) according to the defined discrete-event simulation dynamics, and we select the most likely samples according to their likelihoods with respect to the noisy observations . To make near-optimal plans for drivers, we exploit the equivalence relationship between the estimation of a log-utility lower bound and the variational-inference of a latent state, and thus we reduce the control problem to an inference problem [25, 26].
In sum, this work contributes to the research of integrating simulation modeling and big data through machine learning in the study of complex social network dynamics. This approach has not been explored, with the exception of our preliminary works [27, 26], because the intersection of the machine learning community and the simulation modeling community is presently very small. Nonetheless, this intersection is very powerful because it affords an intuitive interpretation of the information extracted from massive, noisy, unstructured data streams. Moreover, here we extend our previous works [27, 26] by placing the proposed discrete-event decision process model in the context of the transportation research and we provide a detailed evaluation of our approach in making traffic-flow prediction and control. Indeed, for transportation researchers our approach can not only simulate traffic jams during rush hours but also predict from the trajectories of probe vehicles whether today’s traffic jams will be formed earlier or last longer than usual, and help drivers to decide and plan how to use the road network more efficiently.
The remainder of this paper is organized as follows. In Section 2, we discuss other research efforts on making predictions and identifying optimal controls in the road transportation network from noisy observations. In Section 3, we introduce our discrete-event decision process model to capture the traffic-flow dynamics in a large road network, and we derive a particle filter algorithm to predict and optimize traffic flows from noisy observations. Section 4 reports the results of evaluating the performance of a discrete-event decision process and particle filter against other model-based and model-free methods on synthesized and real-world data sets. In Section 5 we discuss the implications and limitations of our work and draw some conclusions.
2 Related Work
Being able to track the trajectories of millions of vehicles through mobile phones and vehicle telematics has led to increasing interactions between machine learning researchers, computational social scientists, civil engineers and transportation researchers to model the traffic dynamics in road transportation networks, in order to achieve real-time traffic forecasting and to enable a more efficient usage of the network at both system and individual levels.
Traditionally, sensors at fixed locations in the road network (e.g. traffic cameras, inductive loop traffic detectors, etc.) are deployed to monitor traffic-stream parameters (i.e. speed, flow and density), and census surveys are used to estimate trip distribution (i.e. how people travel to perform different activities) according to human mobility models for traffic forecasting. Thus, mobile phones and the Internet of Things provide a new way to sense our cities by logging the trajectories of millions of vehicles. These trajectories provide an unprecedented opportunity to estimate the home and work locations of city-wide populations , identify the purposes of trips and special events , model driver behaviors at fine resolution in the real world , and track the dynamics of road transportation networks at city scale and road resolution [30, 16].
Data such as the trajectories of millions of taxicabs, the geo-tagged posts on social media (e.g. Twitter), the trajectories and aggregated pickups and drop-offs from public transportation vehicles, the publicly accessible recordings from road-traffic sensors, and the documents from open government databases together provide a holistic view of human mobility and interactions at a large scale . At the same time, various model-based and model-free machine learning algorithms have been applied to predict road transportation dynamics from noisy observations. In this context, a model is a mechanism specified by a few rules or a computer program to simulate how drivers move and traffic flows evolve over time in a road network. Transportation researchers prefer model-based algorithms because the model provides an explanation for how the predictions are made from the noisy observations in terms of the underlying mechanism, and a window for them to edit the model starting from their experience in order to improve performance and calibrate model accountability. Model-free (or data-driven) algorithms learn to make predictions from training data based on the mathematical guarantees of learnability, but they can potentially hide the details of how the predictions are made. Traditional model-free algorithms to predict traffic include nearest neighbors, Support Vector Machines, ARIMA-type models based on reducing a time series to a wide-sense stationary process, fuzzy logic, shallow neural networks, and Gaussian processes . We refer the readers to two surveys [20, 21] on the large volume of historical research based on small data and traditional machine learning approaches.
Probabilistic Bayesian networks have also shown their potential to capture the probability dependencies of traffic-stream properties at times and locations detected by mobile phones, and to make traffic predictions at city scale and road-level resolution in a data-driven and model-free way . More recently, Deep Neural Network algorithms have shown great potential in predicting city-scale traffic from big training data that are derived from vehicle trajectories. For example, Zhang et al.  have hypothesized a connection between the population distribution in a city and an image, and applied ResNet — a state-of-the art Deep Convolutional Neural Network for image processing — to predict population inflow and outflow at different grid indexes. Lv et al.  have modeled future traffic flows on freeway-network links as a complex non-linear function of the flows on these links over the previous hour, and they have trained a Deep Neural Network of three hidden layers to learn this function from big data. Again, Yu et al.  and Li et al.  have modeled the convolutional structure using a road network graph instead of a regular two-dimensional grid system, and they have developed a Graph-Convolutional Recurrent Neural Network to achieve faster training speed with fewer training data. Ma et al.  have drawn traffic-flow speed at different locations on a road and times as an image and subsequently have trained a Convolutional Neural Network to make short-term forecasts of how this image evolves over time. Instead, Polson and Sokolov  have trained a Vanilla Backpropagation Deep Neural Network to make short-term forecasts of traffic-flow speed with data from 21 loop detectors. Zhao et al.  have trained a Long Short-Term Memory (LSTM) network to predict how the traffic data evolve over time from 500 observation states. Considering the impressive success of graphical models and Deep Neural Networks, which are model-free, it is worth considering what traffic-flow dynamics these neural networks can learn from big vehicle-trajectory data and the extent to which these dynamics are learnable.
Transportation researchers indeed use generative models to explain the patterns in road traffic dynamics using simple rules governing the behavior of drivers and the evolution of traffic flows over time, and to conduct system-wide traffic evaluations through simulations typical of operations research. In terms of scale and resolution, those models range from driver-behavior models for explaining traffic-stream properties at the road level [37, 38] to multi-agent transportation simulators for road usage analysis and policy research at city scale [39, 40]. In terms of complexity, these models range from state-space models for data-driven traffic forecasting using machine learning algorithms such as a particle filter and Kalman filter to transportation simulators that are generally too complex to work with machine learning algorithms. At the scale of a road network, Wang et al.  have applied an extended Kalman filter to make short-term traffic-state predictions in a freeway network starting from real-time traffic measurements at various places and a stochastic macroscopic freeway network traffic-flow modeling, while van Hinsbergen et al.  have designed a localized extended Kalman filter to improve the speed of the filter by updating traffic-state estimations with only the observations from nearby detectors. At the scale of a stretch of road, Van Lint and Hoogendoorn  have applied an extended generalized Treiber-Helbing filter  to fuse heterogeneous data from traffic detectors into traffic-state updates based on traffic flow theory. Instead, Sun and Ban  and Montanino and Punzo  have developed an optimization method to reconstruct the trajectories of unobserved vehicles from those of observed vehicles. More recently, Xie et al.  have applied a particle filter algorithm to construct vehicle trajectories to match observations about traffic density and travel speed from traffic detectors.
Hence, the abundance of vehicle trajectory data makes it possible to capture city-wide transportation dynamics at road-level resolution using high-fidelity driver behavior models (that are as complex as simulators), because big data allow the usage of complex models and trajectory data contain behavior information ranging from individual drivers at the microscopic level to a whole city at the macroscopic level. This trend parallels machine learning research, where many sampling-based algorithms are developed to make inferences about and learn implicit models — models so complex that an analytical specification doesn’t exist.
The state-of-the-art simulation and control theory approaches in transportation policy research are generally small-data approaches . Transportation simulators, such as TRANSIMS  and MATSIM , take data in terms of a road network and an Origin-Destination (O-D) matrix that can both be derived from census surveys, then calibrate model parameters from traffic data, and finally find an optimal transportation policy for open-loop control through enumeration. Due to model complexity, simulation generally takes a long time to converge and it is difficult to account for real-time traffic control and trip plans according to non-recurrent traffic situations. In particular, the traditional control theory approach starts with specifying complex system dynamics as macroscopic state transitions and formulating the decision-making problem as a constrained optimization problem, and finds optimizing strategies through either convex optimization [50, 51] or data-driven reinforcement learning algorithms [52, 53]. Analytical approaches can provide faster and more robust solutions, but due to modeling costs and errors they are applicable only to scenarios with small state spaces or low-resolution interventions. The volume and richness of information in trajectory data and the recent success in applying reinforcement learning and Markov decision processes to solve complex problems point to the possibility of implementing a fine-grained closed-loop control of a road transportation network by applying real-time vehicle trajectories.
In sum, the abundance of vehicle trajectories provides big opportunities to forecast and control city-wide road transportation using microscopic details such as how individual drivers plan their daily trips and drive on the road. In our paper, we propose to use a discrete event model to bridge the high fidelity of a simulation approach with the abundance of trajectory data, and we develop particle filter algorithms to implement forecasting and control. It is worth highlighting that a discrete event model captures the interactions among the components of a complex system with a set of microscopic events that individually induce only minimal changes but together generate complex dynamics.
In this section, we introduce the discrete event model and the control variables for modeling decision making in a transportation system. Based on this model, we develop a particle-based algorithm to estimate the current state distribution from the observation history, and we derive an online planning algorithm to continually search for a near-optimal policy from the current state distribution based on expectation maximization. A background on dynamic Bayesian networks, state-space models and Markov decision processes is provided in Appendix A.
3.1 Discrete Event Model for Inference and Decision Making
We introduce a discrete event model called the stochastic kinetic model to capture the dynamics of a complex social system driven by a set of events. A stochastic kinetic model is a biochemist’s way of describing the temporal evolution of a biological network with species driven by mutually independent events [54, 55], where the stochastic effects are particularly prevalent (e.g. a transcription network or a signal transduction network). Let denote the species in the network. An event (chemical reaction) is specified by a production
The production is interpreted as having rate constant , the probability per unit time as time goes to 0. individuals of species , individuals of species , … , individuals of species interact according to event , resulting in their removal from the system. Similarly, individuals of species , individuals of species , … , individuals of species are introduced into the system. As such, event changes the populations by . The species on the left side of the production are reactants, the species on the right are products, and the species with are catalysts.
At the system level, let be the populations of the species in the system at time . A stochastic kinetic process initially in state at time can be simulated through the Gillespie algorithm  by iteratively (1) sampling the time to the next event according to exponential distribution , where is the rate of all events and is the rate of event , (2) simulating the event according to categorical distribution conditional on event time , and (3) updating the system time and populations , until the termination condition is satisfied. In this algorithm, event rate is the rate constant multiplying a total of different ways for individuals to interact in the system, assuming homogeneous populations. Exponential distribution is the maximum entropy distribution given the rate constant, and consequently it is most likely to occur in natural reactions . The stochastic kinetic model thus assigns a probabilistic measure to a sample path induced by a sequence of events happening between times and , , which is
The stochastic kinetic model is one way to define a discrete event process, and its equivalents in other fields include the stochastic Petri net , the system dynamics model , the multi-agent model specified through a flow chart or a state chart , the Markov jump process [59, 60], the continuous time Bayesian network , and the production rule system . These equivalent models have the same power in capturing the dynamics of a system, but are different in their representations. As such, the stochastic kinetic model can also be used in other fields where there are equivalent models.
Computational social scientists often specify the complex dynamics of a social system with discrete event simulator software. To identify such a discrete event simulator as a Markov process and use the simulator to track real-world social systems with continued observations, we exploit the fact that all the discrete event simulators, to the best of our knowledge, have a way to dump the events happening in a simulation run. As such, we can reconstruct simulation runs according to the event sequences and so reconstruct the stochastic discrete event model from simulation runs outside the simulator. For example, rather than hacking through the 140 thousand lines of code for MATSIM , which is a state-of-the-art multi-agent transportation simulator, to make real-time inferences with real-world data, we can dump four events: (i) vehicle leaving a building, (ii) vehicle entering a link, (iii) vehicle leaving a link, and (iv) vehicle entering a building. From these four events, we can construct a state transition matrix to represent vehicle dynamics.
To define the decision-making process from noisy observations in a complex system with the discrete event model, we maintain a belief state to track the observation-action history and to represent our belief of the current system state, . We then introduce an action/control variable to influence the event-rate constants at time , , where the action or its distribution is determined by the belief state, or . To track complex system dynamics from observations of populations or individuals at discrete time steps and accordingly adjust the control variable, we approximate the continuous time process with a discrete time process on equally spaced time points , with a time interval small enough that the probability of more than one event happening in the interval is negligible. The state transition kernel from time to time is , where is a uniformization rate  satisfying , is the identity matrix, and is the infinitesimal generator defined by .
The result is a discrete time decision process. Let be an event sequence, a state sequence (populations of species or states of individuals), an observation sequence about the system, and a control sequence. Our goal is to maximize the expected future reward of the discrete time process defined by the probability measure (Eq. 5) by iteratively setting from a belief state that summarizes the observation-control history , where is the observation model, is the state transition model, is the immediate reward at time , and the indicator function is 1 if the current state is and 0 otherwise.
Figure 7 shows how a discrete event model can efficiently represents the complex interactions of many components in a large system, which are often specified with simulation software. Figure (a)a is a coupled hidden Markov model to represent the complex interaction dynamics. Its state space and state transition kernel grow exponentially with the number of state variables, which makes it unsuitable for capturing the complex interactions in a system. In a system with 50 interacting components and two states per component, the total number of states will be , which is prohibitive. Figure (b)b is a discrete event model to represent the complex interactions through a set of microscopic events that individually induce only minimal changes to the states of the interacting components in the system but in sequence generate complex interaction dynamics. Figure (c)c shows a schema to control the complex interactions specified by a discrete event model by influencing the probabilities for the microscopic events with a control variable , which is in turn determined by our belief of the system that summarizes all past control inputs to the system () and partial observations about the system (). Here, we use solid lines and circles to indicate the control problem defined by the discrete event decision process, and dashed lines and circles to represent a solution to the control problem. As an illustration, Figure (d)d shows simplified dynamics of how agents travel throughout a day in a road network. Figure (e)e is a stochastic Petri net representation of traffic dynamics, where dots represent agents, circles represent species, and rectangular boxes represent events. Figure (f)f is the production system representation of the discrete event model, where each event is represented by a production.
In order to model road transportation dynamics with a discrete event model, we represent each vehicle as an agent, vehicles at different buildings and road links as different species, agent movement from one building/link to another as an event, and controls such as traffic light schedule and information about traffic conditions as actions that change the agent selection of the alternative downstream links and the agent plans to perform different activities. Specifically, we model road traffic dynamics through a single type of event, . A vehicle moves from link/building to link/building with rate , changing the location of the vehicle from to . Event rate is defined as the probability for the event to happen per unit time, as time goes to 0 (Eq. 1). We use "" to represent a bond: person binds to location before the event and binds to location after the event.
Let be the total number of vehicles in the system and the total number of observed vehicles. The observation model of observing probe vehicles at location conditioned on having vehicles in total is .
We then implement a multi-objective reward function. Each vehicle receives a penalty for every minute spent on the road. Moreover, each vehicle receives a reward for arriving at work at the expected time, and a penalty for arriving late or leaving early. Specifically, the reward is computed as the sum of facility utilities plus the travel utilities , with as the number of facility events. The facility utility is computed as , where denotes the useful work done at the target facility, represents the waiting time before the facility opens, is the late arrival penalty, and is the early departure penalty. Travel utility is computed as , where indicates a penalty for every unit time spent on roads, and is the penalty for every unit distance driving on roads .
3.2 Tracking with Particle Filter
In this subsection, we derive a particle filter algorithm to track the current state of a discrete event process using past observations and actions, and we also derive particle smoothing and parameter learning algorithms to calibrate the parameters of this process. In the problem of controlling road network dynamics specified as a discrete event process, the observations are either the locations of the probe vehicles in a multi-agent modeling or the observed probe-vehicle populations at different links and buildings in a road-level modeling. The estimation of the current state — the belief state — is the probability distributions of vehicle locations in the system or the probability distributions of vehicle populations at road links and buildings.
The particle filter is a sequential Monte Carlo algorithm that approximates the probability measure of a stochastic process. It maintains a collection of particles for and to represent the likelihood of the latent state of a stochastic process at different regions of the state space with each particle representing a system state, given noisy and partial observations . denotes the particle index and the time index. Let for be a collection of particles and be a collection of particle indices. Inference with the particle filter involves tracking the evolution of a stochastic process by alternating between particle mutation and selection. In the mutation step, the particles at the next time step are sampled according to the transition kernel . In the selection step, the particle indices are resampled according to the observation likelihood . Parameter learning with a particle filter involves estimating the posterior distribution of the parameters from the collected particle trajectories .
Our goal is to track the dynamics of a discrete event process initially at state at time from past actions and observations up to the current time , in order to establish optimal control of the process. That is, our goal is to maintain an estimation of . To this end, we initialize particle positions and indices as and , and alternate between a prediction step and an updating step.
In the mutation step, we sample particle positions at time from the particles at time . Specifically, we sample event according to how likely it is that different events will occur conditioned on system state for and action (Eq. 7), and update accordingly (Eq. 8). Because the resampled particles are distributed according to , the sampled particles are distributed according to .
The likelihood of particles are with respect to the observation . To avoid particle degeneracy, we perform a particle-selection step to eliminate particles with low likelihood and duplicate particles with high likelihood (Eq.9). After the particle-selection step, all particles are distributed according to , and all have the same likelihood.
To derive a particle trajectory from the posterior distribution of a stochastic kinetic process with respect to observations, we trace back the events that lead to the particles for :
The particles form an approximation of the forward probability and likelihood . The ancestral lines of the particles , where , form an approximation of the posterior distribution of the stochastic process conditioned on observations, where is an indicator function:
To calibrate the parameters in , we maximize the empirical estimation of log evidence with gradient ascent:
Overall, we develop a particle-based algorithm to update the belief state and calibrate the parameters of a discrete event decision process (Algorithm 1).
3.3 Optimal Control with Particle Filter
In this subsection, we derive a particle-based algorithm to identify the optimal control of a complex system from our estimation of the current system state (belief state), using the equivalence between the state-value function of a Markov decision process and the probability of receiving the reward from a mixture of finite-time Markov decision processes . This equivalence enables the translation of the policy-evaluation and policy-improvement steps in a policy iteration algorithm into the expectation and maximization steps in an expectation maximization (EM) algorithm, and the application of a large variety of approximate inference algorithms for dynamic Bayesian networks to solve intractable optimal control problems. In particular, it is based on the following derivation:
Eq. 12 connects the expected future reward of a Markov decision process and the probability of receiving a binary reward in a mixture of finite time Markov decision processes . This finite time Markov decision process executes the same plan as the original Markov decision process up to a terminal time , it generates a state-action trajectory , and it receives a binary reward with probability . In Eq. 12, is a discount factor. Corresponding to the expected discounted cumulative future reward with , we select and . Corresponding to the expected finite-horizon future reward with , we select and , where indicator function when and otherwise, and when and otherwise.
To identify optimal control with the EM algorithm in a discrete event decision process, we maximize the expected log likelihood by alternately identifying the typical state-action sequences generated by a policy that leads to reward () in the expectation step (E-step) and tuning the control parameters of the policy () so that these typical sequences lead to reward with higher probabilities in the maximization step (M-step). The EM algorithm is an iterative algorithm that searches for the parameters to maximize the expected log likelihood over the posterior probability distribution of the latent variables conditional on the observations. Here, the likelihood is proportional to the value function, the latent variables are a sequence of states and actions, and the observations are of whether a reward is received.
In E-step, we use importance sampling to approximate the proxy of future expected reward and the posterior probability induced in Eq. 12. Specifically, we sample and for , we approximate the prior distribution with sample distribution , and use importance weight to approximate the posterior distribution, where is an indicator function.
The posterior probability (Eq. 15) is the fraction of expected future discounted reward received from over the total expected future discounted reward received after , averaged over sample paths .
In M-step, we iteratively maximize the expected log likelihood of receiving a reward. The optimal control is consequently set such that actions appears in proportion to the future rewards.
To summarize, we develop an algorithm 2 to control a complex system from a discrete event model and noisy observations.
4 Tracking and Planning in City-Scale Transportation Networks
Here, we benchmark our framework with other state-of-the-art algorithms on the problem of tracking and optimizing the travel plans for self-interested drivers in a city-scale transportation network starting from noisy observations of network dynamics.
4.1 Data Description
We evaluate the performance of our framework on five datasets of human mobility: (1) SynthTown, (2) Berlin, (3) Santiago de Chile, (4) Dakar, and (5) NYC Taxicab (see Table 1).
The SynthTown dataset is comprised of a synthesized network of one home location, one work location, and 23 single-direction road links to characterize the trips of 2000 synthesized inhabitants going to work in the morning and returning home in the afternoon . The graphical illustration is shown in Fig. (a)a. More specifically, the prediction problem is to estimate the vehicle counts at home, at work, and at links 1-23 in the present time, 10 minutes later, and 60 minutes later from observations of the 200 “probe” inhabitants collected at link 1 and link 20. These 200 “probe” inhabitants volunteer to share their locations every minute. Simple as it seems, this problem requires a statistical inference algorithm to “understand” several concepts in order to achieve successful tracking and forecasting. For example, the algorithm should successively add the estimated vehicle count at link 1 to home and subtract the estimated vehicle count at link 20 from work. In addition, the estimated vehicle counts at link 21-23 should sequentially follow the estimated vehicle count at link 20 and be followed by the estimated vehicle count at link 1.
The Berlin dataset is comprised of a network of 11 thousand nodes and 24 thousand single-direction car-only links derived from OpenStreetMap; and the trips of 9 thousand synthesized vehicles representing the travel behaviors of three million inhabitants . To make the problem small enough that algorithms with bigger time-complexity can run and have performances compared with our algorithm, we aggregate the 24 thousand road links into 1539 clusters with a walk-trap algorithm . The daily trips in the Berlin dataset are synthesized from (1) the commuter data provided by the German Federal Employment Agency containing the home and workplace municipalities of the working population subject to social insurance contributions, (2) an activity-based demand model (Comprehensive Econometric Microsimulator for Daily Activity-Travel Patterns, or CEMDAP ) to sample a sequence activities (home, work, school, shop, restaurant, etc.) and the corresponding travels that each individual from the synthesized population takes throughout a day, (3) physical simulation (through Multi-Agent Transport Simulation, or MATSim ) to repeatedly modify the sampled activity-travel sequences to match the capability of the transportation network, and (4) Bayesian sampling (through Calibration of dynamic traffic simulations, or Cadyts ) to match the daily activity-travel sequences from the previous step with over 8 thousand hourly traffic count values from over 300 count stations. The synthesized daily trips have been validated based on extensive, regularly-conducted travel surveys and constitute a quality representation of road transport demand. This data set is the outcome of a generalizable approach to synthesize individual-level behaviorally-sound trip diaries from easily accessible input data, since collecting the trip diaries of real-world people is plagued with privacy issues.
The Santiago de Chile dataset is comprised of a network of 23 thousand nodes and 38 thousand single-direction car-only links derived from OpenStreetMap; and the trips of 665 thousand synthesized vehicles representing the travel behaviors of six million inhabitants in car, walking and public transportation modals . The daily trips in the Santiago de Chile dataset were initialized from cloning the sequences of activities (starting time and duration of home, work, school, shopping, leisure, visit and health) and travel mode of 60 thousand individuals (from 18 thousand households) from publicly-accessible travel diaries, and modified through physical simulation and a co-evolutionary algorithm (with MATSim) to maximize the overall utility of the system. The resulting daily trips are compatible with travel modals’ distributions and observed traffic counts at count stations. This data set represents the case where we can get travel diaries with fine temporal and spatial resolution for a significant and representative fraction of a population from publicly-accessible travel diaries.
The Dakar dataset is comprised of a network of 8 thousand single-direction road links derived from OpenStreetMap and 12 thousand real-world vehicle trips derived from the “Data for Development (D4D)” data sets based on the Call Detail Records (CDR) of over 9 million Sonatel customers in Senegal (out of 15 million total population) through year 2013 . A Call Detail Record is a data record produced by a telecommunication device that details a telecommunication transaction that goes through the device, including the calling and called phone numbers, the identification of the telecom devices that contain information about the calling and called locations, the starting time and duration of the call, the type of the call (voice, SMS, Internet access), etc. The record is critical for telecom service providers to generate phone bills, and see various applications in academic research . The D4D-Senegal data sets contains hourly site-to-site voice/SMS traffic among 1666 sites (data set 1), mobility of 300 thousand users randomly sampled every 2 weeks at site level (data set 2), and the mobility of 150 thousand randomly-sampled users for one year at the level of 123 arrondissements (data set 3), where a site is a fine-resolution geographic area designed to balance the utility for scientific research and privacy. From data set 2, we identify the home and work/school locations of each user as randomly picked locations from the most appeared sites during 7am - 7pm and 7pm - 7am respectively. Then, we sample an activity-trip sequence for each user to match her/his sequence of mobility records (in data set 2) from a Markov chain model describing how s/he performed various activities (home, work, school, shopping, etc). This data set represents the case where we can get travel diaries with fine temporal and spatial resolution for a significant and representative fraction of a population through mobile phones.
The NYC TaxiCab dataset 111http://nyc.gov/tlcopendata is comprised of a network of 7 thousand nodes and 11 thousand single-direction road links derived from OpenStreetMap and an average of 1 million daily trips of taxicabs and for-hire vehicles (including Uber, Lyft, Via and Juno) throughout 2018. Each trip record contains pick-up and drop-off zones among the 236 zones in New York city, and pick-up and drop-off data and time, among other information. The trip records are made publicly accessible by the New York City Taxi and Limousine Commission (an agency responsible for licensing and regulating New York City’s taxi cabs, for-hire vehicles, commuter vans, and paratransit vehicles). Together with many other open data sets through the City’s Open Data portal 222https://opendata.cityofnewyork.us/, TLC’s trip data has a big impact in making the city street smart. Here, we use the data to predict the behavior of all taxicabs and for-hire vehicles from observing a small fraction of them.
4.2 Tracking Transportation Dynamics
Benchmark algorithms: We firstly benchmark our framework — stochastic kinetic model with particle filter (PFSKM) — against a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), and an extended Kalman filter (EKF) in the task of continuously tracking the current and future traffic conditions. DNN represents the power of a general-purpose non-parametric model that does not involve a problem-specific structure. We build a five layer Deep Neural Network: (i) an input layer accepting the observation history of “probe” vehicles at selected locations, (ii) three hidden layers, and (iii) one output layer generating the inferred distribution of all vehicles at all locations. RNN exploits the temporal structure that recursively takes the inferred result from the previous cell as well as the current observations as input, and output the estimated vehicle distribution. Both DNN and RNN are trained with 30 days of synthesized mobility data from MATSim until obtaining optimum performance. The EKF assumess a Gaussian distribution between the time-indexed latent states, and we implement a standard EKF procedure that alternates between predicting and updating steps.
Evaluation metric: We use two metrics to evaluate the performance of our model: coefficient of determination () and mean squared error (MSE). We use to evaluate the goodness of fit between a time series of the estimated vehicle counts at a location and the ground truth. Let be the estimated vehicle count at time , the ground truth and the average of . We define . A higher indicates a better fit between the estimated time series and the ground truth, with indicating a perfect fit and a fit worse than using the average. We use MSE to measure the average squared error difference between the estimated vehicle counts at all locations at a time and the ground truth. A lower MSE represents a more precise prediction. Let be the estimated vehicle count at location and the ground truth. We define .
Result visualization: A visualization of the inferred results of applying particle filter on the Dakar dataset is shown in Figure 16. We compare the distribution of vehicles in ground truth, “probe” vehicles and the estimation of our model on each roads at morning (6am), noon (12pm), evening (6pm) and night (12am) of a given day. The figures in the top panels show the ground truth with black dots, and the figures in the bottom panels indicate the posterior distribution of vehicles (estimation results) with red dots. The probe vehicles are represented with green dots. It can be observed from the figures that the vehicle density of ground truth and estimation agree with each other, and both are proportional to the density of “probe” vehicles. As such, we use the particle filter to simulate all the vehicle-moving events in the system and then we select the most likely events with respect to observed “probe” vehicles. As a result, by correctly tracking the vehicles, we can not only predict the vehicle-moving events but also explain how simulated vehicles move in accordance with “probe” vehicles.
Evaluation results: Figure 23 summarizes the MSE and performance statistics of the four models for vehicle tracking task, i.e. estimating the numbers of vehicles up to now, with short term prediction (10 minutes) and with long term prediction (1 hour) on all the datasets. The Dakar dataset is too large for DNN, RNN and EKF, which indicates the better scalability of PFSKM. PFSKM has the lowest MSE across different times of a day, which is followed by DNN, EKF, and RNN in order (top row, lower is better); PFSKM has the highest across different locations, which is followed by DNN, EKF, and RNN (bottom row, higher is better). Firstly, PFSKM outperforms RNN and DNN because it can explicitly leverage the problem specific structure such as the road topology. Secondly, PFSKM outperforms EKF because it can work with arbitrary probability distributions and sometimes a Gaussian assumption is not a good approximation for the real world applications. This comparison also points to new developments of neural network architectures that are either regularized by event-based structures of a complex system or can learn such structures explicitly.
Comparing detailed predictions using SynthTown data: Fig. 26 shows how PFSKM, DNN, RNN and EKF predict the numbers of vehicles at different locations of SynthTown one hour ahead of time throughout a day from observations of probe vehicles (10% of the total) at link 1 and link 20 only. The x-axis indicates the hours of a day, the y-axis shows the numbers of vehicles at different locations — home, work and road segments marked on the left, and the ground truth (GT) serves as the frame of reference.
All four algorithms perform well, indicating that they all get the structure in the dynamics. In fact, there is little uncertainty about the traffic dynamics at SynthTown if the numbers of vehicles on link 1 and 20 can be monitored, albeit with noise. RNN underperforms the other three algorithms because learning the structure of a dynamical system requires a huge training data set. PFSKM estimation agrees with GT and it is better than DNN and RNN estimations, this is because PFSKM explicitly leverages the problem specific structure, i.e., road topology, while DNN and RNN need to learn it implicitly and gradually. PFSKM is better than EKF estimation, because PFSKM can work with arbitrary probability distributions while EKF assumes Gaussianity. EKF and DNN agrees well with GT at locations with a lot of people (home and work), and less well at locations with a few people. It shows PFSKM can adopt dynamic changes better.
4.3 Optimal Control in Transportation Dynamics
Benchmark algorithms: In the previous section we demonstrate the tracking capability of our framework with particle filter. Now, we benchmark our framework against (i) a baseline algorithm, (ii) a co-evolutionary algorithm, and (iii) a neural network policy gradient algorithm in the task of improving driving strategies to maximize the expected future reward. The baseline algorithm (Baseline) optimizes agents’ expected future rewards without considering the current traffic situation and the plans of the other agents. The co-evolutionary algorithm (CoEA) is the state of the art algorithm to generate equilibrium daily activities and trips in transportation theory . In CoEA, agents independently explore and exploit their plans through a genetic operator, then jointly execute and evaluate their plans in a simulator, and finally repeat this process until an equilibrium is reached . The neural network policy gradient algorithm (NNPG) is an approximate planning algorithm that maximizes where is the neural network output, and is a training example of input, action and value . The benchmark neural network has four layers. The input layer receives the current time and the minute-by-minute “probe” vehicle counts in selected locations at specific times in the past hour, and it feeds these values into the hidden layers.
Evaluation metric: We use three metrics to evaluate different planning algorithms. The first one is average trip time in minutes of all vehicles driving from home to work: a lower average trip time means better traffic. The second one is on-time arriving ratio that measures the percentage of people arriving to work on time. Finally, we use expected reward per vehicle per hour, where higher expected rewards mean better individual plans and a more efficient transportation network.
Comparing detailed behaviors on SynthTown data: Figure (b)b shows the vehicle counts at the different locations of SynthTown throughout a day after executing different planning algorithms from the observations of “probe” vehicles (10% of the total) at link 1 and link 20 only. The x-axis indicates the hours of a day, the y-axis shows the numbers of vehicles at different locations (i.e. home, work and road segments marked on the left), and the baseline (Baseline) serves as the frame of reference. It can be observed that vehicles applying the policy from our framework best satisfy the requirements of all individuals. Indeed, at 9 am our framework has the highest number of people arriving at work on time, which is followed by NNPG, and then CoEAs and Baseline. Then, at 5 pm our framework has the highest number of people arriving back at home, while under the other three policies most of the people are either still at work or congested on roads. Finally, analyzing the figure horizontally, the people in our framework spend the least amount of time on roads (link 1, 6, 15, 20, 21, 22 and 23), and most of the time doing useful activities at facilities (home and work).
|Dataset||Models||Average trip time||On-time arriving ratio||Expected reward|
Comparing summary performance metrics: Table 2 compares the average trip time, the on-time arriving ratio and the average unit reward statistics of the four models using the SynthTown, the Berlin, the Dakar, and the Santiago de Chile datasets. The Berlin, the Dakar and the Santiago de Chile datasets are too large for NNPG model to run, which indicates the better scalability of our framework and of CoEA. This comparison leads us to the same conclusion as the detailed comparison on the SynthTown data. Specifically, our framework has the lowest average trip time, the highest on-time arriving ratio and highest expected reward in all datasets. Hence, our framework outperforms NNPG because it has better scalability. Moreover, our framework outperforms CoEA because CoEA uses an offline planning algorithm, while our framework achieves online planning which can change policy dynamically according to different observations at different times.
5 Discussion and Conclusions
Every minute, hundreds of millions of people are leaving behind digital breadcrumbs that mark their movements. Thus, connecting these dots of isolated observations may reveal the big picture of how real-world interactions among individuals generate society-level properties. This may enable traditional and computational social scientists to establish first principles, and let policy researchers experiment with individual-based interventions — essentially turning our real world into a living lab. However, there are at least the following challenges in stitching together these isolated observations with a high-fidelity model of complex interaction dynamics. First of all, existing machine learning approaches either demand significant expertise and time to model the diverse, complex and evolving dynamics in social and interaction networks, or demand a sufficient amount of training data at individual level with high spatial and temporal resolution to train a general-purpose model such as deep neural networks. Neither expert time nor sufficient amount of data is generally available. Second, we often need to make fine individual-level observations in order to answer many research and business questions, and such observations are generally not available in the digital breadcrumbs. For example, we often need to know how individuals belonging to a specific social or economic class travel and interact with one another. However, the dots of our isolated observations are often too noisy. Third, many existing machine learning and signal processing algorithms provide no insights on how predictions are made and optimal policies are selected. Hence, is there a modeling framework that can easily and flexibly specify the diverse and complex dynamics in social systems for machine learning? In this way, anyone who has insights into a social system can specify and tune its dynamics and subsequently apply machine learning algorithms.
In this paper, we have combined the discrete-event model and the particle filter algorithm to continuously track the traffic dynamics and to optimize the driving strategies. In order to make the inferences and the planning tractable, we have adopted the discrete-event simulation model to represent complex dynamics as a sequence of simple events at the individual level. It is worth noting that each of these events make minimal changes to a few individuals but together induce complex dynamics at the system level. Based on the discrete event model, we have derived a particle filtering algorithm to make individual-level inferences through alternatively simulating the various ways the events change individuals’ states in infinitesimal time steps, and to select the ways that are most compatible with the noisy observations. Finally, to make optimal control in the complex system, we have formulated a partially observable Markov decision process problem and reduced it into a learning and inference problem. The conducted large scale experiments show that our proposed framework can accurately track city-scale traffic dynamics and can effectively improve the driving plans. Moreover, our method outperforms existing tracking and optimal control algorithms in the machine learning community.
Our paper points to a new way of integrating simulation modeling, machine learning, and transportation research by turning the real world dynamics into a living lab, where we can not only predict behaviors and optimize policies, but also tell why. While it has not been widely explored, this approach is nevertheless powerful because it affords an intuitive interpretation of the information extracted from massive, noisy, unstructured data streams. For example, our approach can not only simulate traffic jams during rush hour but also predict from the trajectories of probe vehicles whether today’s traffic jams will be formed earlier or last longer than usual, and help drivers to use the road network more efficiently. This integration parallels the trend of developing explainable and increasingly complex models fueled by larger amount of data.
Appendix A Background
In this section, we provide a brief background of the models used to represent and capture the dynamics of complex systems. These models are the dynamic Bayesian network and the state-space model used to capture the dynamics of a complex system, and the Markov decision process and the partially observable Markov decision process used to specify the decision-making problem.
a.1 Dynamic Bayesian Network and State-Space Model
The dynamic Bayesian network (DBN) and the state-space model (SSM) are two generative models used to describe complex system dynamics. A DBN describes the dynamics of a system by specifying the probability dependency of the value of state variables at the current time step conditioned on the value at the previous step. Let denote the value of state variables, and the value of observation variables in a system with state variables at time step . The probability of a sampled trajectory can be factorized as , where is the state transition model, the observation model, and a collection of parameters. A SSM captures the evolution of system states through a set of first-order differences or differential equations. We use and to specify non-linear dynamics, and and to specify linear dynamics, where and define the state evolution dynamics, and define the observation structure, and and are system noise and observation noise, respectively.
In order to make exact inferences with a DBN and SSM, the classic forward-backward algorithm sweeps a forward/filtering pass to compute the forward statistics and a backward/smoothing pass to estimate the backward statistics . Then, it can estimate the one-slice statistics and the two-slice statistics .
Both the DBN and SSM have difficulties in capturing the dynamics of a complex transportation system, as the dynamics of a transportation system does not obey simple first-order differential equations and cannot be expressed with a tractable joint transition kernel. The discrete event model we describe in this paper captures the intractable dynamics of complex state transitions through a set of tractable events, and therefore can be used to depict the dynamics of a transportation network.
a.2 Markov Decision Process
A Markov decision process (MDP) is a framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. Formally, an MDP is defined as a tuple , where represents the state space and the state at time , the action space and the action taken at time , the transition kernel of states such as , the reward function such as that evaluates the immediate reward of each state-action pair, and the discount factor. Let us further define a policy as a mapping from a state to an action or a distribution of it parameterized by —that is, or . Solving an MDP involves finding the optimal policy or equivalently its associated parameter to maximize the expected future reward: , where is a state-action trajectory with probability or .
A partially observable Markov decision process (POMDP) models a decision making where the system dynamics evolves according to an MDP and the decision maker cannot directly observe the underlying state. A POMDP is defined as a tuple , where and have the same definitions as in an MDP, is the observation space, is the observation received at time t, and is the observation probability such as . Solving a POMDP involves maintaining a belief state as the estimated probability distribution of the hidden state conditioned on all past observations according to , then finding the optimal policy or equivalently its associated parameter to maximize the expected future reward (), where is a mapping from a belief state to an action or a distribution of it parameterized by , or .
The optimal control problem of an MDP or POMDP is to find the optimal policy parameterized by to maximize the expected future reward, where the policy iteration method for solving this involves iterating between policy evaluation and policy improvement. Policy evaluation involves estimating the expected future reward of the current policy as by simulating trajectories or as through the inference of . Policy improvement involves parameter learning to improve the current policy according to the estimated expected future reward. Policy iteration suffers from the burden of dimensionality, because simulation has high variance in high-dimensional space, and exact inference of state action distribution is intractable. We resolve this issue by utilizing a discrete event model with control variables to model decision making in transportation systems.
The authors declare that they have no competing interests.
Conceived the study: BL, WD. Designed and performed the experiments: FY, AV, WD. Wrote the paper: BL, WD. All authors read, reviewed and approved the final manuscript.
-  Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.-L.: Understanding individual human mobility patterns. Nature 453, 779–782 (2008)
-  Song, C., Qu, N., Blumm, N., Barabasi, A.-L.: Limits of predictability in human mobility. Science 327(5968), 1018–1021 (2010)
-  Simini, F., Gonzalez, M., Maritain, A., Barabasi, A.-L.: A universal model for mobility and migration patterns. Science 327(5968), 1018–1021 (2010)
-  Schneider, C., Belik, V., Couronne, T., Smoreda, Z., Gonzalez, M.: Unravelling daily human mobility motifs. Journal of The Royal Society Interface 10(84), 20130246 (2013)
-  Kung, K., Greco, K., Sobolevsky, S., Ratti, C.: Exploring universal patterns in human home-work commuting from mobile phone data. PloS One 9(6), 96180 (2014)
-  Pappalardo, L., Simini, F., Rinzivillo, S., Pedreschi, D., Giannotti, F., Barabasi, A.-L.: Returners and explorers dichotomy in human mobility. Nature Communications 6(8166) (2015)
-  Jiang, S., Yang, Y., Gupta, D., Veneziano, D., Athavale, S., Gonzalez, M.: The timegeo modeling framework for urban mobility without travel surveys. Proceedings of the National Academy of Sciences 113(37), 5370–5378 (2016)
-  Wang, P., Hunter, T., Bayen, A.M., Schechtner, K., González, M.C.: Understanding road usage patterns in urban areas. Scientific reports 2, 1001 (2012)
-  Steenbruggen, J., Borzacchiello, M.T., Nijkamp, P., Scholten, H.: Mobile phone data from gsm networks for traffic parameter and urban spatial pattern assessment: a review of applications and opportunities. GeoJournal 78(2), 223–243 (2013)
-  Deville, P., Linard, C., Martin, S., Gilbert, M., Stevens, F.R., Gaughan, A.E., Blondel, V.D., Tatem, A.J.: Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences 111(45), 15888–15893 (2014)
-  Csáji, B.C., Browet, A., Traag, V.A., Delvenne, J.-C., Huens, E., Van Dooren, P., Smoreda, Z., Blondel, V.D.: Exploring the mobility of mobile phone users. Physica A: statistical mechanics and its applications 392(6), 1459–1473 (2013)
-  Berlingerio, M., Calabrese, F., Di Lorenzo, G., Nair, R., Pinelli, F., Sbodio, M.L.: Allaboard: a system for exploring urban mobility and optimizing public transport using cellphone data. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 663–666 (2013). Springer
-  Calabrese, F., Ferrari, L., Blondel, V.D.: Urban sensing using mobile phone network data: a survey of research. ACM Computing Surveys 47(2), 25 (2015)
-  Lv, Y., Duan, Y., Kang, W., Li, Z., Wang, F.-Y.: Traffic flow prediction with big data: A deep learning approach. IEEE Transactions on Intelligent Transportation Systems 16(2), 865–873 (2015)
-  Ma, X., Dai, Z., He, Z., Ma, J., Wang, Y., Wang, Y.: Learning traffic as images: a deep convolutional neural network for large-scale transportation network speed prediction. Sensors 17(4), 818 (2017)
-  Zhang, J., Zheng, Y., Qi, D.: Deep spatio-temporal residual networks for citywide crowd flows prediction. In: AAAI, pp. 1655–1661 (2017)
-  Horvitz, E.J., Apacible, J., Sarin, R., Liao, L.: Prediction, expectation, and surprise: Methods, designs, and study of a deployed traffic forecasting service. In: Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005), pp. 275–283 (2005)
-  Wang, Y., Papageorgiou, M., Messmer, A.: Real-time freeway traffic state estimation based on extended kalman filter: A case study. Transportation Science 41(2), 167–181 (2007)
-  Hamed, M.M., Al-Masaeid, H.R., Said, Z.M.B.: Short-term prediction of traffic volume in urban arterials. Journal of Transportation Engineering 121(3), 249–254 (1995)
-  Vlahogianni, E.I., Golias, J.C., Karlaftis, M.G.: Short-term traffic forecasting: Overview of objectives and methods. Transport reviews 24(5), 533–557 (2004)
-  Vlahogianni, E.I., Karlaftis, M.G., Golias, J.C.: Short-term traffic forecasting: Where we are and where we’re going. Transportation Research Part C: Emerging Technologies 43, 3–19 (2014)