SafeRNet: Safe Transportation Routing in the era of Internet of Vehicles and Mobile Crowd Sensing
World wide road traffic fatality and accident rates are high, and this is true even in technologically advanced countries like the USA. Despite the advances in Intelligent Trans- portation Systems, safe transportation routing i.e., finding safest routes is largely an overlooked paradigm. In recent years, large amount of traffic data has been produced by people, Internet of Vehicles and Internet of Things (IoT). Also, thanks to advances in cloud computing and proliferation of mobile communication technologies, it is now possible to perform analysis on vast amount of generated data (crowd sourced) and deliver the result back to users in real time. This paper proposes SafeRNet, a safe route computation framework which takes advantage of these technologies to analyze streaming traffic data and historical data to effectively infer safe routes and deliver them back to users in real time. SafeRNet utilizes Bayesian network to formulate safe route model. Furthermore, a case study is presented to demonstrate the effectiveness of our approach using real traffic data. SafeRNet intends to improve driversâ safety in a modern technology rich transportation system.
I Introduction and Motivation
World Health Organization’s 2015 Global Status report indicates that the total number of road traffic deaths are 1.25 million per year world wide. Poor countries that lack good infrastructure reported the traffic fatality rate to be at 24.1 per 100,000, where as in technology rich high income countries the same is observed to be 9.2 per 100,000. According to current U.S. Department of Transportation statistics, “There were 29,989 fatal motor vehicle crashes in the United States in 2014 in which 32,675 deaths occurred. This resulted in 10.2 deaths per 100,000 people and 1.08 deaths per 100 million vehicle miles traveled.111 Federal Highway Administration. 2015. Highway statistics, 2014. Washington, DC: U.S. Department of Transportation, http://www.iihs.org/iihs/topics/t/general-statistics/fatalityfacts/state-by-state-overview ” In USA, same study shows fatality rate ranges from 3.5 to 25.7 per 100,000 people where as death rate ranges from 0.57 to 1.65 per 100 million vehicle miles traveled . Although it is extremely difficult to bring the fatality rates down to zero, nevertheless, these statistics are startling, especially, in technologically advanced countries. We believe real time safe routing i.e., computing and delivery of safest route to end users can address the safety related problems to great extent and thus help reduce the fatality as well as accident rates. We argue that safe routing problem can be addressed effectively and efficiently by using emerging technologies like Internet of Vehicle (IoV) and Mobile Crowd Sourcing as we explain in later paragraphs.
Intelligent transportation has come a long way in past couple of decades, and Internet of Vehicle (IoV) is adding a new dimension to it. Because of computing and communications capabilities, IoV has a potential to become the corner stone in delivering and consuming rich applications in safe and secure manner. IoV enables gathering and sharing information about the traffic, road, and vehicle itself by using V2V (vehicle-to-vehicle), V2H (Vehicle-to-Human), V2S (Vehicle-to-Sensor) communications and interactions. This brings us to the question: what role IoV can play in guiding and supervising vehicles to help improve safety in transportation system? Our proposal is an attempt to answer this question.
Recently, the Mobile crowd sensing (MCS), a new sensing paradigm  is producing a lot of useful traffic data, such as vehicle trajectory and lane changing behavior . The data produced and collected through mobile phones is delivered to the cloud for processing purposes. Unfortunately, a large amount of valuable traffic data has not been effectively utilized in addressing the safety issues on the road. What role these data would play in safe route planning is not clear. Our proposed work puts a great emphasis on utilizing the dynamic traffic data in a real time fashion.
We propose a SafeRNet, Figure 1, a framework which addresses the gap in research focusing on big data generated by MCS paradigm, and by vehicles and IoT deployed on road side to effectively infer safe routes. SafeRNet aims to show what a safe route solution would look like in an IoV rich world where, because of advances in cloud computing, analyzing the MCS data has become a reality.
To the best of our knowledge this is the first work of its kind. We summarize our contributions as below:
A novel data flow framework integrating MCS, IoV and cloud computing is proposed showing user-to-cloud and cloud-to-user interaction. The framework combines real time MCS data, IoV data and historical data to deliver safest route to the end users.
Notion of safety is conceptualized by using Bayesian network model.
Safest route computation is formulated and a possible solution is presented.
A case study is presented to demonstrate the feasibility and effectiveness of proposed framework.
The paper is organized as follows: Section II presents background and related work briefly. SafeRNet architecture, Bayesian network modeling approach for safety probability estimation and safe routing problem formulations are presented in Section III. Section IV demonstrates experimental result. Finally, V presents conclusion and future work.
Ii Background and Related Work
Mobile Crowd Sensing is a new sensing paradigm that has an advantage of large scale sensing as compared to traditional approaches. No doubt it has become popular and therefore, a reality  as it enables high sensing accuracy with very low error rate. MCS coupled with a large variety of devices has a potential to become a cutting-edge technology for Internet of Things that would provide a seamless sensor data transfer via Internet. There are already many applications appearing in literature  that use MCS data gathering paradigm.
Internet of Vehicles is a rapidly developing communication paradigm  that possesses the ability to perform accurate positioning even in the blocked GPS signal scenario . In the typical setting of vehicular ad hoc networks, it has gained attention from researches world wide to address vehicle collision warning problem and traffic information dissemination issues . This sensing paradigm has a potential to bring enormous attention on the related area, such as traffic prediction, which has a potential to make real-time performance better . Machine learning has been heavily used in ITS  in mining mobile data stream , in constructing mathematical models and for exchange of information among vehicles. Bayesian network is adopted in our research because of its ability to concisely represent probabilistic relationships  and its previous successful application . However, other methods can also be utilized to analyze data and perform predictions in our framework.
Iii SafeRNet: Architecture, Design and Modeling
In this section, notion of safety used in our work is explained. Then, we present an overview of our proposed system. Furthermore, we describe Bayesian network modeling approach to compute safety probability (described later) and explain how safe route formulation utilizes this information to compute safest route.
Iii-a Notion of Safety
In our proposed work, road traffic safety is defined as a way to measure traffic fatality rate, accident rates and also near collision incidents. We believe that safety issue can arise mainly because of three reasons, law compliance, road condition and hazardous behavior. Non-adherence to law related safety issue arises when drivers break the law such as running the traffic lights, going over/under legal speed limit, etc. Road condition is related to light intensity on the road, type of road (paved/non-paved), road lanes, weather condition, time of day, day of week etc. Hazardous behavior related to reckless driving such as frequent lane changing, light flashing, honking etc. There are other factors that contribute to the accidents such as presence of pedestrians on the road, driver’s mental state etc. There can be many factors that impact traffic safety, however, we limit our study to the data that can be gathered by using IoV with MCS data gathering paradigm. Nevertheless, the distinction among causes of accidents are important to take preventive and cautionary action. For example, the extent of law compliance may trigger a proactive government intervention to bring order to a road.
The safety probability, , of a road is defined as below:
where is collision probability It is to be noted that for simplicity we focus on collision only and not on number of causalities or number of vehicles involved in a collision.
Iii-B System Overview
End-to-end data flow in SafeRNet architecture is shown in Fig. 1. SafeRnet can be broadly viewed as the integration of three modules: (i) sensing and communication; (ii) databases for storing dynamic as well as historical data; and (iii) compute engine to analyze the data. The functional details of these modules are presented next.
Iii-B1 Sensing and Communication
IoV coupled with deployed sensing and communication technologies on the road act as a data source and communication mediums. In our IoV model, vehicles act as mobile sensor nodes that are equipped with the On-Board Units (OBU) which can communicate to Road Side Units (RSUs)/OBU or directly to the cloud via cellular network. The RSUs are fixed nodes that serve as an infrastructure to facilitate data communication to/from remote cloud. Deployed sensors along the road such as cameras, speed detectors, etc. acts as data acquisition system that obtain and send additional traffic data on surrounding area to the cloud.
Collected data is associated with a particular road or road segments. Data is classified in three broad categories - law compliance, road condition and hazardous behavior. These three categories are related to safety issues as described in the previous section. The proposed classification has a potential to be used to provide a preference based safe routes to end-users. The proposed architecture uses two kinds of databases, one for storing the dynamic data and other for static data. The data that does not change in small time frame are static data, for example, road type, road zone, map data are static data. Dynamic data are either streaming traffic data or data that changes in short time frame. For example, data such as weather condition, light condition sometimes changes frequently and therefore they belong to dynamic data category. Also, streaming data such as vehicle density, lane changing behavior, speed of vehicle are also dynamic data and are stored in dynamic database. We use short time frame as a more generic term with time units: minute(s), hour(s) or a day.
Iii-B3 Compute Engine
Compute engine uses Bayesian network model to compute safety probability for road/road segment by referring to historical data (see Figure 2) and route selection component to compute safest route. It is to be noted that compute engine has access to both types of data and it has the ability to update its database when an event of interest is detected. In our framework, Bayesian network periodically updates its structure adapting to new data sources and attributes. Because of its ability to learn causal relationships, Bayesian network becomes a clear choice to serve as a compute engine in our work. Also, desired properties like handling of incomplete data, prevention of over fitting of data and straight forward construction of prior knowledge make Bayesian network an excellent choice for our work. Once the Bayesian network is constructed, the road segments are assigned safety probability in the road network which is a part of map data. Furthermore, the safest route is computed on the graph obtained by map data with edges associated with safety probability (see Section III-D).
Iii-C Bayesian Network Modeling
Bayesian network (BN) also known as Bayesian Belief network, is a probabilistic directed acyclic graphical model which represents the conditional independence relationships. The conditional independence is capable of probabilistic representation and reasoning among variables by using a directed acyclic graph (DAG) . The nodes and directed edges of BN are random variables and conditional dependence relationships among variables, respectively. Nodes are conditionally dependent given the value of their parents. For each node in the Bayesian network there is a conditional probability table (CPT) that serves as prior probabilities for BN. These prior probabilities are used to compute total joint probability according to below equation:
where is the th node in the set of n nodes of the network and denotes the parent node of node . The aim of BN learning is to support the training data by finding the detailed relationship among variables as well as their corresponding CPTs. Next, algorithms for BN scoring metrics, structure learning and parameter learning are discussed.
Scoring metrics is a method to measure the performance and quality of the network for a given set of data. The Bayesian metric  for a specific Bayesian network structure for a database is defined as:
where is the prior on , () is the cardinality of variables, represents the number of records in the database which takes its th value, represents the number of records in the database which takes its th value and which takes its th value. and are the prior on and , the gamma-function and represent the choices of priors on counts restricted by . When assigned 1, the K2 metric is obtained.
Iii-C2 Structure Learning
To learn the best structure of Bayesian network we adopt K2 algorithm , a greedy algorithm, in our this study. For a given database , prior knowledge , structure learning aim to find an optimal structure with the best score as per below equation:
Iii-C3 Parameter Learning
Once the Bayesian structure is constructed, one can build the conditional probability table (CPT) for each relationship of the nodes. Given the database and node , we have:
where and , are Hyperparameters representing the Dirichlet priors  which is the probability distribution for prior knowledge of the relationships among variables in .
Iii-C4 BN Inference
Marginal probability is calculated by using the above structure and learned CPTs for each node from the observations in the Bayesian network for each road segments. Furthermore, we assign those probabilities to edges in the road network graph obtained from map data.
Iii-D Safe route problem formulation
In this section, we present our model of safe route and further safest route formulation is presented and its transformation to shortest path problem is explained.
Iii-D1 Safe Route Model
In our work, safety probability of road segments are computed by using Bayesian network as described in previous subsection. For simplicity sake, we assume that safety probability of roads are independent of each other, therefore, following law of independence, safety probability of a route is defined as below:
where is the set of roads forming a route, is some road in route R and is corresponding safety probability.
Iii-D2 Safest Route
From a source to a destination, there exists a set of roads forming set of routes . Following the safe route model described in previous section, the safest route is given by following formulation:
The safest route is the result of a set of roads that gives maximum safety route probability, therefore, following 6 the problem of finding safest route can be formulated as below:
Since we are interested in finding the safest route the problem can be reformulated as below:
Law of logarithms transforms the maximization of multiplication problem into maximization of summation problem:
Furthermore, since log(p(j)) is a negative quantity, after applying unary operator , the problem becomes a minimization problem:
Above equation indicates, to find the safest route, we must find the shortest path in a graph where is the weight of road ( ). Such problem can be solved by using Dijskstra’s shortest path algorithm in using a Fibonacci heap. Example of transformation is shown in Fig. 3.
Safest route score is defined as below:
therefore, higher safety score means safer route.
|TR||Type of road||0 highway, 1 district or province road|
|TRL||Type of road lanes||0 road with one road lane, 1 road with separated road lanes|
|RF||Road factors||0 bad road surface, 1 faulty signals, 2 faulty lighting, 3 road works, 4 queue, 5 downhill, 6 curve,7 bad visibility|
|WC||Weather conditions||0 normal weather, 1 rain, 2 fog, 3 wind, 4 snow, 5 hail, 6 other weather|
|RC||Road conditions||0 dry road surface, 1 wet road surface, 2 snow on road surface, 3 clean road surface, 4 dirty road surface|
|LC||Light conditions||0 daylight, 1 twilight, 2 public lighting, 3 night|
|W||Week||0 week, 1 weekend|
|PD||Part of the day||0 morn. rush hour_9h, 1 morn.10-12h, 2 noon13_15h, 3 eve. rush hour16_18h, 4 eve.19_21h, 5 night22_6h|
|C||Collision||0 none, 1 collision|
|V||Velocity||0 Low, 1 Normal, 2 High|
|VD||Vehicle Density||0 low, 1 high|
|LCB||Lane Changing Behavior||0 not frequent, 1 frequent|
|RZ||Road Zone||0 none, 1 commercial, 2 residential|
In this section, we demonstrate the working and effectiveness of SafeRNet. We used real traffic data to build Bayesian network and present a case study.
Iv-a Bayesian Network Structure Learning
We use the dataset obtained from Frequent Itemset Mining Dataset Repository222http://fimi.ua.ac.be/data/ research community. Further details about the dataset can be found here . The missing data attribute values are generated by using Gaussian distribution function. A total of 160k records are used in our study. The data attributes used in our study are listed in Table I. Furthermore, Bayesian network structure is trained by using 80% of the sample data set and rest of the data is used for testing. The obtained Bayesian network structure is shown in Figure 4.
Iv-B Case Study
To study the effectiveness of our proposed model and make the simulated scenarios as close to reality as possible. We consider a long distance route scenario that spans a geography where there is a visible and considerable variation of certain spatial data. For demonstration purposes, we focus on a geographical region between Dothan, Alabama and Atlanta, Georgia. The roads between cities are road segments. Figure 5(a) shows the map data and extracted road network graph is shown in Figure 5(b). The presented scenario utilizes the real weather data from radar map provided by WunderMap333https://www.wunderground.com/wundermap/ on June 26th, 2016 for geography shown in figure 5(b). The weather map shows a moving storm in that geography on that day. The weather data is recorded periodically at a set interval of 1 hour starting from 8AM to 7PM. Weather impacts the safety probability of the road segments shown in Figure 5(b).
Figure 6 shows the variation of safety route score for different trip start time. Trip is defined as a set of routes between Dothan, Alabama and Atlanta, Georgia. Figure shows that safest route varies based on weather conditions in that region. For example, one would take a different route if the trip is started at 1PM as compared to the route if the trip is started at 2PM. Also, the safest route may be not desirable if the safest route score is low (lower score means lesser safety, see equation 12) indicating high possibility of crash/collision if trip is started. For example, the safest route is same for the case trip start time 11AM and 4PM, however, this route is a lot safer at 11AM as compared to 4PM. This shows that our framework captures the impact of dynamic variables on safe route computation. Figure shows that different routes can be deemed to be safest at different times, moreover, safest route may not be desirable at all because it performs poorly in terms of safety score. Dynamic changes in variables impact safety route score and its corresponding safest routes, therefore, proposed framework enables users to make informed decisions on travel plans.
V Conclusion & Future Work
We propose a safe routing framework called SafeRNet to addresses the safe transportation routing problem in the presence of Internet of Vehicles, cloud computing and Mobile Crowd Sensing technologies. The proposed framework addresses the need for computing the safest route and then delivers them back to interested users in a real time and on demand basis. User created and real time hardware device generated dynamic data are used to minimize the human errors. Bayesian network modeling approach and an optimization framework are used in cloud to analyze IoV and MCS generated spatio-temporal road traffic data. Furthermore, through experimentation on real data set we demonstrate that SafeRNet is effective in improving the transportation safety.
There are limitations of our framework that we intend to work on in future. In our proposed framework, there needs to be a more structured approach to convert a map data into road network graph data for route computation purposes. We also believe a tradeoff between safest route and travel time/distance could be important to many users. We believe that this work would bring more attention to this important problem which holds the key to reducing the fatality/accident rates.
We thank the anonymous reviewers for their constructive feedbacks which helped us to improve this paper. We also like to thank National Institute of Statistics, Flanders (Belgium) for making the accident dataset available to the public.
-  W. Ed, “Global status report on road safety 2015,” (official report). Geneva, Switzerland: World Health Organisation (WHO), pp. vii, 1â14, 75ff (countries), 264â271 (table A2), 316â332 (table A10).
-  B. Guo, Z. Yu, X. Zhou, and D. Zhang, “From participatory sensing to mobile crowd sensing,” in Pervasive Computing and Communications Workshops (PERCOM Workshops), 2014 IEEE International Conference on. IEEE, 2014, pp. 593–598.
-  S. Ruj, M. A. Cavenaghi, Z. Huang, A. Nayak, and I. Stojmenovic, “On data-centric misbehavior detection in vanets,” in Vehicular technology conference (VTC Fall), 2011 IEEE. IEEE, 2011, pp. 1–5.
-  W. Guo and S. Wang, “Mobile crowd-sensing wireless activity with measured interference power,” IEEE Wireless Communications Letters, vol. 2, no. 5, pp. 539– 542, 2013.
-  R. K. Ganti, F. Ye, and H. Lei, “Mobile crowdsensing: current state and future challenges.” IEEE Communications Magazine, vol. 49, no. 11, pp. 32– 39, 2011.
-  H. B. Kolls, “Communication interface device for managing wireless data transmission between a vehicle and the internet,” Feb. 2006, uS Patent 7,003,289.
-  J. Prinsloo and R. Malekian, “Accurate vehicle location system using rfid, an internet of things approach,” Sensors, vol. 16, no. 6, p. 825, 2016.
-  E. C. Eze, S.-J. Zhang, E.-J. Liu, and J. C. Eze, “Advances in vehicular ad-hoc networks (vanets): Challenges and road-map for future development,” International Journal of Automation and Computing, vol. 13, no. 1, pp. 1– 18, 2016.
-  J. Wan, J. Liu, Z. Shao, A. V. Vasilakos, M. Imran, and K. Zhou, “Mobile crowd sensing for traffic prediction in internet of vehicles,” Sensors, vol. 16, no. 1, p. 88, 2016.
-  J. Lepine, V. Rouillard, and M. Sek, “On the use of machine learning to detect shocks in road vehicle vibration signals,” Packaging Technology and Science, 2016.
-  S. Krishnaswamy, J. Gama, and M. M. Gaber, “Mobile data stream mining: from algorithms to applications,” in 2012 IEEE 13th International Conference on Mobile Data Management. IEEE, 2012, pp. 360– 363.
-  M. Kotkowski, H. Nguyen, Y. Getahun, and V. K. Mago, “A novel agent based method for intelligent public transportation system,” in Proceedings of the 1st International ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, ser. UrbanGIS’15. New York, NY, USA: ACM, 2015, pp. 85–93.
-  M. Fogue, J. A. Sanguesa, F. Naranjo, J. Gallardo, P. Garrido, and F. J. Martinez, “Non-emergency patient transport services planning through genetic algorithms,” Expert Systems with Applications, vol. 61, pp. 262–271, 2016.
-  G. F. Cooper, “The computational complexity of probabilistic inference using bayesian belief networks,” Artificial intelligence, vol. 42, no. 2-3, pp. 393– 405, 1990.
-  E. Belyi, I. Patel, A. Reddy, and V. Mago, “A multi-agent based system for route planning,” in International Conference on Human Interface and the Management of Information. Springer, 2015, pp. 500–512.
-  A. Ouali, A. R. Cherif, and M.-O. Krebs, “Data mining based bayesian networks for best classification,” Computational statistics & data analysis, vol. 51, no. 2, pp. 1278–1292, 2006.
-  G. F. Cooper and E. Herskovits, “A bayesian method for the induction of probabilistic networks from data,” Machine learning, vol. 9, no. 4, pp. 309–347, 1992.
-  D. Heckerman, D. Geiger, and D. M. Chickering, “Learning bayesian networks: The combination of knowledge and statistical data,” Machine learning, vol. 20, no. 3, pp. 197–243, 1995.
-  K. Geurts, G. Wets, T. Brijs, and K. Vanhoof, “Profiling of high-frequency accident locations by use of association rules,” Transportation Research Record: Journal of the Transportation Research Board, no. 1840, pp. 123–130, 2003.