# Complex Network Theoretical Analysis on Information Dissemination over Vehicular Networks

## Abstract

How to enhance the communication efficiency and quality on vehicular networks is one critical important issue. While with the larger and larger scale of vehicular networks in dense cities, the real-world datasets show that the vehicular networks essentially belong to the complex network model. Meanwhile, the extensive research on complex networks has shown that the complex network theory can both provide an accurate network illustration model and further make great contributions to the network design, optimization and management. In this paper, we start with analyzing characteristics of a taxi GPS dataset and then establishing the vehicular-to-infrastructure, vehicle-to-vehicle and the hybrid communication model, respectively. Moreover, we propose a clustering algorithm for station selection, a traffic allocation optimization model and an information source selection model based on the communication performances and complex network theory.

## 1 Introduction

Due to the emerging of intelligent transport system, vehicular networks have received lots of attentions. Although cellular networks enable convenient voice communication and simple entertainment services to drivers and passengers, they are not well-suited for certain direct vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2I) communications [1]. In particular, how to improve the performances of the communication system has already been under development [2], where some key technologies [3], e.g., small cells, device-to-device (D2D) communication, mobile clouds, flexible spectrum management [4] [5], etc., can be considered to be employed in vehicular networks.

In the literature of vehicular networks, many researches focused on improvement of the vehicle mobility models [6], communication channel models and the routing strategies [7] [8] [9] [10], while the network properties as well as the complex characteristics of the vehicular networks have not been fully investigated. The vehicular networks are associated with a tremendous network size. Moreover, diverse hierarchical structures and node types give rise to more complex interactions. Furthermore, vehicular networks have a complex time-space relationship. The mobility of the vehicles on the road lead to the dynamic evolutionary topology. In terms of some hot communication technologies, the ultra dense cellular deployment would lead to more than ever interactions among vehicle units (vehicles to infrastructures and infrastructure to infrastructure) and the D2D based vehicular-to-vehicular communication also lead to a more complex hybrid communication network. Therefore, it is necessary to view the vehicular networks from the other dimension, i.e., using complex network theory to discover the complex characteristics of vehicular networks, based on which the network performance can be improved.

With the development of random graph model, the complex network theory emerged based on the [11] and [12] [13] [14], which discovered the small-word property and the power-law distribution of the node degree of the realistic complex networks. Based on the advantages of complex networks theory, this paper proposes a complex network theoretic view on the vehicular networks with following original contributions. For one thing, this is the first work to establish the vehicular network V2V and V2I models with complex network theory. Moreover, We use the node degree, average path length, clustering coefficient and betweenness centrality to analyze the topology of a vehicular network based on the taxis GPS database of Beijing [15] and study the relationship between the network topological properties and communication parameters. For another thing, we propose a clustering algorithm, a traffic allocation model and an information source selection model depending on the communication impedance.

The rest of this paper is organized as follows. Section 2 establishes a vehicular network system model based on the complex network theory, and gives some key parameters and their characters. Section 3 describes three typical vehicular communication models and three optimization algorithm models. Section 4 gives the simulation results for the proposed models. Concluding remarks and future work are given in Section 5.

## 2 Data-Driven Complex Network Model

### 2.1 Dataset Analysis

In vehicular networks, vehicles can communicate with each other (V2V), and can also establish communication with the roadside infrastructures (V2I). In this subsection, we construct the complex network model for the vehicular networks based on a real-world dataset, which contains the taxi GPS data of Beijing (longitude from 116.25 to 116.55, and latitude from 39.8 to 40.05) obtained from the Microsoft Research Asia [15].

Based on the aforementioned GPS dataset, we plot the vehicles position distribution in the Fig. 1 at one moment. The vehicles position distribution clearly reflects the shape planning structure of Beijing and distinguishes its downtown and suburban areas. In the following subsection, we will construct a weighted and undirected graph model based on some key communication parameters for vehicular networks.

### 2.2 Weighted and Undirected Graph Models for Vehicular Networks

In accordance with the analyses above, we build the vehicular network model as a weighed and undirected complex network in which the nodes represent the vehicles in the road segments and the undirected edges represent the interaction between the nodes. The interaction in this paper means the communication between each two vehicles. The edges weights measure the communication performances on the vehicular networks which depend on the distance between the communication pairs, communication channel fading, the environment disturbance and the cellular radius.

To simplify modeling and calculation, we assume that the communication ability of each vehicle is identical and communication channel meets the COST 231-Bertoni-Ikegami model [16]. In addition, we neglect the cellular gaps and the cellular shapes, which are not affected by terrain. Accordingly, the weighted and undirected vehicular network is noted as a graph , where is the set of vertices representing vehicles and is the set of edges representing the interaction among the vertices. Weights reflect the communication performance on the vehicular network. reflects the communication performance in the vehicular ad hoc network, where the specific definition of the communication impedance is based on the following key communication technologies:

*Channel Model*: Because of the city dotted with tall buildings and luxuriant trees, signals from sources may be attenuated severely to destinations. This paper use the COST 231-Bertoni-Ikegami Model to analyze the transmission path loss. We assume that there exists a line-of-sight transmission path between each two communication-capable vehicles. Therefore, the relatively accurate path loss in the urban area, can be calculated as:

(1) |

where is the transmission distance and represents the signal carrier frequency.

*Ultra Dense Cellular Handover*: Communication system tends to construct a multi-layer heterogeneous network covering base stations and low power micro-stations. In order to improve spectrum efficiency and the transmission capacity, we have made unremitting endeavor on the enhancement of the modulation and encoding methods, while the decrease of cell radius can also result in a sharp increase of system capacity. Therefore, an appropriate communication cell radius improves spatial multiplex ratio and reduces the system power consumption.
Nonetheless, an ultra dense cellular handover means a frequency conversion, more shared-spectrum interferences and more difficulties in multi-point coordination. Spontaneously, the time-delay and handoff dropping probability are both increased due to the ultra dense cellular handover, which increases the impedance of communication of each communication link. We make a statistical calculation of the number of cellular switching on each communication link, noted as .
Based on the communication channel model and ultra dense cellular handover mentioned above and considering the node degrees and betweenness centralities in the complex network theory, we define the weight of the edge connecting node and node , marked as , which is named as link communication impedance:

(2) |

where represents the degree of the node and notes the betweenness centrality of vehicle . shows the energy noise ratio, are characterized parameters varying with diffident network topology, and and are nonlinear control parameters. Based on the above definition, the communication impedance depends on the node degree, link distance, frequency of communication, average signal energy noise ratio and the cellular switching times. First, a vehicle with a large degree or high betweenness centrality means it participating in quantities of communication missions, which leads to a relatively long store-and-forward delay and high probability of blocking. Second, long communication distance conduces high path loss and consumes much more signal power. What is more, a small cellular radius leads to more cell handovers , which also increases the time delay and deteriorates the communication performance. In these two aspects, the communication impedance should be positively correlated with , and . Third, a high average signal energy noise ratio per unit distance contributes a robust communication, naturally being negatively correlated to the impedance. In this way, we have completely established a complex network graph model for the vehicular network communication.

### 2.3 Complex Network Verification

In this section, we quantitatively analyze and verify the small-world property and scaling-free property of the vehicular networks. In the first place, we propose some key parameters depending on the complex network theory.

*Node Degree Distribution*: The node degree of a vehicle in the vehicular network, marked as , is defined as the number of the vehicles it can communicate with. Moreover, is the probability that a randomized node’s degree is . And the distribution of is defined as the node degree distribution.

*Clustering Coefficients*: The characteristic that neighbors can also communicated with each other is called the clustering characteristic, which measures the tightness of the network. The vehicle ’s clustering coefficient is defined as the following:

(3) |

where represents the node degree of vehicle and is the number of communication links among neighbors. Further more, the general clustering coefficient of the entire network is the average of .

*Betweenness Centrality*: The normalized betweenness centrality , and therefore, is defined to measure the importance of the node from another dimension, i.e.,

(4) |

where is the number of the shortest path from to , and notes the number of the shortest path via from to .

A data-driven numerical simulation is conducted for the vehicular network and we verify the complex network properties based on the Taxi GPS dataset. Fig. 2 demonstrates the parameters mentioned above of the proposed network with communication distance . Moreover, we calculated the average network clustering coefficient and the average path length .

The simulation results conform to the small world property (a high degree of clustering and a short average path length) and a scaling free distribution in node degree and betweenness centrality. In consequence, we can quantitatively treat the vehicular network as a complex network and the complex network theory bring us a new perspective in network design, optimization and management for the communication on vehicular networks. Next section, we will propose three optimization models under different communication models.

## 3 Communication on the Vehicular Networks

In Section 2, we have discussed the network topology of vehicular networks. Based on the analysis above, we establish the V2I (Section 3.1), V2V (Section 3.2) and the hybrid communication model (Section 3.3), respectively, with the communication impedance. Moreover, we propose a clustering algorithm for station selection, a traffic allocation optimization model and an information source selection model.

### 3.1 Clustering Algorithm of the V2I Model

In the following, we will focus on the V2I communication model. Similarly, the vehicle impedance in the V2I model is defined based on the Massive MIMO in vehicular communication system, which is a technology to enhance the overall networks performance. With a large excess of service antennas over terminals and time-division duplex operation, the extra antennas focuses energy into ever smaller regions of space and bring huge improvements in communication throughput and energy efficiency. In [17], the authors proposed the throughput (achievable rate of the uplink transmission from user to measure the behavior of massive MIMO systems):

(5) |

where shows the the signal-to-interference-plus-noise-ratio (SINR) which is a function containing channel model parameters and antennas parameters. is the channel estimation (CE) time, and is the wireless energy transfer (WET) time. In our model, we only consider the value of instead of its impact factors. We assume that the base stations directly communicate with vehicles within its control range, which means that the distance from a vehicle to a base station is less than the cellular radius in the V2I Model. In this way, we define the communication impendence of vehicle as follows:

(6) |

Similarly, represents the degree of the node and notes the betweenness centrality of the vehicle . shows the throughput of a certain vehicle-to-station communication link. Besides, and are characterized parameters varying with diffident network topologies, while and are nonlinear control parameters. A clustering algorithm based on the generalized distance is presented.

(7) |

where represents the vehicle impendence, represents the realistic distance of two vehicles and denotes the weighting coefficient.

Clustering algorithm based on generalized distance.

Step1: Select one sample point as the clustering center .

Step2: Calculate the generalized distances to the center, and select the with as center .

Step3: Calculate all the generalized distances to the two centers, and select the with as center , the rest can be done in the same manner.

Step4: Based on the nearest neighbouring rule classifying other samples.

Fig.3 shows a clustering example based on the generalized distance, which provides a constructive suggestion on the base station selection and cellular division.

### 3.2 Traffic Allocation on the V2V Model

In terms of the complex communication missions in vehicular networks, a variety of services like real-time voice services, high definition video services and Internet access services should be supported whenever and wherever. However, how to allocate the communication traffic in an optimal fashion is worth discussing in details. For simplification, we assume that there are certain quantities of communication tasks transmitting from vehicles to a destination vehicle. The total communication demand quantity is marked as . Let be the vehicle node set and the starting vehicle set is denoted by and represents the allocated communication traffic allocation set, where is the actual communication task quantity on the th communication link. We define the cost function as:

(8) |

where is the communication impedance from vehicle to vehicle on the Dijkstra path under the condition of transferring the communication traffic . Let be the communication capacity of each communication link, which denotes the maximum number of communication tasks and let represents the total communication tasks on the communication link between vehicle and , . We have the following optimization problem:

(9) | ||||

where and , when the traffic goes through the link connecting the vehicle and , otherwise . The network traffic allocation optimization problem can be casted as a convex optimization problem in (11) by the definition of traffic-edge incidence matrix , and

(10) |

where is the total number of probable links, , and . Then, we have

(11) | ||||

Furthermore, we can add a eigenfunction to this linear programming problem and rewrite it as follows:

(12) | ||||

where represents the row vector of matrix and auxiliary variable controls the computational accuracy. is the sum of the communication impendence of each the allocation routing.

The solution of the problem (12) is marked as , which satisfies the condition:

(13) |

where let , . And we can prove that the deviation between and the optimal solution of primal problem is not more than . Many computer simulation algorithms can solve the above optimization problem.

### 3.3 Information Source Selection on the Hybrid Model

The criterion for selecting the information source location is to make the network capacity maximize. In another word, the information broadcasting facilities should be located near the source vehicles associated with information replicas. In this subsection, we focus on the hybrid communication model, where we study the optimal source vehicles selection strategy. Let indicate the probability of any packet to pass node , and and are defined identically as (4):

(14) |

where is the probability of a packet to choose source vehicle and vehicle as its destination. Instead of uniform distribution, the source vehicles obey the probability , while we assume that the destination vehicles of packets are uniformly distributed and are independently selected. We have:

(15) |

Then, the probability of any packet to pass vehicle can be calculated as follows:

(16) |

Define the measuring the conditional probability of the situation where packet starts from vehicle to pass vehicle ,

(17) |

Then, can be estimated as:

(18) |

where indicates the upper bound packets generated per time step to maintain in a flow state, and serves as a measure of the overall capacity of the network system, which is a function of betweenness centrality and communication impendence .

The base station selection model, therefore, reduces to a a min-max problem:

(19) | ||||

After introducing an auxiliary variable :

(20) |

the optimization problem can be casted as a linear programming problem as follows:

(21) | ||||

where , and . is defined in (22)

(22) |

Thus, we can easily find the minimal by linear programming algorithms and get the numerical solution with the help of calculating computer.

## 4 Simulation Results

In this section, we conduct simulation on the extensive studies about the network topology and the communication performances based on our models. First of all, we analyze the influence of the maximum communication distance and other key communication parameters on the network topology.

Section 3.2 proposed a vehicular network V2V communication model based on the complex network theory, relying on which we elaborated some complexity parameters to analyze the performance of the network in the respect of topology structure. In the following, we analyze the effect of communication parameters on the communication impedance. On this score, we only concentrate on the topology properties of the vehicular network based on the Taxis GPS in Beijing for the time being and give constructive suggestions on the traffic management and communication design.

The carrier frequency mainly determines the transmission path loss . We obtain five curves with different maximum communication distances, as in Fig. 4 subgraph (a). The vertical coordinates represents the average communication impedance for each of links and in this situation we neglect the effect of node importance by letting in (2). With the increasing of carrier frequency under each scenario, the average communication impedance ascends correspondingly. Obviously, the conclusion can be deduced from the definition of the communication impedance . Likewise, a large maximum communication range contributes the communication impedance with more power loss. Specifically, with a small maximum communication range (200m500m), communication impedance maintains a relatively small value but a high growth rate with . However, when the reaches a specific distance (above 800m), the grow is slowing and communication impedance is tending towards stability. To our knowledge, the carrier frequency in communication may apply a high carrier frequency, but it needs a comprehensive consideration on the path loss and the communication range. Fig. 4 subgraph (b) shows the relationship between cellular radius and switching times under the condition of different maximum communication ranges. Generally speaking, the switching times descend with the increasing of cellular radius and the declining rate tends mildness. Even though we can improve spectrum efficiency and the transmission capacity by narrow down the cellular radius, large switching times reduce the communication performance in the same way. When we extend the maximum communication range, obviously there is a soaring incasement in the switching times under the scenarios of relatively small radiuses. As cellular radius , the switching times have no significant changes. The simulation results are consistent with the actual situation. The distribution of taxis in the city are concentrated in crowded areas, which is just the clustering feature of the small-world network. The average path length is surprisingly to a limited extent. As a consequence, in terms of an appropriate cellular radius , the average switching times hover in a narrow range. To summarize, the communication parameters to some extent affect the impedance of communication. In the realistic engineering, we should synthetically consider the carrier frequency, maximum communication distance, energy utilization efficiency, cellular radius etc, where a trade-off may contribute a communication effects. This paper provides a performance analysis method rather than the specific parameters.

As for the information selection model, Fig. 4 subgraphs (c) and (d) shows the related simulation results with the maximum communication distance . Subgraph (c) demonstrates the communication impedance of each vehicle in the descend order. Subgraph (d) is the simulation result about how to select the information sources. Obviously, we can conclude that the vehicles play highly symmetrical roles in the information spreading. As shown in subgraph (d), only a few vehicles should act as sources in heterogeneous vehicular networks. That’s means the source vehicle should be distributed within a small number of the nodes. Therefore, we can direct or manage fewer vehicles to control the entire vehicle network. More than that, an appropriate communication distance means a small range communication defined above due to the dispersed degree distribution and the low path loss. It makes great contribution to the green communication with a low power dissipation. As the vehicular network is a large-scale heterogeneous network, our work suggests that to improve the network capacity, information like traffic accident, congested roads or the traffic control should be broadcasted deriving from certain source vehicles.

## 5 Conclusion

In this paper, we analyzed the V2V and V2I communication performances on vehicular networks based on complex network theory. Furthermore, we proposed a clustering algorithm for station selection, a traffic allocation optimization model and an information source selection model, respectively which were viewed as examples for illustration of the concrete application of the defined communication impedance.

## Acknowledgment

This research was supported by the NSFC China under projects 61371079, 61271267 and 91338203.

### References

- H. Hartenstein and K. P. Laberteaux, ¡°A tutorial survey on vehicular ad hoc networks,¡± Communications Magazine, IEEE, vol. 46, no. 6, pp. 164–171, June 2008.
- F. Giust, L. Cominardi, and C. J. Bernardos, ¡°Distributed mobility management for future 5g networks: overview and analysis of existing approaches,¡± Communications Magazine, IEEE, vol. 53, no. 1, pp. 142–149, Jan. 2015.
- P. Demestichas, A. Georgakopoulos, D. Karvounas, K. Tsagkaris, V. Stavroulaki, J. Lu, C. Xiong, and J. Yao, ¡°5g on the horizon: key challenges for the radio-access network,¡± Vehicular Technology Magazine, IEEE, vol. 8, no. 3, pp. 47–53, Sept. 2013.
- C. Jiang, Y. Chen, K. J. R. Liu, and Y. Ren, ¡°Renewal-theoretical dynamic spectrum access in cognitive radio network with unknown primary behavior,¡± IEEE J. Sel. Areas Commun., vol. 31, no. 3, pp. 406–416, 2013.
- C. Jiang, Y. Chen, Y. Gao, and K. J. R. Liu, ¡°Joint spectrum sensing and access evolutionary game in cognitive radio networks,¡± IEEE Trans. Wireless Commun., vol. 12, no. 5, pp. 2470–2483, 2013.
- B. T. Sharef, R. A. Alsaqour, and M. Ismail, ¡°Vehicular communication ad hoc routing protocols: A survey,¡± Journal of network and computer applications, vol. 40, pp. 363–396, Apr. 2014.
- H. Zhang and J. Li, ¡°Modeling and dynamical topology properties of vanet based on complex networks theory,¡± AIP Advances, vol. 5, no. 1, p. 017150, Jan. 2015.
- C. Jiang, Y. Chen, and K. R. Liu, ¡°Data-driven optimal throughput analysis for route selection in cognitive vehicular networks,¡± Selected Areas in Communications, IEEE Journal on, vol. 32, no. 11, pp. 2149–2162, Nov. 2014.
- C. Jiang, H. Zhang, Y. Ren, and H. Chen, ¡°Energy-efficient non-cooperative cognitive radio networks: Micro, meso and macro views,¡± IEEE Commun. Mag., vol. 52, no. 7, pp. 14–20, 2014.
- J. Wang, C. Jiang, Z. Han, Y. Ren, and L. Hanzo, ¡°Network association strategies for an energy harvesting aided super-wifi network relying on measured solar activity,¡± IEEE Commun. Mag., vol. 34, no. 12, pp. 3785–3797, 2016.
- D. J. Watts and S. H. Strogatz, ¡°Collective dynamics of small-worldnetworks,¡± Nature, vol. 393, no. 6684, pp. 440–442, June 1998.
- A.-L. Barab¨¢si and R. Albert, ¡°Emergence of scaling in random networks,¡± Science, vol. 286, no. 5439, pp. 509–512, Oct. 1999.
- C. Jiang, Y. Chen, and K. J. R. Liu, ¡°Graphical evolutionary game for information diffusion over social networks,¡± IEEE J. Sel. Topics Signal Process., vol. 8, no. 4, pp. 524–536, 2014.
- C. Jiang, Y. chen, and K. J. R. Liu, ¡°Evolutionary dynamics of information diffusion over social networks,¡± IEEE Trans. Signal Process., vol. 62, no. 17, pp. 4573–4586, 2014.
- J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, and Y. Huang, ¡°T-drive: driving directions based on taxi trajectories,¡± in Proceedings of the 18th SIGSPATIAL International conference on advances in geographic information systems. New York: ACM, Nov. 2010, pp. 99–108.
- L. M. Correia, ¡°A view of the cost 231-bertoni-ikegami model,¡± in Antennas and Propagation, 2009. EuCAP 2009. 3rd European Conference on. Berlin, Germany: IEEE, Mar. 2009, pp. 1681–1685.
- G.-M. Yang, C.-C. Ho, R. Zhang, and Y. Guan, ¡°Throughput optimization for massive mimo systems powered by wireless energy transfer,¡± Selected Areas in Communications, IEEE Journal on, vol. 30, no. 60, pp. 1–12, Jan.