SpatioTemporal Motifs for Optimized VehicletoVehicle (V2V) Communications
Abstract
Caching popular contents in vehicletovehicle (V2V) communication networks is expected to play an important role in road traffic management, the realization of intelligent transportation systems (ITSs), and the delivery of multimedia content across vehicles. However, for effective caching, the network must dynamically choose the optimal set of cars that will cache popular content and disseminate it in the entire network. However, most of the existing prior art on V2V caching is restricted to cache placement that is solely based on location and user demands and does not account for the largescale spatiotemporal variations in V2V communication networks. In contrast, in this paper, a novel spatiotemporal caching strategy is proposed based on the notion of temporal graph motifs that can capture spatiotemporal communication patterns in V2V networks. It is shown that, by identifying such V2V motifs, the network can find suboptimal content placement strategies for effective content dissemination across a vehicular network. Simulation results using real traces from the city of Cologne show that the proposed approach can increase the average data rate by for different network scenarios.
1 Introduction
Vehicletovehicle (V2V) communication is seen as one of enabling technologies for intelligent transportation systems and a key enabler for many smart road and traffic management systems [1], as it allows critical information dissemination. Moreover, spurred by the availability of invehicle infotainment (IVI) systems disseminating entertainment and information content to passengers [2], there is a strong need for a highspeed and stable delivery of large multimedia files, such as videos, photos and songs, to the various cars within a vehicular network. For effective dissemination of such diverse content across vehicular networks, there is a need for effective content placement strategies to maximize the throughput of the system [3]. Moreover, to reap the benefits of V2V content dissemination, we must address different challenges including optimal cache placement and resource allocation [4].
Cache placement in V2V networks has recently attracted significant attention such as in [5][8]. In such works, the popular contents are offloaded to the storage of a number of wellchosen cars and devices at offpeak hours in order to serve requests during peak traffic hours. In these scenarios, cars and devices that do not have the cached content will not have to download the content from wireless base stations (BSs). Instead, they can request the content directly from other cars having cached data which can eventually lead to a reduction in the traffic load at the BSs. Meanwhile, caching using local storage can reduce latency due to a shorter communication distance. However, the benefits of caching are highly dependent on the set of cars and devices chosen for caching the popular contents [7].
In [5], the authors use vehicle mobility data for content dissemination and combine the idea of opportunistic forwarding, trajectory based forwarding and geographical forwarding to develop a mobilitycentric algorithm to place content in vehicular networks. Meanwhile, the work in [6] applies available users mobility patterns to develop a polynomialtime solution to maximize the saved cost by caching contents in local storage. Moreover, in [7], the authors use location information and subscriptionbased information to divide vehicles into different groups and, then, design a spatiotemporal multicast routing protocol to construct an optimized dissemination mesh network. The work in [8] presents an optimal caching policy by using both mobility information and users’ demands and proposes a greedy caching algorithm with polynomial order complexity to obtain bounds of the caching policy whose complexity grows exponentially with the number of users. However, existing works, including [5][8], do not take into account the temporal dynamics of V2V communication networks, such as the frequency of occurrence of different V2V links, which can be a key metric for content placement. For example, using only location information to select the set of cars that will cache the content can lead to choosing cars which are unable to communicate with each other or with other cars, effectively limiting the benefits of caching.
The main contribution of this paper is a novel framework for spatiotemporal caching in vehicular networks that is cognizant of intrinsic spatial and temporal patterns in V2V communications. In particular, using the tools of temporal network analysis [9], we identify temporal motifs in the V2V network as key communication patterns observed among vehicles that appear more frequently compared with what is expected in a baseline, randomized reference system. After identifying the spatialtemporal motifs, the proposed approach then finds the best candidate cars for content placement. To our best knowledge, this is the first work that exploits spatialtemporal motifs for optimizing content dissemination in vehicular networks. Simulation results using real traces from the city of Cologne, Germany, show that the proposed motifbased approach yields significant performance gains in terms of the average data rate, compared to a conventional locationbased scheme.
The rest of the paper is organized as follows. Section II presents the system model and problem formulation. In Section III, we present the proposed motifbased approach. Section IV provides the simulation results. Conclusions are drawn in Section V.
2 System Model and Problem Formulation
Consider a vehicular network in an urban environment, composed of a set of vehicles. In particular, as shown in Fig.1, we consider that a nearby BS provides wireless coverage to the cars in . The BS frequently uses beacon signals to keep track of the vehicles’ positions by using the received signalstrength measurements in [10], and collects the V2V communication data when exchanging information with the communication facility within the vehicles as proposed in [11].
In our system, the BS seeks to seed multimedia files and traffic data in the storage unit of a set of cars to provide the passengers in neighboring cars with streaming services or to disseminate delaysensitive information, such as upcoming road incidents to help drivers decide on their routes. Using this mechanism, the seeded cars, refereed to hereinafter as serving cars, can disseminate the cached content to nearby cars via V2V links. Consequently, the BS can reduce its traffic load, as it no longer needs to transmit the same content to multiple cars. Fig. 1 shows a case in which the BS caches content at car . Here, an arbitrary car sends a request for content to the BS . If the content is already cached at car , the BS would inform the requesting car to use V2V communication to obtain the content from . Otherwise, the BS has to directly transmit the requested content to , which increases its load. Hereinafter, we refer to cars that do not have the cached content as nonserving cars. To increase the spectrum efficiency, we assume that V2V links reuse the spectral resources of the cellular network as in [12].
2.1 Problem formulation
The signaltointerferenceplusnoise ratio (SINR) of a V2V communication link between a serving car and a nonserving car will be:
(1) 
where is the transmission power from to , and is the variance of the Gaussian noise at the receiver. In addition, represents the channel gain between and and can expressed as , where is the fading gain, is the distance, and is the path loss exponent. We consider that the cellular links and the V2V links will experience a Rayleigh fading as done in [13]. and capture, respectively, the interference generated by the links between the BS and other vehicles and by other V2V communication links. These interference terms are given by,
(2) 
where is a binary variable that captures the feasibility of communication link between and . In fact, if the SINR at the receiving car from the transmitting car exceeds a target threshold for V2V communication, otherwise, .
Accordingly, the achievable data rate for the V2V link between the serving car and nonserving car is
(3) 
where is the bandwidth. The corresponding SINR and the achievable rate between the BS and nonserving car , are:
(4) 
We consider a set of popular contents with the same size, sorted from high to low popularity. Due to the limited storage, the serving nodes will choose to cache a limited number of contents. Therefore, to increase the possibility of meeting the requirements from nonserving nodes, the serving cars will cache most popular files, under the capacity constraints. Therefore, when a nonserving node requests one file out of the files, the node can acquire it via a V2V link as long as there exists one serving node within its communication range. To leverage V2V communications for disseminating cached content, the network must determine which vehicles act as serving nodes so as to maximize the average data rate achieved by nonserving cars. To this end, we define the binary variable for each car , where if car is selected as serving node, otherwise , and set contains serving nodes. Therefore, we can formulate the problem as follows:
(5)  
(6)  
(7) 
where is the probability mass function (pmf) of the request for file by car . This distribution can be modeled by the Zipf distribution with pmf , where is the Zipf exponent that determines the skewness of the distribution [14]. The indicator variable is such that if the serving car is the nearest car to the requester in the serving set ; , otherwise. ensures that a nonserving car will always choose the closest serving car with the requested file. Constraint (6) guarantees that the transmission power of vehicles will not surpass the maximum power level , and constraint (7) ensures that SINR of the V2V links is above a threshold, .
2.2 V2V macroscopic communication graphs
As long as we find the optimal set of serving vehicles, we can assign nonserving nodes to serving nodes according to the spatial distance, and solve the problem given by (5)(7). However, finding the optimal set is a 01 integer programming where determining whether each individual node in the set should be considered as either a serving or a nonserving node. In fact, the problem is one of Karp’s 21 NPcomplete problems [15], which is hard to solve directly.
Alternatively, we can find a suboptimal solution to the optimization problem by using the information within the vehicular network. In particular, in addition to the the position and demand of each vehicle, we can also leverage the temporal domain information. This is because V2V networks are naturally dynamic and exhibit the temporal features. For example, the number of communication links between two arbitrary cars may vary with time. Such temporal information on the frequency of communication is valuable to determine which cars are more likely to better disseminate the content. Therefore, time domain information is also important to choose the optimal set for cache placement and solve the problem (5)(7).
To capture the dynamics in the time domain, we propose to use collected V2V communications data and model the system as a directed temporal graph , whose vertices are the cars in and whose temporal edges, in set , denote the timestamped communication events among vehicles. We represent a wireless communication link between two different cars as a 3tuple edge labeled as , , where and denote the serving car and the nonserving car, respectively. is the time of initiating the transmission from to . Here, we assume that, at any time, a car cannot communicate with more than one car simultaneously. In this case, there are no two edges with the common vehicle element initiated at the same time.
To obtain the set of cars that are more active and more likely to participate in the V2V communication in a period of time, we introduce the time constraint and devide the graph into multiple macroscopic communication graphs, where we can change the value of to filter outdated V2V links. That is, for any edge in these macroscopic graphs, there always exists at least one other edge , in the same graph, that meets the following requirements:

Two edges share at least one node, ;

If a wireless connection occurs before another connection , ;

If a wireless connection occurs before another connection , .
3 Proposed Strategy Based on SpatioTemporal Motifs
To find a suboptimal solution to (5)(7), we need to develop an algorithm to detect motifs in V2V communication macroscopic graphs. To this end, we follow three steps. The first step is searching microscopic subgraphs defined as basic communication units with typical sizes (number of edges) in macroscopic communication graphs. The second step is collecting microscopic subgraphs with similar isomorphic structure. The third step is determining the frequently occurring motifs after comparing each subgraph’s frequency of occurrence with its counterpart in a baseline randomized V2V communication network. In particular, we use the notion of Zscore, expressed as
(8) 
where captures the standard deviation of the corresponding subgraph in the reference system. If , where is a given threshold, we can classify the subgraph as a motif [16]. Finally, given the detected motifs, our spatiotemporal caching strategy can select , , to solve the optimization problem in (5)(7).
3.1 Searching for V2V microscopic subgraphs
To effectively detect existing motifs in V2V communication networks, we first decompose V2V macroscopic communication graphs into microscopic subgraphs with the same size.
The algorithm used for finding V2V microscopic subgraphs with target size in a macroscopic graph is shown in Algorithm 1. Given a set of labeled cars and labeled connection edges, there are two ways to obtain microscopic subgraphs from the macroscopic graph. One approach is to consider the nodes connected by edges, and another approach is to consider the set of edges in which an arbitrary edge can find another edge sharing the common node. In contrast to the work in [17], the proposed algorithm is based on the second approach. As shown in Fig. 2, from the macroscopic graph, we can obtain a pair of edge sets, and , where the first set contains only one edge , labeled as , and the second set has ’s all neighboring edges with greater labels. The mechanism will first add an arbitrary edge , labeled as , from the second set to the first set and update the first set. By calling EdgeExtension(, , ), the second set could be extended by first merging the set of edges, which are neighbor to the newly added edges and with greater labels in the macroscopic graph, and then removing edge and other edges with smaller labels compared with . Next, we repeat the aforementioned steps for the first set with the added edge and the new second set until we obtain the microscopic subgraphs meeting the size requirement, i.e., .
Example: When the two sets of edges from the macroscopic graph are {1} and {2,3,4,6}, as shown in Fig. 2, the algorithm first adds edge 2 to the first set {1} and obtains the first updated set {1,2}. Then, it updates the second set as {3,4,5,6} by first merging the neighboring set with greater labels, i.e., {3,5,6} with {2,3,4,6} and then deleting the edge 2. Repeatedly, the algorithm follows the same processes for the updated set {1,2} and the edge set {3,4,5,6}, and we can obtain the microscopic subgraphs {1,2,3} and {1,2,6}, if .
As shown in Fig. 2, by using this algorithm for any one edge set and the set of its surrounding edges with greater labels, we can finally collect all V2V microscopic subgraphs with the required size in a macroscopic graph. To obtain the whole set of microscopic subgraphs existing in the V2V network, we can use this algorithm for different macroscopic graphs. Although the complexity of the algorithm will increase with the increment of the vehicles, we can apply the algorithm in scenarios with capacity limitations, like intersections, parking lots, and parts of the highway, or a subset of the network, thus reducing the complexity and the processing time.
3.2 Classifying V2V microscopic subgraphs
To sort V2V microscopic subgraphs with the same isomorphic structure, i.e., to find microscopic subgraphs which contain the same number of cars connected in the same way, we exploit the notion of canonical labeling. By permuting the elements in the adjacency matrices obtained from a microscopic subgraph, we could construct many lists of integers. By viewing each list of integers as a string of 1s and 0s, we sort them based on lexicographic ordering to obtain a canonical labeling defined as the string with the minimum value. Due to the fact that the canonical labeling of two subgraphs will be identical as long as they have the similar isomorphic structure [18], the problem of determining isomorphic structures among subgraphs is equivalent to deciding whether given microscopic subgraphs have the same canonical labeling or not.
For example, as shown in Fig. 3, by concatenating rows or columns one after the other in the permuted adjacency matrices of the first microscopic extracted from Fig. 2, we can find two strings of 1s and 0s. According to lexicographic ordering, we notice the string corresponding to the second matrix is smaller than its counterpart in the first matrix, i.e., “000010101”<“0001011”, and “0001011” can be chosen as the canonical labeling for the first microscopic subgraph compared with other permutations. Similarly, we can observe the second microscopic subgraph shares the same canonical labeling with the first subgraph, and, thus, these two subgraphs have the same isomorphic structure.
Accordingly, we are capable of completing structure classification for all the microscopic subgraphs in the V2V communication network, and then, we can calculate the occurrence frequency of microscopic subgraphs having a similar isomorphic structure. Moreover, we also repeat the same steps for a randomized V2V communication network so as to obtain the mean frequency and the standard deviation of the corresponding subgraph. We can use (8) to determine whether the subgraph is motif or not.
3.3 Proposed spatiotemporal caching strategy
To obtain the suboptimal solution for the optimization problem (5)(7), as shown in Fig. 4, the BS will first select the serving cars in the set based on the motifs detected from the V2V communication graph. In particular, we assume there are motifs and the Zscore of a motif is . Note that for any , . By observing the motif structure, we can determine the outdegree of each car, which is defined as the number of outgoing edges emanating from the car. For example, for the first microscopic subgraph in Fig. 3, the outdegrees of are 1, 2, and 0. Then, we choose the influential car as the node with the maximum outdegree in the corresponding motif, since the connections that originate from this car can reach more recipients compared with other cars, leading to a more effective content dissemination. Next, we can statistically acquire the frequency of car ’s being the influential car in the th motif as . Based on that, as shown in Fig. 4, we can obtain the sum frequency of being the influential car in the motifs for car as,
(9) 
where captures the weight value of a motif , expressed as . After calculating the sum frequency of each car, we sort nodes from the car with the highest frequency to the one with the lowest frequency. Finally, the choice of best candidates to cache will be the first elements in the array.
After choosing the serving cars, the next step is to assign each receiving car to its optimal serving car. In particular, if the BS receives the request from one receiving car, the BS would inform the car the nearest serving car with required content based on the collected location information. To solve (5)(7), we decompose it into two problems. One is determining the optimal set of serving cars, which is solved by exploiting the temporal motifs. The other is completing the best assignments between the serving cars and the nonserving cars using the spatial knowledge.
4 Simulation results
Parameter  Meaning  Value 

Width of each lane  m  
Transmission power of base station  W  
Transmission power of V2V links  dBm  
Path loss exponent  
SINR threshold  dB  
Power of noise  dBm  
Bandwidth of the system  MHz  
Zipf exponent  
Total number of files in the network  
The maximum number of files the car can cache  
Approximate distance from BS to freeway  km  
Time constraint  100 s 
For our simulations, we use the vehicular mobility dataset within the city of Cologne, Germany, which is collected by the TAPASCologne project [19]. The dataset contains information about roads and vehicles as well as the trip information for each individual car in one day. In particular, we take into account a kmlength freeway (Autobahn 4 in Cologne) with three lanes in each direction and the nearest BS is colonius fernsehturm. We collect the location information of a network of vehicles that coexist on the freeway from the data. All simulation parameters are summarized in Table. 1.
Since the Poisson distribution can be used to capture the number of events occurring within a fixed interval [20], we assume that the number of wireless communication links between two arbitrary cars and that in proximity of one another, in a given period of time, follows a Poisson distribution with parameter . To better simulate realtime data, we assume that is inversely proportional to the distance between and . This is because a closer distance leads to a better communication environment, resulting in a higher probability to build communication links. Then, we randomly assign a time stamp to each V2V communication links. In a baseline randomized V2V communication network, the communication instances are randomly generated and given a time stamp. Once the temporal graph data is generated, the motifs can be detected based on the approach in Section III. In our analysis, a structure is identified as a motif, when the Zcore of the structure is at least . For comparison, we use a locationbased caching strategy. In this strategy, the basic principle is that the BS would always select a set of cars that can realize the least summation of distance between remaining cars and the corresponding closest cars the in the selected set.
We consider two simulation scenarios. In the first scenario, we consider total number of cars (serving and nonserving). As we change the number of serving cars, the number of nonserving cars will be modified accordingly. In the second scenario, we first randomly choose several car sets with different total number of cars. Then, based on the proposed strategy and the locationbased strategy, we choose a fixed number of nonserving cars chosen out of the selected sets. In particular, we select twelve car sets having a total number of cars ranging from to with a step of , and we fix the number of receiving cars to . According to the proposed method, we can detect different motifs in both scenarios, sorted from the highest to the lowest in terms of Zscore, from the real trace data and generated wireless communication data, as shown in Fig. 5 and Fig. 6. Based on these structures, we can observe the outdegree of each node. Then, using the proposed algorithm in Section III, we can determine the set of cars used for caching.
Fig. 7 shows the average transmission rate achieved by nonserving cars under the locationbased caching strategy and the proposed approach, with the total number of cars fixed at (first scenario). From Fig. 7, we observe that the spatiotemporal caching strategy yields a better performance compared with the locationbased cache strategy in terms of average date rate per nonserving car. In particular, the performance advantage reaches up to when the number of serving car is . Furthermore, as the number of serving cars increases, the number of nonserving cars will decrease. In particular, when the number of serving cars goes to , there are only cars requesting for content, leading to a reduced interference and an increase in the average data rate, as also seen in Fig. 7.
Fig. 8 shows the average data rate for nonserving cars as we vary the number of serving cars. We can observe that the proposed strategy outperforms locationbased caching strategy up to when there are cars acting as serving nodes. Further, the average data rate for both strategies will not increase all the time. This is due to the fact that an increase in the number of serving cars will raise the interference over the V2V links. When the impact of interference cannot be compensated by the gain from V2V communication, the average data rate for V2V links will decrease, as seen in Fig. 8. Moreover, Fig. 9 shows the cumulative distribution functions (CDFs) of the data rate of nonserving cars for , , , and serving cars. Compared with the data rate resulting from locationbased strategy, the nonserving cars are more likely to achieve a higher data rate when employing the proposed caching strategy. Fig. 9 also shows under both caching strategies, the probability of achieving a higher data rate for serving cars is greater than the counterparts for , , and serving cars. In particular, when the number of serving cars is and the probability is , the proposed caching strategy improves the data rate of about compared to the locationbased strategy.
5 Conclusions
In this paper, we have proposed a novel spatiotemporal caching policy in vehicular networks. In contrast to traditional locationbased caching strategies, we have leveraged temporal graph motifs, which represent subgraphs with high frequency of occurrence in the V2V communication graph, to determine car candidates to cache popular content. We have developed an approach to detect the motifs and, then, have used the results to determine the preferred set of cars for popular content placement. Simulation results using real car location traces have shown that the proposed spatiotemporal caching strategy can yield significant gains in terms of the average data rate per car for two practical scenarios.
References
 K. Zheng, Q. Zheng, P. Chatzimisios, W. Xiang, and Y. Zhou, “Heterogeneous Vehicular Networking: A Survey on Architecture, Challenges, and Solutions,” IEEE Commun. Surveys Tuts., vol. 17, no. 4, pp. 23772396, Fourthquarter, 2015.
 Y. Wu, W. Putnam, J. Wang, and Z. Cheng, “A wireless peertopeer broadcast model for emergency vehicles using automotive networking,” in Proc. of IEEE Symposium Series on Computational Intelligence (SSCI), Athens, Greece, Dec. 2016.
 D. Raychaudhuri and N. B. Mandayam, “Frontiers of wireless and mobile communications,” Proc. IEEE, vol. 100, no. 4, pp. 824840, Apr. 2012.
 IEEE, “IEEE trialuse standard for wireless access in vehicular environments (WAVE)  security services for applications and management messages,” in IEEE Std 1609.22006, ed, 2006.
 H. Wu, R. Fujimoto, R. Guensler, and M. Hunter, “MDDV: A mobilitycentric data dissemination algorithm for vehicule network,” in Prof. of ACM International Workshop on VehiculAr InterNETworking (VANET), Philadelphia, PA, USA, Oct. 2004.
 Y. Guan, Y. Xiao, H. Feng, C. Shen, and L. Cimini, “MobiCacher: Mobilityaware content caching in smallcell networks,” in Proc. of IEEE Global Communications Conference (GLOBECOM), Austin, TX, USA, Dec. 2014.
 S. Shivshakar and A. Jamalipour, “Spatiotemporal multicast grouping for contentbased routing in vehicular networks: A distributed approach,” J. Netw. Comput. Appl., vol. 39, pp. 93–103, Mar. 2014.
 S. Hosny, A. Eryilmaz, A. Abouzeid, and H. Gamal, "Mobilityaware centralized D2D caching networks," in Proc. of Annual Allerton Conference, UrbanaChampaign, IL, USA, Sept. 2016.
 U. Alon, “Network motifs: Theory and experimental approaches,” Nature Rev. Genet., vol. 8, pp. 450–461, Jun. 2007.
 M. Hellebrandt, R. Mathar, and M. Scheibenbogen, “Estimating position and velocity of mobiles in a cellular radio network.” IEEE Trans. Veh. Technol., vol. 46, no. 1, pp. 6571, Feb. 1997.
 J. Hubaux, S. Capkun, and J. Luo, “The security and privacy of smart vehicles.” IEEE Security Privacy vol. 2, no. 3, pp. 4955, MayJune 2004.
 C. Yu, K. Doppler, C. Robeiro, and O. Tirkkonen, “Resource sharing optimization for devicetodevice communication underlaying cellular networks.” IEEE Trans. Wireless Commun., vol. 10, no. 8, pp. 27522763, Aug. 2011.
 F. Liu, Z. Chen, and B. Xia, “Data dissemination with network coding in twoway vehicletovehicle networks.” IEEE Trans. Veh. Technol., vol. 65, no. 4, pp. 24452456, Apr. 2016.
 D. Malak, M. AlShalash, and J. Andrews, “Optimizing the spatial content caching distribution for devicetodevice communications,” in Proc. of 2016 IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, July 2016.
 R. Karp, “Complexity of computer computations,” New York, USA: Springer Press, 1972.
 P. Holme and J. Saramaki, “Temporal network,” Phys. Rep., vol. 519, no. 3, pp. 97125, Oct. 2012.
 S. Wernicke, “Efficient detection of network motifs,” IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 3. no. 4, pp. 347359, Oct. 2006.
 R. Read and D. Corneil. “The graph isomorph disease,” Journal of Graph Theory, vol. 1, pp. 339363, Dec. 1977.
 Data/Scenarios/TAPASCologne, available on: http://sumo.dlr.de/wiki/Data/Scenarios/TAPASCologne.
 J. Kingman, “Poisson processes,” Oxford, UK: Oxford University Press, 1993.