Optimal Forwarding in Opportunistic Delay Tolerant Networks with Meeting Rate Estimations
Abstract
Data transfer in opportunistic Delay Tolerant Networks (DTNs) must rely on unscheduled sporadic meetings between nodes. The main challenge in these networks is to develop a mechanism based on which nodes can learn to make nearly optimal forwarding decision rules despite having no apriori knowledge of the network topology. The forwarding mechanism should ideally result in a high delivery probability, low average latency and efficient usage of the network resources. In this paper, we propose both centralized and decentralized singlecopy message forwarding algorithms that, under relatively strong assumptions about the network’s behaviour, minimize the expected latencies from any node in the network to a particular destination. After proving the optimality of our proposed algorithms, we develop a decentralized algorithm that involves a recursive maximum likelihood procedure to estimate the meeting rates. We confirm the improvement that our proposed algorithms make in the system performance through numerical simulations on datasets from synthetic and realworld opportunistic networks.
Shell : Bare Demo of IEEEtran.cls for Journals
1Introduction
or Disruption Tolerant Networks (DTNs) are a class of wireless mobile node networks in which the communication path between any pair of nodes is frequently unavailable. Nodes are thus only intermittently connected. DTNs were first studied in the 1990s when the research community considered how the Internet could be adapted for space communications [1]. Later, it was recognized that DTNs were a suitable model for several terrestrial networks.
DTNs can be categorized according to whether the node connections are scheduled (thus predictable) or random (hence unpredictable). Space communication networks fall into the first group. Networks belonging to the second category, which are the focus of this paper, are also referred to as opportunistic networks, because nodes seize the opportunity to transfer data when a communication channel becomes available. Opportunistic networks have been studied intensively in recent years (e.g., [2]) because they can fulfil a number of useful purposes, such as nonintrusive wildlife tracking (e.g., ZebraNet [5] and SWIM [6]), emergency response in disaster scenarios (e.g., ChaosFIRE [7]), provision of data communication to remote and rural areas (e.g., DakNet [8]), and traffic offloading in cellular networks (e.g., [9]).
In most opportunistic networks, the nodes are highly mobile and have a short radio range, and the density of nodes is low. In many cases, nodes have limited power and memory resources. These attributes combine with the intermittent connections to make routing traffic challenging. Routing is usually based on a storecarryforward mechanism that exploits node mobility. In this mechanism, the source transmits its message to a node it meets. This intermediate node stores and then carries the received message until it meets another node to which it can forward the message. This process is repeated until the message reaches its destination. The key ingredients in designing an opportunistic network routing protocol are the forwarding decisions: should a node forward a message to a neighbour it meets? should it retain a copy for itself?
Although much research effort has been devoted to the development of opportunistic network routing algorithms [10], the algorithms are either centralized, have no performance guarantees, or ignore the need to estimate network parameters. Our work focuses on the mobile adhoc network (MANET) setting, where node speed is much reduced compared to the vehicular adhoc (VANET) case, and we can assume that there are fewer restrictions on the amount of data that nodes can transfer when they meet. In this paper, we derive a decentralized routing algorithm that has performance guarantees (under simplifying assumptions about the network behaviour). When the meeting times between nodes are independent and exponentially distributed, the routing algorithm minimizes the expected latency in sending a packet from any source node to a specific destination. We examine the behaviour of the routing algorithm when the meeting rates are learned online using a recursive maximum likelihood procedure. We show that, for a stationary network, the decision rules and achieved expected latencies converge to those obtained when there is exact knowledge of the meeting rates. We present the results of simulations that compare the performance of the proposed algorithm to previous approaches, and examine how the algorithm is affected by practical network limitations (finite buffers, restrictions on data exchange, message expiry times).
Organization: The paper is organized as follows. In the following subsection, we discuss related work. In Section 2, we describe our system model and formulate the routing problem. In Section 3, we present the forwarding algorithms and discuss their optimality under the network modeling assumptions. We present numerical simulation results in Section 4 and make concluding remarks in Section 5.
1.1Related Work
The first proposed approaches for routing in opportunistic networks were based on extending the concept of flooding to intermittently connected mobile networks. In these replicationbased methods, a node forwards the messages stored in its buffer to all of (or to a fraction of) the nodes it encounters. There is no attempt to evaluate the capability of a given node to expedite the delivery. These routing algorithms have few parameters: they determine only how much replication can occur and which nodes can make copies of packets. One of the earliest algorithms was epidemic routing [10], in which a node forwards a message to any node it meets, provided that node has not previously received a copy of the message. Thus messages are quickly distributed through the connected portions of the network. Other replicationbased approaches ([11]) manage to reduce the transmission overhead of epidemic routing and improve its delivery performance through modification of the replication process and prioritization of messages. Models have been developed that allow an analytical characterization of the performance of the epidemic routing techniques [30]. Replicationbased approaches result in a high probability of message delivery since more nodes have a copy of each message, but they can produce network congestion.
A step towards achieving more efficient routing approaches is to consider the history of node contacts in the network instead of blindly forwarding packets. Historybased (also called utilitybased) routing algorithms assume that nodes’ movement patterns are not completely random and that future contacts depend on the frequency and duration of past encounters. Based on these past observations, both the source and the intermediate nodes decide whether to forward a message to nodes they encounter or to store it and wait for a better opportunity. An early example is [13], which extends epidemic routing to situations with limited resources, incorporating a dropping strategy for the case when the buffer of a node is full. The dropping decisions are based on the meeting history of the node. PRoPHET [14] assigns a delivery probability metric to each node which indicates how likely it is that the message will be delivered to the destination by that particular node. This metric is updated each time two nodes meet, and thus takes into account the history of meetings in the network. MaxProp [15] and MEED [16] are other examples of historybased algorithms proposed for vehicular DTNs. In these networks, nodes move with higher speeds, reducing the amount of time they are in each other’s radio range. Hence, the two main limiting resources are the duration of time that nodes are able to transfer data and their storage capacities.
Other researchers have examined whether it is possible to exploit other characteristics of opportunistic networks to improve the performance of routing algorithms. Since social interactions often determine when connections between nodes occur, several algorithms strive to use social network concepts like betweenness centralities (e.g. [17]), or community formations (e.g. [18]). Other algorithms have attempted to take advantage of the strategic behaviour of nodes (e.g. [21]). Our work focuses on routing a message to a single destination, but there are connections to research that addresses the task of spreading information to multiple nodes in a network. Of particular interest is the gossipbased approach in [23], which greatly reduces the number of message copies in the network while achieving nearoptimal dissemination.
The experimentalbased studies demonstrate the efficiency of their proposed methods by running simulations on traces recorded from real world opportunistic networks. Experimental analyses are valuable and take into account practical considerations, but they can leave us with an incomplete understanding of how an algorithm operates and how it will perform in other untested network conditions. For example, the behaviour of PRoPHET has been shown to be very sensitive to parameter choice [32]. It is also useful to design an optimal algorithm under slightly less realistic modeling assumptions, and then consider how it can be adapted to address the practical limitations, without completely losing its desirable features. More recent studies have focused on deriving a forwarding process whose optimality (in some sense) can be mathematically proved under assumptions about network behaviour. [24] extends the two hop relay strategy of [11] by considering the expected delivery time to the destination as a metric to find the best set of candidate relays. By increasing the number of relaying steps recursively, a centralized singlecopy multihop opportunistic routing scheme is proposed for sparse DTNs. The main defect of a centralized approach is that global knowledge of the network is required in order to make forwarding decisions.
There have been some efforts towards migrating to decentralized solutions that still provide performance guarantees. [25] proposes a decentralized timesensitive algorithm called TOUR in which message priority is taken into account in addition to nodes’ expected latencies when making forwarding decisions. Although in TOUR each node only needs to be aware of the local information about the rates of contacts with its own set of neighbours, the algorithm assumes that the node knows the exact contact rates. In most practical scenarios, this assumption is not valid.
Some researchers have explored how imprecision in the measurement or estimation of network parameters can impact the performance of opportunistic network routing algorithms. In [26], Boldrini et al. discuss different sources of errors that may exist in parameter estimation like missed encounters, incorrect combination of short contacts, and memory limitations. They model these errors as a random variable with a normal distribution and evaluate the performance of four different forwarding schemes under this model. Although this error analysis is useful, Boldrini et al. do not specify how parameters should be estimated in order to obtain a performance that approaches what can be achieved when perfect apriori knowledge of the network parameters is available.
Some of the results in this paper were presented in an earlier conference paper [33], but here we include more extensive experimental analysis and additional theoretical results.
2System Model
We consider a network of mobile nodes which aim to send messages to a particular destination node . The set of nodes is denoted by . We assume that the random intermeeting times of nodes are independent and exponentially distributed with parameter for nodes and .
Although the aggregate intermeeting distributions of nodes in mobile adhoc networks often follow a truncated power law [34], there is evidence that the intermeeting times of individual pairs of nodes can be adequately modeled by exponential distributions with heterogeneous coefficients [36]. In particular, Conan et al. [36] and Gao et al. [37] conduct statistical analyses of mobile social network data traces, including the Infocom data set [40] that we analyze in Section 4. They demonstrate that most pairs of nodes have intermeeting times that are approximately exponentially distributed. In [38], approximately exponential distributions of individual meeting times are detected through statistical analyses of car/taxi mobility traces.
We associate with the network a contact graph which is formed by adding a link between any two nodes that meet. We assume that the contact graph is connected and denote the set of neighbors of node in this graph by . Since the contacts between nodes are not prescheduled, we cannot identify endtoend paths ahead of time. Hence, solving the routing task is equivalent to identifying the forwarding decisions that nodes should make when meeting each other. We assume that nodes’ buffer sizes are unlimited, message Time To Live (TTL) is infinity and that nodes’ speed and message lengths are such that any number of messages can be forwarded during each meeting. We consider only algorithms that do not involve replication. In the class of algorithms we consider, each time node meets one of its neighbors , it forwards a message destined for with probability . Considering the matrix comprised of all pairs and , we set if nodes and never meet and are thus not neighbors in the contact graph. We denote the forwarding probabilities of node by the vector ; this is the th row of the matrix .
The expected latency from node to destination is a function of the probability decision matrix and we denote it by . Our goal is to find the matrix such that the sum of the expected latencies of all the nodes in the network to the specified destination is minimized. Let us call this utility function . We assume that the network topologies and meeting rates are such that the solution is unique. If not, our algorithms guarantee that we reach one of the optimal matrices, but the proofs are more complicated. The first step towards achieving this goal and finding matrix is to discover how the expected latency of an arbitrary node , , depends on the elements of the probability decision matrix in general. Lemma ? provides an expression for in terms of and . The proof is available in Appendix Section 6.
Based on the expression derived in Lemma ?, the expected latency of each node to the destination depends on the expected latencies of its neighbours. This result raises a substantial question: Does the probability decision matrix that minimizes the sum of expected latencies of all nodes of the network (), also minimize the expected latency of each individual node? Before continuing to propose algorithms for finding , we answer this question and make two points about the structure of through the following theorem. The proof is provided in Appendix Section 7.
Theorem ? shows that the minimization problem is actually a binary problem. Each time node meets one of its neighbours , it either forwards the message or keeps it. From now on, we change our notation and use the binary indicator matrix instead of to capture this binary decision. Therefore, the optimization takes the form:
Theorem ? also states that the optimum solution matrix can be equivalently achieved by minimizing the expected latency of each of the network nodes to the destination. This is the main idea of developing centralized and decentralized algorithms for finding . In the next section, we introduce the algorithms we have proposed for solving this optimization problem and prove that they find the optimal solution.
3Algorithms
In the first part of this section, we try to to find in a centralized fashion where the whole topology and meeting rates of the network are available at a central unit. This unit calculates a binary matrix and informs the nodes about the neighbours they should forward their buffered messages to. We prove that the solution achieved upon completion of this algorithm is the same as the optimum solution . In the second part of the section, we introduce a decentralized algorithm and prove that it converges to the same global solution with probability . The advantage of the decentralized approach is that no node needs to have a global knowledge of the network and each node can learn its own optimal forwarding decisions. The only piece of information a node needs to know is its meeting rates with its own neighbours. Finally in the last part of the section, we make our model more realistic by assuming that nodes have no apriori knowledge of any meeting rates. In this more practically realistic scenario, each node estimates the meeting rates with its neighbours, updating its estimates each time a contact occurs.
3.1Centralized Approach with Global Knowledge
Suppose for each node , the set of neighbours and their meeting rates are known at a central calculation unit. Algorithm ? presents an iterative procedure to identify a binary decision matrix . In this algorithm, we first decide on the forwarding rules of the node that has the most frequent direct contacts with the destination. We refer to this node as node . In order to achieve the minimum expected latency to the destination, node should forward its generated or received messages only to the destination, ignoring its meetings with any other nodes. All other nodes that encounters meet the destination less frequently and, if they forward their messages to the destination through other nodes, these other nodes also meet the destination less frequently than . In subsequent steps of the algorithm, we consider all the nodes that have direct contacts with the nodes whose forwarding decision rules have already been made (the set ) and calculate the minimum latency that each of them can obtain by forwarding through nodes in to the destination. At the end of each iteration, we finalize the forwarding decision for the one node that can achieve the minimum latency and add it to . We repeat the procedure until the decision has been made for all the nodes of the network and the elements of the binary matrix have all been specified. The next theorem states that the binary matrix resulting from applying this procedure, as specified concretely in Algorithm ?, achieves the minimum sum of expected latencies to the destination. The proof can be found in Appendix Section 8.
Theorem ? demonstrates that the iterative optimization procedure expressed in Algorithm ? finds the solution of the minimization problem in (Equation 1). If there is not a unique solution, then at some point in Algorithm ?, there will be multiple that solve the optimization in line 7. It is straightforward to show that choosing any one of these leads to a decision matrix that achieves the minimum expected latencies.
3.2Decentralized Approach with Partial Apriori Knowledge
Suppose no central unit exists and each node is just aware of its own and the meeting rates . Algorithm ? demonstrates how nodes can make their binary forwarding decisions based on this local information. Since the expected latency of each node depends on the expected latency values of its neighbours, nodes need to have an estimation of their neighbours’ expected latencies to be able to make forwarding decisions. We denote by the estimate at node of the latency from node to the destination. In Algorithm ?, each time two nodes meet, they update these estimates and then recalculate their optimum forwarding rules. Theorem ? proves that this decentralized approach results in the same global optimum solution. The proof of Theorem ? is provided in Appendix Section 9.
We refer to our proposed decentralized greedy latency minimization algorithm (Algorithm ?) as MinLat and evaluate its efficiency in different random and realworld networks based on certain performance metrics in Section 4. Regarding the computational complexity of finding the minimum expected latency in MinLat, the following lemma shows that the optimizations in lines 8 and 9 of this algorithm are linear fractional programs and can be solved quickly using variants from linear programming. Further details are available in Appendix Section 10.
Assuming that can be solved in polynomial order , the worst case complexity order of Algorithm ? is because in the th round of this algorithm, should be solved for each of the nodes that are not in the set . In Algorithm ?, each time node meets one of its neighbours,it solves a problem of complexity . The only information that a node needs to share when it meets another node is its estimate of its own expected latency to the destination. In the general case where messages can be destined to any node in the network, this exchangable message could be a length vector of expected latencies to all nodes.
The following proposition provides a bound on the expected convergence time of Algorithm ?. The brief proof is provided in Appendix Section 9. The bound depends on the slowest meeting rate between each node and its candidate relay nodes. This is a conservative bound, since in practice, a node only needs to meet the relay nodes to which it actually forwards data under the optimum forwarding rule.
3.3Decentralized Approach with No Apriori Knowledge
In part Section 3.2, we assumed that as soon as a node meets another node, it has a perfect knowledge of its meeting rate with that node. In practice, a node will need to estimate its meeting rates with the neighbours and periodically revise the estimation as meetings occur (or fail to occur). Consider an arbitrary pair of nodes that meet each other with rate . We denote the intermeeting time, which is the time between and meetings, by . For this specific pair of nodes, is an independent sample of an exponentially distributed random variable with parameter . Using the maximum likelihood (ML) approach we can estimate the parameter after samples. The likelihood function is maximized by
Hence, under the exponential model, a node only needs to remember the last time it met its neighbour and the number of times it has met that neighbour. With these two pieces of information, it can update its estimation of the meeting rate () from the previously estimated value () using the following equation.
Based on this argument, we develop a more practical version of MinLat in which an arbitrary pair of nodes and use their estimated meeting rates in their calculations and modify this estimation each time they meet. We refer to this version of MinLat as MinLatE.
Let denote the time since the network began operating, and denote by the decision matrix achieved by MinLatE at time . Further, denote by the estimate at node at time of the expected latency to the destination, when the forwarding decision matrix is . This estimate differs from that obtained in Algorithm 2, , because the distributed algorithm calculates them using estimated meeting rates . The following theorem states that the achieved expected latencies, and the estimated expected latencies, , converge in probability to the optimum expected latencies . The proof is provided in Appendix Section 11.
We check the claims of Theorem ? and investigate the convergence speed of MinLatE through simulations in Section Section 4.
4Simulation Results
In this section, we investigate the efficiency of our proposed approach in modeling and solving the forwarding/routing problem in different opportunistic network scenarios. We first test our algorithms using three different networks to model the contacts between mobile nodes. The characteristics of the networks are derived from the Infocom05 dataset [40]. This data set is based on an experiment conducted during the IEEE Infocom 2005 conference in Miami where 41 Bluetooth enabled devices (Intel iMote) were carried by attendees for 3 days. The start and end times of each contact between participants were recorded. The average time between node contacts in the Infocom05 dataset is seconds ( hours). In our processing, we only consider the contacts in which both devices recognized each other so that an acknowledged message could be transfered between them.
In the first network, (Net I), we construct a contact graph using an evolving undirected network model based on the preferential attachment mechanism. We start with a small fully connected graph of vertices and add vertices to it one by one until the graph consists of nodes. At each step, the new vertex is connected to previously existing vertices. The probability that the new vertex is connected to vertex is where is the degree of up to this stage. After building the contact graph, we assign a parameter to each pair of nodes and which are connected in the contact graph and assume that they meet with exponentially distributed intermeeting times with parameter . We choose the parameters from a uniform distribution with the same expectation as the average of node meeting rates observed in the Infocom05 dataset.
In Net II, we set to be equal to the inverse of the average intermeeting time between nodes and in the Infocom05 dataset. We are interested in the behaviour of the algorithms in relatively sparse networks, so we limit the number of neighbours of each node: node is only connected to node in the contact graph if the meeting rate is among the largest meeting rates of either node or node . In our simulations, the meeting times between nodes and for Net II are then chosen from an exponential distribution with parameter . In the third experimental network, Net III, we use the actual meeting times recorded in the Infocom05 dataset. The analysis in [37] indicates that the distribution of individual intermeeting times for most pairs of nodes can be approximated reasonably well by an exponential distribution; on the other hand, the aggregate distribution of contact times shows heavytailed behaviour and is better approximated using a truncated power distribution [35]. Table ? summarizes the properties of the test networks.


Parameters 



[0.5ex] I 




[0.3ex] 



[0.3ex] 

Dataset times  
[0.3ex] 
As mentioned in Section 3, we call our proposed decentralized greedy latency minimization algorithm MinLat and refer to its more practical version with meeting rate estimations as MinLatE. In these two algorithms, the decisions that nodes make for future forwarding rules depend on the (estimated) meeting rates, which are derived from the frequency of past contacts between nodes. Thus, MinLat and MinLatE can be identified as historybased routing algorithms. In order to evaluate their performance, we compare them with existing historybased routing protocols that can be implemented in a distributed fashion and do not need apriori knowledge of the network topology. As mentioned in Section Section 1, the fixed point algorithm proposed in [24] is proved to result in the minimum expected latency (which is expected to be the same as the result of our proposed centralized Algorithm ?). However, the proposed algorithm in [24] is centralized and needs to be performed in a control unit where the whole topology of the network is known. The TOUR algorithm proposed in [25] is decentralized, but each node needs perfect apriori knowledge of its meeting rates with other nodes. Also, the main focus of TOUR is to find the optimum way to make forwarding decisions based on the priorities of messages. We have chosen PRoPHETv2 [32] and MaxProp [15] as the most appropriate candidates for comparison. We also compare to Epidemic routing [10], which is expected to result in a high delivery probability at the expense of high usage of network resources. The parameters of PRoPHETv2 are set to those suggested in [14] and [32], i.e., , , , and . In order to put the focus on evaluation of the performance efficiency of forwarding rules and eliminate the effect of the buffer management technique on this performance, we first test the algorithms on ideal network scenarios where the message life times, buffer sizes and data exchanges have no restriction. Therefore for these simulations, the dropping rules proposed for MaxProp in [15] are not applied. We then study the network behaviour when these practical challenges are added to the simulation setups.
We divide the Infocom05 dataset into slots of 12 hours. In each of these time slots, we build networks I to III using the nodes that are present in that period. The intermeeting time exponential parameters (s) are estimated based on the meetings that occurred in that specific time slot and networks I and II are constructed using these estimated parameters. For each of the first four 12hour periods (the first two days of the conference), we send messages, spaced by second intervals, from randomly chosen source nodes to a particular destination. We terminate the simulation at the end of the 3day period, and calculate the fraction of messages successfully delivered by each of the single copy (PRoPHETv2, MinLat) and multicopy (Epidemic, MaxProp) forwarding algorithms. For each algorithm, we also calculate the average latency of the messages that are delivered by all four algorithms; the average number of hops that messages pass to reach the destination; and the average buffer occupancy of the nodes. For each of the nodes of the network, we calculate the average of performance metrics over the time slots when that node has been chosen as a destination. Figure 1 shows the average and the confidence intervals of the four performance metrics for different destinations in the three test networks using Epidemic, PRoPHETv2, MaxProp, and MinLat forwarding algorithms.
There is no restriction on message life time or buffer size that can cause message dropping in these simulations. However, the delivery rates in some cases are less than because the simulations are terminated before all of the generated messages are successfully delivered. We observe that in all the three test networks, MinLat has a better performance than the other existing historybase single copy algorithm, PRoPHETv2, in terms of delivery rate and average latency. Its performance is also comparable to MaxProp which is a historybased multicopy algorithm. Also, noting that the scale of the vertical axis of Figure ? is logarithmic, we see that MinLat occupies much less memory of nodes’ buffers in average. For networks I and II, where the assumption of exponentially distributed intermeeting times holds, delivering a higher rate of messages with lower average latencies than PRoPHETv2 is expected from Theorem ?. However, we observe that this result also holds for network III where the actual meeting times are used. All algorithms display slightly poorer performance in network III; this is probably due to heavytailed and nonstationary intermeeting times.
In order to explore how the incorporation of meeting rate estimations in MinLatE affects the message delivery performance, we conduct further simulations with a different message generation scenario on network II. Figure 2 displays the average delivery latency as a function of time for the historybased routing algorithms PRoPHET, MaxProp, MinLat, and MinLatE. The delivery latency values are averaged over simulation runs while the destination node and message generation times are fixed in all the runs. Messages are generated with interarrival time of seconds at randomly chosen source nodes where is uniformly distributed in . Each point on the curves represents the average latency of the most recently sent messages. As Figure 2 shows, in all of the three single copy routing algorithms (PRoPHET, MinLat, MinLatE), the average time it takes for a message to be delivered at the destination decreases as time goes by. This decreasing trend is due to the fact that the forwarding rules discovered by nodes improve as the nodes have more contacts and their information concerning their neighours’ message delivery capabilities becomes more accurate. However, in the multicopy routing algorithm, MaxProp, the average message delivery time increases in the beginning. In MaxProp, the weights assigned to the links are initialized to be equal, which means that nodes forward messages to more of their neighbours. There is thus a high level of message replication which leads to messages reaching the destination sooner on average. As time passes, the level of replication decreases and the average delivery time increases. We also observe that MinLatE and MinLat eventually achieve the same average delivery latencies, as expected from Theorem ?. However, the convergence to the optimum point is slower in MinLatE due to the time it takes for nodes to obtain accurate estimates of their meeting rates.
We examine the performance of the forwarding algorithms in larger networks by extending network I to nodes but with the same average parameter for exponential intermeeting times. We also make our model more realistic by adding some practical restrictions to the network model. First, we assume that messages have finite TTL, i.e., a message is discarded when its lifetime exceeds a certain threshold. Figure 3 displays the performance of routing algorithms for different values of TTL varying from to as large as the simulation time (almost seconds). Simulations are run times and in each round, a different destination is randomly chosen from network nodes based on a uniform distribution. The average latency and average hop count are calculated only for the messages that reach the destination.
The simulation results in Figure ? show that decreasing the TTL has a similar overall effect on all of the algorithms. For larger TTLs, the delivery rate increases, but the buffer occupancy, average latency and hop count also increase. MinLat outperforms PRoPHET and MaxProp in terms of delivery rate, average latency, and buffer occupancy even in a scenario with a restricted message life time. Although intermeeting times are exponentially distributed and the contact graph is based on preferential attachment, in this larger network of 100 nodes, PRoPHET cannot reach percent delivery rate even without any restriction on TTL.
The next step towards a more realistic network model is to consider limits on buffer size. In the next set of simulations, we assume that TTL is seconds so that all algorithms reach their best possible delivery rate. We also assume that each node has a limited capacity for keeping the messages. When the buffer occupancy of a node reaches its limit, messages from other nodes are not forwarded to it. Moreover, any generated messages at the fully occupied node are immediately dropped. Figure Figure 4 shows the performance of algorithms for buffer sizes in the range of to messages. We see that increasing the buffer size improves the delivery rate for all algorithms. It also reduces the average latency because as the buffer size increases, nodes can more frequently follow their optimum forwarding rules.
Finally, we assume that the contact duration is limited so that the number of messages during any meeting is restricted by an exchange limit. We set the buffer size to messages so that all the algorithms reach their best possible delivery rate. This buffer size implies MB of node memory if each message is KB. We check the effect of varying the exchange limit on the network performance. Figure 5 shows the four comparison metrics for exchange limits in the interval to messages. As we see in the figure, MinLat cannot achieve the optimum average latency for some values of exchange limit, but it still has the best performance in terms of buffer occupancy.
The simulation results displayed in Figures Figure 3Figure 5 illustrate that although MinLat has been designed for an ideal network model, it has an acceptable performance when we impose realistic conditions such as finite TTL, buffer size, and exchange limit. Our final set of simulations explores the impact of including the recursive maximum likelihood estimation of meeting rates. We conduct experiments using both the centralized and decentralized algorithms operating on the extension of network I to nodes. We select the parameters of intermeeting times from a uniform distribution . The destination node is randomly chosen from the nodes of the network based on a uniform distribution and is fixed throughout the simulation. We run the simulation for seconds. We examine the error terms in Theorem 4, averaging over all nodes. Figure ? shows the average absolute difference between the estimated expected latencies and the true minimum latencies, i.e., . Figure ? shows the difference between the achieved average latencies and the true minimum latencies, i.e., Figure 6 indicates that the decentralized algorithm achieves almost the same estimation error as the centralized algorithm, suggesting that the limiting effect is the convergence of the meeting rate estimates rather than the dissemination of latency estimates through the network. As expected from Theorem ?, we see that the estimated and achieved errors both decay to zero as the time goes by (for achieved latencies, it is almost zero for all nodes of the network after seconds).
Comparing Figures ? and ? shows that for both centralized and decentralized scenarios, the average difference between the achieved latency and the minimum latency is less than the average difference between the estimated latency and the minimum latency. This is expected, because the estimated latencies are based on incorrect decision matrices and estimated meeting rates, whereas the achieved latencies are derived from the actual meeting rates. The results suggest that even when there remains substantial inaccuracy in the expected latency estimates (e.g. at time seconds), the algorithm identifies a closetooptimal forwarding decision matrix.
5Conclusion
In this paper, we used an analytical framework to model the opportunistic data transfer among mobile devices in Delay Tolerant Networks. In our model, the random intermeeting times of nodes are assumed to be independent and exponentially distributed. We formulated the routing/forwarding problem as an optimization problem in which the goal is to minimize the sum of expected latencies from all nodes of the network to a particular destination. We proved that the solution of this problem is binary, i.e., when an arbitrary node meets any other node in the network, its optimum forwarding rule dictates either to always forward its messages to the encountered node or to never forward any messages to it. We also showed that the solution of this optimization problem minimizes the expected latency from each node of the network to the destination as well. Based on these results, we proposed centralized and distributed versions of an algorithm to find the optimum forwarding decision rules and proved that each of these algorithms result in the same solution. In order to evaluate the performance efficiency of the suggested algorithms in different synthetic and realworld networks, we chose four performance metrics as comparison metrics and compared our proposed decentralized algorithm (MinLat) with the most similar existing approaches. In order to evaluate the performance of MinLat in more realistic scenarios, we conducted simulations in larger networks with practical constraints like limited message life (TTL), buffer size and message exchange.
One of the main contributions of this work is relaxing the condition of having complete knowledge of meeting rates at each node for making the forwarding decisions. We used a recursive maximum likelihood procedure (MinLatE) to learn the meeting rates online and proved its convergence in probability to the same optimal solution. The validity of this theoretical result was assessed through simulations. Moreover, we compared the convergence speed of the proposed centralized and decentralized algorithms when the meeting rates are estimated online. The simulation results show that the decentralized algorithm has almost the same convergence rate as the centralized algorithm, even though the network topology is not known at individual nodes.
In future work, we aim to explore the effect of timevarying meeting rates between nodes, which would motivate the use of filters to track the rates. We also hope to examine whether it is possible to derive similar expressions for expected latencies and optimal forwarding rules for cases when the intermeeting times are not exponentially distributed or are correlated.
6Proof of Lemma
When node commences in its routing of a packet, it must first wait a time before it meets one of its neighbors. The amount of time before node meets a specific neighbor is an exponentially distributed random variable with parameter . The time is equal to the minimum of the exponentially distributed random variables corresponding to all neighbours and its expected value is
The probability that is the first node that meets is . Hence is
The last term in follows from the memoryless property of the distributions. Subsituting into leads to .
7Proof of Theorem
Let us assume that the nodes, excluding , are labelled in ascending order of their expected latency under , i.e., . For a given matrix , we denote by all rows of except . If we fix and for for some , then is monotonically increasing with respect to (see ). This implies that if we commence with any and change only to decrease , then all other such that either decrease or remain the same. The matrix must therefore satisfy for all . Otherwise we could choose an alternative that reduces and hence achieves .
We can examine the partial derivative of with respect to at :
This derivative has the same sign as: , or equivalently . This expression for the derivative, together with the requirement that , implies that if and if . Our assumption that the solution is unique implies that . Otherwise, from , it is clear that we could choose any between 0 and 1 and achieve the same , without affecting any other . This establishes statement (1) of the theorem.
Although we have established that , we have not yet shown that globally minimizes . We establish this by contradiction. Suppose does not minimize the expected latency for some nonempty set of nodes . In other words, denoting the minimum expected latency achieved via the minimization in for node by , we have
Let node be the node in such that for all . Denote by the ranking of the node with greatest expected latency under such that . Based on the discussion above, for each node , for all and hence . Node must have at least one neighbour in the set . Otherwise, it could not achieve an expected latency under that is less than all nodes (observe from that ).
The matrix that achieves the minimum must satisfy for all , since for any matrix we have . We also have for if . For a fixed choice of the value decreases if we can reduce for any such that . The matrix minimizes for all , implying that for all . Since for all , it follows that . For node , the values of for have no impact on , so we have . This contradicts the original assumption that does not minimize the latency for all nodes , and thus establishes statement (2) of the theorem.
8Proof of Theorem
We observe that for all , (since the optimizations are the same). Based on Theorem 1 and its proof, the equality holds only if for all such that . The statements in the theorem follow based on an induction argument.
Suppose, without loss of generality, that the nodes are labelled in ascending order of expected latency under . For node , the only neighbour with lower expected latency is the destination. In iteration 1, the destination is included in and must be in . Recall that . Node 1 has the minimum expected latency according to the chosen labelling and Theorem 1, except for the destination itself. The relationship thus implies that . We therefore have , and node is selected to be added to , with and for all . Statements 1a)c) in the theorem clearly hold after one iteration, i.e., after the addition of node 1 to .
Assume the same statements hold after the addition of node to . Then, for node we must have for all such that . Again this implies that for all . Thus, node is correctly selected for addition to and the statements 1a)c) hold at the end of iteration .
It follows that the statements hold for all iterations of the algorithm, and after completion, when , the second statement follows.
9Proof of Theorem and Proposition
9.1Proof of Theorem
Assume that the nodes are labelled in order of ascending expected latency under , i.e., . Denote by the moment of time at which node meets the destination node. For denote by the earliest time by which node has met all nodes in the set in the time period . Due to the assumption that the intermeeting times are exponentially distributed, is finite with probability .
At , node learns its meeting rate with the destination (). Since the estimated latencies are initialized to and due to the update equations in Algorithm ?, the estimation that node has at of the latencies of its neighbors are upperbounds, i.e. . As discussed in the proof of the previous theorems, the minimizer has and for all . At time , since the term involving in the update equation of Algorithm ? has its minimum value, the vector identifies the same minimum latency . Hence, immediately after time we are guaranteed that .
At , node is aware of the minimum expected latencies for the nodes in the set . All other expected latencies are upper bounds, i.e. for . The solution takes value only for nodes in . The minimizer at time is thus equal to and achieves . Therefore, imediately after we will have . This argument applies until just after , at which point we have . Since is finite with probability , the statement of the theorem follows.
9.2Proof of Proposition
Algorithm ? has converged when all of the nodes have met all of their relay candidates and have identified their optimum forwarding rules. Thus, , where defined above in the proof of Theorem ?, is the expected convergence time. Considering the worst case where an arbitrary node is connected to all the nodes in the set , we have
where denotes the intermeeting time of node with its neighbour and follows an exponential distribution with parameter . Using a standard result for the expected value of the maximum of nonidentical independent exponential random variables, we have
An upper bound on this value is Therefore, an upperbound on the convergence time of Algorithm ? is .
10Proof of Lemma
Let’s return to considering the probability decision variable vector instead of binary decision variable vector. (We still expect the optimal solution to be of the binary form). Without loss of generality, we relabel the nodes by labels . The optimization problem that node tries to solve in Algorithm ? follows this general form which is known as linear fractional optimization problems:
where , , , , , ,and the elements of the matrix are
After applying the following parameter changes (the CharnesCooper transformation), we have
and the optimization problem Equation 6 converts to
which is a linear optimization problem and can be solved using Linear Programming (LP) solution methods.
11Proof of Theorem
We first prove the statement of the theorem for estimated latencies. Without loss of generality, we relabel the nodes such that . At node , MinLatE forms estimates of the meeting rates with the contact graph neighbours, for , and the expected latencies from each neighbour . The result is a sequence of random variables , with a variable being added to the sequence each time node meets another node. [41] shows that the maximum likelihood estimator of the meeting rates is consistent, i.e., . More precisely,
Equivalently, writing , we have for any and , there exists a such that for all :
where denotes the estimate of the meeting rate between nodes and at time .
We show that for any set of meeting rates , there exists an for which the optimum forwarding decision matrix in MinLatE () is the same as the optimum forwarding decision matrix in MinLat () with desirably high probability. In order to do so, we find upper and lower bounds (that apply with high probability) on the estimated expected latencies for the optimal decision matrices identified by both MinLat and MinLatE. We first demonstrate a relationship between the estimated and true expected latencies that would hold if the nodes employed the optimal decision matrix . We show that there exists a such that for each node , with probability greater than , for any positive , we have for all :
We derive by induction. From the arguments made in Theorem 3, we know that under an arbitrary node will not forward any messages to a node that does not belong to the set . For , after time such that holds, we have, with probability greater than for :
Suppose holds for all the nodes . Denote by the last meeting between node and that occurs subsequent to but prior to a considered time . Then we can identify a so that the following relationship holds with probability greater than for :
Here we have chosen to be sufficiently large such that the inequality on the second line holds with probability exceeding . Similarly we can identify a so that the lower bound in holds with probability greater than for . By taking , we see that holds for node as well, and by induction, holds for all nodes .
We now turn our attention to the forwarding matrix determined by MinLatE (). We first consider a scenario where the estimates of the meeting rates are frozen after a certain time . The minimization procedure in MinLatE is the same as in MinLat, but operates on the estimates of the meeting rates. If these estimates are held constant, then the results in Theorems 13 apply, with the substitution of anywhere we make use of . With probability 1, the optimization algorithm will thus converge after a finite time , and the estimated expected latencies will be consistent across the network. Hence, there exists a labeling for which We can now employ the same argument that was used above for to determine that there is a finite time such that the following bound holds for all with probability greater than for all .
In the actual MinLatE algorithm, the meeting rates are not frozen after , but continue to be updated as more meetings occur. This only results in the probabilistic bounds on being tighter, and hence can only tighten the bounds on .
To avoid having nodespecific bounds on the accuracy of the estimates, we can rewrite the bounds as:
This bound holds for both and with probability at least after some time .
Our goal is to show that there exists a moment of time after which is true with desirably high probability. We can accomplish this by showing that there exists an (and thus an associated time ) for which the upperbound on is less than the lowerbound on for any for all . If this is the case, then with probability exceeding , the optimization procedure that derives will set it to , because it minimizes the estimated latencies. Hence, should satisfy
which leads to
where .
Now that we have established that after a finite amount of time occurs with a probability desirably close to one, we can show that for any using the following properties:
If , then (Continuous Mapping Theorem);
If , then ;
If , then ;
If and , then (Slutsky’s Theorem),
where denotes the convergence in distribution and is an arbitrary continuous function.
Again consider the node labelling such that . For node , which is obvious from property 1. For any node , if holds for any , then due to properties 24 we have,
and
Property 1 in combination with and results in the statement of the theorem.
For the achieved expected latencies, , as opposed to those estimated at the nodes, the proof is more straightforward. For a given decision matrix, , the expected latencies s are functions of the true meeting rates and are thus not random variables. Thus, the sequence is only a function of the random sequences , via the optimization that determines . Since we have established that converges in probability to , it follows that converges in probability to due to property 1.
References
 A. Voyiatzis, “A survey of delay and disruptiontolerant networking applications,” J. Internet Engineering, vol. 5, no. 1, 2012.
 M. J. Khabbaz, C. M. Assi, and W. F. Fawaz, “Disruptiontolerant networking: A comprehensive survey on recent developments and persisting challenges,” in IEEE Commun. Surveys & Tutorials, vol. 14, no. 2, pp. 607–640, 2012.
 Y. Cao and Z. Sun, “Routing in delay/disruption tolerant networks: A taxonomy, survey and challenges,” in IEEE Commun. Surveys & Tutorials, vol. 15, no. 2, pp. 654–677, 2013.
 K. Wei, X. Liang, and K. Xu, “A survey of socialaware routing protocols in delay tolerant networks: applications, taxonomy and designrelated issues,” in IEEE Commun. Surveys & Tutorials, vol. 16, no. 1, pp. 556–578, 2014.
 P. Juang, H. Oki, Y. Wang, M. Martonosi, L. S. Peh, and D. Rubenstein, “Energyefficient computing for wildlife tracking: Design tradeoffs and early experiences with ZebraNet,” in ACM Sigplan Notices, vol. 37, no. 10, 2002, pp. 96–107.
 T. Small and Z. J. Haas, “The shared wireless infostation model: a new ad hoc networking paradigm (or where there is a whale, there is a way),” in Proc. ACM Int. Symp. Mobile Ad hoc Net.& Comput. (MobiHoc), Annapolis, MD, Jun 2003, pp. 233–244.
 B. E. Pataki and L. Kovács, “Sensor data collection experiments with chaoster in the fed4fire federated testbeds,” in IEEE Wireless and Mobile Comput., Net. and Comm. (WiMob), Larnaca, Cyprus, Oct. 2014.
 A. Pentland, R. Fletcher, and A. Hasson, “Daknet: Rethinking connectivity in developing nations,” IEEE Computer, vol. 37, no. 1, pp. 78–83, Jan. 2004.
 B. Han, P. Hui, V. A. Kumar, M. V. Marathe, J. Shao, and A. Srinivasan, “Mobile data offloading through opportunistic communications and social participation,” IEEE Trans. Mobile Comput., vol. 11, no. 5, pp. 821–834, May 2012.
 A. Vahdat and D. Becker, “Epidemic routing for partially connected ad hoc networks,” Duke Univ., Durham, NC, USA, Tech. Rep., 2000.
 M. Grossglauser and D. Tse, “Mobility increases the capacity of adhoc wireless networks,” in Proc. IEEE Infocom, vol. 3, Anchorage, AL, USA, Apr. 2001, pp. 1360–1369.
 T. Spyropoulos, K. Psounis, and C. S. Raghavendra, “Spray and wait: an efficient routing scheme for intermittently connected mobile networks,” in Proc. ACM SIGCOMM Workshop on Delaytolerant Networks, Philadelphia, PA, USA, Aug. 2005, pp. 252–259.
 J. A. Davis, A. H. Fagg, and B. N. Levine, “Wearable computers as packet transport mechanisms in highlypartitioned adhoc networks,” in Proc. IEEE Int. Symp. on Wearable Computers, Zurich, Germany, Oct. 2001, pp. 141–148.
 A. Lindgren, A. Doria, and O. Schelén, “Probabilistic routing in intermittently connected networks,” ACM SIGMOBILE Mobile Comput. and Commun. Rev., vol. 7, no. 3, pp. 19–20, Jul. 2003.
 J. Burgess, B. Gallagher, D. Jensen, and B. N. Levine, “Maxprop: Routing for vehiclebased disruptiontolerant networks.” in Proc. IEEE Infocom, vol. 6, Barcelona, Spain, Apr. 2006, pp. 1–11.
 E. P. Jones, L. Li, J. K. Schmidtke, and P. A. Ward, “Practical routing in delaytolerant networks,” IEEE Trans. Mobile Comp., vol. 6, no. 8, pp. 943–959, 2007.
 E. M. Daly and M. Haahr, “Social network analysis for routing in disconnected delaytolerant manets,” in Proc. ACM Int. Symp. Mobile Ad hoc Net. Comput. (MobiHoc), Montreal, Canada, Sep. 2007, pp. 32–40.
 P. Hui, J. Crowcroft, and E. Yoneki, “Bubble rap: Socialbased forwarding in delaytolerant networks,” IEEE Trans. Mobile Comput., vol. 10, no. 11, pp. 1576–1589, 2011.
 D. A. Sharma and M. Coates, “Contact graph based routing in opportunistic networks,” in Proc. IEEE Global Conf. Sig. and Info. Proc. (GlobalSIP).1em plus 0.5em minus 0.4emAustin, TX, USA: IEEE, Dec. 2013.
 M. Xiao, J. Wu, and L. Huang, “Communityaware opportunistic routing in mobile social networks,” IEEE Trans. Computers, vol. 63, no. 7, pp. 1682–1695, 2013.
 Y. Li, G. Su, D. O. Wu, D. Jin, L. Su, and L. Zeng, “The impact of node selfishness on multicasting in delay tolerant networks,” IEEE Trans. Veh. Tech., vol. 60, no. 5, pp. 2224–2238, 2011.
 P. Sermpezis and T. Spyropoulos, “Understanding the effects of social selfishness on the performance of heterogeneous opportunistic networks,” Computer Commun., vol. 48, pp. 71–83, 2014.
 H. Zhang, Z. Zhang, and H. Dai, “Gossipbased information spreading in mobile networks,” IEEE Trans. Wireless Commun., vol. 12, no. 11, pp. 5918–5928, 2013.
 V. Conan, J. Leguay, and T. Friedman, “Fixed point opportunistic routing in delay tolerant networks,” IEEE J. Sel. Areas in Commun., vol. 26, no. 5, pp. 773–782, 2008.
 M. Xiao, J. Wu, C. Liu, and L. Huang, “Tour: Timesensitive opportunistic utilitybased routing in delay tolerant networks,” in Proc. IEEE Infocom, Turin, Italy, Apr. 2013, pp. 2085–2091.
 C. Boldrini, M. Conti, and A. Passarella, “Modelling socialaware forwarding in opportunistic networks,” in Proc. PERFORM (LNCS 6821).1em plus 0.5em minus 0.4emVienna, Austria: SpringerVerlag, Oct. 2010, pp. 141–152.
 ——, “Performance modelling of opportunistic forwarding with imprecise knowledge,” in Proc. Int. Symp. Model. and Opt. in Mobile, Ad Hoc and Wireless Net. (WiOpt), 2012, pp. 216–223.
 R. Ramanathan, R. Hansen, P. Basu, R. RosalesHain, and R. Krishnan, “Prioritized epidemic routing for opportunistic networks,” in Proc. ACM Int. Workshop on Mobile Opportunistic Networks, San Juan, Puerto Rico, Jun. 2007, pp. 62–66.
 M. Khouzani, S. Eshghi, S. Sarkar, N. B. Shroff, and S. S. Venkatesh, “Optimal energyaware epidemic routing in DTNs,” in Proc. ACM Int. Symp. on Mobile Ad hoc Net. & Comput. (MobiHoc), South Carolina, USA, Jun. 2012, pp. 175–182.
 D. J. Klein, J. Hespanha, and U. Madhow, “A reactiondiffusion model for epidemic routing in sparsely connected MANETs,” in Proc. IEEE Infocom, San Diego, USA, Mar. 2010, pp. 1–9.
 Q. Wang and Z. J. Haas, “Analytical model of epidemic routing for delaytolerant networks,” in Proc. ACM Int. Workshop on High Perf. Mobile Opportunistic Sys., Paphos, Cyprus, Oct. 2012, pp. 1–8.
 S. Grasic, E. Davies, A. Lindgren, and A. Doria, “The evolution of a dtn routing protocolprophetv2,” in Proc. ACM Workshop on Challenged Networks, Las Vegas, NV, USA, Sept. 2011, pp. 27–30.
 S. Shaghaghian and M. Coates, “Opportunistic networks: Minimizing expected latency,” in IEEE Wireless and Mobile Comput., Net. and Comm. (WiMob), Larnaca, Cyprus, Oct. 2014.
 H. Cai and D. Y. Eun, “Crossing over the bounded domain: from exponential to powerlaw intermeeting time in mobile ad hoc networks,” IEEE/ACM Transactions on Networking, vol. 17, no. 5, pp. 1578–1591, Oct. 2009.
 A. Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass, and J. Scott, “Impact of human mobility on opportunistic forwarding algorithms,” IEEE Trans. Mob. Comput., vol. 6, no. 6, pp. 606–620, Jun. 2007.
 V. Conan, J. Leguay, and T. Friedman, “The heterogeneity of intercontact time distributions: its importance for routing in delay tolerant networks,” 2007, arXiv:cs/0609068v2 [cs.NI], LIP6.
 W. Gao, Q. Li, B. Zhao, and G. Cao, “Multicasting in delay tolerant networks: A social network perspective,” in Proc. ACM MobiHoc, New Orleans, LA, USA, May 2009, pp. 299–308.
 K. Lee, Y. Y. amd J. Jeong, H. Won, I. Rhee, and S. Chong, “Maxcontribution: On optimal resource allocation in delay tolerant networks,” in Proc. IEEE Infocom, San Diego, CA, USA, Mar. 2010, pp. 1–9.
 H. Zhu, L. Fu, G. Xue, Y. Zhu, M. Li, and L. M. Ni, “Recognizing exponential intercontact time in VANETs,” in Proc. IEEE Infocom, San Diego, CA, USA, Mar. 2010, pp. 1–5.
 J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau, “CRAWDAD data set cambridge/haggle (v. 20060131),” Downloaded from http://crawdad.org/cambridge/haggle/, Jan. 2006.
 R. H. Berk, “Consistency and asymptotic normality of MLE’s for exponential models,” The Annals of Math. Stat., vol. 43, no. 1, pp. 193–204, 1972.