Distributed Storage in Mobile Wireless Networks with DevicetoDevice Communication
Abstract
We consider the use of distributed storage (DS) to reduce the communication cost of content delivery in wireless networks. Content is stored (cached) in a number of mobile devices using an erasure correcting code. Users retrieve content from other devices using devicetodevice communication or from the base station (BS), at the expense of higher communication cost. We address the repair problem when a device storing data leaves the cell. We introduce a repair scheduling where repair is performed periodically and derive analytical expressions for the overall communication cost of content download and data repair as a function of the repair interval. The derived expressions are then used to evaluate the communication cost entailed by DS using several erasure correcting codes. Our results show that DS can reduce the communication cost with respect to the case where content is downloaded only from the BS, provided that repairs are performed frequently enough. If devices storing content arrive to the cell, the communication cost using DS is further reduced and, for large enough arrival rate, it is always beneficial. Interestingly, we show that MDS codes, which do not perform well for classical DS, can yield a low overall communication cost in wireless DS.
 BS
 base station
 cdf
 cumulative distribution function
 CDN
 content delivery network
 c.u.
 cost units
 D2D
 devicetodevice
 DS
 distributed storage
 ECC
 erasure correcting code
 i.i.d.
 independent, identically distributed
 LRC
 locally repairable code
 MBR
 minimum bandwidth regenerating
 MDS
 maximum distance separable
 MIMO
 multiple input multiple output
 MSR
 minimum storage regenerating
 OFDM
 orthogonal frequency division multiplexing
 P2P
 peertopeer
 probability density function
 pmf
 probability mass function
 RV
 random variable
 t.u.
 time unit
I Introduction
It is predicted that the global mobile data traffic will exceed 30 exabytes per month by 2020, nearly a tenfold increase compared to the traffic in 2015 [1]. This dramatic increase threatens to completely congest the already burdened wireless networks. One popular approach to reduce peak traffic is to store popular content closer to the end users, a technique known as caching. The idea is to deploy a number of access points (called helpers) with large storage capacity, but lowrate wireless backhaul, and store data across them [2, 3]. Users can then download content from the helpers, resulting in a higher throughput per user. In [4] it was suggested to store content directly in the mobile devices, taking advantage of the high storage capacity of modern smart phones and tablets. The requested content can then be directly retrieved from neighbouring mobile devices, using devicetodevice (D2D) communication. This allows for a more efficient content delivery at no additional infrastructure cost. Caching in the mobile devices to alleviate the wireless bottleneck has attracted a significant interest in the research community in the recent years [5, 6, 7, 8]. In all these works, simple content caching and/or replication (i.e., a number of copies of a content are stored in the network) is considered. Additionally, the use of maximum distance separable (MDS) codes to facilitate decentralized random caching was investigated in [8].
A relevant problem in D2Dassisted mobile caching networks is the repairing of the lost data when a storage device is unavailable, e.g., when a storage device fails or leaves the network. Repairing of the lost data was considered in [9], where the communication cost incurred by data download and repair was analyzed for a caching scheme where data is stored in the mobile devices using replication and regenerating codes [10]. A strong assumption in [9] is that the repair of the lost content is performed instantaneously. As a result, content can always be downloaded from the mobile devices. Under the assumption of instantaneous repair, the caching strategy that minimizes the overall communication cost is replication.
In this paper, we consider content caching in a wireless network scenario using erasure correcting codes. When using erasure correcting codes to cache content, caching bears strong ties with the concept of distributed storage (DS) for reliable data storage. Indeed, the set of mobile devices storing content can be seen as a distributed storage network. The fundamental difference with respect to DS for reliable data storage is that data download can be done not only from the storage nodes, but the base station (BS) can also assist to deliver the data. Therefore, the strict guarantees on fault tolerance can be relaxed, which brings new and interesting degrees of freedom with respect to erasurecorrecting coding for DS for reliable data storage. Here, to avoid confusion with standard (uncoded) caching, we will use the term wireless distributed storage, highlighting the resemblance with DS using erasure correcting codes for reliable data storage in, e.g., data centers. Similar to the scenario in [9], we consider a cellular system where mobile devices roam in and out of a cell according to a Poisson random process and request content at random times. The cell is served by a BS, which always has access to the content. Content is also stored across a limited number of mobile devices using an erasure correcting code. Our main focus is on the repair problem when a device that stores data leaves the network. In particular, we introduce a more realistic repair scheduling than the one in [9] where lost content is repaired (from storage devices using D2D communication or from the BS) at periodic times.
We derive analytical, closedform expressions for the overall communication cost of content download and data repair as a function of the repair interval. The derived expressions are general and can be used to analyze the overall communication cost incurred by any erasure correcting code for DS. As an example of the application of the proposed framework, we analyze the overall communication cost incurred by MDS codes, regenerating codes [10], and locally repairable codes [11]. We show that wireless DS can reduce the overall communication cost as compared to the basic scenario where content is only downloaded from the BS. However, this is provided that repairs can be performed frequently enough. Moreover, in the case when nodes storing content arrive to the cell, the communication cost using DS is further reduced and, for large enough arrival rate, it is always beneficial as compared to BS download. The repair interval that minimizes the overall communication cost depends on the network parameters and the underlying erasure correcting code. We show that, in general, instantaneous repair is not optimal. The derived expressions can also be used to find, for a given repair interval, the erasure correcting code yielding the lowest overall communication cost.
Noninstantaneous repairs, the socalled “lazy” repairs, have already been proposed for DS in data centers [12, 13] to reduce the amount of data that has to be transmitted within the storage network during the repair process, known as the repair bandwidth. However, contrary to [12, 13], in the wireless scenario considered here the noninstantaneous repairs impact both data repair and download. We show that, somewhat interestingly, erasure correcting codes achieving a low repair bandwidth do not always perform well in a wireless DS setting. On the other hand, MDS codes, which entail a high repair bandwidth, can yield a low overall communication cost for some repair intervals.
Notation: The probability density function (pdf) of a random variable is denoted by . Expectation and probability are denoted by and , respectively. We use bold lowercase letters to denote vectors and bold uppercase letters for matrices.
Ii System Model
We consider a single cell in a cellular network, served by a BS, where mobile devices (referred to as nodes) arrive and depart according to a Poisson random process. The initial number of nodes in the network is . Nodes wish to download content from the network. For simplicity, we assume that there is a single object (file), of size bits, stored at the BS. We further assume that nodes can store data and communicate between them using D2D communication. The considered scenario is depicted in Fig. 1.
Arrivaldeparture model. Nodes arrive according to a Poisson process with exponential independent, identically distributed (i.i.d.) random interarrival times with pdf
(1) 
where is the expected arrival rate of a node and is time, measured in time units.
The nodes stay in the cell for an i.i.d. exponential random lifetime with pdf
(2) 
where is the expected departure rate of a node. The number of nodes in the cell can be described by an queuing model where the probability that there are nodes in the cell is [14]
(3) 
For simplicity, we assume that , i.e., the flow in and out from the cell is the same and the expected number of nodes in the cell stays constant (equal to ).
Data storage. The file is partitioned into packets, called symbols, of size bits and is encoded into coded symbols, , using an erasure correcting code of rate . The encoded data is stored in nodes, , referred to as storage nodes. Note that implies that a storage node may store multiple coded symbols. For some of the considered erasure correcting codes, this is the case (see Section VI). To simplify the analysis in Sections III and IV, we set . This guarantees that the probability that the number of nodes in the cell is smaller than is negligibly small, i.e.,
(4) 
using (3). For example, for and , (4) is less than . Therefore, with high probability the file can be stored in the cell. In the results section we show that this simplification has negligible impact and that the analytical expressions match closely with the simulation results.
Each storage node stores exactly bits, i.e., we consider a symmetric allocation [15]. Hence^{1}^{1}1Without loss of generality, we assume .,
(5) 
Incoming process. Nodes arriving to the cell may bring cached content. The expected arrival rate of nodes storing content is , . We also assume that the expected arrival rate of nodes not carrying content is , so that the expected arrival rate of a node (with or without content) is and the expected number of nodes in the cell is (see above). The incoming process is discussed in more detail in Section V.
Data delivery. Nodes request the file at random times with i.i.d. random interrequest time with pdf
(6) 
where is the expected request rate per node. Whenever possible, the file is downloaded from the storage nodes using D2D communication, referred to as D2D download. In particular, we assume that data can be downloaded from any subset of storage nodes, , which we will refer to as the download locality. In other words, D2D download is possible if or more storage nodes remain in the cell. In this case, the amount of downloaded data is bits.^{2}^{2}2To simplify the analysis in Sections III and IV, we assume that the download bandwidth is the same irrespective of whether the request comes from a storage node itself or not, i.e., users do not have access to their own stored data. This is a reasonable approximation if . Furthermore, this may be a practical assumption. Due to concerns about security in systems that allow for D2D connectivity, it has been proposed to isolate part of the memory in the mobile devices to be used only for DS, so that devices cannot have access to their own cached data [16]. In the case where there are less than storage nodes in the cell, the file is downloaded from the BS, which we refer to as BS download. In this case, bits are downloaded.
Communication cost. We assume that transmission from the BS and from a storage node (in D2D communication) have different costs. We denote by and the cost (in cost units (c.u.) per bit, [c.u./bit]) of transmitting one bit from the BS and from a storage node, respectively. Therefore, the cost of downloading a file from the BS and the storage nodes is and , respectively. Furthermore, we define , where corresponds to a high traffic load in the BStodevice link and reflects a scenario where the battery of the devices is the main constraint.
Iia Repair Process
When a storage node leaves the cell, its stored data is lost (see blue node with orange stripes in Fig. 1). Therefore, another node needs to be populated with data to maintain the initial state of reliability of the DS network, i.e., storage nodes. The restore (repair) of the lost data onto another node, chosen uniformly at random from all nodes in the cell that do not store any content, will be referred to as the repair process. We introduce a scheduled repair scheme where the repair process is run periodically. We denote the interval between two repairs by (in t.u.), . Note that corresponds to the case of instantaneous repair, considered in [9].
Similar to the download, repair can be accomplished from the storage nodes (D2D repair) or from the BS (BS repair), with cost per bit and , respectively. The amount of data (in bits) that needs to be retrieved from the network to repair a single failed node is referred to as the repair bandwidth, denoted by . For simplicity, we assume that each repair is handled independently of the others. In particular, we assume that D2D repair can be performed from any subset of storage nodes, , by retrieving bits from each node. In other words, D2D repair is possible if there are at least storage nodes in the cell at the moment of repair. In this case, , and the corresponding communication cost is . Parameter is usually referred to as the repair locality in the DS literature. If there are less than storage nodes in the cell at the moment of repair, then the repair is carried out by the BS. In this case, , with communication cost . Note that . For both repair and download, we assume errorfree transmission.
Iii Repair and Download Cost
In this section, we derive analytical expressions for the repair and download cost, and subsequently for the overall communication cost, as a function of the repair interval . For analysis purposes, we initially disregard the incoming process, i.e., set . The case is then addressed in Section V building upon the results in this section. We denote by the average communication cost of repairing lost data, and refer to it as the repair cost. Also, we denote by the average communication cost of downloading the file, and refer to it as the download cost. The (average) overall communication cost is denoted by , where . The costs are defined in cost units per bit and time unit, [c.u./(bitt.u.)].
For later use, we denote by the probability mass function (pmf) of the binomial distribution with parameters and ,
(7) 
Iiia Repair Cost
The repair cost has two contributions, corresponding to the cases of BS repair and D2D repair. Denote by and the average number of nodes repaired from the storage nodes and from the BS, respectively, in one repair interval. Then, (in [c.u./(bitt.u.)]) is given by
(8) 
where and (in c.u.) are the cost of repairing a single storage node from the BS and from storage nodes, respectively (see Section IIA), and we normalize by such that does not depend on the file size.
The repair cost, , is given in the following theorem.
Theorem 1.
Consider the DS network in Section II with departure rate , communication costs and , BS repair bandwidth , file size , repair interval , and probability that a node has not left the network during a time . Furthermore, consider the use of an erasure correcting code with D2D repair bandwidth . The repair cost is given by
(9) 
Proof:
As the interdeparture times are exponentially distributed, the probability that a storage node has not left the network during a time and is available for repair is
Hence, the probability that storage nodes are available for repair is . If storage nodes remain in the cell, then repairs need to be performed. D2D repair is performed if , and BS repair is performed otherwise. Therefore,
Remark 1.
We see from (8) that if , i.e., , D2D repair should never be performed, as repairing always from the BS yields a lower repair cost. In this case the repair cost would be
IiiB Download Cost
Similar to , the download cost has two contributions, corresponding to the case where content is downloaded from the BS and from the storage nodes. Denote by and the probability that, for a request, the file is downloaded from the BS and from the storage nodes, respectively. Then, can be written as
(10) 
where and are the cost of downloading the file from the BS and from the storage nodes, respectively (see Section II), and is the overall request rate per t.u.. Again, we normalize by so that the cost does not depend on the file size. The download cost is given in the following theorem.
Theorem 2.
Consider the DS network in Section II with expected number of nodes in the cell , departure rate , request rate , communication costs and , file size , and repair interval . Furthermore, consider the use of an erasure correcting code that stores bits per node. Let for , and . The download cost is given by
(11) 
The proof is given in Appendix A. Here, for ease of understanding, we give an outline of the proof. Since , it follows from (10) that to derive is sufficient to derive . Let be the number of storage nodes alive in the cell within a repair interval, i.e., for , with . It is important to observe that is described by a Poisson death process [14], since storage nodes may leave the cell, and no repair is attempted before a time . This random process is illustrated in Fig. 2. At some point, too many storage nodes have left the network, such that the number of available storage nodes goes below and D2D download is no longer possible. Denote the (random) time this occurs by , i.e., , (see Fig. 2). Denote by the arrival time of the th file request within a repair interval, . The probability can then be derived in two steps.

Find the pdf of the arrival time of the file requests within a repair interval , .

Find the probability that a request arrives before , (i.e., D2D download is possible).
Remark 2.
If , i.e., , performing BS download only is optimal. The download cost is then
(12) 
We also have the following result about the behavior of in (11).
Corollary 1.
For , is monotonically increasing with if , monotonically decreasing with if , and constant otherwise.
Proof:
The proof follows directly from differentiating with respect to and is therefore omitted. ∎
IiiC Overall Communication Cost
Combining Theorems 1 and 2, one obtains the expression for the overall communication cost,
(13) 
Note that, in general, is not monotone with . We can derive the following result for (instantaneous repair) and (no repair).
Corollary 2.
(14) 
Moreover, for ,
(15) 
Proof:
See Appendix B. ∎
For instantaneous repair (), both repair and download are always performed from the storage nodes. Thus, the two terms in (14) correspond to the D2D repair and D2D download, and we recover the result in [9]. For , data is never repaired (hence, ). For , the number of storage nodes in the cell will become smaller than at some point, and D2D download is no longer possible. Therefore, the overall communication cost in (15) is the BS download cost in (12).
Iv Hybrid Repair and Download
In the system model in Section II and the analysis in Section III we assumed that if repair (resp. download) cannot be completed from storage nodes (because there are less than (resp. ) storage nodes available in the cell), BS repair (resp. download) is performed. Alternatively, for both repair and download, a node might retrieve data from the available storage nodes using D2D communication and retrieve the rest from the BS to complete the repair or the download. We will refer to this setup as partial D2D repair and partial D2D download, and the scheme that implements it as the hybrid repair and download scheme. In the following, we extend the analysis in Section III to the hybrid scheme.
Iva Repair Cost
Assume that, at the time of repair, storage nodes are available, i.e., repair cannot be accomplished exclusively from the storage nodes. However, bits could be retrieved from the available storage nodes and the remaining bits to complete the repair from the BS. The corresponding communication cost is . For the conventional scheme, D2D repair is not possible for , and the repair cost corresponds to that of BS repair, i.e., . This implies that, if , partial repair leads to a reduced repair cost if or, equivalently, . For , the hybrid scheme performs partial D2D repair if and BS repair otherwise. The repair cost is given in the following theorem.
Theorem 3.
Proof:
It follows the same lines as the proof of Theorem 1. ∎
IvB Download Cost
Similar to the repair case, if storage nodes are available at the time of a file request, the file cannot be downloaded solely from the storage nodes. However, bits could be downloaded from the available storage nodes and the remaining bits from the BS, with communication cost . For the conventional scheme, the download cost corresponds to that of BS download, i.e., . Hence, the hybrid scheme leads to a lower download cost if , or equivalently, . For , the hybrid scheme performs partial D2D download if and BS download otherwise. The download cost is given in the following theorem.
Theorem 4.
Consider the DS network in Section II using the hybrid scheme. Let and , for . The download cost is given by
(16) 
where , , and
Proof:
See Appendix C. ∎
V Repair and Download Cost with
an Incoming Process
The analysis in the preceding sections does not consider the possibility that nodes arriving to the cell may bring content. In a real scenario with neighboring cells, however, this may be the case. We will refer to the arrival of nodes with content as the incoming process. Considering an incoming process significantly complicates the analysis. This is due to the fact that arriving nodes may bring content that is not directly useful, in the sense that they may bring code symbols which are already available in another storage node. At a given time, it is likely that some symbols will be stored by more than one storage node, while other symbols will not be present in the storage network (due to node departures). As a result, the analysis needs to consider storage node classes, where a node class defines the set of storage nodes storing given code symbols. In general, for an erasure correcting code, there are storage node classes, since all code symbols are different. The case of simply replicating the data (using a repetition code) is a bit different. Despite the fact that all code symbols are equal, for the analysis of replication we still need to consider storage node classes, i.e., we treat each of the code symbols of the replication as they were different.
In this section, we extend the analysis in Sections III and IV to the scenario with an incoming process. In particular, we show that Theorem 1 and Theorem 2 can also be used to analyze the repair and download costs for this scenario by using different input parameters. More precisely, we consider the scenario where storage nodes of a given class arrive to the cell according to a Poisson process with expected arrival rate . An incoming storage node brings a single code symbol of a given class. Furthermore, nodes not storing content arrive according to a Poisson process with expected arrival rate . The departure rate for all nodes is , i.e., as before, the average number of nodes in the cell is . We assume the practical scenario where the BS maintains a list of the nodes storing content, which is communicated periodically to all nodes in the cell every t.u.. For simplicity, we assume that .
Va Repair Cost
Denote by the number of class storage nodes in the cell at time . Also, denote by the probability that class is empty at time , i.e., . Since all storage node classes have the same arrival and departure rate, we can drop subindex and write . Also, let be the stationary distribution, where is the probability that class has storage nodes. Equation (1) in Theorem 1 can then be used for the scenario with an incoming process by setting .
The difficulty here lies in computing . Without repairs, the evolution of is given by a Poisson birthdeath process, which can be modeled by an Markov chain model. In this case, the stationary distribution exists and can be computed. However, the repairs performed every t.u. interfere with the stationarity of the process. Indeed, in the presence of repairs, the evolution of does no longer correspond to a Poisson birthdeath process. In this case, the analysis appears to be formidable.
Here, we propose the following twostep procedure to compute . Consider a single repair interval of duration , where is the number of storage nodes in class at time . Within a repair interval , is described by a Poisson birthdeath process^{3}^{3}3This is contrast to the case with no incoming process, where the evolution of for is described by a Poisson death process.. Since storage node classes are independent of each other and have the same arrival and departure rates, we can focus on a single class. Hence, we will drop the subindex in and simply write .
Let denote the transition probability function of the continuoustime Markov chain representing the Poisson birthdeath process. can be computed by deriving a set of differential equations, called Kolmogorov’s forward equations, whose solution can be computed as follows [17]. Let be the matrix with th entry , where is the maximum number of storage nodes of one class. Also, let be the transition rates of the continuoustime Markov chain. Then can be computed as [17]
(17) 
where is the generator of the Markov chain, with entries , and , given by
with
(18) 
The infinite power series in (17) converges for any square matrix , and can be efficiently computed using, e.g., the algorithm described in [18].
Note that in our scenario, is not finite. However, if the probability of having storage nodes of a given class at time , , sharply decreases with . Therefore, we can limit to a sufficiently large value, and by solving (17) get a very good approximation of .
Given , we can estimate the stationary distribution recursively. For a given distribution at time , , we can compute as
(19) 
where and , due to the repair, and for .
Equivalently, this recursion can be written in compact form as
(20)  
(21) 
where is an matrix with entries , for , and . Note that and are the stationary distributions before and after repair, respectively.
Theorem 5.
Consider the DS network in Section II with departure rate , arrival rate of storage nodes of a given class , arrival rate of nodes not storing content , communication costs and , BS repair bandwidth , file size , and repair interval . Furthermore, consider the use of an erasure correcting code with D2D repair bandwidth . The repair cost is given by (1) with , and is given by the first element of in (21).
Proof:
The proof follows from the discussion above. ∎
Theorem 6.
Proof:
The proof follows from the discussion above. ∎
Remark 3.
It is important to remark that the analysis for the scenario with an incoming process does not consider the departure of individual storage nodes, but rather the departure of whole classes, i.e., all nodes of a given class. Thus, and in (1) should not be interpreted as storage nodes and storage nodes, respectively, but as and storage node classes.
VB Download Cost
Assume that after repair there are storage nodes of a given class, say class . With some abuse of notation, let be the number of storage nodes of class at time , where parameter indicates that . The evolution of for is given by a Poisson death process. Denote by the time instant at which the last of the storage nodes in class leaves the cell. is hypoexponentially distributed with pdf given by (31), with and . The expected value of is [19, Sec. 1.3.1]
(22) 
Note that is exponentially distributed.
Let be the time instant at which the last of the storage nodes in class leaves the cell or, in other words, the time instant at which the whole class leaves the cell. The pdf of is a weighted sum of the pdfs , weighed by , i.e., it is a weighted sum of hypoexponential distributions. The expected value of is
(23) 
Let be the number of nonempty storage node classes in the cell at time . Computing exactly requires to compute the distribution of the time instant at which changes from to , denoted by , similar to the case with no incoming process (see Appendix A). Unfortunately, due to the fact that the pdf of is a weighted sum of hypoexponential distributions, computing the pdf of seems unfeasible. Here, we propose to approximate the pdf of by an exponential pdf. Indeed, it appears that is in general the largest element in , therefore the distribution of has a large exponential component. Assuming that is well approximated by an exponential distribution with mean , the download cost for the scenario with an incoming process can then be computed using (11) in Theorem 2 by setting , where now a storage node departure should be interpreted as a storage node class departure. We have observed that by approximating the pdf of by an exponential distribution with mean , the analytical results match very well with the simulations for the whole range of interesting values of and , as shown in the results section. The download cost for the hybrid scheme is found by using (16) in Theorem 4 with .
Vi Erasure Correcting Codes in Distributed Storage
From Sections III–V, it can be seen that the overall communication cost depends on the network parameters (), , , and , and on the parameters , , , , and (and subsequently on and ), which are determined by the erasure correcting code used for DS. An erasure correcting code for DS is typically described in terms of the number of nodes used for storage, the download locality and the repair locality, and is defined using the notation . In this section, we briefly describe MDS codes [20], regenerating codes [10] and LRCs [11] in the context of DS. We also connect the code parameters with the code parameters . In Section VII, we then evaluate the overall communication cost of DS using these three code families.
We remark that the analysis in the previous sections applies directly to MDS and regenerating codes. However, due to the specificities of LRCs, Theorem 1 needs to be slightly modified, as shown in Section VIC below.
Via Maximum Distance Separable Codes
Assume the use of an MDS code for DS. In this case, each storage node stores one coded symbol, hence and . Due to the MDS property, D2D repair and D2D download require to contact storage nodes. Therefore, an MDS code in a DS context is described with the triple . Moreover, , i.e., . The fact that an amount of information equal to the size of the entire file has to be retrieved to repair a single storage node is a known drawback of MDS codes [10]. The simplest MDS code is the replication scheme. In this case, each storage node stores the entire file, i.e., and .
ViB Regenerating Codes
A lower repair bandwidth (as compared to MDS codes) can be achieved by using regenerating codes [10], at the expense of increasing [10]. Two main classes of regenerating codes are covered here, minimum storage regenerating (MSR) codes and minimum bandwidth regenerating (MBR) codes. MSR codes yield the minimum storage per node, i.e., is minimum, while MBR codes achieve minimum D2D repair bandwidth. Regenerating codes have two repair models, functional repair and exact repair [21]. In exact repair, the lost data is regenerated exactly [21]. In functional repair, the lost data is regenerated such that the initial state of reliability in the DS system is restored [21], but the regenerated data does not need to be a replica of the lost data [21]. Here, we consider only exact repair, since it is of more practical interest [22].
An exactrepair MSR code in a DS system has and , with [22].^{4}^{4}4The design of linear, exactrepair MSR codes with has been proven impossible [23]. Hence, using (5),
Furthermore [22],
with equality only when , which is only possible for and due to the restriction on the values for the repair locality. The repair bandwidth,
is minimized for [10]. We remark that the storage per node (and hence the average download cost) for an MDS code and an MSR code are equal.
An MBR code further reduces the repair bandwidth at the expense of increasing the storage per node. An exactrepair MBR code has and for [22]. Using (5), we have
Furthermore [22],
Similar to the MSR codes, the repair bandwidth of an MBR code,
is minimized for [10].
Note that an regenerating code has exactly the same overall communication cost as an replication scheme.
ViC Locally Repairable Codes
A lower repair locality (as compared to MDS codes) is achieved by using LRCs [11]. An LRC has and , where and . Each node stores
bits. The storage nodes are arranged in disjoint repair groups with nodes in each group. Any single storage node can be repaired locally by retrieving bits from nodes in the repair group [11]. A storage node involved in the repair process transmits all its stored data, i.e., , hence
If local D2D repair is not possible, repair can be carried out globally by retrieving bits from any subset of storage nodes. Since it is necessary to distinguish between local and global repairs (as opposed to MDS and regenerating codes), the expression of the repair cost in Theorem 1 does not apply to LRCs and needs to be modified. We denote by and the average number of nodes repaired from the storage nodes locally and globally, respectively, in one repair interval. We will also need the following definitions. Let be the random vector whose component is the random variable giving the number of repair groups with storage node departures in a repair interval . Note that takes values in and . The probability of storage node departures in a repair group is , where is the probability that a storage node has not left the network during a time . Let be a realization of and let . Then,
(24) 
where , is the multinomial coefficient, and .
The repair cost for LRCs is given in the following theorem.
Theorem 7.
Consider the DS network in Section II with departure rate , communication costs and , BS repair bandwidth , file size , and repair interval . Furthermore, consider the use of an LRC with disjoint repair groups and D2D repair bandwidth . The repair cost is given by
(25) 
where
and is an indicator function.
Proof:
See Appendix D. ∎
ViD Lowest Overall Communication Cost for Instantaneous Repair
For instantaneous repair, the minimum overall communication cost is given in the following lemma.
Lemma 1.
Proof:
See Appendix E. ∎
This is in agreement with the result in [9], where replication was shown to be optimal.
Vii Numerical results
In this section, we evaluate the overall communication cost (computed using (1) and (11)) for the erasure correcting codes discussed in the previous section. For the results, we consider a network with nodes, where the number of storage nodes is . This gives a probability smaller than of having less than nodes in the cell (see (4)), which is considered negligible. Without loss of generality, we set the departure rate and , i.e., . Figs. 3–9 refer to a system with no incoming process, i.e., , while Figs. 10 and 11 consider the presence of an incoming process, .
Fig. 3 shows normalized to the cost of downloading from the BS, , i.e., , as a function of the normalized repair interval, , for a selection of MDS codes, regenerating codes and LRCs with . The ratio between the request rate and departure rate is , i.e., the average request rate in the cell is requests per t.u., and . The meaning of is that each node places in average requests per node life time. Also, in the figure means that the repair interval is equal to one average node lifetime. Simulation results^{5}^{5}5When simulating the wireless DS system, the repair process is not executed if the number of nodes in the cell is less than at the particular repair instant. are also included in the figure (markers). Note that since we normalize to the BS download cost, values below ordinate correspond to the case where DS is beneficial. For relatively high repair frequencies, all codes yield lower than BS download. However, exceeds , i.e., BS download is less costly than the DS communication cost, for values of the repair interval larger than a threshold, which we define as
(26) 
For , retrieving the file from the BS is always less costly, therefore storing data in the nodes is useless. depends on the network parameters , , and as well as the code parameters , and .
We see from Fig. 3 that the value of that minimizes , denoted by , depends on the code used for storage. In particular, for the MSR code, i.e., instantaneous repair is optimal. Performing an exhaustive search for , it is readily verified that the same is true for any of the codes in Section VI with . It is reasonable to assume that this will be the case also for . On the other hand, for the MDS code. depends on the network and code parameters. In particular, the tolerance to storage node departures in a repair interval affects . In Section VIIA, we investigate how the network parameters affect and . In Section VIIB, we explore how the code parameters affect .
Viia Effect of Varying Network Parameters
Fig. 4 shows how increases with for the same codes as in Fig. 3 and . For , approximately, for all considered codes, i.e., it is never beneficial to use the devices for storage and the file should always be downloaded from the BS. It is worth noticing that, for moderatetolarge , the MSR code requires in the order of 10 repairs per average node lifetime while the MDS code requires only around 0.66 repairs per node lifetime for DS to be beneficial over BS download. The main difference between the MDS code and the MSR code is the number of storage node departures in a repair interval that the code can tolerate such that D2D repair is still possible, i.e., . The MDS code can handle the departure of up to storage nodes while the MSR code can tolerate a single departure only. This explains the higher repair frequency required by the MSR code.
For the LRC and , Fig. 5 shows how and are affected by the ratio . We see that increasing reduces for all and that increases with . The same behavior is observed using any of the codes in Section VI, which can be verified by the following manipulations of the equations in Section III. The case corresponds to , which can be readily seen by taking the limit in (13), using (1) and (11), for fixed and finite . This shows that the overall communication cost is essentially the download cost for a sufficiently high . Since is monotonically increasing in (Corollary 1) and as (Corollary 2), we also have that for . Hence, DS always leads to a lower overall communication cost, as compared to the BS download cost, for sufficiently large .
ViiB Results of Changing Code Parameters
We investigate how the repair locality affects . Fig. 6 shows versus for the MSR code for and . We observe that for the lowest is achieved for , i.e., the highest possible repair locality. This is due to the fact that for regenerating codes is minimized for (see [10] and Section VIB). However, increasing requires decreasing to yield the lowest . This is due to the improved tolerance to storage node departures as decreases. The result is interesting, because it means that in wireless DS, if repairs cannot be accomplished very frequently, repair locality is a more important parameter than repair bandwidth. On the other hand, if repairs can be performed very frequently, repair bandwidth becomes more important than repair locality, because tolerance to storage node departures is not critical. In general, there is a tradeoff between the repair bandwidth and the tolerance to storage node departures (directly related to the repair locality), which holds true for any of the codes in Section VI. How to set the the parameter depends on how frequently we can repair the DS system.
ViiC Improved Communication Cost Using the Hybrid Scheme
We return to the hybrid repair and download scheme presented in Section IV to investigate the gains in overall communication cost as compared to the cost when using the conventional scheme. We remark that the hybrid scheme does not improve for all codes in Section VI. In particular, for finite , is only reduced if (Theorem 3) and is only improved if (Theorem 4). Fig. 7 shows versus for all codes in Fig. 3 that achieve lower when using the hybrid scheme. We set and and include simulation results in the figure (markers). Dashed curves correspond to the conventional scheme, and solid curves to the hybrid scheme.