Optimizing Content Caching to Maximize the Density of Successful Receptions in Device-to-Device Networking

# Optimizing Content Caching to Maximize the Density of Successful Receptions in Device-to-Device Networking

Derya Malak, Mazin Al-Shalash, and Jeffrey G. Andrews Parts of the manuscript were presented at the 2014 IEEE Globecom Workshops [1] and at the 2015 IEEE ICC Workshops [2].D. Malak and J. G. Andrews are with the Wireless and Networking Communications Group (WNCG), The University of Texas at Austin, Austin, TX 78701 USA (email: deryamalak@utexas.edu; jandrews@ece.utexas.edu). M. Al-Shalash is with Huawei Technologies, Plano, TX 75075 USA (e-mail: mshalash@huawei.com).This research has been supported by Huawei. Manuscript last revised: September 15, 2019.
###### Abstract

Device-to-device () communication is a promising approach to optimize the utilization of air interface resources in 5G networks, since it allows decentralized opportunistic short-range communication. For to be useful, mobile nodes must possess content that other mobiles want. Thus, intelligent caching techniques are essential for . In this paper we use results from stochastic geometry to derive the probability of successful content delivery in the presence of interference and noise. We employ a general transmission strategy where multiple files are cached at the users and different files can be transmitted simultaneously throughout the network. We then formulate an optimization problem, and find the caching distribution that maximizes the density of successful receptions () under a simple transmission strategy where a single file is transmitted at a time throughout the network. We model file requests by a Zipf distribution with exponent , which results in an optimal caching distribution that is also a Zipf distribution with exponent , which is related to through a simple expression involving the path loss exponent. We solve the optimal content placement problem for more general demand profiles under Rayleigh, Ricean and Nakagami small-scale fading distributions. Our results suggest that it is required to flatten the request distribution to optimize the caching performance. We also develop strategies to optimize content caching for the more general case with multiple files, and bound the for that scenario.

## I Introduction

Wireless networks are experiencing a well-known ever-rising demand for enhanced high rate data services, in particular wireless video, which is forecast to consume around 70% of wireless bandwidth by 2019 [3]. Non-real-time video in particular is expected to comprise half of this amount [4], and comprises large files that can be cached in the network. Meanwhile, preliminary techniques have been standardized by 3GPP to allow decentralized file sharing and public safety applications [5]. is intriguing since it allows increased spatial reuse and possibly very high rate communication without increased network infrastructure or new spectrum, but is only viable when the mobile users have content that other nearby users want. Thus, it is clear that smart content caching is essential for .

Caching popular content is a well known technique to reduce resource usage, and increase content access speed and availability [6]. Infrastructure-based caching can reduce delay and when done at the network edge, also reduce the impact on the backhaul network, which in many cases is the bottleneck in wireless networks [7]. However, this type of caching does not reduce the demand on spectral resources. To gain spectral reuse and increase the area spectral efficiency, the content must be cached on wireless devices themselves, which allows short-range communication which is independent of the network infrastructure. communication can enable proximity-based applications involving discovering and communicating with nearby devices [8]. Synchronized distributed network architectures for communications are designed, e.g., FlashLinQ [9] and ITLinQ [10], and caching is shown to provide increased spectral reuse in -enabled networks [11]. Although order optimal solutions for optimal content placement is known under certain channel conditions [12, 13, 14], it is not known how to best cache content in a network. Intuitively, popular content should be seeded into the users’ limited storage resources in a way that maximizes the probability that a given device can find a desired file within its radio range. Exploring this problem quantitively is the goal of this paper.

### I-a Related Work

Different aspects of content distribution are studied. Scalability in ad hoc networks is considered [15], where decentralized algorithms for message forwarding are proposed by considering a Zipf product form model for message preferences. Throughput scaling laws with caching have been widely studied [16, 17, 18]. Optimal collaboration distance, Zipf distribution for content reuse, best achievable scaling for the expected number of active interference-free collaboration pairs for different Zipf exponents is studied [19]. With a heuristic choice (Zipf) of caching distribution for Zipf distributed requests, the optimal collaboration distance [20] and the Zipf exponent to maximize number of links are determined [17]. However, in general, the caching pmf is not necessarily same as the request pmf. This brings us to the one of the main objectives in this paper, which is to find the best caching pmf that achieves the best density of successful receptions () in networks.

Under the classical protocol model of ad hoc networks [21], for a grid network model, with fixed cache size , as the number of users and the number of files become large with , the order optimal111The order optimality in [12, 22] is in the sense of a throughput-outage tradeoff due to simple model used. caching distribution is studied and the per-node throughput is shown to behave as [12, 22]. The network diameter is shown to scale as for a multi-hop scenario [13]. It is shown that local multi-hop yields per-node throughput scaling as [14].

Spatial caching for a client requesting a large file that is stored at the caches with limited storage, is studied [23]. Using Poisson point process () to model the user locations, optimal geographic content placement and outage in wireless networks are studied [24]. The probability that the typical user finds the content in one of its nearby base stations ()s is optimized using the distribution of the number of s simultaneously covering a user [25]. Performance of randomized caching in networks from a maximization perspective has not been studied, which we study in this paper.

Although the work conducted in [17, 19] focused on the optimal caching distribution to maximize the average number of connections, the system model was overly simplistic. They assumed a cellular network where each serves the users in a square cell. The cell is divided into small clusters. communications are allowed within each cluster. To avoid intra-cluster interference, only one transmitter-receiver pair per cluster is allowed, and it does not introduce interference for other clusters. In this paper, we aim to overcome these serious limitations using a more realistic network model that captures the simultaneous transmissions where there is no restriction in the number of pairs.

### I-B Contributions

This paper develops optimal content caching strategies that aim to maximize the average density of successful receptions so as to address the demands of receivers. The contributions are as follows.

Physical channel modeling using . We introduce the network model in Sect. II, in which the locations of the users are modeled as a homogeneous . Different from the grid-based model in [12, 22], we consider the actual physical channel model. modeling makes our analysis tractable because unlike the cluster-based model in [20], where only a pair of users are allowed to communicate in a square region, we require no constraint on the link distance and allow a random number of simultaneous transmissions. All analysis is for a typical mobile node which is permissible in a homogeneous by Slivnyak’s theorem [26]. The interference due to simultaneously active transmitters, noise and the small-scale Rayleigh fading are incorporated into the analysis. Any transmission is successful as long as the Signal-to-Interference-plus-Noise Ratio () is above a threshold.

Density of successful receptions (). We propose a new file caching strategy exploiting stochastic geometry and the results of [27], and we introduce the concept of the density of successful receptions (). Although in this paper, we do not investigate the throughput-outage tradeoff as in [12, 22], the is closely related to the outage probability, obtained through the scaling of the coverage, i.e., the complement of the outage probability, with the number of receivers per unit area.

Optimal caching distribution to maximize the for the sequential multi-file model. We study a randomized transmission model for users with storage size in Sect. II. We propose techniques for randomized content caching based on the possible ways of prioritizing different files. In Sect. III, we start with a baseline model with single file to determine the optimal fractions of transmitters and receivers in the network model with distributed user locations that maximizes the . In Sect. IV, we consider the more general sequential multi-file transmission scenario, where we investigate the maximum in terms of the optimal fractions of and derived in Sect. III, to determine the , and optimize the caching pmf based on the randomized model.

Small-scale fading results. We formulate an optimization problem in Sect. IV-A to find the best caching distribution that maximizes the under a simple transmission strategy where single file is transmitted at a time throughout the network, assuming user demands are modeled by a Zipf distribution with exponent . This scheme yields a certain fraction of users to be active at a time based on the distribution of the requests. In Sect. IV-B, we optimize the of users for the multi-file setup, where the small-scale fading is Rayleigh distributed. We consider several special cases corresponding to 1) small but non-zero noise, 2) arbitrary noise and 3) an approximation for arbitrary noise allowing the path loss exponent . For case 1), we show that the optimal caching strategy also has a Zipf distribution but with exponent where . For case 2), we show that the same result holds based on an approximation of the coverage justified numerically in Sect. IV-B. This relation implies that is smaller than , i.e., the caching distribution should be more uniform compared to the request distribution, yet more popular files should be cached at a higher number of users. For case 3), we obtain a distribution similar to Benford’s law (detailed in Sect. IV-B) that optimizes the caching pmf. We also extend our results to the “general request distributions”, and show that cases 1) and 2) are also valid for Ricean and Nakagami fading distributions in Sect. IV-B.

In general, the optimal and the optimal caching distribution might not be tractable. Therefore, assuming the request and caching probabilities are known a priori, we weight the caching pmf to provide iterative techniques to optimize the under different settings. We propose caching strategies that consider maximizing the of the least desired file and of all files as detailed in Sect. V-B.

Optimal caching distribution to maximize the for the simultaneous multi-file model. In Sect. VI, we extend our study to the simultaneous transmissions of different files and define popularity-based and global strategies. The popularity-based strategy is in favor of the transmission of popular files and discards unpopular files. On the other hand, the global strategy schedules all the files simultaneously, which leads to lower coverage than the sequential model does. Optimization of the in these cases is very intricate compared to the case of sequential modeling. Therefore, we numerically compare the proposed caching models in Sect. VI, and observe that the optimal solutions become skewed towards the most popular content in the network. Thus, we infer that under different models, the optimal caching distribution may not be a Zipf distribution as also found in [12, 13, 14].

Insights. Our results show that the optimal caching strategy exhibits less locality of the reference (abbreviated as locality) compared to the input stream of requests, i.e., the demand distribution222The performance of demand-driven caching depends on the locality exhibited by the stream of requests. The more skewed the popularity pmf, (i) the stronger the locality and the smaller the miss rate of the cache[28], and (ii) good cache replacement strategies are expected to produce an output stream of requests exhibiting less locality than the input stream of requests [29]. In [28], authors showed that (i) and (ii) hold for caches operating under random on-demand replacement algorithms.. We also analyze the special case of using a tight approximation for standard Gaussian -function. Using this approach we show that the optimal caching distribution can be approximated by Benford’s law, which is a special bounded case of Zipf’s law [30]. In Sect. VII, we validate that both Zipf distribution and Benford’s law have very similar distributional characteristics, further validating the generality of the results. For the multiple file case, we extend our results by finding lower and upper bounds for the in Sect. V. Simulations show that the bounds are very accurate approximations for particular values.

## Ii System Model

We consider a mobile network model in which users are spatially distributed as a homogeneous of density , where a randomly selected user can transmit or receive information.

In the multiple file scenario, the randomized caching model we propose is shown in Fig. 1. The model can be summarized as follows. At any time slot, only a fraction of the users scheduled. Any user transmits with probability and receives with probability independently of other users. Each user has a cache with storage size . If it is selected as a receiver at a time slot, it draws a sample from the request distribution , which is assumed to be Zipf distributed. If it is selected as transmitter at a time slot, it draws a sample from the caching distribution . The selection of request distribution and the optimization of caching distribution will be detailed in Sect. IV. At any time slot, each receiver is scheduled based on closest transmitter association.

A system model for the content distribution network with multiple files is illustrated in Fig. 2. For multiple file case, different from the single file case, where the content distribution network is like a downlink cellular network since nearest transmitter has the content, a farther transmitter is often the one with the file required by the receiver.

General models for the multi-cell using stochastic geometry were developed in [27], where the downlink coverage probability was derived as:

 pcov(T,λ,α)≜P[SINR>T]=πλ∫∞0e−πλrβ(T,α)−μTσ2rα/2dr, (1)

where . The expectation is with respect to the interference power distribution , the transmit power is , and Signal-to-Noise Ratio () is defined at a distance of and is . A summary of the symbol definitions and important network parameters are given in Table I.

###### Definition 1.

Density of successful receptions (). The performance of a randomly chosen receiver is determined by its coverage. For the homogeneous with density , let fraction of all users be the transmitter process , and fraction of users be the receiver process , where . The coverage probability of a randomly chosen receiver is , which is the same for all receivers, and the total average number of receivers is proportional to the density . Hence, the , which denotes the mean number of successful receptions per unit area, equals

 DSR=λγ2pcov(T,λγ1,α)=λγ2(πλγ1∫∞0e−πλγ1rβ(T,α)−μTσ2rα/2dr), (2)

where is obtained by combining (1) with the thinning property of the , i.e., , which is obtained through the thinning of , is a homogeneous with density [31, Ch. 1].

We consider the generalized file caching problem in networks where every user randomly requests or caches some files based on the availabilities. Our goal is to maximize the in (2) for single file and multiple files. We discuss the details of our optimization problem in Sects. III and IV.

## Iii DSR For a Single File

We first assume that there is a single file in the network. The single file case is the baseline model for the more general multi-file model presented in Sect. IV. Sampled uniformly at random from the , a fraction of the users form the process of the users possessing the file, and a fraction of the users form the process of the users who want the same file. The receivers communicate with the nearest transmitter while all other transmitters act as interferers, and each transmitter can serve multiple receivers. A receiver is in coverage when its from its nearest transmitter is larger than some threshold . Given the total density of receivers is given by , and each receiver is successfully covered with probability , the , i.e., , is given by their product. In the single file scenario, since there is only 1 file being transmitted in the network, there is no caching pmf. Our objective in this section is to determine the optimal fractions of transmitters and receivers in the network that maximizes the . In Sect. IV, we consider the multiple file transmission scenario, where we use the optimal fractions of transmitters and receivers and , respectively, derived in this section, to determine the , and optimize the caching pmf based on the randomized model outlined in Sect. II. We formulate the following optimization problem to determine and :

 DSR∗= maxγ1>0,γ2>0 λγ2pcov(T,λγ1,α) (3) s.t. γ1+γ2=a,0

where is the coverage probability of a typical user, and is the total fraction of transmitting and receiving users in a network with density .

###### Lemma 1.

The fraction of transmitters should be less than that of receivers, i.e., the solution of (3) satisfies the following relation: .

###### Proof.

See Appendix -A. ∎

###### Lemma 2.

The maximum for arbitrary noise and is given by

 DSR∗=λ(a−γ1)/(1γ1[1γ1−1a−γ1]2μTσ2(πλ)2β(T,4)+β(T,4)).
###### Proof.

See Appendix -B. ∎

###### Corollary 1.

Low case, . As , the coverage can be approximated as . Hence, the maximum is given as

 DSR∗=λ(a−γ1)/(1γ1[1γ1−1a−γ1]2μTσ2(πλ)2+1), (4)

where optimal satisfies .

###### Corollary 2.

No noise (degenerative) case. For no noise, . Maximum for single file for , Rayleigh fading, no noise, and is , obtained for the optimal value of , i.e., so that there is one transmitter333In the no noise case the single file result is trivial. In multiple file case, there will be interference due to the simultaneous transmissions of multiple files, which will be discussed in Sect. IV..

Next, we consider the low noise approximation of the success probability that is more easily computable than the constant noise power expression and more accurate than the no noise approximation for . Using the expansion for as , the term for small but non-zero noise case can be calculated after an integration by parts of (1) as follows

 pcov(T,λ,α)=1β(T,α)−μTσ2(λπ)−α2β(T,α)α2+1Γ(1+α2)+o(σ2).
###### Lemma 3.

The maximum for a single file for , Rayleigh fading, small noise is equal to

 DSR∗=λαβ(T,α)[1α−(γ∗1−1)α+γ∗1(2−α)o(σ2)].
###### Proof.

See Appendix -C. ∎

For , there is a closed form expression for as follows: , which we use for the derivation of Lemma 4.

###### Lemma 4.

The maximum for small but non-zero noise and is

 DSR∗=2λ(a−γ1)(1+√Tarctan(√T))[1−μTσ2aμTσ2(2a−γ1)+o(σ2)]+o(σ2). (5)
###### Proof.

See Appendix -D. ∎

Discussion. In Fig. 3 (a), we illustrate the relation between and for , =0.1. To simplify the notation, we assume that and let and . As increases for , the decreases and decreases. Note that the solid lines denote the simulation results for the model. In Fig. 3 (b), the variation of with respect to for , =0.1 is shown. The coverage is monotonically decreasing in and a concave increasing function of . For increasing , the value of becomes very small, and to maximize the , a higher fraction of the users should be transmitters (i.e., higher ) to compensate the outage. For low , to maximize the , the fraction of the receivers should be higher. Therefore, as decreases, the increases and becomes right-skewed, but decreases only slightly, which is negligible444This follows from the separability assumption of in and , thus insensitivity of the maximization problem to the value of , which is further detailed in Assumption 1 of Sect. IV-B, and verified in Appendix -F.. Thus, we conclude that is largely invariant to and mainly determined by . In Fig. 3 (c), we show the variation of with . The increases with . On the other hand, decreases as the density of users increases and transmissions from increased number of users cause high interference.

Although the single file case is trivial in the sense that it boils down to the optimization of the fractions of the transmitters and receivers that maximizes the , it is the baseline model for the multiple file case where the main objective is to determine the optimal caching distribution over the set of files. We discuss the multiple file setup next.

## Iv Optimizing the DSR of the Sequential Serving Model with Multiple Files

We determine the optimal caching distribution for the transmitters to maximize the for the sequential serving-based strategy, in which one type of file is transmitted at a time. Later, in Sect. VI, we study the general case, where the transmissions of different files can take place simultaneously.

File Popularity Distribution. To model the file popularity in a general network, we use Zipf distribution for , which is commonly used in the literature [19]. Then, the popularity of file is given by , for , where is the Zipf exponent and there are files in total. The demand distribution Zipf is the same for all receivers of the model.

### Iv-a Sequential Serving-based Model

In this model, only the set of transmitters having a specific file transmits simultaneously. Hence, this is the special case where only one file is transmitted at a time network-wide. This is illustrated in Fig. 1 in Sect. II. If a user is selected as a receiver at a time slot, it draws a sample from the request distribution , which is known. If any user is randomly selected as the transmitter at a time slot with probability , it draws a sample from the caching distribution , which is not known yet. At any time slot, each receiver is scheduled based on closest transmitter association. According to this model, since file is available at each transmitter with , using the thinning property of the [31, Ch. 1], the probability of coverage for file is

 pcov(T,λtpc(i),α)=πλtpc(i)∫∞0e−πλtpc(i)rβ(T,α)−μTσ2rα/2dr, (6)

where is the total density of the transmitting users.

Given that the requests are modeled by the Zipf distribution, our objective is to maximize the of users for the sequential serving-based model, denoted by for a model with density :

 maxpc DSRS (7) s.t. M∑i=1pc(i)=1;pr(i)=1iγr/M∑j=11jγr,i=1,…,M,

where , the first constraint is the total probability law for the caching distribution, and the second constraint is the demand distribution modeled as Zipf with exponent , and , and is the number of files.

Note that in (7) is obtained for a sequential transmission or scheduling model and it is same as the formulation given in (1) which follows from Theorem 1 of [27]. This model can be generalized to different scheduling schemes. For example, in Sect. VI, we introduce a more general model where multiple files are simultaneously transmitted, and obtain a coverage expression that is different from in (7), which is detailed in Theorem 2 of Sect. VI.

Similar to the optimal fractions of the transmitter and receiver processes calculated in Sect. III for the single file case, optimal values of and for multi-file case can be found by taking the derivative of (7) with respect to , which yields the following expression:

 M∑i=1λpr(i)pc(i){∫∞0[1γ1−11−γ1−πλpc(i)β(T,α)r]e−πλγ1pc(i)rβ(T,α)−μTσ2rα2dr}=0, (8)

where optimal value of and the pmf are coupled. Therefore, we first solve (7) by optimizing the pmf and then, determine the value that satisfies (8).

We now investigate different special network scenarios where significant simplification is possible.

### Iv-B Rayleigh Fading DSR Results

We optimize the of users for the multi-file setup, where interference fading power follows an exponential distribution with . We consider several special cases corresponding to 1) small but non-zero noise, 2) arbitrary noise and 3) an approximation for arbitrary noise allowing the path loss exponent . We find the optimal caching distribution corresponding to each scenario.

###### Lemma 5.

Small but non-zero noise, . The optimal caching distribution is , which is also Zipf distributed, where is the Zipf exponent for the caching pmf.

###### Proof.

See Appendix -E. ∎

Assuming , the caching pmf exponent satisfies , which implies that the optimal caching pmf that maximizes the has a more uniform distribution exhibiting less locality of reference compared to the request distribution that is more skewed towards the most popular files.

###### Assumption 1.

Separability of coverage distribution. For Rayleigh, Ricean and Nakagami small-scale fading distributions, the function can be approximated as a linear function of as shown in Fig. 4. This relation555Although the expression is not analytically tractable, we can approximate as a linear function of because the lower incomplete Gamma function has light-tailed characteristics. Since the channel power distribution -which is exponential due to Rayleigh fading- is also light tailed, we can expect to observe such a linear approximation in our numerical results. greatly simplifies the analysis of the optimization problem given in (7).

###### Lemma 6.

Arbitrary Noise, . For arbitrary noise, from Assumption 1, the optimal caching distribution can be approximated as a Zipf distribution given by

 pc(i)≈1iγc/M∑j=11jγc,i=1,…,M, (9)

where is the Zipf exponent for the caching pmf assuming .

###### Proof.

See Appendix -F. ∎

Interestingly, this result is the same as Rayleigh fading with small but non-zero noise model developed in Sect. IV-B, which follows from the monotonic transformation [32] caused by increasing the noise power in (6). According to the pmf given in (9), the optimal caching strategy exhibits less locality of reference than the input stream of requests. Therefore, it is a good caching strategy, which will be further verified in Sect. VII. Lemma 6 suggests that files with higher popularity should be cached less frequently than the demand for this file, and unpopular files should be cached more frequently than the demand for the file. However, high popularity files should be still cached at more locations compared to the low popularity files. The path loss evens out the file popularities and the caching distribution should be more uniform compared to the request distribution. The sequential transmission model shows that for a Zipf request distribution with exponent , which is skewed towards the most popular files, the optimal caching pmf should be also Zipf distributed with the relation for , implying that the caching pmf is more uniform than the request pmf.

The next result generalizes Lemma 6 to any request distribution rather than the Zipf distribution, and is derived solving (29) in Appendix -F using the separability of coverage from Assumption 1.

###### Theorem 1.

For arbitrary noise, if the small-scale fading is Rayleigh, Nakagami or Ricean distributed, from Assumption 1, for a general request pmf, , the optimal caching pmf is approximated as

 pc(i)≈pr(i)1(α/2+1)/M∑j=1pr(j)1(α/2+1),i=1,…,M. (10)

From (10), it is required to flatten the request pmf to optimize the caching performance. Examples include the case of uniform demands, where the optimal caching distribution should be also uniform, and Geometric() request distribution, for which the caching distribution satisfies Geometric(), where . In the case of Zipf demands, we can derive the same result as in Lemma 6.

###### Lemma 7.

An Approximation for Arbitrary Noise with . For a total number of files and arbitrary noise with , the optimal caching pmf is

 pc(i)=ai+blog(i+1i),i=1,…,M, (11)

where , , and the pmf is valid only if .

###### Proof.

See Appendix -G. ∎

The distribution in (11) of Lemma 7 is a variety of Benford’s law [30], which is a special bounded case of Zipf’s law. Benford’s law refers to the frequency distribution of digits in many real-life sources of data and is characterized by the pmf . In distributed caching problems, the number of files, , is generally much greater than 9. Therefore, we generalize the law as , . The result in (11) has a very similar form as the Benford law with shift parameter for file and a scaling parameter , as determined in Lemma 7.

## V A Lower and Upper Bound on the DSR and Different Caching Strategies

The analysis of the becomes intractable for the multiple file case when the caching pdf does not have a simple form. Therefore, we derive a lower and upper bound to characterize the for the sequential serving model and provide two different caching strategies to maximize .

### V-a Bounds on DSRS

We provide a lower and upper bound for , the of the sequential serving-based transmission model with multiple files. We discussed the optimal file caching problem for multiple file scenarios in [1]. Here, we compare our solution to the several bounds and other caching strategies.

#### V-A1 Upper Bound (UB)

Using the concavity of in , a UB is found as

 ∑Mi=1pr(i)pcov(T,λtpc(i),α)(a)

where () follows from Jensen’s inequality, and () follows from the assumption for that yields , where .

#### V-A2 Lower Bound (LB)

Using the fact that given is Zipf distributed, the optimal also has Zipf distribution as proven in Lemma 6 as a solution of the maximization problem in (7). As a result, any distribution that is not skewed towards the most popular files will yield a suboptimal . Hence a uniform caching distribution performs worse than the Zipf law, and a LB is found as

 ∑Mi=1pr(i)pcov(T,λtpc(i),α)>∑Mi=1pr(i)pcov(T,λtM,α)=pcov(T,λtM,α). (13)

### V-B Caching Strategies for the Sequential Serving Model with Multiple Files

We propose two optimization formulations to maximize in the presence of multiple files, where the request and caching probabilities are known a priori because in general the optimal and the optimal caching distribution is not tractable. The first strategy, where we maximize the for the least popular file, favors the least desired file, i.e., the file with the lowest popularity, to prevent from fading away in the network. Therefore, we introduce the variables for files to weight the caching pmf . The second strategy aims to maximize the of all files by optimizing the fraction ’s of the users for each file type. We assume the caching distribution is given. Then, we provide iterative techniques to solve the problems presented in this section.

#### V-B1 Maximum DSR of the Least Desired File

Our motivation behind maximizing the of the least desired file is to prevent the files with low popularity from fading away in the network.

###### Lemma 8.

The caching probability of each file is weighted by so that the total fraction of transmissions for all files, denoted by satisfies . Given for some , the optimal solution is given by .

###### Proof.

See Appendix -H. ∎

#### V-B2 Maximum DSR of All Files

We maximize the for all files without any prioritization.

###### Lemma 9.

The optimal solution to maximize the for all files is given by for all .

###### Proof.

See Appendix -I. ∎

As well as maximizing the for the sequential model, one might wish to select a file with a particular request probability, and use to distribute this file and all files with higher probability or simultaneously cache all files using as detailed in Sect. VI. In the next section, we describe the simultaneous transmission of multiple files, and derive expressions for distribution and .

## Vi Simultaneous Transmissions of Different Files with Arbitrary Noise

We consider the multiple file case, where a typical receiver requires a specific set of files, and the set of its transmitter candidates are the ones that contain any of the requested files. Each receiver gets the file from the closest transmitter candidate. The rest of the active transmitters that do not have the files requested are the interferers. We provide a detailed analysis for the coverage next.

Assume that each receiver has a state, determined by the set of files it requests. For a receiver in state , the set of requested files is . Let the tagged receiver be and in state , and be the set of transmitters that a receiver in state can get data from. Hence, the set of transmitter candidates for user in state is the superposition given by , where is the set of transmitters containing file . Let be the density of , where . The rest of the transmitters, i.e., , is an independent process with density .

The sum gives the probability that the user has at least one of the files requested by any receiver in state . Hence, the density of the transmitter candidates for a receiver in state are given by the product of and , i.e., . Hence, using the nearest neighbor distribution of the typical receiver in state , the distance to its nearest transmitter is distributed as , for and .

We assume that all users experience Rayleigh fading with mean , and constant transmit power of . Assuming user is at o, in state and is a receiver, and x is the tagged transmitter denoted by , and the distance between them is , then the at user is , where is the channel gain parameter between and , is the white Gaussian noise, and is the total interference at node in state , and given by the following expression: , where is the channel gain from the interferer and the receiver , is the interferer to receiver distance, on , the first term is the interference due to the set of transmitters that has the files requested by the receiver, and the second term is the interference due to the rest of the transmitters that do not have any of the desired files by the receiver. The total interference depends on the transmission scheme. Compared to the nearest user association [27], it is hard to characterize the interference in dynamic caching models with different association techniques.

###### Theorem 2.

The probability of coverage of a typical user conditioned on being at state is given by666The definition of here is different from the definition of the classical downlink coverage probability given in (1) due to the possibility of simultaneous transmissions of different file types.

 Pcov(T,λj,α)=πλj∫∞0e−πλjv(1−ρ2(T,α))−πλtv(ρ1(T,α)+ρ2(T,α))−Tσ2vα/2dv, (14)

where and .

###### Proof.

See Appendix -J. ∎

We now consider the special case of the path loss exponent , which is more tractable.

###### Corollary 3.

Letting , the probability of coverage of a typical user conditioned on being at state for the special case of and is given by

 Pcov(T,λj,4)=πλtpj√πTσ2eH(T,λt,pj)22Q(H(T,λt,pj)). (15)
###### Proof.

See Appendix -K. ∎

Since the term is increasing in and converges to in the limit as goes to infinity, is increasing in , and positive. Furthermore, is monotonically increasing in . This observation is essential in the characterization of the under different user criteria.

We consider two different strategies for the simultaneous transmission of multiple files, namely popularity-based and global models, which differ mainly in the set of files cached at the transmitters.

### Vi-a Popularity-based DSR

In this approach, a set of files corresponding to the most popular ones in the network is cached simultaneously at all transmitters. We define , which stands for the of the popularity-based approach, and is calculated over the set of most popular files as

 DSRP=λγ2∑k∈Kpr(k)Pcov(T,ξl,α), (16)

where is the set of the most popular files, and , where is a set corresponding to the most popular files cached at the transmitters among the set of available files in the caches.

Consider the special case of (16), where only the most popular file in the network is cached at all the transmitters if available, i.e., , which modifies (16) as , where follows from the fact that for , the coverage probability becomes same as the sequential serving-based model in Sect. IV, and the most popular file index can be found from the demand distribution and is given by , and hence the corresponding density of the transmitters is , where for all .

### Vi-B Global DSR

Global is defined as the average performance of all users in the network, which is determined by the spatial characteristics of file distributions and the coverage of a typical user. The function in our model is state dependent since the coverage probability of a user is determined according to the files requested by the user. The expected global is given as follows:

 DSRG=λγ2∑Mi=1pr(i)Pcov(T,γ1λpc(i),α). (17)

A Discussion on the Various Transmission Models. Popularity-based transmission and global model in this section do not depend on the cache states. Instead, they both depend on the global file popularity distributions, and have similar characteristics as given in (16) and (17). It is intuitive to observe that the optimal caching distributions in both models follow similar trends as the request distribution. Sequential serving-based model in Sect. IV-A boils down to the scenario characterized in [27] where only a subset of transmitters and their candidate receivers are active simultaneously. Hence, this model mitigates interference and provides higher coverage than the other models. However, since the is a weighted function of the file transmit pmf , the of the model is reduced.

Now, we present some numerical results on the general transmission models discussed and present results related to the popularity-based , global and sequential .

State dependent coverage probability. We illustrate the coverage probability for varying for a fixed fraction of transmitters () in Fig. 6. The coverage probability is state dependent777The receiver’s state refers to the collection of files it requests. and for the receiver in state , the density of transmitters is given by where . If the requested files are available in the set of transmitters, then the receiver has higher coverage. Therefore, for higher fraction of transmitters , the coverage probability is higher.

Caching performance of the proposed transmission models. The optimal caching strategies that maximize the caching problems of Sect. VI given in (16) and (17) are not necessarily Zipf distributed. However, without the Zipf distribution assumption, the optimization formulations become intractable since in (14) is nonlinear in the density of the users. Therefore, for simulation purposes, we find the optimal Zipf caching exponents that maximize the proposed functions.

comparison. We investigate the variation of the sequential model with respect to the caching parameter . From Fig. 6, we observe that increases with the request distribution parameter , assuming both distributions are Zipf. In Figs. 8 and 8, we illustrate the variation of the popularity-based model and the global model with . In both figures, it is clearly seen that as the requests become more skewed (higher ), the increases. It also increases with , which implies that the optimal caching distribution should also be skewed towards the highly popular files.

## Vii Numerical Results and Discussion

We evaluate the optimal caching distributions that maximize the . The simulation results are based on Sects. IV and V. We consider a general network model with Rayleigh fading distribution with and for small and general noise solutions. The requests are modeled by .

Benford versus Zipf distributions. In Figs. 10 and 10, we illustrate the trend of optimal Zipf caching distribution and the Benford law developed in Sect. IV for different numbers of total files. As seen from Fig. 10, these two distributions have similar characteristics. However, as increases, the range of for which Benford caching distribution in (11) and Zipf laws are comparable becomes narrower. For , it is not practical to approximate the Benford law with a Zipf distribution. In fact, as described in Sect. IV, as the noise level decreases, i.e., drops, the optimal caching strategy converges to Zipf distribution. As seen in Fig. 10, for small noise, i.e., for high , these laws behave similarly for relatively high values compared to the general noise case.

We now compare the of the sequential serving model for various based on the optimal solutions that are also Zipf distributed, as derived in Sect. IV, and the lower and upper bounds obtained in Sect. V. The numerical solutions are obtained by calculating the of various (random) caching distributions and picking the best one that achieves the highest .

Zipf caching with is a good approximation to maximize the . In Fig. 12, we compare the performances of different caching strategies for a Zipf request distribution with parameter and . The Zipf caching distribution with parameter is very close to the optimal solution evaluated numerically that is also very close to the simple lower bound derived in (13). Furthermore, Benford distribution has very similar characteristics as the optimal caching distribution solution. There is a huge gap between the UB and the no noise in terms of the , and the for the no noise case is the highest among all for all or values.