Optimization and Analysis of Probabilistic Caching in tier Heterogeneous Networks
Abstract
In this paper, we study the probabilistic caching for an tier wireless heterogeneous network (HetNet) using stochastic geometry. A general and tractable expression of the successful delivery probability (SDP) is first derived. We then optimize the caching probabilities for maximizing the SDP in the high signaltonoise ratio (SNR) regime. The problem is proved to be convex and solved efficiently. We next establish an interesting connection between tier HetNets and singletier networks. Unlike the singletier network where the optimal performance only depends on the cache size, the optimal performance of tier HetNets depends also on the BS densities. The performance upper bound is, however, determined by an equivalent singletier network. We further show that with uniform caching probabilities regardless of content popularities, to achieve a target SDP, the BS density of a tier can be reduced by increasing the cache size of the tier when the cache size is larger than a threshold; otherwise the BS density and BS cache size can be increased simultaneously. It is also found analytically that the BS density of a tier is inverse to the BS cache size of the same tier and is linear to BS cache sizes of other tiers.
I Introduction
The global mobile data traffic is estimated to increase to 30.6 exabytes per month by 2020, an eightfold growth over 2015, and the contribution by video is foreseen to increase from in 2015 to in 2020 [2]. To address this mobile data tsunami and hence meet the capacity requirement for the future 5G network[3], an effective and promising candidate solution is to deploy a dense network with heterogeneous base stations (BSs), such as macro BSs, relays, femto BSs and pico BSs [4]. The heterogeneous network (HetNet) can provide higher throughput and spectral efficiency. In the meantime, it also faces two challenges. One is the tremendous burden on the backhual link due to the explosive demand for video contents during the peak time and the other is high CAPEX and OPEX due to the denser BSs.
Recently, caching popular contents at BSs has been introduced as a promising technique to offload mobile data traffic in cellular networks [5, 6]. Unlike the communication resources, the storage resources are abundant, economical, and sustainable. By exploiting the abundance of the storage resources in wireless networks, significant gains in network capacity through caching can be expected [7], which enables caching to be an essential functionality of emerging wireless networks [8].
The aim of this work is to study how caching can address the aforementioned challenges in a multitier HetNet. In specific, we first would like to find out what is the optimal cache placement strategy in order to alleviate the traffic burden in backhaul links to the minimum. Second, we would like to find out if the deployment cost of dense BSs can be traded by BS cache storage, and if so, what are the tradeoffs and what conditions must be met in order for it to happen.
Ia Related Work
Caching has the potential to alleviate the heavy burden on the capacitylimited backhaul link and also improves userperceived experience [9]. Utilizing the tool of stochastic geometry [10], the work [11] formulates the caching problem in a scenario where small BSs are distributed according to a homogeneous Poisson Point Process (HPPP). The authors in [12] consider a twotier HetNet and derive a closedform expression of the outage probability by jointly considering spectrum allocation and storage constraints. In [13], the authors consider a 3tier HPPPbased HetNet with caching and theoretically elaborate the average ergodic rate, outage probability and delay. Considering an HPPPbased cacheenabled small cell network, a closed form expression of the outage probability and the optimal BS density to achieve a target hit probability are derived in [14]. The work [15] proposes a clustercentric small cell network and designs cooperative transmission scheme to balance transmit diversity and content diversity. It is worth noting that these works mainly focus on the performance analysis of cacheenabled wireless networks for given caching strategies, such as caching the most popular contents.
Caching strategy is an important issue for cacheenabled wireless networks. Previous works on the optimal caching strategy design can be classified into two trends based on whether channel fading and interference are considered. The early trend focuses on the connection topology only while ignoring channel fading and interference. The authors in [16] formulate a cache placement problem in distributed helper stations to minimize the average download delay with both uncoded and coded caching. It is shown in [16] that the optimal caching problem in a wireless network with fixed connection topology is an NPhard problem (without coding). In [17], a joint routing and caching design problem is studied to maximize the content requests served by small BSs. By reducing the NPhard optimization problem to a variant of the facility location problem, algorithms with approximation guarantees are established. The second trend takes into account channel fading and interference for caching optimization by mostly utilizing the tool of stochastic geometry. The work [18] proposes an optimal randomized caching policy to maximize the total hit probability and overviews different coverage models to evaluate the performance. The works [19, 20, 21] optimize the probabilistic caching strategy to maximize the successful download probability in small cell networks. Further, a closedform expression for the optimal caching probabilities is obtained in the noiselimited scenario in [20]. In [22], a greedy algorithm is proposed to find the optimal caching strategy to minimize the average bit error rate. The work [23] studies the problem of joint caching, routing, and channel assignment for video delivery over coordinated multicell systems of the future Internet.
Recently, caching strategy optimization is extended to wireless heterogeneous networks. The combination of the optimal caching and the network heterogeneity brings more gains in network capacity. Utilizing the tool of stochastic geometry, the works [24, 25] investigate the optimal probabilistic caching at helper stations while assuming deterministic caching at macro stations to maximize the successful transmission probability in a twotier HetNet. Based on [18], the work [26] considers different types of BSs with different cache capacities. The cache optimization problem for the first type of BSs is solved by assuming that the placement strategy for other types of BSs is given. The joint probabilistic caching optimization problem for all types of BSs is yet not considered, and little analytical insight on the cache design and system performance is available. In general, the joint optimization for probabilistic cache placements in different tiers of a HetNet is very challenging due to the different tier association probabilities brought by the content diversity as well as the complicated interference distribution by the nature of network heterogeneity.
Furthermore, a tradeoff between the small BS density and total storage size is firstly presented in [11], where each small BS caches the most popular contents. Using the optimal caching scheme, [25] shows that the helper density can be traded by the cache size to achieve a target area spectral efficiency. Note that the tradeoff studies in [11, 25] are conducted numerically only without theoretical analysis. Deriving and analyzing the tradeoff theoretically has not been solved. In [27], the authors address the question that how much caching is needed to achieve the linear capacity scaling in the dense wireless network based on scaling law method.
IB Contributions
In this work, we first investigate the optimal probabilistic caching to maximize the successful delivery probability (SDP) in a general tier () wireless cacheenabled HetNet. We next establish an interesting connection between tier HetNets and singletier networks. We then address the tradeoffs between the BS caching capability and the BS density analytically based on the uniform caching strategy. The main contributions are summarized as follows:

Analyzing and optimizing the SDP for the tier HetNet: We derive the tier association probability and the SDP by modeling the BS locations in the HetNet as tier independent HPPPs. The optimal probabilistic caching problem for maximizing the SDP is then formulated. We prove that this problem is concave in the high signaltonoise ratio (SNR) regime. The sufficient and necessary conditions for the optimal solution are derived.

Highlighting the connection between tier HetNets and singletier networks: We further study the optimal caching problem in special cases, and find that the maximum SDP of singletier networks only depends on the cache size while that of tier HetNets is also determined by the BS densities and transmit powers. Moreover, in the high SNR regime, we prove that there exists a singletier network such that the maximum SDP of the tier HetNet is upper bounded by that of the singletier network. When all tiers of BSs have the same cache size, the tier HetNet performs the same as the singletier network, regardless of the network heterogeneity.

Presenting insights on the impacts of the key network parameters: We first show that the optimal performance of singletier networks is independent of the BS density and transmit power. Then, under uniform caching strategy, we analytically present the impacts of the BS cache size, density and transmit power of each tier on the system performance. Numerical results also verify our analytical results.

Revealing the tradeoffs between the BS density and the BS cache size: With uniform caching strategy, our analysis reveals that, to maintain a target SDP, the network parameters are related as follows: increasing the BS caching capability can reduce the BS density when its cache size is larger than a threshold ; the BS density is inversely proportional to the cache size in the same tier, i.e., . Here, denotes the index of the tier and is independent of and . For the different tiers, we prove that the BS density is a linear function of the cache size , i.e., , for , where , and are independent of and . Likewise, we reveal the similar tradeoffs between the BS transmit power and the BS cache size.
The rest of this paper is organized as follows. Section II presents the system model. The performance metric is analyzed in Section III. In Section IV, we formulate and solve the optimal caching problem. Then, the impacts and tradeoffs of the network parameters are shown in section V. The numerical and simulation results are presented in Section VI, and the conclusions are drawn in Section VII.
Ii System Model
Iia Network and Caching Model
We consider a general wireless cacheenabled HetNet consisting of tiers of BSs, where the BSs in different tiers are distinguished by their transmit powers, spatial densities, biasing factors, and cache sizes^{1}^{1}1These model assumptions indicate that BSs in different tiers have different traffic load to handle, and also reflect the demand heterogeneity in different BSs.. The locations of BSs in each tier are spatially distributed according to an independent HPPP, denoted as with density for . A threetier HetNet including macro BSs, relays and pico BSs, is illustrated in Fig. 1. Consider the downlink transmission. Time is divided into discrete slots with equal duration and we study one slot of the system. For the wireless channel, both largescale fading and smallscale fading are considered. The largescale fading is modeled by a standard distancedependent path loss attenuation with path loss exponent . The Rayleigh fading channel is considered as the smallscale fading, i.e., . Each user receiver experiences an additive noise that obeys zeromean complex Gaussian distribution with variance .
Consider a database consisting of contents denoted by , and all the contents are assumed to have equal length^{2}^{2}2Note that the extension to the general case where contents are of different lengths is quite straightforward since the contents can be divided into chunks with equal size.. Each user only requests one single content at each time slot. The content popularity distribution is identical among all users, represented by , where each user requests the th content with probability and . The content popularity is assumed to be known a prior for cache placement. Without loss of generality, we assume . Each BS is equipped with a cache storage. The cache capacities of tier BSs are denoted as , where each BS in the th tier can store at most () contents.
When a user submits a content request, the content will be delivered directly from the local cache of a BS that has cached it. If the content is not cached in any BS, it will be downloaded from the core network through backhaul links. Since the main purpose of this paper is to optimize the caching strategy to offload the backhaul traffic, we only consider the transmission of the cached contents at BSs, same as [19].
We adopt the probabilistic caching strategy and assume all the BSs in a same tier use the same caching probabilities. Each BS caches contents with the given probabilities independently of other BSs. Define as the caching probability matrix where denotes the probability that the BSs in the th tier caches the th content. It must satisfy
(1)  
(2) 
Note that the conditions (1) and (2) are sufficient and necessary for the existence of a random content placement policy requiring no more than slots of storage at each BS in tier for [18]. Also note that if a BS realizes the caching strategy by caching each file at random with the given probability but independently of other files, the actual cache memory in the BS can be exceeded or wasted. To strictly meet the instantaneous cache size constraint (2) at each BS, a novel content placement approach is proposed in Section IIC of [18]^{3}^{3}3Note that the file ordering in this content placement approach can be varied arbitrarily at each particular realization if different file combinations are desired.. This approach brings dependency among the caching events of different files. But such cache dependency is irrelevant to the analysis in this work.
IiB Probability of Tier Association
Without loss of generality, we carry out our analysis for a typical user, denoted as , located at the origin as in [10]. In the cacheenable HetNet, the user association policy does not only depend on the received signal strength but also the requested and cached contents. Specifically, when requests content , it is associated with the strongest BS among those that have cached content from all the tiers based on the average received signal power. Denote the distance between and the nearest BS caching content in the th tier by . According to our tier association policy, the index of the tier that is associated with for content is:
(3) 
where and are the association bias factor and the transmit power of BSs in the th tier, respectively. For notation simplicity, we assume , , in the rest of the paper.
It is essential to determine each tier’s association probability when a user requests a content. Since each BS caches contents independently of other BSs, the locations of the BSs caching content in the th tier can be modeled as a thinned HPPP with density [28]^{4}^{4}4Note that application of the thinning property of HPPP is based on the assumption that each BS caches contents independently of other BSs, hence it is not affected by the realization method of [18].. Then, we have the following lemma.
Lemma 1.
The probability of associated with the th tier for content is given by
(4) 
Proof.
Please refer to Appendix A. ∎
This lemma states that the association probability is determined directly by the density and the transmit power of the thinned HPPP .
Iii Performance Analysis
In this section, we analyze the SDP for a given probabilistic caching scheme . Consider that all the BSs operate in the fully loaded state and share the common bandwidth [10]. By using an orthogonal multiple access strategy within a cell, the intracell interference is thus not considered here and only the interference introduced by intertier cells and intratier other cells is incorporated into analysis. Given that sends a request for content and is associated with the th tier, then the received instantaneous signaltointerferenceplusnoise ratio (SINR) of is given by
(5) 
where is the distance from to its serving BS in the tier tier, denotes the distance between and the th interfering BS in the th tier, is the smallscale fading channel gain between and the serving BS (the th interfering BS). The delivery of content from tier is successful when the received SINR of is larger than a threshold . Thus, the SDP of content from tier can be expressed as^{5}^{5}5Note that the SDP in a cacheenabled network depends on both the average received signal strength and the caching distribution, which is different from the traditional coverage probability where is always associated with the strongest BS.
(6) 
Recall that the locations of BSs caching content in the tier can be modeled as a thinned HPPP, then the probability density function (PDF) of is given below.^{6}^{6}6 and can also be obtained by applying the thinned HPPP to Lemma and Lemma of [29], respectively.
Lemma 2.
The PDF of is
(7) 
Proof.
Please refer to Appendix B. ∎
Note that different from the conventional network without caching where each user is associated with the strongest BS and there only exists one type of interfering BSs, in the multitier cacheenabled HetNet considered in this work, the interferences to when it is associated with the th tier for content can be divided into two groups. The first group of interferences come from all the BSs (except the serving BS ) in each tier that have stored content , the locations of which can be modeled as a thinned HPPP, denoted as with density for . The second group comes from all the BSs in each tier that do not cache content , the locations of which can also be modeled as a thinned HPPP, denoted as with density for . For the first interference group, the distance from to each interfering BS in is at least times of the distance from to its serving BS according to (30) caused by our association policy. For the second group, the interfering BSs could be very close to . By carefully handling these two groups of interferences in tier HetNets, we derive an analytical expression for in the following proposition.
Proposition 1.
The SDP is
(8) 
where , and . Furthermore, denotes the Gauss hypergeometric function, and is the Beta function defined as .
Proof.
Please refer to Appendix C. ∎
By the law of total probability, the average SDP for is given by
(9) 
Substituting (8) and (4) into (9), we obtain a tractable expression of as follows
(10) 
In the interferencelimited scenario, where the noise power is very small compared with the interference power and hence can be neglected, the expression (10) can be simplified.
Corollary 1.
In the interferencelimited scenario, i.e., , the SDP can be simplified as
(11) 
where .
Equation (10) and (11) show a tractable expression and a closedform expression for the SDP in the general regime and high SNR regime, respectively. The performance metric depends on four main factors: the number of tiers , the caching probabilities , the BS densities and transmit powers . In the rest of this paper, we shall focus on the interferencelimited regime with high SNR.
Iv Caching Optimization and Analysis
In this section, we formulate and solve the optimal caching problem for maximizing the SDP in the high SNR regime. Further, by considering the optimal caching problem in special cases, we establish an interesting connection between tier HetNets and singletier networks.
Iva Caching Optimization for General Case
The optimal caching problem of maximizing the SDP is formulated as
Proposition 2.
Problem is a concave optimization problem.
Proof.
Please refer to Appendix D. ∎
By Proposition 2, we can use the standard interior point method to solve [30]. Let denote the optimal solution of . By the KarushKuhnTucker (KKT) conditions, the sufficient and necessary conditions for can be stated in the following lemma.
Lemma 3.
The optimal solution of Problem satisfies the following sufficient and necessary conditions:
(12) 
for , where , , , , and is the Lagrangian multiplier that satisfies ^{7}^{7}7From (39), it can be shown by contradiction that the maximum SDP is achieved only when constraint (2) holds with equality. The similar proof is given by Lemma 2 of [18]. In the rest of this paper, we use (2) with equality as the constraint. for .
Proof.
Please refer to Appendix E. ∎
Furthermore, according to (39), we have the following remark to state the impact of the BS cache size of each tier.
Remark 1.
For , the maximum SDP increases with the cache size ().
IvB Caching Optimization for Special Cases
IvB1 Optimization for N=1
When , the network degrades to the singletier network. Denote as the caching strategy and as the cache size for the singletier network, then the optimal caching probabilities in (12) become:
(13) 
where satisfies and can be found by bisection method. ^{8}^{8}8Note that the optimal caching probability in this special case is consistent with the prior works on singletier networks in [19, 21]. Our work extends the probabilistic caching strategy optimization for a singletier network to that for a general tier HetNet and contributes to presenting the impacts and essential tradeoffs of the heterogeneous network parameters.
Based on (13), we have the following result.
Corollary 2.
The optimal caching probability decreases with the index , i.e, increases with the content popularity . Besides, increases with the cache size .
Proof.
Please refer to Appendix F. ∎
By (11) for , the maximum SDP for the singletier network, denoted by , is
(14) 
thus we have the following remark.
Remark 2.
In the interferencelimited regime, the maximum SDP of singletier networks is independent of the BS density, transmit power, and only depends on the cache size. This is because the serving BS and interfering BSs have the same caching resource, and the increase in signal power is counterbalanced by the increase in interference power. Similar performance independency on the BS density and transmit power also exists for traditional networks without cache [10].
IvB2 Connection between 1tier and tier HetNets
We observe from (11) that the SDP in the high SNR regime depends on the caching probabilities through . Thus, by defining , then we can formulate a new problem below,
(15)  
(16)  
(17) 
From (14) and (15), it is seen that is identical with the caching optimization problem in a singletier network with being the caching probability vector and being the equivalent cache size. By further comparing and , we obtain the following proposition which states a general relationship between the optimal performance of an tier HetNet and that of a singletier network.
Proposition 3.
Let be the optimal objective of for the considered tier HetNet. For the singletier network with cache size , we have
with equality if for .
Proof.
Please refer to Appendix G. ∎
Proposition 3 states that in the interferencelimited regime, the optimal performance of an tier HetNet with BS cache sizes , BS densities , and BS transmit powers is upper bounded by that of a singletier network with BS cache size , arbitrary BS density, and arbitrary BS transmit power. Further, their performances are the same when the cache sizes are the same for all tiers in the tier HetNet.
IvB3 Optimization for ,
In this case, Proposition 3 shows that is equivalent to . Based on (13), the optimal solution of can be further given below.
Corollary 3.
When for , there exists an optimal solution of satisfying for , where the optimal solution of follows (13) with and .
Proof.
Problem is the new constructed singletier caching optimization problem. Thus, its optimal solution is the same as (13) where let and . In the second part of the proof of Proposition 3, we show that when for , is equivalent to , and with is also an optimal solution of . Thus, Corollary 3 is proved. ∎
Remark 3.
In the interferencelimited regime, when all the BSs in the tier HetNet have the same cache size, the maximum SDP of the HetNet is independent of the network heterogeneity in the BS density and transmit power.
Traditionally, without caching ability at BSs, the outage probability is independent of the number of tiers, the BS densities and transmit powers in the interferencelimited tier HetNets [29]. By introducing caching resource into the system, (11) states that the maximum SDP generally depends on the number of tiers , the BS cache size , the BS density and transmit power . The intuition behind this observation is that the caching resource changes the decision of a user to access a BS. In the multitier cacheenabled HetNet, the decision not only depends on the received SINR, but also depends on the contents cached at BSs. In order to further understand the impact of the cache size on , we theoretically illustrate the relationship between the cache size and the BS densitytransmit power in the next section.
V Analysis on Network Parameters under Uniform Cache
In this section, with the uniform caching strategy where each content is cached with equal probabilities regardless of content popularities, the equivalence between the SDP of an tier HetNet and that of a singletier network is obtained. Based on this property, we further investigate the impacts of the key network parameters, i.e, the BS cache size, density and transmit power on the system performance. Finally, the tradeoffs of the BS density , transmit power and cache size are found.
Va The Equivalence under Uniform Cache
Consider the uniform caching strategy where and for singletier and tier HetNets, respectively. By substituting and into (11) and (15), respectively, the equivalence of the system performance of tier HetNets and 1tier Networks can be established.
Proposition 4.
In the interferencelimited regime, the SDP of a tier HetNet with the caching strategy equals that of the singletier network with the caching strategy , and is given by
(18) 
where the cache size of the singletier network .
Based on (13) and , we have the following corollary.
Corollary 4.
In the interferencelimited regime, consider the scenario that the video content popularity distribution is an uniform distribution, i.e., , , then the uniform caching strategies and are the optimal probabilistic caching strategies for singletier networks and tier HetNets, respectively.
Proof.
VB Impacts of , , ,
Proposition 4 further states that, with uniform caching regardless of content popularity, the SDP performance of tier HetNets depends on all the system parameters through the equivalent cache size only. Then, the impacts of the BS density and cache size of tier can be obtained in the following two lemmas.
Lemma 4.
The SDP increases with and when . Otherwise, decreases with or .
Proof.
From (18), we can see that increases with . Then we have
Obviously, when , we have that and , which also means that increases with and ^{9}^{9}9This is not contradict with Remark 2 in Section IV, because the BS densities , transmit powers and cache sizes affect the cache size in here based on .. Due to =, we thus have Lemma 4. ∎
Lemma 5.
The SDP increases with () and the increasing speed is monotonic to , .
Proof.
Due to , increases with for . Since increases with and , increases with . We thus have this lemma. ∎
Remark 4.
From Lemma 4 and Lemma 5, we can observe two important features in the tier cacheenabled HetNet:

Increasing the density or transmit power of the BSs from the tier with small cache size decreases the system performance. This somewhat surprising result is actually intuitive since such BSs (e.g., pico or femto) with small cache only provide little service but bring strong interferences to other BSs (e.g., macro or relay).

If the BS transmit power and density of the th tier are both the largest among all the tiers, it is most effective to increase the performance of the network by increasing the BS cache size of the th tier.
VC Tradeoffs of , , and ,
In the preceding subsection, we describe the impacts of , , and on . Now we will present the tradeoffs of these network parameters at a target SDP.
VC1 Tradeoffs of one tier parameters
Lemma 4 and 5 show that increasing the BS density, transmit power or cache size influence the SDP by changing . This suggests that, as long as does not change, one can interchange different types of system parameters to maintain the same system performance, and hence can obtain the tradeoffs of these network parameters at a target SDP. In the following, we elaborate the tradeoff between the BS density (transmit power ) and the cache size within each tier for .
Given a target SDP , the communication resource ( and ) and the caching resource () of all the tiers without the th tier, i.e., and , we can obtain based on (18), thereby gaining the tradeoffs of the th tier’s cache size , BS density and transmit power , as illustrated in the following theorem.
Theorem 1.
With the uniform caching strategy, given a target determined by , and the fixed values , , for and , the network parameters , , satisfy the following tradeoffs
(19)  
(20) 
where
(21)  
(22) 
Proof.
Please refer to Appendix H. ∎
Interestingly, we can observe from Theorem 1 that is inversely proportional to , while is a power function of with a negative exponent (). Accordingly, it is natural to ask that if the BS density can be reduced by increasing the BS caching capability. If yes, what is the condition? Thus, we have the following corollary to answer this question.
Corollary 5.
To maintain the same SDP, we have the following results for the th tier

The BS density and transmit power decrease with the cache size , i.e., and ,, when .

The BS density and transmit power increase with the cache size , i.e., and , when .
Proof.
Please refer to Appendix I. ∎
It is worth mentioning that increasing the BS caching capability of one tier does not always reduce the BS density or transmit power of this tier to achieve the same performance in a multitier NetHet, in contrast to singletier caching networks [11]. Taking for example, the tradeoff between and is shown in Fig. 2(a). The target SDP determined by is . Note that if a tier is deployed with larger cache size (), the more the BS cache size () increases, the less the BS density () becomes, in order to maintain the same . However, if the tier is deployed with small cache size (), increasing both the BS cache size and density can keep the same performance, as show in Fig. 2(a). This is because the increase in BS density of the tier with small cache size only improves its own tier association probability, and also causes the stronger interference to other tiers with larger cache sizes. Therefore, more users are associated with the tier with small cache size, and the SDP decreases because this tier can only serve fewer content requests. The tradeoff between and shown in Fig. 2(b) is similar to that of and .
VC2 Tradeoffs of different tier parameters
Similarly, we can find the tradeoffs of the parameters in different tiers, as specified in Theorem 2.
Theorem 2.
When , , and () satisfy the following tradeoffs for any two different tiers, remains constant.
(23)  
(24) 
for , where
(25)  
(26)  
(27)  
(28) 