CacheEnabled Heterogeneous Cellular Networks:
Optimal TierLevel Content Placement
Abstract
Caching popular contents at base stations (BSs) of a heterogeneous cellular network (HCN) avoids frequent information passage from content providers to the network edge, thereby reducing latency and alleviating traffic congestion in backhaul links. The potential of caching at the network edge for tackling 5G challenges has motivated the recent studies of optimal content placement in largescale HCNs. However, due to the complexity of network performance analysis, the existing strategies were designed mostly based on approximation, heuristics and intuition. In general, the optimal strategies for content placement in HCNs remain largely unknown and deriving them forms the theme of this paper. To this end, we adopt the popular random HCN model where tiers of BSs are modeled as independent Poisson point processes (PPPs) distributed in the plane with different densities. Further, the random caching scheme is considered where each of a given set of files with corresponding popularity measures is placed at each BS of a particular tier with a corresponding probability, called placement probability. The probabilities are identical for all BSs in the same tier but vary over tiers, giving the name tierlevel content placement. We consider the network performance metric, hit probability, defined as the probability that a file requested by the typical user is delivered successfully to the user. Leveraging existing results on HCN performance, we maximize the hit probability over content placement probabilities, which yields the optimal tierlevel placement policies. For the case of uniform received signaltointerference thresholds for successful transmissions for BSs in different tiers, the policy is in closedform where the placement probability for a particular file is proportional to the squareroot of the corresponding popularity measure with an offset depending on BS caching capacities. For the general case of nonuniform SIR thresholds, the optimization problem is nonconvex and a suboptimal placement policy is designed by approximation, which has a similar structure as in the case of uniform SIR thresholds and shown by simulation to be closetooptimal.
I Introduction
The last decade has seen multimedia contents becoming dominant in mobile data traffic [1]. As a result, a vision for 5G wireless systems is to enable highrate and lowlatency content delivery, e.g., ultrahighdefinition video streaming [2]. The key challenge for realizing this vision is that transporting large volumes of data from content providers to end users causes severe traffic congestion in backhaul links, resulting in rate loss and high latency [3]. On the other hand, the dramatic advancement of the harddisk technology makes it feasible to deploy large storage (several to dozens of TB) at the network edge (e.g., base stations (BSs) and dedicated access points) at low cost [4]. In view of these, caching popular contents at the network edge has emerged as a promising solution, where highly skewed content popularity is exploited to alleviate the heavy burden on backhaul networks and reduce latency in content delivery [5, 6, 7]. Since popular contents vary at a time scale of several days [8], content placement can be performed every day during offpeak hours without causing an extra burden on the system. Compared with caching in wired networks, the broadcast and superposition natures of the wireless medium make the optimal content placement in wireless networks a much more challenging problem and solving the problem has been the main theme in designing cacheenabled wireless systems and networks [9]. Along the same theme, the current work considers caching for nextgeneration heterogeneous cellular networks (HCNs), adopting the classic tier HCN model [10], and focuses on studying the optimal policy for placing contents in different BS tiers.
Ia Related Work
Extensive research has been conducted on studying the performance gain for joint contentplacement and wireless transmissions as well as designing relevant techniques. From the informationtheoretic perspective, the capacity scaling laws were derived for a large cacheenabled wireless network with a hierarchical tree structure [11]. In [12], the novel idea of integrating coding into user caching, called coded caching, was proposed to improve substantially the efficiency of content delivery over uncoded caching. Specifically, exploiting joint coding of multiple files and the broadcast nature of downlink channels, the content placement at BSs and delivery were jointly optimized to minimize the communication overhead for content delivery. Coded caching in an erasure broadcast channel was then studied in [13] where the optimal capacity region has been derived in some cases. In parallel, extensive research has also been carried out on the more practical uncoded caching where the focus is the design of strategies for contentplacement at BSs (or access points) to optimize the network performance in terms of the expected time for file downloading. Since optimal designs are NPhard in general [14, 15], most research has resorted to suboptimal techniques with closetooptimal performance. Specifically, practical algorithms have been designed for caching contents distributively at access points dedicated for content delivery using greedy algorithms [14] and the theory of beliefpropagation [15]. Furthermore, joint transmission and caching can further improve the network performance [16, 17, 18]. Suboptimal solutions were developed to maximize the quality of service for multirelay networks [16] and twohop relaying network [17] via decomposing the original problem into several simpler subproblems. Considering the opportunistic cooperative MIMO, schemes were presented in [18] to leverage multitimescale joint optimization of power and cache control to enable realtime video streaming. Recent advancements in wireless caching techniques have been summarized in various journal special issues and survey articles (see e.g., [19]).
It is also crucial to understand the performance gain that caching brings to largescale wireless networks. Presently, the common approach is to model and design cacheenabled wireless networks using stochastic geometry. The approach leverages the availability of a wide range of existing stochastic geometric network models, ranging from devicetodevice (D2D) networks to HCNs, and relevant results by adding caching capacities to network nodes [20, 21, 22, 23, 24, 25, 26, 27]. In the resultant models, BSs and mobiles are typically distributed in the 2dimensional (2D) plane as Poisson point processes (PPPs). Despite their similarity in the network nodes’ spatial distributions, the cacheenabled networks differ from the traditional networks without caching in their functions, with the former aiming at efficient content delivery and the latter at reliable communication. Correspondingly, the performance of a cacheenabled network is typically measured using a metric called hit probability, defined as the probability that a file requested by a typical user is not only available in the network but can also be wirelessly delivered to the user [24]. Based on stochasticgeometry network models, the performance of cacheenabled D2D networks [20, 21] and HCNs [22, 23] were analyzed in terms of hit probability as well as average throughput. For smallcell networks, one design challenge is that the cache capacity limitation of BSs affects the availability of contents with low and moderate popularity. A solution was proposed in [26] for multicell cooperative transmission/delivery in order to enhance the content availability. Specifically, the proposed contentplacement strategy is to partition the cache of each BS into two halves for storing both the most popular files and fractions of other files; then multicell cooperation effectively integrates storage spaces at cooperative BSs into a larger cache to increase content availability for improving the network hit probability. Based on approximate performance analysis, the contentplacement strategy derived in [26] is heuristic and the optimal one remains unknown.
In the aforementioned work, the content placement at cacheenabled nodes is deterministic. An alternative strategy is probabilistic (content) placement where a particular file is placed in the cache of a network node (BS or mobile) with a given probability [24, 25], called placement probability. The strategy has also been considered in designing largescale cacheenabled networks [24, 25]. The key characteristic of probabilistic placement is that all files with nonzero placement probabilities are available in a largescale network with their spatial densities proportional to the probabilities. Given its random nature, the strategy fits the stochasticgeometry models better than the deterministic counterpart as the former allows for tractable analyses for certain networks as demonstrated in this work. The placement probabilities for different content files were optimized to maximize the hit probability for cellular networks in [24] and for D2D networks in [25]. It was found therein that the optimal placement probabilities are highly dependent on, but not identical to, the (content) popularity measures, defined as the contentdemand distribution over files as they are also functions of network parameters, e.g., wirelesslink reliability and cache capacities. To improve content availability, a hybrid scheme combining deterministic and probabilistic content placement was proposed in [27] for HCNs with multicasting where the most popular files are cached at every macrocell BS and different combinations of other files are randomly cached at picocell BSs. Similar to the strategy in [26], the proposed strategy in [27] does not lead to tractable networkperformance analysis and was optimized for the approximate hit probability.
IB Motivation, Contributions and Organization
HCNs are expected to be deployed as nextgeneration wireless networks supporting content delivery besides communication and mobile computing [9]. In view of prior work, the existing strategies for content placement in largescale HCNs are mostly heuristic and the optimal policies in closedform remain largely unknown, even though existing results reveal their various properties and dependence on network parameters. This motivates the current work on analyzing the structure of the optimal contentplacement policies for HCNs.
To this end, the cacheenabled HCN is modeled by adopting the classic tier HCN model for the spatial distributions of BSs and mobiles [10]. To be specific, the locations of different tiers of BSs and mobiles are modeled as independent homogeneous PPPs with nonuniform densities. Besides density, each tier is characterized by a set of additional parameters including BS transmission power, finite cache capacity and minimum received signaltointerference (SIR) threshold required for successful content delivery. Note that the use of SIR is based on the implicit assumption that the network is interference limited. A user is associated with the nearest BS where the requested file is available. It is assumed that there exists a content database comprising files characterized by corresponding popularity measures. Each user generates a random request for a particular file based on the discrete popularity distribution. In the paper, we propose a tractable approach of probabilistic tierlevel content placement (TLCP) for the HCN where the placement probabilities are identical for all BSs belonging to the same tier but are different across tiers. The goal of the current work is to analyze the structure of the optimal policies for TLCP given the networkperformance metric of hit probability. The main contributions are summarized as follows.

Hit Probability Analysis. By extending the results on outage probability for HCNs in [10], the hit probability for cacheenabled HCNs are derived in closed form. The results reveal that the metric is determined not only by the physicallayer related parameters, including BS density, transmission power, and pathloss exponent, but also the contentrelated parameters, including contentpopularity measures and placement probabilities. With uniform SIR thresholds for all tiers, the hit probability is observed to be a monotone increasing function of the placement probability and converges to a constant independent of BS density and transmission power as the placement probabilities approach .

Optimal Content Placement for MultiTier HCNs. For a multitier HCN, the placement probabilities form an matrix whose rows and columns correspond to the files and the tiers, respectively. First, consider a multitier HCN with uniform SIR thresholds for all tiers. Building on the results derived for singletier HCNs, a weighted sum (over tiers) of the placement probabilities for a particular file has the structure that it is proportional to the square root of the popularity measure with a fixed offset. Using this result, we derive the expressions for individual placement probabilities and reveal a useful structure allowing for a simple sequential computation of the probabilities. An algorithm is proposed to realize the aforementioned procedure. Next, consider the general case of a multitier HCN with nonuniform SIR thresholds for different tiers. In this case, finding the optimal content placement is nonconvex and it is thus difficult to derive the optimal policy in closedform. However, a suboptimal algorithm can be designed leveraging the insights from the optimal policy structures for the previous cases. Our numerical results show that the performance of the proposed scheme is closetooptimal.
The remainder of the paper is organized as follows. The network model and metric are described in Section II. The hit probability and optimal content placement for cacheenabled HCNs are analyzed in Sections III and IV, respectively. Numerical results are provided in Section V followed by the conclusion in Section VI.
Ii Network Model and Metric
In this section, we describe the mathematical model for the cacheenabled HCN illustrated in Fig. 1 and define its performance metric. The symbols used therein and their meanings are tabulated in Table I.
Symbol  Meaning 

Total number of tiers in a HCN  
Total number of files in a datbase  
The th file  
Point process of BSs in the th tier  
,  Point process of BSs in the th tier with, without file 
,  Density and transmission power of BSs in the th tier 
SIR threshold of BSs in the th tier  
Rayleigh fading gain with unit mean  
Pathloss exponent  
Cache capacity of BSs in the th tier  
Popularity measure for file  
Placement probability for file in the kth tier BSs  
Iia Network Topology
The spatial distributions of BSs are modeled using the classic tier stochasticgeometry model for the HCN described as follows [10]. The network comprises tiers of BSs modeled as independent homogeneous PPPs distributed in the plane. The th tier is denoted by with the BS density and transmission power represented by and , respectively. Assuming an interferencelimited network, the transmission by a BS to an associated user is successful if the received SIR exceeds a given threshold, denoted by , identical for all links in the th tier.
We consider a particular frequencyflat channel, corresponding to a single frequency subchannel of a broadband system. Single antennas are deployed at all BSs and users. Furthermore, the BSs are assumed to transmit continuously in the unicast mode. The users are assumed to be Poisson distributed. As a result, based on Slyvnyak’s theorem [28], it is sufficient to consider in the networkperformance analysis a typical user located at the origin, which yields the expected experience for all users. The channel is modeled in such a way that the signal power received at the user from a th tier BS located at is given by , where the random variable models the Rayleigh fading and is the pathloss exponent ^{1}^{1}1In practice, the pathloss exponent may vary over the tiers. The corresponding conditional hit probability does not have a closedform expression as in Lemma 3, resulting in an intractable optimization problem. The current solution for the simpler case of uniform pathloss exponent can provide useful insights into designing practical content placement schemes for the said general case.. Based on the channel model ^{2}^{2}2The effect of shadowing on network performance is omitted in the current model for simplicity but can be captured by modifying the model following the method in [29], namely appropriately scaling the transmission power of BSs in each tier. However, the corresponding modifications of the analysis and algorithmic design are straightforward without changing the key results and insights., the interference power measured at the typical user, denoted by , can be written as
(1) 
where the fading coefficients are assumed to be independent and identically distributed (i.i.d.).
IiB Probabilistic Content Placement
In this paper, we consider a content (such as video file) database containing files with normalized size equal to following the literature [24, 26, 27].^{3}^{3}3In practice, a large content file to be cached is usually divided in to units of equal sizes because they have different popularity. For instance, the beginning minute of a YouTube video is much more popular than the remainder. Thus, to be precise, the equalsize files considered in this paper correspond to content units in practice. As illustrated in Fig. 1, the BSs from different tiers are assumed to have different cache capacities which are denoted by for the th tier with . We make the practical assumption that not all BSs have sufficient capacities for storing the whole database, i.e., . We adopt a probabilistic content placement scheme similar to the one in [24] to randomly select files for caching at different tiers under their cache capacity constraints:
(2) 
Specifically, the th file, denoted by , is cached at a tier BS with a fixed probability called a placement probability. The placement probabilities, , are identical for all BSs in the same tier , . They specify the tierlevel content placement (TLCP). Grouping the placement probabilities yields the following placement probability matrix:
(3) 
The rows and columns of correspond to different files and different tiers, respectively. Given the placement probabilities in and under the cachecapacity constraints in (2), there exist specific strategies of randomly placing contents at individual BSs such that File is available at a tier BS with a probability exactly equal to [24]. One of such strategies is illustrated in [24, Fig. 1]. Given the existence of randomplacement strategies for achieving the content availability specified by , this paper focuses on optimizing for maximizing the hit probability.
The files in the content database differ in popularity, measured by a corresponding set of values with for all and [24, 25, 26, 27]. This set is a probability mass function such that the typical user requests file with probability . Without loss of generality, it is assumed that the files are ordered in decreasing popularity, i.e., .
IiC ContentCentric Cell Association
Contentcentric cell association accounts for both the factor of link reliability and the factor of content availability. We adopt a common scheme that associates a user with the BS that maximizes the received signal power among those having the requested file (see e.g., [27, 30]).^{4}^{4}4Coordinated multiple access point (CoMP) defined in the LTE standard can be applied to improve the network performance via associating each user with multiple BSs. Adopting the technology in the current network model does not lead to tractable analysis. However, it is possible to develop practical contentdelivery schemes for HetNets with CoMP by integrating the current optimal TL content placement and the design of cooperative content delivery in [26]. It is important to note that due to limited BS storage, the database cached at BSs is only the popular subset of all contents. Thus, it is possible that a file requested by a user is unavailable at the network edge, which has to be retrieved from a data center across the backhaul network. In such cases, the classic cell association rule is applied to connect the user to the nearest BS. These cases occur infrequently and furthermore are outside the current scope of content placement at the network edge. Thus, they are omitted in our analysis for simplicity following the common approach in the literature (see e.g., [30]). For ease of exposition, we partition the HCN into effective tiers, called the contentcentric tiers, according to the file availability within each tier. The th contentcentric tier refers to the process of tier BSs with file , denoted by , while the remaining tier BSs are denoted by with . Due to the probabilistic content placement scheme, and are independent PPPs with densities and , respectively. A user is said to be associated with the th contentcentric tier if the user requests and is served by a tier BS. Then, conditioned on the typical user requesting file , the serving BS is given by
(4) 
where denotes BS ’s transmission power. In addition, conditioned on the typical user requesting file , the interference power in (1) can be written in terms of the contentcentric tiers as:
(5)  
IiD Network Performance Metric
The network performance is measured by the hit probability defined as the probability that a file the typical user requested is not only cached at a BS but also successfully delivered by the BS over the wireless channel (see e.g., [24]). By definition, the hit probability quantifies the reduced fraction of backhaul load. In addition, it can also indicate the reduction of mean latency in the backhaul network (see Appendix A for details). Therefore, we use the hit probability as the main network performance metric in this paper. For the purpose of analysis, let denote the (unconditional) hit probability, denote the conditional hit probability given that the typical user requests file , and denote the content popularity for file . Then
(6) 
Furthermore, define the association probability indexed by , denoted by , as the probability that the typical user is associated with the th contentcentric tier. The hit probability conditional on this event is represented by . It follows that
(7) 
Iii Analysis of Hit Probability
In this section, the hit probability for the cacheenabled HCN is calculated. To this end, the association probabilities and the probability density function (PDF) of the serving distances are derived in the following two lemmas, via directly modifying Lemmas and in [31] enabled by the interpretation of the HCN as one comprising contentcentric tiers (see Section IIB).
Lemma 1 (Association Probabilities).
The association probability that the typical user belongs to the th effective tier is given as
(8) 
where the constant .
Proof.
See Appendix B.
The result in Lemma 1 shows that the typical user requesting a particular file is more likely to be associated with one of those tiers having not only larger placement probability but also denser BS or higher BS transmission power, aligned with intuition. In addition, it is shown that if is small, the placement probability and BS density have more dominant effects on determining the association probability than transmission power, since converges to one for all as .
Lemma 2 (Statistical Serving Distances).
The PDF of the serving distance between the typical user and the associated BS in the th effective tier is given as
(9) 
where is given in (8).
Next, we are ready to derive the hit probabilities using Lemmas 1 and 2. For ease of notation, we define the following two functions and , which are related to the interference coming from the BSs with and without the file , respectively:
(10)  
(11) 
where denotes the Gauss hypergeometric function and is the cosecanttrigonometry function. To further simplify the expression of hit probability, we define the following function:
(12) 
Then the conditional hit probability can be written as shown in the following lemma.
Lemma 3 (Conditional Hit Probability).
Proof.
See Appendix C.
Using Lemma 3 and the definition of hit probability in (6), we obtain the first main result of this paper.
Theorem 1 (Hit Probability).
Theorem 1 shows that the hit probability is determined by two sets of network parameters: one set is related to the physical layer including the BS density , transmit power , and pathloss parameter ; the other set contains contentrelated parameters including the popularity measures and placement probabilities .
From Theorem 1, we can directly obtain hit probabilities for two special cases, namely, the singletier HCNs and the multitier HCNs with uniform SIR thresholds, as shown in the following two corollaries.
Corollary 1 (Hit Probability for SingleTier HCNs).
Corollary 1 shows that the hit probability for singletier cacheenabled networks is independent with BS density and transmit power, which is a wellknown characteristic of interferencelimited cellular networks. On the other hand, it is found to be monotone increasing with growing placement probabilities as the spatial content density increases.
Corollary 2 (Hit Probability for Multipletier HCNs with Uniform SIR Thresholds).
Remark 1 (Effects of Large Cache Capacities).
Corollary 2 shows that the hit probability is a monotone increasing function of the placement probabilities and converges to a constant, which is independent of the BS densities and transmission powers, as all the placement probabilities become ones, corresponding to the case of large cache capacities. At this limit, the cacheenable HCN is effectively the same as a traditional interferencelimited HCN for general data services and the said independence is due to a uniform SIR threshold and is well known in the literature (see e.g., [10]).
Iv Optimal TierLevel Content Placement
In this section, we maximize the hit probability derived for the cacheenabled HCNs in the preceding section over the placement probabilities.
Iva Problem Formulation
The TLCP problem consists of finding the placement matrix in (3) that maximizes the hit probability for HCNs as given in Theorem 1. Mathematically, the optimization problem can be formulated as follows:
(P0)  
s.t.  
where the first constraint from (2) is based on the BS cache capacity for each tier and the second constraint arises from the fact that is a probability.
It is numerically difficult to directly solve Problem P0, since it has a structure of “sumofratios” with a nonconvex nature and has been proved to be NPcomplete. In order to provide useful insights and results for tackling the problem, the optimal content placement policies are first analyzed for the special case of singletier HCNs and then extended to multitier HCNs.
IvB SingleTier HCNs
For the current case with , using Corollary 1, Problem P0 is simplified as:
(P1)  
s.t.  
where denotes the placement probability for file , the vector , denotes the cache capacity for singletier HCNs.
It should be noted that Problem P1 has the same structure as that for the asymptotic (high SNR and high user density) case in [30] where the singletier cacheenabled BSs are distributed as a PPP and random combination of files are cached in each BS with probability . Nevertheless, it is still valuable to discuss this special case since it provides useful insights for solving the complex problem for multitier HCNs. Therefore, this paper focuses on these insights.
Problem P1 is convex since the objective function is convex and the constraints are linear and can thus be solved using the Lagrange method. The Lagrangian function can be written as
(16) 
where denotes the Lagrangian multiplier. Using the KarushKuhnTucker (KKT) condition, setting the derivative of in (16) to zero leads to the optimal placement probabilities as shown in Theorem 2 where the optimal Lagrange multiplier is denoted by . Note that the capacity constraint is active at the optimal point, namely . This result comes from the fact that the objective function of Problem P1 is a monotoneincreasing function of .
Theorem 2 (Optimal TLCP for SingleTier HCNs).
For the singletier cacheenabled HCN, given the optimal Lagrangian multiplier , the optimal content placement probabilities, denoted by , that solve Problem P1 are given as
(17) 
where the thresholds and , and the optimal Lagrange multiplier satisfies the equality
(18) 
In addition, the optimal Lagrangian multiplier in Theorem 2 can be found via a simple bisection search. Let be the number of iterations needed to find the optimal Lagrange multiplier . Clearly, the computational complexity of TLCP for singletier HCNs is . The corresponding algorithm is shown in Algorithm 1.
initialize 

repeat 
if , 
else 
until converges 
Remark 2 (OffsetPopularity Proportional Caching Structure).
As illustrated in Fig. 2, the optimal content placement in Theorem 2 has the mentioned offsetpopularity proportional (OPP) structure described as follows. Specifically, if the popularity measure of a particular file is within the range , the optimal placement probability, , monotonically increases with the square root of the popularity measure, i.e., . Otherwise, the probability is either or depending on whether the measure is above or below the range. Furthermore, the probability is offset by a function of the SIR threshold and scaled by a function of both the threshold and the cache capacity .
Remark 3 (Effects of Content Popularity on Optimal Placement Probability).
The result in Theorem 2 shows that the optimal content placement probability is decided by its popularity measure. In particular, content files can be separated by defining three ranges of popularity measure, corresponding to placement probabilities of , and as illustrated in Fig. 3, called the dispensability, diversity, and densification ranges, respectively. In the dispensability range, the files are highly unpopular and do not need to be cached in the network. In contrast, the files in the densification range are highly popular such that their spatial density should be maximized by caching the files at every BS. Last, files in the diversity range have moderate popularity and it is desirable to have all of them available in the network, corresponding to enhancing spatial content diversity. As a result, they are cached at different fractions of BSs.
Remark 4 (Effects of SIR threshold).
The SIR threshold affects both the popularity thresholds ( and ) in the optimal placement policy (see Theorem 2). It is observed from numerical results that both thresholds are monotone increasing functions of .
Remark 5 (Effects of Lagrangian multiplier ).
The value of Lagrangian multiplier affects the popularity thresholds and , and is determined by the capacity constraint equality, i.e., . In the case that the requested cache unit is larger than the cache capacity, i.e., , the Lagrangian multiplier should be increased to enlarge the popularity thresholds and thus decrease the placement probabilities, and vice versa.
Problem P1 is considered purposely to help solve the general version and also for clarity in exposition. In particular, the insight from solving P1 is exploited to solve P2 in closed form given uniform SIR thresholds and develop a suboptimal scheme for the case with nonuniform thresholds.
IvC Multitier HCNs with Uniform SIR Thresholds
Consider a multitier HCN with uniform SIR thresholds for all tiers. Based on the hit probability in Corollary 2, Problem P0 for the current case is given as:
(P2)  
s.t.  
where and are constants defined as and . One can see that the problem is convex and can thus be solved numerically using a standard convexoptimization solver. However, the numerical approach may have high complexity if the content database is large and further yields little insight into the optimal policy structure. Thus, in the remainder of this section, a simple algorithm is developed for sequential computation of the optimal policy, which also reveals some properties of the policy structure.
To this end, define the tierwise weighted sum of placement probabilities for each file as
(19) 
Using this definition, a relaxed version of Problem P2 can be rewritten as follows:
(P3)  
s.t.  
Comparing Problem P3 with P1 for the singletier HCNs, one can see the two problems have identical forms. Thus, this allows Problem P3 to be solved following a similar procedure as P1, yielding the following proposition.
Proposition 1 (Weighted Sum of Optimal Placement Probabilities).
The weighted sum of the optimal placement probabilities for multitier HCNs with uniform SIR thresholds, denoted by , is given as:
(20) 
where , , and the optimal Lagrange multiplier satisfies the following equality
The value of can be found using the bisection search in Algorithm 1. Then the optimal values for the weighted sum can be computed using Proposition 1.
Problem P3 is the relaxed version of P2 since the feasible region of P3 is larger than that of P2. Let denote the optimal placement probabilities solving Problem P2 and the weighted sums solving Problem P3. The following proposition shows that the relaxation does not compromise the optimality of the solution.
Proposition 2.
The solution of Problem P3 solves P2 in the sense that .
Proof.
See Appendix D.
Next, based on the results in Propositions 1 and 2, the structure of the optimal placement policy is derived as shown in Theorem 3, which enables lowcomplexity sequential computation of the optimal placement probabilities.
Theorem 3 (Sequential Computation of Optimal Placement Probabilities).
One possible policy for optimal TLCP for the HCNs with uniform SIR thresholds is given as follows:
(21) 
where
(22) 
and is as given in Proposition 1.
Proof.
See Appendix E.
A key observation of the policy structure in Theorem 3 is that depends only on with and . This suggests that the optimal placement probabilities can be computed sequentially as shown in Algorithm and thus the computational complexity of Algorithm 2 is , where is the number of iterations needed to find the optimal Lagrange multiplier.
One can observe from Proposition 1 that the optimal solution for Problem P2 is not unique. In other words, there may exist a set of placement probabilities different from that computed using Algorithm but achieving the same hit probability.
IvD Multitier HCNs with NonUniform SIR Thresholds
For the current case, the problem of optimal content placement is Problem P0. As the problem is nonconvex, it is numerically complex to solve and also difficult to develop lowcomplexity algorithms by analyzing the optimal policy structure. Therefore, a lowcomplexity suboptimal algorithm is proposed for content placement for the current case. The algorithm is designed based on approximating the hit probability in Theorem 1 by neglecting the effects of the placement probability of other tiers on the hit probability of the th tier. Specifically, given as defined previously and by replacing the term with , the hit probability in Theorem 1 can be approximated by given as
(23) 
where . Thus, where are independent of each other. As a result, maximizing is equivalent to separate maximization of individual summation terms . Therefore, Problem P0 can be approximated by singletier optimization problems, each of which is written as:
(P4)  
s.t.  
Using the results in the case of singletier HCNs in Theorem 2, we derive the suboptimal contentplacement policy as shown in the following proposition.
Proposition 3 (SubOptimal TLCP for MultiTier HCNs with NonUniform SIRs).
For the multitier cacheenabled HCNs with nonuniform SIR thresholds, the optimal TLCP placement probabilities, denoted by , that solve Problem P4 are given as
(24) 
where and . The optimal dual variable satisfies the equality
(25) 
The above suboptimal TLCP policy approximates problem P0 as independent singletier optimization problems. Thus, the corresponding computational complexity is , where denotes the number of iterations needed to find the optimal Lagrange multiplier for each tier. In addition, the numerical results in the next section show that it can attain closetooptimal performance.
V Simulation Results
In this section, simulation is conducted to validate the optimality of the contentplacement policies derived in the preceding section and to compare the performance of the strategy of TLCP with conventional ones. The benchmark strategies include the “most popular” content placement (MPCP) that caches the most popular contents in a greedy manner and the hybrid content placement (HCP) proposed in [27]. Our simulation is based on the following settings unless specified otherwise. The number of BS tiers is and the pathloss exponent . The BS transmission power for the two tiers are dBm and dBm, respectively. The SIR threshold for tier is fixed at dB while the other is a variable.
Va Conditional Hit Probability
The conditional hit probability for a typical file versus caching probability is shown in Fig. 4. The analytical results are computed numerically using Lemma 3 and the simulated ones are obtained from Monte Carlo simulation using Matlab. First, it is observed that the simulated results match the analytical results well, which validates our analysis. In addition, the conditional hit probability increases with the growing placement probability if . However, it does not necessarily hold for the case , which shows that the effects of placement probability on the hit probability differ with SIR threshold. This is because increasing the placement probability increases the association probability of that tier (see Lemma 1) and thus decreases the conditional hit probability if that tier has smaller hit probability due to the larger SIR threshold. Meanwhile, it reduces the serving distance (see Lemma 2) and thus increases the conditional hit probability. The (final) effects of placement probability on the hit probability are determined by the absolute values of the above increment and decrement.
VB Optimal Content Placement
Fig. 5 compares the performance of the optimal TLCP proposed in this paper (Theorem 3) with MPCP and HCP. For MPCP, each macrocell BS (or smallcell BS) caches the (or ) most popular files. For HCP, each macrocell BS caches the most popular files while each smallcell BS caches the remaining files with optimal probabilistic content placement given in Theorem 2. First of all, Fig. 5 (a) shows that the hit probabilities under these three content placement policies increase as the content popularity becomes more skewed (a growing Zipf exponent ), aligned with intuition. Next, TLCP is observed to achieve higher hit probability than MPCP and HCP due to the content densification and diversity (see Remark 3). Further, we observe that the gain over MPCP decreases with a growing since MPCP is a popularityaware policy. In contrast, the gain over HCP increases with a growing . This is because, in the HCP, only Macrocell tier caches the most popular files. In addition, the optimality of TLCP is verified by comparing the results given by the standard optimization tool CVX. Last, from Fig. 5 (b), it is observed that the optimal hit probability increases as the perlink data rate reduces due to the reducing SIR threshold. In particular, the maximum hit probability (when the data rate approximates to 0) is less than 1 since the caching capacity is limited.
The hit probabilities under different CP policies, including the optimal CP, suboptimal TLCP (see Proposition 3), MPCP, and HCP, versus different cache capacities and the number of contents are shown in Fig. 6(a) and Fig. 6(b), respectively. The optimal CP under this case (i.e., multitier HCNs with nonuniform SIR thresholds) is derived by adopting the dual methods for nonconvex optimization problem in [32] since Problem P0 has the same structure as that in [32] and it satisfies the timesharing condition (the proof is shown in Appendix F). Compared with the optimal CP, the suboptimal TLCP provides closetooptimal performance. It should be noted that the computational complexity of optimal solution is linear in the number of files , but exponential in the number of BS tiers , since it involves solving nonconvex optimization problems, corresponding to the tones, each with variables. While the computational complexity of our proposed suboptimal TLCP algorithm is linear with both and . In addition, besides the obvious monotoneincreasing hit probability with cache capacity, we observe that the suboptimal TLCP outperforms both the HCP and MPCP. Finally, it is shown that the hit probability increases with the growing cache capacity and decreases with the growing number of contents, which coincides with our intuition.
Vi Conclusion
In this paper, we have studied the hit probability and the optimal content placement of the cacheenabled HCNs where the BSs are distributed as multiple independent PPPs and the files are probabilistically cached at BSs in different tiers with different BS densities, transmission powers, cache capacities and SIR thresholds. Using stochastic geometry, we have analyzed the hit probability and shown that it is affected by both the physical layer and contentrelated parameters. Specifically, for the case where all the tiers have the uniform SIR thresholds, the hit probability increases with all the placement probabilities and converges to its maximum (constant) value as all the probabilities achieve one without considering the cache capacity constraint. Then, with the cache capacity constraint, the optimal content placement strategy has been proposed to maximize the hit probability for both single and multitier HCNs. We have found that the placement probability for each file has the OPP caching structure, i.e., the optimal placement probability is linearly proportional to the square root of offsetpopularity with truncation to enforce the range for the probability. On the other hand, for multitier HCNs with uniform SIR thresholds, interestingly, the weightedsum of the optimal placement probabilities also has the OPP caching structure. Further, an optimal or a suboptimal TLCP caching algorithm has been proposed to maximize the hit probability HCNs with uniform or nonuniform SIR thresholds, respectively.
The fundamental structure of the optimal content placement strategies proposed in this paper provides useful guidelines and insights for designing cacheenabled wireless networks. As a promising future direction, it would be very helpful to take BS cooperation and multicast transmissions into account for practical networks. In addition, coded caching can be used to further enhance network performance.
a Analysis of Backhaul Latency
Based on (9) in [33], the mean packet delay in propagation via a wired backhaul network can be approximated as
(26) 
where denotes the BS density, is the gateway density, and are constants related to the processing capability of a backhaul node. In cacheenabled HCNs, the typical user has to retrieve using the backhaul network a file that is not cached at BSs. It follows from (26) that the resultant backhaul latency is given as
(27) 
where is the hit probability for cacheenabled HCNs. From (27), we can see that the backhaul latency is a monotonedecreasing function of the hit probability as shown in Fig. 7. This shows that improving the hit probability of the radio access network reduces the burden on the backhaul network.
B Proof of Lemma 1
Define , which represents the received signal power due to transmissions by the BSs with file in the th tier, where is the distance from the typical user to the nearest BS in contentcentric tier . According to the contentcentric cell association, the association probability is the probability that . Therefore,
(28) 
To derive , and the probability density function (PDF) of , denoted by , are calculated as follows.
(29) 
Further, is derived by taking the derivative of with respect to ,
(30) 
Last, the expression of is derived by substituting (29) and (30) into (28).
C Proof of Lemma 3
In order to calculate the conditional hit probability, we first derive the probability that the typical user successfully receives the requested file from its given serving BS in the th tier, denoted by , as follows.