Modeling and Performance of Uplink CacheEnabled Massive MIMO Heterogeneous Networks
Abstract
A significant burden on wireless networks is brought by the uploading of usergenerated contents to the Internet by means of applications such as the social media. To cope with this mobile data tsunami, we develop a novel multipleinput multipleoutput (MIMO) network architecture with randomly located base stations (BSs) a large number of antennas employing cacheenabled uplink transmission. In particular, we formulate a scenario, where the users upload their content to their strongest base stations (BSs), which are Poisson point process (PPP) distributed. In addition, the BSs, exploiting the benefits of massive MIMO, upload their contents to the core network by means of a finiterate backhaul. After proposing the caching policies, where we propose the modified von Mises distribution as the popularity distribution function, we derive the outage probability and the average delivery rate by taking advantage of tools from the deterministic equivalent (DE) and stochastic geometry analyses. Numerical results investigate the realistic performance gains of the proposed heterogeneous cacheenabled uplink on the network in terms of cardinal operating parameters. For example, insights regarding the BSs storage size are exposed. Moreover, the impacts of the key parameters such the file popularity distribution, and the target bitrate are investigated. Specifically, the outage probability decreases if the storage size is increased, while the average delivery rate increases. In addition, the concentration parameter, defining the number of files stored at the intermediate nodes (popularity), affects directly the proposed metrics. Furthermore, a higher target rate results in higher outage because fewer users obey this constraint. Also, we demonstrate that a denser network decreases the outage and increases the delivery rate. Hence, the introduction of caching at the uplink of the system design ameliorates the network performance.
Solution
Caching, channel aging, heterogeneous networks, massive MIMO, stochastic geometry
I Introduction
A vast majority of new wireless services such as social networks, webbrowsing applications, and multimedia streaming has fueled the mobile data traffic with an imminent fold boost over the next 10 years [1]. As a result, mobile operators need to redesign their current networks and delve into more sophisticated techniques to surge system capacity forward and expand coverage in fifth generation (5G) networks [2].
A promising solution towards this direction relies on the deployment of lowpower, shortrange, and costefficient small cell networks (SCNs) or else heterogeneous networks (HetNets). In fact, irregular cellular networks, deployed opportunistically and in hot spots have been researched for a fairly long time now [3]. Specifically, downlink singleinput singleoutput (SISO) HetNets have already presented sufficient progress [4, 5, 6, 7]. Literally, relevant standardization activities have started in 3GPP release a long time ago [8]. Having assumed that SISO studies are anachronistic, efforts have been devoted to the challenge of modeling multiantenna HetNets by assuming perfect channel state information (CSI) [9, 10, 11]. In addition, the practical consideration of imperfect CSI has taken place in several works [12, 13]. Moreover, research has been devoted to the analysis of stochastically geometric uplink models [14, 15].
In a parallel avenue, massive multipleinput multipleoutput (MIMO) has emerged as another technology supporting the backbone of 5G networks. Remarkably, massive MIMO point to the increase of spectral and energy efficiencies [16, 17]. The achievement of high cellthroughput along with simple signal processing has been contrived by deploying largescale antenna arrays at the BS and multiuser (MU) transmission. Starting from the strong assumption of perfect CSI, research has faced realistic impediments such as the presence of pilot contamination [18] and [16], the inevitable hardware impairments [19] and [20], as well as the channel aging [21, 22, 23]. Especially, by applying the theory of deterministic equivalent (DE) analysis, massive MIMO systems were studied under the presence of channel aging [21]. The key assumption of the DE analysis is that and with a given ratio, where and are the numbers of BS antennas and users, respectively. In particular, channel aging refers to the channel variation between the time instance the channel is estimated and the time instance it is used for precoding or detection. The sources of channel aging are mainly the relative movement of the users with the BS antennas, the phase noise, and any processing delays [23]. Despite its significant implications, few works have scrutinized its impact on massive MIMO systems [21, 22, 23, 13].
The growing trend of usergenerated content such as the sharing of realtime events by means of smartphones inflicts a great uploading strain to the wireless networks. Despite the importance of uplink on mobile cellular networks, most efforts have been focused on the downlink scenario [24, 25, 26]. In fact, few attempts have been dedicated to the expanding demands of users’ transmission (uploading). Notably, differences appear between the procedures of uploading and downloading. Specifically, an asymmetry regarding the downlink and uplink bandwidths takes place, since the downlink bandwidth can reach times the corresponding uplink bandwidth. As a consequence, the uploading time is longer, and the throughput is lower. Hence, lower quality of experience is met. Another difference concerns the limited resources of mobile devices such as the battery capacity and the transmit power. Obviously, it is a critical solution to moderate the uplink traffic pressure. An efficient remedy that is brought to the play is the employment of caching in SCNs by exploiting the content popularity appearing as redundancy [27]. In fact, caching the users’ content in the edge of the network brings gains regarding the user satisfaction and the traffic load [24]. However, most works address the problem of caching in the downlink direction.
Ia MotivationCentral Idea
This paper is motivated by the following observations: 1) Content providers relocate their users’ contents from the core network to the intermediate nodes in the network (caching). Hence, a key question to answer is the design and investigation of the converse scenario, where the users move their content to the intermediate nodes to alleviate the upload traffic. 2) We employ a large number of antennas at each BS to take advantage of the benefits of massive MIMO such as the elimination of intracell interference. 3) Networks have the tendency to become denser resulting in their irregularity, i.e., it is required to introduce the concept of SCNs. 4) User mobility and its resultant channel aging is a common phenomenon in wireless communications. The main contributions are summarized as follows.

Contrary to existing works [24, 27], which have studied the downlink caching, we focus on a novel strategy described as uplink caching. Hence, instead of having the users requesting contents from their associated BS, the users upload their contents to their strongest BS. More concretely, we formulate the caching problem in the uplink.

In parallel of considering uplink caching, we take advantage of the gained performance benefits when the massive MIMO technology is employed and HetNet design is encountered. As far as the authors are aware, it is worthwhile to mention that this work is unique regarding the study of the notion of massive MIMO in caching.

We introduce the modified von Mises distribution instead of a powerlaw or the Zipf distribution to describe the content popularity distribution, in order to represent the locality and the concentration of the content popularities in a specific region.

It is the first work introducing channel aging in an architecture including caching. Although we do not show the degradation of the system performance in terms of plots by varying the user mobility due to users’ relative movement, the analytical results allow the observation of its dependence and its loss quantification.

We derive the outage probability and the average delivery rate of an uplink massive MIMO HetNet, where the intermediate nodes are enriched with caching resources. In particular, after having obtained the deterministic signaltointerferenceplusnoise ratio (SINR) by means of the theory of DEs, we achieve to obtain a statistical expression. Notably, the main benefit of the DEs is the provision of deterministic expressions allowing to avoid any Monte Carlo simulations.

Relied on the numerical results, we elaborate on the impact of various parameters such as the BSs density and the storage size. For example, a high storage size induces improvement of the system, since the outage probability decreases and the average delivery rate increases. For the sake of comparison, we also present the results corresponding to the absence of caching, where applicable.
IB Paper Outline
The remainder of this paper has the following structure. Section II develops the system model of the uplink of a massive MIMO HetNet with channel aging. Section III presents the caching model, while Section IV provides the estimated channel including the effects of pilot contamination and channel aging. Next, Section V presents the uplink transmission under the presence of channel aging and introduction of the caching concept. Section VI provides the main results of this study. Especially, Subsection VIA includes the derivation and investigation of the outage probability, while Subsection VIB, provides the presentation of the average delivery rate of this general model. The numerical results are placed in Section VII, and Section VIII concludes the paper.
IC Notation
Vectors and matrices are denoted by boldface lower and upper case symbols. The notations and refer to complex dimensional vectors and matrices, respectively. The symbols , , and express the transpose, Hermitian transpose, and trace operators, respectively. The expectation operator is denoted by , and the symbol declares definition. Moreover, is the zerothorder Bessel function of the first kind, and denotes the Gamma distribution with shape and scale parameters and , respectively. Finally, represents a circularly symmetric complex Gaussian vector with zeromean and covariance matrix .
Ii System Model
This section considers the setup for the uplink of a massive MIMO cellular network consisted of BSs with locations drawn according to an independent PPP with density , i.e., this formulation corresponds to the generalized and quite interesting design of uplink massive MIMO HetNets. For the sake of better description, a graphical representation of the system layout is shown in Fig 1. Specifically, let the BS of the th cell having a large number of antennas, denoted by . The users are assumed to be distributed as an independent PPP with a sufficiently high density . In fact, in any resource block, we assume that the th BS randomly schedules users according to a distancebased criterion. Actually, the users are connected with the nearest BS constituting its Voronoi cell, while a Voronoi tessellation is structured by the set of all these cells[28]^{1}^{1}1In this work, we assume only a nonlineofsight (NLoS) transmission, while the consideration of an LoS component is left for future work. Its introduction in the analysis could be made by means of a multislope pathloss model [29].. In other words, these users are connected with the strongest BS constituting its Voronoi cell, while a Voronoi tessellation is structured by the set of all these cells^{2}^{2}2The users are assumed to be distributed as an independent point process, but each cell is large enough (the density of the users’ PPP is sufficiently large) to shelter users. The users in each cell are independently and uniformly distributed.. Also, we assume that , as stated by the basic principle of massive MIMO technology^{3}^{3}3Although, in practice the number of BS antennas , and the number of associated users differ across cells, henceforth, we assume and for the sake of simplicity.. Moreover, we consider that each user is equipped with just a singleantenna mobile terminal. Evidently, since the massive MIMO concept is employed, many degrees of freedom are shared across each cell. A central scheduler provides a fixed broadband connection to these BSs by means of wired backhaul links. Obviously, the capacity of the link between the backhaul and each BS is a decreasing function of , since the deployment of more BSs per given area results in less capacity per backhaul link. A capacity expression, obeying to this property, will be introduced in Subsection VIB.
Further to the network topology, by exploiting Slivnyak’s theorem, we are able to conduct the analysis just by focusing on a randomly chosen BS found at the origin [30]. Hereafter, we refer to this BS as the tagged BS. Hence, we assume that at time , the location of the associated scheduled user is at , while the location of the th user found in the th cell is denoted by . Similarly, is the channel vector from the associated th user in the cell located at to the tagged BS (located at the origin), while the interference term is the channel vector corresponding to the link from the th user of the th cell located at . The locations of the th scheduled users from all the cells are formed by a nonstationary point process , which is not a PPP because of the correlation of their locations with the BS process. The explanation relies on the prohibition of the presence of all other users in in the tagged cell [14, 15]. Although this kind of correlations regarding the scheduled users’ locations make the exact analysis intractable, we endorse the uplink model, accounting for the pairwise correlations, proposed in [31] and followed in [15]. Moreover, we consider an exclusion ball approximation on the distribution of the scheduled user process , being a firstorder approximation of the model in [31]. On the top of the ball approximation, we assume that the random variable, expressing the distance from the scheduled user to its tagged BS at the origin, is assumed to be a Rayleigh variable with a mean of [14]. We denote the radius of the ball, in order to let the size of the surface of the exclusion ball equal the average cell size, which is [32]. Furthermore, the scheduled user process , describing the locations of the other scheduled users, is formed as an inhomogeneous PPP of density where the users are found outside an exclusion ball having as center the tagged BS. Especially, the locations of the th users of the other cells, belonging to , are modeled by means of an inhomogeneous PPP with a density function of
(1) 
where .
Both the uplink training and data transmission phases necessitate the introduction of fractional power control in our analysis. Thus, the user in the the th cell transmits with power
(2) 
where expresses the fraction of the compensation of the pathloss given by , while is the open loop transmit power assuming no power control.
As far as the channel model is concerned, let the pointtopoint channels be characterized by independent and identically distributed (i.i.d.) Rayleigh block fading with unit mean, while we assume a block fading model, where the channel is assumed constant during one block, but varies independently from block to block. Note that although the assumption of both line and nonline of sight signals appear in small cells, we consider only Rayleigh fading for the sake of simplicity. Relaxation of the Rayleigh fading assumption as well as the introduction and study of other fading models can be done with techniques found in [33], and is left for future work. Hence, in the proposed model, the channel vector from user in the th cell at the th time slot is modelled as
(3) 
where is the largescale pathloss and is an uncorrelated fast fading Gaussian channel vector with elements having zero mean and unit variance, i.e., . Note that the incorporation of spatial correlation due to lack of limited antenna spacing, and different antenna patterns is left for future work. The pathloss at the tagged cell is described by
(4) 
where is the pathloss exponent, and expresses a constant determined by the carrier frequency and the reference distance. Given that we have assumed an NLoS component, the distancebased criterion is translated to pathloss based, i.e., the minimum distance corresponds to the minimum path loss signal. For the sake of exposition, we have assumed a simplistic singleslope pathloss model, in order not to distract the reader from the main contributions. The application of a more complex model, such as the multislope pathloss model presented in [34], is outside of the scope of this paper and left to future work.
The transmission scheme includes an uplink channel estimation phase, and continues with an uplink data transmission phase, allowing the derivation of the outage probability and the average delivery rate, in order to shed light on their behavior. However, we need first to introduce the caching model.
Iii Caching Model
There are definitely certain cases that BSs need to cache the files of users in the uplink and backhaul for further cost savings, latency reduction, etc. For example, imagine a crowded scenario where many users are willing to upload their video recordings to the Internet/network. If uploaded files could be proactively cached at the BSs, that could be beneficial to the network if nearby users have later the interest to download/watch those uploaded videos. Proactive caching at the uplink could alleviate the upload traffic. This can be also a criterion to select which file is of high interest. Specifically, assuming that the nodes/users upload files via uplink, there is a chance that the user who is uploading the file will have many downloads (i.e., Justin Bieber sharing a video, most likely will be viewed by his followers). Therefore, the file of interest could be inferred based on proactive prediction methods of how the uploading device is "influential", relying on machine learning, social networks, etc. Moreover, although there is no not too much study about the role of caching in the uplink, uplink caching will be very beneficial when the Internet of Things (IoT) devices will be introduced on cellular networks [35, Fig. 37].
Let us consider that the network has a content catalog of contents represented by the set of . User in th cell at the th time slot (located at ) demands a content from a subcatalog according to a content popularity distribution . In particular, the BS at the tagged cell has a content popularity distribution and is modelled by a modified von Mises distribution [36], which is a symmetric circular distribution defined as
(5) 
where is a point in the support such as , the parameter is a measure of location such as , the parameter is a measure of concentration with , and the function is the zerothorder Bessel function of the first kind. The distribution becomes uniform when and highly concentrated on the point when . The parameters and are analogous to the mean and variance in Gaussian distribution. In fact, when , we obtain the Dirac delta function. The intuition behind such a distribution and modelling is due to the observation that the content catalog is finite and the content popularities might be concentrated on specific region, where parameters and are used for its description. Suppose that the th BS has a storage capacity of nats with ( bit = nats), and caches files according to the policy provided below. Henceforth, all the parameters concerning caching are the same across all cells, e.g., we assume that and . The length of each file in the catalog is nats, while expresses its bitrate in nats/s/Hz. Note that the uplink rate of each user has to be equal or higher than the file bitrate , in order to avoid any interruption during its experience.
We assume that we store the most popular files from the catalog in advance offline. Storing most popular files requires perfect knowledge of the content popularity, which might not be possible to be constructed locally. In order to make the things local/geographical, an interesting caching policy is to store the th closest different files mentioned above.
If the file is of high interest for the BS (if it will be a popular file in the future and is not cached yet), then the BS should cache it, i.e., uplink transmission incurs. If the file is of high interest for the BS (if it will be a popular file in the future) and is already cached, then the BS should do nothing (cache miss). In other words, in such case, cache hit means that the user is in coverage and the file is not included in the BS. Hence, uploading to the BS is meaningful; otherwise, if the user is not in coverage or the file is included at the BS, the file will not be uploaded or it will be uploaded to the core network from the BS.
Notably, the uplink caching process is dynamic, since the users are likely to have a popular content at any time, which should be uploaded for the better performance of the network. Hence, the proposed model is constantly vital. However, even when all the users have uploaded their contents, they are able to exploit the model and focus on downloading a content that already has been uploaded. The latter scenario is very rare because it is out of chance that at some point all the users will have uploaded their contents. At any time, there will be at least one user that will have some popular content to upload.
Iv Channel Estimation
Let us denote the channel coherence time is . We assume that the same timefrequency resources are shared by the users across all cells. Aiming to the characterization of realistic systems, we account for imperfect CSIT due to pilot contamination and channel aging. Let denote the length of the training period. Obeying to timedivision duplex (TDD) design, during the uplink training phase, having duration symbols, the tagged BS obtains the estimated channel. The uplink data transmission phase consists of symbols.
Having in mind that the signal from each user is attenuated with distance because of the pathloss, we present the pilot contamination occurred due to the reuse of the pilot sequences during the training phase.
Iva Pilot Contamination
According to TDD, estimation of the local CSI takes place during the uplink training phase, where the same band of frequencies is shared across all cells. Moreover, the th user in each cell is assigned with the same pilot sequence. As a result, pilot contamination occurs and the degradation of the system performance is inevitable. Let the superscript describe the training stage. Furthermore, the scheduled user processes with different pilots, i.e., , , are assumed to be independent. The tagged BS receives a noisy observation of the channel vector from the associated sheduled user at time instance . The average power of each transmitted pilot symbol from the sheduled user is . Hence, the associated BS observes the channel as
(6) 
where the vector denotes the training sequence of the th user with =1, and is the spatially white additive Gaussian noise matrix with i.i.d. entries distributed as . Note that the channel vectors , being independent across cells and user distances, are Gaussian distibuted as .
The tagged BS estimates by applying minimum mean square error (MMSE) estimation to (6), and by assuming that the tagged BS knows perfectly the largescale pathlosses for . Thus, the estimated channel is
(7) 
and it is distributed as with variance given as
(8) 
Based on the orthogonality principle of the MMSE estimation, the uncorrelated estimation error vector at time instance is , being distributed as with
(9) 
IvB Channel Aging
The relative movement of the th associated user with a comparison to the tagged BS antennas results in the variation of the channel. Hence, this source of imperfection contributes further to the need for estimation of the channel. Mathematically, we are able to relate the current sample of the channel with its past samples by means of an autoregressive model of order [37]. Herein, for the sake of computational complexity and tractability, we choose an autoregressive model of order , which is a common approach in the literature [38]. Thus, the current channel at the tagged BS is modeled as
(10) 
where is the channel in the previous symbol duration, and , modelled as a stationary Gaussian random process with i.i.d. entries and distribution , is the uncorrelated channel error because of the channel variation [38]. Regarding , it is related to the secondorder statistics. Specifically, an appropriate measure for modeling the variation of the channel is its secondorder statistics, which can be described by means of the autocorrelation function of the channel. For this role, a widely accepted model is the Jakes model due to its generality and simplicity [37]. The Jakes model describes a propagation medium with twodimensional isotropic scattering and a monopole antenna at the receiver [39]. In such case, the normalized discretetime autocorrelation function of the fading channel is expressed by
(11) 
where and are the maximum Doppler shift and the channel sampling period. Especially, the maximum Doppler shift can be expressed by means of the relative velocity of the scheduled user , i.e., , where is the speed of light and is the carrier frequency. Also, denotes the delay. Increasing the argument of the Bessel function results in a decrease of the magnitude to zero but with some ripples in the meanwhile. We set , i.e., we consider a single symbol delay. To this end, we assume that the BS has perfect knowledge of .
Remarkably, following the procedure in [21], we are able to write both pilot contamination and timevariation of the channel as a combination. More concretely, the channel at time slot can be written as
(12) 
where and with are mutually independent. Hence, the estimated channel of the scheduled user at time is provided by . Note that in the special case, where , we obtain a static environment with no user mobility.
V Uplink Transmission
In general, the physical representation of a link defines the probability distribution function (PDF) of this link. Specifically, we face different distributions depending if we model the desired or the interference part of the received signal. Another example, affecting the PDF, concerns the choices between multiantenna and singleantenna BS architecture, and between single or multiuser transmission. Notable, herein, we employ the general setting of a large number of antennas deployed by the tagged BS serving multiple users simultaneously. The first step towards the statistical characterization of the powers of the received signal’s parts is to model the uplink transmission.
Thus, accounting for a quasistatic block fading model with frequencyflat fading channels varying for symbol to symbol, the received signal from the associated scheduled user at to the tagged BS during the th timeslot after applying a general decoder can be expressed as
(13) 
where is the uplink data symbol of the th scheduled user with . The channel vector denotes the desired channel vector between the tagged BS and the associated th scheduled user located at at timeinstace . Similarly, expresses the interference channel vector from the other users found at far from the typical BS at timeinstace . Also, is the Gaussian thermal noise vector in the uplink data transmission.
Taking into account for the realistic case, where imperfect CSI due to pilot contamination and timevariation of the channel (see (12)), is considered, the received signal by the tagged BS can be written as
(14) 
where we have replaced the current channel by means of (12) with its estimated version^{4}^{4}4Note that the replacement concerns only the current desired channel because the interference part is not of direct interest and can be seen as additive noise.. In (14), the first term expresses the desired signal received by the tagged BS. The second term describes the estimation error effect. Furthermore, the third term represents the other users interference, while the last term denotes the postprocessed noise.
The achievable uplink SINR from the scheduled user to the tagged BS, denoted by , is shown in (16), where we have treated the unknown terms at the tagged BS as uncorrelated additive noise. The encoding of the message takes place over many realizations of certain sources of randomness in the model. Specifically, the expectation operators are taken over the channel estimation error, the smallscale fading in the interference links, and the thermal noise. Notably, the resultant SINR, provided by (17), is a random variable because of the randomness accompanying the largescale pathlosses. Thus, we result in the uplink of a multiuser large MIMO HetNet, where the SINR expression will be investigated by using tools from stochastic geometry and large random matrix theory.
Herein, we present the derivation of an approximation of the SINR, when maximalratio combining (MRC) receivers are employed. As the number of BS antennas grows large, the approximation becomes tighter according to the theory of DEs [40]. Starting from the DE SINR distribution, we derive below the outage probability and the average delivery rate.
Let the tagged BS apply the decoder to the received signal, being a scaled version of the channel estimate . The mathematical expression of the MRC decoder is
(15) 
where the scaling of the decoder is applied for the sake of simplicity, but it will not affect the SINR distribution. Also, note that the decoder depends on the estimated channel, obtained during the training phase. Hereafter, we omit the time index from the expressions, while note that the DE expressions are calculated over the channel distributions, i.e., they are conditioned on the BSs positions.
Let be the deterministic SINR, obtained such that ^{5}^{5}5The notation denotes almost sure convergence, while the definition of the term “deterministic equivalent” is given by [40, Def. 6.1]..
Proposition 1.
The uplink achievable DE SINR with MRC decoding under the presence of pilot contamination and channel aging is given by (17) with .
(16) 
(17) 
Proof.
See Appendix B. ∎
Notably, the terms correspond to the interference terms from other cells.
Vi Performance Analysis
The proposed realistic system depends on several practical factors, e.g., the pilot contamination and the channel aging as well as caching parameters such as the storage size. As already known, the qualityofexperience constraints specify that the uplink rate of the scheduled user should be equal or higher than the file bitrate so that the user does not observe any interruption during its experience. In addition, another impediment is not been taken into account in most cases. It concerns the rate of backhaul, which becomes quite important when the cache misses.
The quantification and the assessment of the system necessitate the definition of certain metrics, namely the outage probability and the average delivery rate.
Via Outage probability
In this section, we present the uplink outage probability of the associated user in a large antenna MU HetNet with imperfect CSIT due to pilot contamination and channel aging, while caching is employed. The technical derivation is given in Appendix C. As a performance metric, the outage probability is given as the complementary of the success (coverage) probability, expressing the joint probabilities of the uplink rate exceeding the file bitrate and the received file missing from the local cache. It is worthwhile to mention that only the BSs connect with the backhaul. In other words, the associated user is not able to upload its content if the BS already has it. Actually, there is no reason to do it, since the content is stored at the BS and it is BS’s task to upload it through its wired link. Hence, we have
(18) 
where , is the received file by the typical BS, and is the local cache of the served BS at the th cell. Differently to [27], this definition follows another line of reasoning. In particular, if the requested file is not in the cache of the served BS, and if the uplink rate is higher than the file bitrate , then, the user uploads its content and does not observe any interruption during its communication. Hence, we expect the outage probability to be close to zero. Formally, the outage probability is given by the following theorem.
Theorem 1.
The approximated uplink outage probability in a large MUMIMO HetNet with caching attributes, accounting for imperfect CSIT due to pilot contamination and channel aging, is given by
(19) 
with given by (5) and the coverage probability given by (21) with . The variable represents the number of terms used in the calculation, , while , , , , , , and .
Proof.
See Appendix C. ∎
ViB Average Delivery Rate
This section presents the derivation of the average delivery rate, defined as
(20) 
where with and being arbitrary coefficients under the constraint the ceiling of the delivery rate is with . denotes the backhaul capacity being available to the intermediate nodes. Also, is the fraction of time expressing the training overhead which occurs during the estimation channel.
(21) 
However, (20) refers to the opposite direction (uplink) and then it includes an interesting and insightful explanation, especially, because the conditions are different. In the case that the uplink rate is higher than the target file rate (bitrate) and the file is not found in the BSs, the user uploads at full rate . On the contrary, if the rate is greater than and the file is already to the local cache, the associated user does not upload its content, but the tagged BS does. The latter constraint relies on the assumption that a highspeed backhaul is not costefficient in dense networks.
Theorem 2.
The approximated uplink average delivery rate of the typical BS in a large MUMIMO HetNet with caching attributes, accounting for imperfect CSIT due to pilot contamination and channel aging, is given by
(22) 
where is given by (5), and the coverage probability is given by (21) with if we substitute with .
Proof.
See Appendix D. ∎
Vii Numerical Results
In this section, we illustrate the behavior of the analytical expressions concerning the outage probability and the average delivery rate , which are provided by means of (19) and (22)^{6}^{6}6Remarkably, there is no known result in the literature studying caching in the uplink of a HetNet employing a large number of antennas (massive MIMO). In addition, there is no known reference investigating channel aging in the case of cachedenabled BSs.. In fact, we investigate the impact of various design parameters such as the BS density , the storage size of BSs in nats, and the target bitrate in nats/sec/Hz. Also, the analytical expressions are verified by Monte Carlo simulations. The simulated curves were obtained by averaging the corresponding expressions over random instances. Actually, the simulated results of the outage probability and the average delivery user rate are depicted along with the proposed analytical expressions. Specifically, the bullets correspond to the simulation results, while the “solid” lines represent the proposed analytical results by varying their parameters. The discrimination between “solid” and “dot” lines, where applicable, designates the results with “caching” and “no caching”, respectively. The “no caching” scenario is obtained by assuming that the content popularity distribution coincides with the Dirac delta function, i.e., .
The simulations are conducted by following a specific procedure. Specifically, we choose a sufficiently large area of , where the locations of the BSs are simulated as a realization of a PPP with given density . Next, the users’ PPP density is considered to be ^{7}^{7}7Although the analytical expressions rely on the assumption of an infinite plane, the simulation takes place over a finite window.. The association relies on the minimum pathloss (distancebased) rule, while users from each cell are randomly scheduled. Hence, we select the strongest user to the tagged BS, found at the origin, as the associated scheduled user at . It is worthwhile to mention that the users could employ other schemes to upload their contents to the BSs. For example, they could select the serving BSs rather than the closest BSs. The relevant comparison with other approaches regarding the selection of the appropriate BSs is interesting and is left for future work. Furthermore, the setup includes BSs of number of antennas, while we pick users per BS. The system under study, embodying a such number of BS antennas, is considered to describe a massive MIMO model, since the simulations coincide with the DEs. In other words, the DEs are tight approximations even for this number of antennas. Hence, such a number of BS antennas can represent a massive MIMO model. However, this is not a new observation. According to the literature, Similar observations have been made in the literature even for an system [40, 41, 42, 43]. The average uplink transmit power for both training and transmission phases is , and the bandwidth allocated for each user is MHz. Also, regarding the rest parameters, we set nats, , , , , , , , unless otherwise stated. Due to limited space, in this work, we do not focus on channel aging, studied in other works such as [13], but the cynosure is the impact of caching in the uplink.
Viia Impact of BS Density
In Fig. 2, we illustrate the behavior of the outage probability with respect to the BS density for different values of the storage size . We observe a decrement of the outage probability as the BS density increases. In other words, a denser HetNet provides better coverage. At the same time, an increase of the storage size of the intermediate nodes brings a decrease in an outage, since the users do not have to upload their content to the core network because the BSs have plenty of space to save the receiver information.
Regarding the average delivery rate , it increases with the BS density as can be seen in Fig. 3. However, it saturates soon due to the increasing intracell interference. Moreover, higher storage size contributes to the increase of the rate because of the traffic load towards the backhaul is alleviated.
ViiB Impact of Storage Size
Fig. 4 shows the relationship of the outage probability with the storage size of the BSs. Notably, the storage capability of networks with caching is one of the most crucial parameters during the design. Obviously, the outage probability increases with the storage size, but decreases with the target bitrate. In other words, the larger the target bit rate is set, the larger the outage probability will be.
In the same direction, in Fig. 5, the average delivery rate becomes higher with increasing storage size, but after a value of , further increment is not beneficial, since all users content will be already available to the corresponding BSs. Especially, less target rate allows better coverage.
ViiC Impact of Concentration Parameter
The variation of the concentration parameter, described by , is depicted in Fig. 6. Small means that a high quota of files is already at the intermediate nodes, i.e., many files are popular. Hence, the users do need to upload their files and significant outage is observed. Moreover, higher storage size allows more files to be uploaded in the BSs. As a result, it is likely that the contents of the users are already at the BSs and the users are inactive since they do not need to upload their contents.
The dependence of the average delivery rate with the concentration parameter is provided by Fig. 7. Specifically, a high concentration parameter means that many files will be uploaded, and thus, the average rate increases.
ViiD Impact of Target BitRate
The target bitrate is another critical parameter that should be taken into account during the formation and study of the current architecture. In particular, Fig. 8 demonstrates the lines of the outage probability versus the target for , and . Increasing the target rate, the outage probability increases, since less users are served. In addition, the performance is improved with increasing storate size because more content can be saved to the intermidiate nodes without the need to upload it at the backhaul.
In a parallel avenue, Fig. 9 shows the increase in the performance with bigger storage capacities at the BSs, while it is apparent that a higher target rate results in a higher average delivery rate.
Viii Conclusion
In this paper, we introduced the concept of caching in the uplink of a system with stochastically distributed massive MIMO BSs, where users upload their contents to servers through the BS by means of finiterate backhaul links. In addition to significantly generalizing the state of the art cacheenabled PPP models to the uplink scenario, we enriched the uplink of the HetNet with the massive MIMO concept. Remarkably, it is the first work, where the caching nodes have a large number of antennas. Moreover, our approach considered imperfect CSI due to pilot contamination and channel aging. After deriving the DE of the SINR, we provided the outage probability and the average delivery rate. Our main purpose was to focus on fundamental parameters, being relevant to the caching design. Such parameters are the storage size of the serving BS and their target file rate. In particular, we demonstrated that by increasing the storage size, the performance of the system is improved, since the outage probability decreases and the average delivery rate increases. Furthermore, by increasing the target file bitrate, the majority of the users is not served. Hence, the outage probability increases. Overall, it was shown that the introduction of the notion of caching in the uplink enhances the system performance.
Appendix A Useful Lemmas
Lemma 1 (Alzer’s inequality [44]).
Assuming that is a normalized gamma random variable with parameter and a constant , then the probability can be tightly upper bounded by
(23) 
where .
Lemma 2 ([43, Lem. B.26]).
Let with uniformly bounded spectral norm (with respect to ). Consider and , where , and , are mutually independent and independent of . Then, we have
(24)  
(25)  
(26)  
(27) 
Appendix B Proof of Proposition 1
First, we divide both the numerator and the denominator of (16) by. Then, we start with the numerator of the SINR. We insert in (16) the expression of the MRC decoder given by (15). We have^{8}^{8}8Let and two infinite sequences. denotes the equivalence relation .
(28) 
where (a) follows after substituting the estimated channel given in (15), while (b) is obtained by means of Lemma 2, since the covariance of is . We continue with the first term in the denominator of the SINR including the estimation error. We have
because the covariance of the estimation error is . The next step is to derive the second term in the denominator, which is written as
(29) 
When and , the second term of the previous expression simplifies to
(30) 
where we have used Lemma 2. In the case that and or , we have
(31) 
Thus, (29) becomes
(32) 
Regarding the term that includes the thermal noise, after applying Lemma 2 we have
(33) 
Appendix C Proof of Theorem 1
The proof starts by finding first the conditional coverage probability on as
(34) 
Hence, we focus on the derivation of . Specifically, we propose an approximation for the outofcell interference, described by for all users in each cell, i.e., . This approximation will allow the decoupling of the correlated terms and will result in a tractable evaluation of . Specifically, by approximating the outofcell interference with its mean, we have [45]
(35)  
(36) 
where in (35), we made the following substitution
Moreover, (36) is obtained by means of the Campbell’s theorem [30] and the exclusion ball model, described in Sec. II as