On the Capacity of Cloud Radio Access Networks with Oblivious Relaying

# On the Capacity of Cloud Radio Access Networks with Oblivious Relaying

Iñaki Estella Aguerri    Abdellatif Zaidi   Giuseppe Caire   Shlomo Shamai (Shitz)
The results in this paper have been partially presented in [1]. Inaki Estella Aguerri is with the Mathematical and Algorithmic Sciences Lab, France Research Center, 92100 Boulogne-Billancourt, France. Abdellatif Zaidi is with Université Paris-Est, France, and currently on leave at the Mathematical and Algorithmic Sciences Laboratory, Huawei Paris Research Center, 92100 Boulogne-Billancourt, France. Giuseppe Caire is with the Technische Universität Berlin, 10587 Berlin, Germany. Shlomo Shamai (Shitz) is with the Technion Institute of Technology, Technion City, Haifa 32000, Israel. The work of G. Caire is supported by an Alexander von Humboldt Professorship. The work of S. Shamai has been supported by the European Union’s Horizon 2020 Research And Innovation Programme, grant agreement no. 694630, and partly by the US-Israel, Binational Science Foundation (BSF). Emails. {inaki.estella@huawei.com, abdellatif.zaidi@u-pem.fr, caire@tu-berlin.de, sshlomo@ee.technion.ac.il}.
###### Abstract

We study the transmission over a network in which users send information to a remote destination through relay nodes that are connected to the destination via finite-capacity error-free links, i.e., a cloud radio access network. The relays are constrained to operate without knowledge of the users’ codebooks, i.e., they perform oblivious processing. The destination, or central processor, however, is informed about the users’ codebooks. We establish a single-letter characterization of the capacity region of this model for a class of discrete memoryless channels in which the outputs at the relay nodes are independent given the users’ inputs. We show that both relaying à-la Cover-El Gamal, i.e., compress-and-forward with joint decompression and decoding, and “noisy network coding”, are optimal. The proof of the converse part establishes, and utilizes, connections with the Chief Executive Officer (CEO) source coding problem under logarithmic loss distortion measure. Extensions to general discrete memoryless channels are also investigated. In this case, we establish inner and outer bounds on the capacity region. For memoryless Gaussian channels within the studied class of channels, we characterize the capacity region when the users are constrained to time-share among Gaussian codebooks. We also discuss the suboptimality of separate decompression-decoding and the role of time-sharing. Furthermore, we study the related distributed information bottleneck problem and characterize optimal tradeoffs between rates (i.e., complexity) and information (i.e., accuracy) in the vector Gaussian case.

## I Introduction

Cloud radio access networks (CRAN) provide a new architecture for next-generation wireless cellular systems in which base stations (BSs) are connected to a cloud-computing central processor (CP) via error-free finite-rate fronthaul links. This architecture is generally seen as an efficient means to increase spectral efficiency in cellular networks by enabling joint processing of the signals received by multiple BSs at the CP and, so, possibly alleviating the effect of interference. Other advantages include low cost deployment and flexible network utilization [2].

In a CRAN network, each BS acts essentially as a relay node; and so it can in principle implement any relaying strategy, e.g., decode-and-forward [3, Theorem 1], compress-and-forward [3, Theorem 6] or combinations of them. Relaying strategies in CRANs can be divided roughly into two classes: i) strategies that require the relay nodes to know the users’ codebooks (i.e., modulation, coding), such as decode-and-forward, compute-and-forward [4, 5, 6] or variants thereof, and ii) strategies in which the relay nodes operate without knowledge of the users’ codebooks, often referred to as oblivious relay processing (or nomadic transmission[7, 8, 9]. This second class is composed essentially of strategies in which the relays implement forms of compress-and-forward [3], such as successive Wyner-Ziv compression [10, 11, 12] and quantize-map-and-forward [13] or noisy-network coding [14]. Schemes that combine the two apporaches have been shown to possibly outperform the best of the two [15], especially in scenarios in which there are more users than relay nodes.

In essence, however, a CRAN architecture is usually envisioned as one in which BSs operate as simple radio units (RUs) that are constrained to implement only radio functionalities such as analog-to-digital conversion and filtering while the baseband functionalities are migrated to the CP. For this reason, while relaying schemes that involve partial or full decoding of the users’ codewords can sometimes offer rate gains, they do not seem to be suitable in practice. In fact, such schemes assume that all or a subset of the relay nodes are fully aware (at all times!) of the codebooks and encoding operations used by the users. For this reason, the signaling required to enable such awareness is generally prohibitive, particularly as the network size gets large. Instead, schemes in which relay nodes perform oblivious processing are preferred in practice. Oblivious processing was first introduced in [7]. The basic idea is that of using randomized encoding to model lack of information about codebooks. For related works, the reader may refer to [8, 16] and [17]. In particular, [8] extends the original definition of oblivious processing of [7], which rules out time-sharing, to include settings in which transmitters are allowed to switch among different codebooks, constrained relay nodes are unaware of the codebooks but are given, or can acquire, time- or frequency-schedule information111Typically, this information is small, e.g., 1 bit that captures on/off activity; and, so, obtaining it is generally much less demanding that obtaining full information about the users’ codebooks.. The framework is referred to therein as “oblivious processing with enabled time-sharing”.

In this work, we consider transmission over a CRAN in which the relay nodes are constrained to operate without knowledge of the users’ codebooks, i.e., are oblivious, and only know time- or frequency-sharing information. The model is shown in Figure 1. Focusing on a class of discrete memoryless channels in which the relay outputs are independent conditionally on the users’ inputs, we establish a single-letter characterization of the capacity region of this class of channels. We show that both relaying à-la Cover-El Gamal, i.e., compress-and-forward with joint decompression and decoding, and noisy network coding are optimal. For the proof of the converse part, we utilize useful connections with the Chief Executive Officer (CEO) source coding problem under logarithmic loss distortion measure [18]. Extensions to general discrete memoryless channels are also investigated. In this case, we establish inner and outer bounds on the capacity region. For memoryless Gaussian channels within the studied class, we provide a full characterization of the capacity region under Gaussian signaling, i.e., when the users’ channel inputs are restricted to be Gaussian. In doing so, we also investigate the role of time-sharing. Finally, leveraging the connection with the information bottleneck method (IB) [19] (see [20] for an earlier equivalent formulation of the IB problem), we study the problem of distributed information bottleneck (D-IB) in which multiple sensors compress separately their observations in a manner that, collectively, the compressed signals provide as much information as possible about a remote (or hidden) source. For this model, we characterize optimal tradeoffs among the minimum description lengths at which the observed signals are described (i.e., complexity) and the information that the produced descriptions collectively preserve about the remote source (i.e., accuracy or relevant information). This is captured through the model’s information-rate region which we establish here for both discrete memoryless and memoryless vector Gaussian cases. The result in the Gaussian case generalizes that developed by Tishby et al. [21] for the single-encoder Gaussian information bottleneck method to the case of multiple encoders. Since the single-encoder IB method has found application in various contexts of learning and prediction [22], such as word clustering for text classification [23], community detection [24], neural code analysis [25], speech recognition [26] and others, distributed IB methods clearly finds usefulness in the extensions of those applications to the distributed case. Blahut-Arimoto type algorithms that allow to compute optimal tradeoffs between rate and information have recently been developed in [27]. The reader may refer to [28, 29, 30, 31, 32] for other related works.

### Outline and Notation

The rest of this paper is organized as follows. Section II provides a formal description of the model, as well as some definitions that are related to it. Section III contains the main result of this paper, which is a single-letter characterization of the capacity region of a class of discrete memoryless CRANs with oblivious processing at relays and enabled time-sharing in which the channel outputs at the relay nodes are independent conditionally on the users’ channel inputs. This section also provides inner and outer bounds on the capacity region of general discrete memoryless CRANs with constrained relays, as well as some discussions on the suboptimality of successive decompression and decoding and the role of time-sharing. In Section IV, we study a memoryless vector Gaussian CRAN model with oblivious processing at relays and enabled time-sharing, for which we characterize the capacity region under Gaussian signaling. Finally, in Section V, we characterize the rate-information region of the vector Gaussian distributed information bottleneck problem.

Throughout this paper, we use the following notation. Upper case letters are used to denote random variables, e.g., ; lower case letters are used to denote realizations of random variables ; and calligraphic letters denote sets, e.g., . The cardinality of a set is denoted by . The length- sequence is denoted as ; and, for integers and such that , the sub-sequence is denoted as . Probability mass functions (pmfs), are denoted by ; or for short, as . Boldface upper case letters denote vectors or matrices, e.g., , where context should make the distinction clear. For an integer , we denote the set of integers smaller or equal as . Sometimes, this set will also be denoted as . For a set of integers , the notation designates the set of random variables with indices in the set , i.e., . We denote the covariance of a zero mean vector by ; is the cross-correlation , and the conditional correlation matrix of given as .

## Ii System Model

Consider the discrete memoryless (DM) CRAN model shown in Figure 1. In this model, users communicate with a common destination or central processor (CP) through relay nodes, where and . Relay node , , is connected to the CP via an error-free finite-rate fronthaul link of capacity . In what follows, we let and indicate the set of users and relays, respectively.

Similar to [8], the relay nodes are constrained to operate without knowledge of the users’ codebooks and only know a time-sharing sequence , i.e., a set of time instants at which users switch among different codebooks. The obliviousness of the relay nodes to the actual codebooks of the users is modeled via the notion of randomized encoding [7] (see also [33] for an earlier introduction of this notion in the context of coding for channels with unknown states). That is, users or transmitters select their codebooks at random and the relay nodes are not informed about the currently selected codebooks, while the CP is given such information. Specifically, in this setup, user , , sends codewords that depend not only on the message of rate that is to be transmitted to the CP by the user and the time-sharing sequence , but also on the index of the codebook selected by this user. This codebook index runs over all possible codebooks of the given rate , i.e., , and is unknown to the relay nodes. The CP, however, knows all indices of the currently selected codebooks by the users. Also, it is assumed that all terminals know the time-sharing sequence.

### Ii-a Formal Definitions

The discrete memoryless CRAN model with oblivious relay processing and enabled time-sharing that we study in this paper is defined as follows.

1. Messages and Codebooks: Transmitter , , sends message to the CP using a codebook from a set of codebooks that is indexed by . The index is picked at random and shared with the CP, but not the relays.

2. Time-sharing sequence: All terminals, including the relay nodes, are aware of a time-sharing sequence , distributed as for a pmf .

3. Encoding functions: The encoding function at user , , is defined by a pair where is a single-letter pmf and is a mapping that assigns the given codebook index , message and time-sharing variable to a channel input . Conditioned on a time-sharing sequence , the probability of selecting a codebook is given by

 pFl|Qn(fl|qn)=∏ml∈[1:2nRl]pXnl|Qn(ϕl(fl,ml,qn)|qn), (1)

where for some given conditional pmf .

4. Relaying functions: The relay nodes receive the outputs of a memoryless interference channel defined by

 pYnK|XnL(ynK|xnL)=n∏i=1pYK|XL(yK,i|xL,i). (2)

Relay node , , is unaware of the codebook indices , and maps its received channel output into an index as . The index is then sent the to the CP over the error-free link of capacity .

5. Decoding function: Upon receiving the indices , the CP estimates the users’ messages as

 (^M1,…,^ML)=g(F1,…,FL,J1,…,JK,Qn), (3)

where

 g: [1:|X1|n2nR1]×⋯×[1:|XL|n2nRL]×[1:2nC1]×⋯×[1:2nCL]×Qn →[1:2nR1]×…×[1:2nRL] (4)

is the decoding function at the CP.

###### Definition 1.

A code for the studied DM CRAN model with oblivious relay processing and enabled time-sharing consists of encoding functions , relaying functions , and a decoding function .

###### Definition 2.

A rate tuple is said to be achievable if, for any , there exists a sequence of codes such that

 Pr{(M1,…,ML)≠(^M1,…,^ML)}≤ϵ, (5)

where the probability is taken with respect to a uniform distribution of messages , , and with respect to independent indices , , whose joint distribution, conditioned on the time-sharing sequence, is given by the product of (1).

For given individual fronthaul constraints , the capacity region is the closure of all achievable rate tuples .

In this work, we are interested in characterizing the capacity region .

### Ii-B Some Useful Implications

As shown in [8], the above constraint of oblivious relay processing with enabled time-sharing means that, in the absence of information regarding the indices and the messages , a codeword taken from a codebook has independent but non-identically distributed entries.

###### Lemma 1.

Without the knowledge of the selected codebooks indices , the distribution of the transmitted codewords conditioned on the time-sharing sequence are given by

 Pr{Xnl(Fl,Wl,Qn)=xnl|Qn=qn}=n∏i=1pXl|Q(xl,i|qi). (6)

Thus, the channel output at relay is distributed as

 pYnk|Qn(ynk|qn)=n∏i=1∑x1,…,xLpYk|XL(yk,i|xL,i)L∏i=1pXl|Q(xl,i|qi).
###### Proof.

The proof of this lemma, whose result was also used in [8], is along the lines of that of [7, Lemma 1] and is therefore omitted for brevity. ∎

###### Remark 1.

Equation (6) states that, when averaged over the probability of selecting a codebook and over the uniform distribution of the message set, but conditioned on the time-sharing variable , the transmitted codeword has a pmf according to a product distribution of independent but non-identically distributed entries. That is, in the absence of codebook information, the codewords lack structure. When a node is informed of the codebook index , the codebook structure is provided by the selected codebook.

## Iii Discrete Memoryless Model

### Iii-a Capacity Region of a Class of CRANs

In this section, we establish a single-letter characterization of the capacity region of a class of discrete memoryless CRANs with oblivious relay processing and enabled time-sharing in which the channel outputs at the relay nodes are independent conditionally on the users’ inputs. Specifically, consider the following class of DM CRANs in which equation (2) factorizes as

 pYnK|XnL(ynK|xnL)=n∏i=1K∏k=1pYk|XL(yk,i|xL,i). (7)

Equation (7) is equivalent to that, for all and all ,

 Yk,i−\minuso−XL,i−\minuso−YK/k,i (8)

forms a Markov chain in this order. The following theorem provides the capacity region of this class of channels.

###### Theorem 1.

For the class of DM CRANs with oblivious relay processing and enabled time-sharing for which (8) holds, the capacity region is given by the union of all rate tuples which satisfy

 ∑t∈TRt≤ ∑s∈S[Cs−I(Ys;Us|XL,Q)]+I(XT;USc|XTc,Q), (9)

for all non-empty subsets and all , for some joint measure of the form

 p(q)L∏l=1p(xl|q)K∏k=1p(yk|xL)K∏k=1p(uk|yk,q). (10)
###### Proof.

The proof of Theorem 1 appears in Appendix A. ∎

###### Remark 2.

Our main contribution in Theorem 1 is the proof of the converse part. As mentioned in Appendix A, the direct part of Theorem 1 can be obtained by a coding scheme in which each relay node compresses its channel output by using Wyner-Ziv binning [34] to exploit the correlation with the channel outputs at the other relays, and forwards the bin index to the CP over its rate-limited link. The CP jointly decodes the compression indices (within the corresponding bins) and the transmitted messages, i.e., Cover-El Gamal compress-and-forward [3, Theorem 3] with joint decompression and decoding (CF-JD)222The rate region achievable by this scheme for a general DM CRAN, i.e., without the Markov chain (8), is given by Theorem 2.. Alternatively, the rate region of Theorem 1 can also be obtained by a direct application of the noisy network coding (NNC) scheme of [14, Theorem 1]. The reader may find it useful to observe that the fact that the two operations of decompression and decoding are performed jointly in the scheme CF-JD is critical to achieve the full rate-region of Theorem 1, in the sense that if the CP first jointly decodes the compression indices and then jointly decodes the users’ messages, i.e., the two operations are performed successively, this results in a a region that is generally strictly suboptimal. A similar observation can be found in [12].

###### Remark 3.

Key element to the proof of the converse part of Theorem 1 is the connection with the Chief Executive Officer (CEO) source coding problem333Because the relay nodes are connected to the CP through error-free finite-rate links, the scenario, as seen by the relay nodes, is similar to one in which a remote vector source needs to be compressed distributively and conveyed to a single decoder. There are important differences, however, as the vector source is not i.i.d. here but given by a codebook that is subject to design.. For the case of encoders, while the characterization of the optimal rate-distortion region of this problem for general distortion measures has eluded the information theory for now more than four decades, a characterization of the optimal region in the case of logarithmic loss distortion measure has been provided recently in [18]. A key step in [18] is that the log-loss distortion measure admits a lower bound in the form of the entropy of the source conditioned on the decoders input. Leveraging on this result, in our converse proof of Theorem 1 we derive a single letter upper-bound on the entropy of the channel inputs conditioned on the indices that are sent by the relays, in the absence of knowledge of the codebooks indices . (Cf. the step (56) in Appendix A). The connection with the CEO problem is discussed further in Section V.

###### Remark 4.

In the special case in which and the memoryless channel (7) is such that for , the source coding counter-part of the problem treated in this section reduces to a distributed source coding setting with independent sources (recall that the users input symbols are independent here) under logarithmic loss distortion measure. Note that, for and general, i.e., arbitrarily correlated, sources, the problem appears to be of remarkable complexity, and is still to be solved. In fact, the Berger-Tung coding scheme [35] can be suboptimal in this case, as is known to be so for Korner-Marton’s modulo-two adder problem [36].

### Iii-B Inner and Outer Bounds for the General DM CRAN Model

In this section, we study the general DM CRAN model (2). That is, the Markov chains given by (8) are not necessarily assumed to hold. In this case, we establish inner and outer bounds on the capacity region that do not coincide in general. The bounds extend those of [7], which are established therein for a setup with a single transmitter and no time-sharing, to the case of multiple transmitters and enabled time-sharing.

The following theorem provides an inner bound on the capacity region of the general DM CRAN model (2) with oblivious relay processing and time-sharing.

###### Theorem 2.

For the general DM CRAN model (2) with oblivious relay processing and enabled time-sharing, the achievable rate region of the scheme CF-JD is given by the union of all rate tuples that satisfy, for all non-empty subsets and all ,

 ∑t∈TRt≤ ∑s∈SCs−I(YS;US|XL,USc,Q)+I(XT;USc|XTc,Q), (11)

for some joint measure of the form

 p(q)L∏l=1p(xl|q)p(yK|xL)K∏k=1p(uk|yk,q). (12)
###### Proof.

The proof of Theorem 2 appears in Appendix B. ∎

###### Remark 5.

The coding scheme that we employ for the proof of Theorem 2, which we denote by compress-and-forward with joint decompression and decoding (CF-JD), is one in which every relay node compresses its output à-la Cover-El Gamal compress-and-forward [3, Theorem 3]. The CP jointly decodes the compression indices and users’ messages. The scheme, as detailed in Appendix B, generalizes [7, Theorem 3] to the case of multiple users and enabled time-sharing.

We now provide an outer bound on the capacity region of the general DM CRAN model with oblivious relay processing and time-sharing. The following theorem states the result.

###### Theorem 3.

For the general DM CRAN model (2) with oblivious relay processing and enabled time-sharing, if a rate tuple is achievable then for all non-empty subsets and it holds that

 ∑t∈TRt≤ ∑s∈SCs−I(YS;US|XL,USc,Q)+I(XT;USc|XTc,Q), (13)

for some distributed according to

 p(q)L∏l=1p(xl|q) p(yK|xL) p(w|q), (14)

where for ; for some random variable and deterministic functions , for .

###### Proof.

The proof of Theorem 3 appears in Appendix C. ∎

###### Remark 6.

The inner bound of Theorem 2 and the outer bound of Theorem 3 do not coincide in general. This is because in Theorem 2, the auxiliary random variables satisfy the Markov chains , while in Theorem 3 each is a function of but also of a “common” random variable . In particular, the Markov chains do not necessarily hold for the auxiliary random variables of the outer bound.

###### Remark 7.

As we already mentioned, the class of DM CRAN models satisfying (8) connects with the CEO problem under logarithmic loss distortion measure. The rate-distortion region of this problem is characterized in the excellent contribution [18] for an arbitrary number of (source) encoders (see [18, Theorem 3] therein). For general DM CRAN channels, i.e., without the Markov chain (8) the model connects with the distributed source coding problem under logarithmic loss distortion measure. While a solution of the latter problem for the case of two encoders has been found in [18, Theorem 6], generalizing the result to the case of arbitrary number of encoders poses a significant challenge. In fact, as also mentioned in [18], the Berger-Tung inner bound is known to be generally suboptimal (e.g., see the Korner-Marton lossless modulo-sum problem [36]). Characterizing the capacity region of the general DM CRAN model under the constraint of oblivious relay processing and enabled time-sharing poses a similar challenge, even for the case of two relays. Finally, we mention that in the context of multi-terminal distributed source coding with general distortion measure, an outer bound has been derived in [37]; and is shown to be tight in certain cases. The proof technique therein is based on introducing a random source such that the observations at the encoders are conditionally independent on , i.e., a Markov chain similar to that in (8) holds. Note however that the connection of the outer bound that we develop here for the uplink CRAN model with oblivious relay processing with that of [37] is only of high level nature as the proof techniques are different.

### Iii-C On the Suboptimality of Separate Decompression-Decoding and Role of Time-Sharing

For the general DM CRAN model (2), the scheme CF-JD of Theorem 2 is based on a joint decoding of the compression indices and users’ messages. That is, the CP performs the operations of the decoding of the quantization codewords and the decoding of the users’ messages simultaneously. A more practical strategy, considered also in [7] and [12], consists in having the CP first decode the quantization codewords (jointly), and then decode the users’ messages (jointly). That is, compress-and-forward with separate decompression and decoding operations. In what follows, we refer to such a scheme as CF-SD. The following proposition provides the rate-region allowed by this scheme for the DM CRAN model (2).

###### Proposition 1.

[7, Theorem 1]) For the general DM CRAN model (2) with oblivious relay processing and enabled time-sharing, the achievable rate region of the scheme CF-SD is the union of all rate tuples that satisfy, for all non-empty and

 ∑t∈TRt ≤I(XT;UK|XTc,Q) (15a) ∑s∈SCs ≥I(US;YS|USc,Q), (15b)

for some pmf .

It is clear that the rate region of Proposition 1 is contained in that, , of Theorem 2.

As a special instance of the scheme CF-SD, we consider compress-and-forward with successive separate decompression-decoding performs sequential decoding of the quantization codewords first, followed by sequential decoding of the users’ messages. More specifically, let and be two permutations that are defined on the set of quantization codewords and the set of user message codewords, respectively. An outline of this scheme, which we denote as CF-SSD, is as follows. The relay nodes compress their outputs sequentially, starting by relay node . In doing so, they utilize Wyner-Ziv binning [34], i.e., relay node , , quantizes its channel output into a description taking into account as decoder side information. The CP first recovers the quantization codewords in the same order, and then decodes the users’ messages sequentially, in the order indicated by , starting by user . That is, the codeword of user , , is estimated using all compression codewords as well as the previously decoded users codewords . The rate-region obtained with a given decoding order as well as that of the scheme CF-SSD, obtained by considering all possible permutations, are given in the following proposition.

###### Proposition 2.

For the general DM CRAN model (2) with oblivious relay processing and enabled time-sharing, the achievable rate region of the scheme CF-SSD with decoding order is the union of all rate tuples that satisfy, for all and ,

 Rπu(l) ≤I(Xπu(l);UK|Xπu(1),…,Xπu(l−1),Q) (16a) Cπr(k) ≥I(Uπr(k);Yπ(k)|Uπr(1),…,Uπr(k−1),Q), (16b)

for some pmf . The rate region achievable by the scheme CF-SSD is defined as the union of the regions over all possible permutations and , i.e.,

 RCF-SSD=⋃πr,πuRCF-% SSD(πr,πu). (17)

While successive separate decompression and decoding results in a rate region that is generally strictly smaller than that of joint decoding, i.e., with CF-JD, in what follows we show that the maximum sum-rate that is achievable by this specific separate decompression-decoding is the same as that achieved by joint decoding. That is, the schemes CF-SSD and CF-JD achieve the same sum-rate (and, so, so does also the scheme CF-SD). Specifically, let the maximum sum-rate achieved by the scheme CF-JD be defined as

 Rsum, CF-JD={max∑Li=1Ris.t.(R1,…,RL)∈RCF-JD.

Similarly, let the maximum sum rate for the scheme CF-SD be defined as

 Rsum, CF-SD={max∑Li=1Ris.t.(R1,…,RL)∈RCF-SD,

and that of the scheme CF-SSD defined as

 Rsum, CF-SSD={max∑Li=1Ris.t.(R1,…,RL)∈RCF-SSD.
###### Theorem 4.

For the general DM CRAN model (2) with oblivious relay processing and enabled time-sharing in Figure 1, we have

 Rsum, CF-JD=Rsum, CF-SD=Rsum, CF-SSD. (18)
###### Proof.

The proof of Theorem 4 appears in Appendix D. ∎

###### Remark 8.

The proof of Theorem 4 uses properties of submodular optimization; and is similar to that of [12, Theorem 2] which shows that CF-JD and CF-SD achieve the same sum-rate for the class of CRANs that satisfy (8). Thus, in a sense, Theorem 4 can be thought of as a generalization of [12, Theorem 2] to the case of general channels (2).

###### Remark 9.

Theorem 4 shows that the three schemes CF-JD, CF-SD and CF-SSD achieve the same sum-rate and that, in general, the use of time-sharing is required for the three schemes to achieve the maximum sum-rate. Note that the uplink CRAN is a multiple-source, multiple-relay, single-destination network. If all fronthaul capacities were infinite, then the model would reduce to a standard multiple access channel (MAC) and it follows from standard results that time-sharing is not needed to achieve the optimal sum-rate in this case [38]. The reader may wonder whether it is also so in the case of finite-rate fronthaul links, i.e., whether one can optimally set in the region for sum-rate maximization. The answer to this question is negative for finite fronthaul capacities , as shown in Section IV. This is reminiscent of the fact that time-sharing generally increase rates in relay channels, e.g., [39, 40]. In addition, when the three schemes CF-JD, CF-SD and CF-SSD are restricted to operate without time-sharing, i.e., , CF-SSD might perform strictly worse than CF-JD and CF-SD. To see this, the reader may find it useful to observe that while time-sharing is not required for sum-rate maximization in a regular MAC, as successive decoding (in any order) is sum-rate optimal in this case, it is beneficial when the sum-rate maximization is subjected to constraints on the users’ message rates such as when the users’ rates need to be symmetric [41], i.e., the operation point is not in a corner point of the MAC region. Similarly, standard successive Wyner-Ziv (in any order, without time-sharing) is known to achieve any corner point of the Berger-Tung region [42, 43], but time-sharing (or rate-splitting à-la [42]) is beneficial if the compression rates are subjected to constraints such as when the compression rates are symmetric. An example which illustrates these aspects for memoryless Gaussian CRAN is provided in Section IV.

## Iv Memoryless MIMO Gaussian CRAN

In this section, we consider a memoryless Gaussian MIMO CRAN with oblivious relay processing and enabled time-sharing. Relay node , , is equipped with receive antennas and has channel output

 Yk=Hk,LX+Nk, (19)

where , is the channel input vector of user , is the number of antennas at user , is the matrix obtained by concatenating the , , horizontally, with being the channel matrix connecting user to relay node , and is the noise vector at relay node , assumed to be memoryless Gaussian with covariance matrix and independent from other noises and from the channel inputs . The transmission from user is subjected to the following covariance constraint,

 E[XlXHl]⪯Kl, (20)

where is a given positive semi-definite matrix, and the notation indicates that the matrix is positive semi-definite.

### Iv-a Capacity Region under Time-Sharing of Gaussian Inputs

The memoryless MIMO Gaussian model with oblivious relay processing decribed by (19) and (20) clearly falls into the class of CRANs studied in Section III-A, since forms a Markov chain in this order for all . Thus, Theorem 1, which can be extended to continuous channels using standard techniques, characterizes the capacity region of this model. The computation of the region of Theorem 1, i.e., , for the model described by (19) and (20), however, is not easy as it requires finding the optimal choices of channel inputs and the involved auxiliary random variables . In this section, we find an explicit characterization of the capacity region of the model described by (19) and (20) in the case in which the users are constrained to time-share only among Gaussian codebooks. That is, for all and all , the distribution of the input conditionally on is Gaussian (with covariance matrix that can be optimized over so as to satisfy (20)). We denote that region by . Although Gaussian input may generally be suboptimal for uplink CRAN [7], i.e., in general , restricting to Gaussian input for every is appreciable because it leads to rate regions that are less difficult to evaluate. In doing so, we also show that time-sharing Gaussian compression at the relay nodes is optimal if the users’ channel inputs are restricted to be Gaussian for all .

Let, for all , the input be restricted to be distributed such that for all ,

 Xl|Q=q∼CN(0,Kl,q), (21)

where the matrices are chosen to satisfy

 ∑q∈QpQ(q)Kl,q⪯Kl. (22)

The following theorem characterizes the capacity region of the model with oblivious relay processing described by (19) and (20) under the constraint of fixed Gaussian input and given fronthaul capacities .

###### Theorem 5.

The capacity region of the memoryless Gaussian MIMO model with oblivious relay processing described by (19) and (20) under time-sharing of Gaussian inputs is given by the set of all rate tuples that satisfy

 ∑t∈TRt≤ ∑k∈S[Ck−EQ[log|Σ−1k||Σ−1k−Bk,Q|]] (23)

for all non-empty and all , for some pmf and matrices and such that and ; and where, for and , the matrix is defined as .

###### Proof.

The proof of Theorem 5 appears in Appendix E. ∎

###### Remark 10.

Theorem 5 extends the result with oblivious relay processing of [7, Theorem 5] to the MIMO setup with users and enabled time-sharing, and shows that under the constraint of Gaussian signaling, the quantization codewords can be chosen optimally to be Gaussian. Recall that, as shown through an example in [7], restricting to Gaussian input signaling can be a severe constraint and is generally suboptimal.

### Iv-B On the Role of Time-Sharing

In Remark 9 in Section III-C we commented on the utility of time-sharing for sum-rate maximization in the uplink of DM CRAN with oblivious relay processing. In this section we investigate further the role of time-sharing. Specifically, we first provide an example in which time-sharing increases capacity; and then discuss some scenarios in which time-sharing does not enlarge the capacity region of the memoryless MIMO Gaussian CRAN model with oblivious relay processing described by (19) and (20).

For convenience, let us denote by the rate region obtained by setting , i.e, without enabled time-sharing, in the region of Theorem 5. That is, is given by the set of all rate tuples that for all non-empty and all

 ∑t∈TRt≤ ∑k∈S[Ck−log|Σ−1k||Σ−1k−Bk|]+log|∑k∈ScHHk,TBkHk,T+K−1T||K−1T|, (24)

for some , .

The following example shows that may be contained strictly in .

###### Example 1.

Consider an instance of the memoryless MIMO Gaussian CRAN described by (19) and (20) in which , , (all devices are equipped with single-antennas), the relay nodes have equal fronthaul capacities, i.e., , and

 Yk=aX+Nk,fork=1,2, (25)

where and , for .

The capacity of this one-user Gaussian CRAN example can be obtained from Theorem 5 as the following optimization problem

 CG(C)=maxαq,bq,Pq minS⊆{1,2}{|S|[C+|Q|∑q=1αqlog(1−bq)]+|Q|∑q=1αqlog(|Sc|Pqa2bq+1)} (26)

where the maximization is over , and , such that and . Due to Theorem 4, is achievable with CF-JD, CF-SD and CD-SSD by using time-sharing. Without time-sharing, i.e., , the capacity of this one-user Gaussian CRAN example is achievable with the CF-JD scheme and can be obtained easily from (24), as

 Cno-tsG(C) =max0≤b≤1minS⊆{1,2}{|S|[C+log(1−b)]+log(|Sc|Pa2b+1)} (27) =log(1+2a2P2−2C(22C+a2P−√a4P2+(1+2Pa2)22C)). (28)

With time-sharing with, say , the user can communicate at larger rates with CF-JD, as follows. The transmission time is divided into two periods or phases, of duration and respectively, where . The user transmits symbols only during the first phase, with power ; and it remains silent during the second phase. The two relay nodes operate as follows. During the first phase, relay node , , compresses its output to the fronthaul constraint ; and it remains silent during the second phase. Observe that with such transmission scheme the input constraint (22) and fronthaul contraints are satisfied. Evaluating the rate-region of Theorem 5 with the choice , , and , yields in this case

 (29)

Figure 2 depicts the evolution of the the capacity enabled with time-sharing , the capacity without time-sharing , as well as the cut-set upper bound, for and , as function of the user transmit power . Also shown for comparison is the achievable rate as given by (29), which is a lower bound on . Obseve that while restricting to CF-JD with two-phases might be suboptimal, is very close to . As it can be seen from the figure, the utility of time-sharing (to increase rate) is visible mainly at small average transmit power. The intuition for this gain is that, for small , the observations at the relay nodes become too noisy and the relay mostly forwards noise. It is therefore more advantageous to increase the power at for a fraction of the transmission. Accordingly, the effective compression rate is increased to , therefore reducing the compression noise. This observation is reminiscent of similar ones in [39] in the context of relay channels with orthogonal components and in [40] in the context of primitive relay channels.

When the three schemes CF-JD, CF-SD and CF-SSD are restricted to operate without time-sharing, i.e., , and Gaussian signaling, CF-SD and CF-SSD might perform strictly worse than CF-JD. The rate achievable by the CF-SD scheme without time-sharing follows by Proposition 1, and it is easy to show that it coincides with in (28), i.e., in this example, CF-JD and CF-SD achieve the capacity without time-sharing. The rate achievable by CF-SSD without time-sharing and Gaussian test channels , , can be obtained from Proposition 2, as

 Rno-tsG,CF-SSD(C):=log(1+Pa2((1+σ−21)−1+(1+σ−22)−1)), (30)

where and .

Figure 3 shows the capacities , and the achievable rates and for and , as function of the transmit power . Note that CF-SSD, when restricted not to use time-sharing performs strictly worse than CF-JD and CF-SD without time-sharing, i.e., . Observe that in this scenario, the gains due to time-sharing are limited. This observation is in line with the fact that for large fronthaul values, the CRAN model reduces to a MAC, for which time-sharing is not required to achieve the optimal sum-rate. ∎

The above shows that in general time-sharing increases rates for the memoryless MIMO Gaussian CRAN model described by (19) and (20), i.e., . In what follows, we discuss two scenarios in which time-sharing does not enlarge the capacity region of the model given by (19) and (20), i.e., .

#### Iv-B1 Case of Fixed Gaussian Codebook at User Side

Consider the scenario in which the users are not allowed to time-share among several Gaussian codebooks, but they are constrained to use each a single, possibly different, Gaussian codebook. This may be relevant, e.g., for contexts in which signaling overhead reduction among the users and relays is of prime interest. Conceptually, this corresponds to equalizing all the covariance matrices for given and all . Let

 ~Kl:=Kl,1=⋯=Kl,|Q|⪯Kl. (31)

The reader may wonder whether allowing the relay nodes to time-share among compression codebooks can be beneficial in this case. Note that the answer to this question is not clear a-priori, because time-sharing in general increases the Berger-Tung rate region if constraints on the rates are imposed. (See Remark 9). The following proposition shows that for the model described by (19) and (20) this does not hold under the constraint (31).

###### Proposition 3.

For the model with oblivious relay processing described by (19) and (20), if (31) holds for all then .

###### Proof.

The proof of Proposition 3 appears in Appendix F. ∎

#### Iv-B2 High SNR Regime

Consider again the model described by (19) and (20). Assume that for all the vector Gaussian noise at relay node has covariance matrix

 Σk=ϵ~Σk (32)

for some and that is independent from .

The following proposition shows that, in this case, the benefit of time-sharing in terms of increasing rates vanishes for arbitrarily small .

###### Proposition 4.

For the model with oblivious relay processing described by (19) and (20), if for all the vector Gaussian noise at relay node has covariance matrix that can be put in the form given by (32) for some and that is independent from , then the following holds: If , then for some . In addition

 limϵ→0Δϵ=0. (33)
###### Proof.

The proof of Proposition 4 appears in Appendix H. ∎

### Iv-C Price of Non-Awareness: Bounded Rate Loss

In this section, we show that for the memoryless MIMO Gaussian model that is given by (19) and (20) allowing the relay nodes to be fully aware of the users’ codebooks (i.e., the non-constrained or non-oblivious setting) increases rates by at most a bounded constant (only !). In other terms, restricting the relay nodes not to know/utilize the users’ codebooks causes only a bounded rate loss in comparison with maximum rate that would be achievable in the non-oblivious setting. The constant depends on the network size, but is independent of the channel gain matrix, powers and noise levels. The result is an easy combination of a recent improved constant-gap result of Ganguly and Lim in [44] (which tightens further that of Zhou et al. [12], see Remark 11 below) with our Theorem 5.

For simplicity, we focus on the case in which for all and for all . For the unconstrained case (i.e., with none of the constraints of obliviousness and Gaussian signaling assumed), the capacity region of the model described by (19) and (20), which we denote hereafter as , is still to be found in general; and an easy outer bound on it is given by the maximum-flow min-cut bound, i.e., the set of all rate tuples for which for all and

 ∑t∈TRt≤ ∑k∈SCk+log|∑k∈ScHHk,TΣ−1kHk,T+K−1T||K−1T|. (34)

The following theorem shows that the rate-region of Theorem 5 is within a constant gap from , and so from the capacity region of the unconstrained setting .

###### Theorem 6.

If , then there exists a constant such that , with