Linear-Feedback Sum-Capacity for Gaussian Multiple Access Channels with Feedback

Linear-Feedback Sum-Capacity for Gaussian Multiple Access Channels with Feedback

Ehsan Ardestanizadeh, Michèle A. Wigger, Young-Han Kim, and Tara Javidi E. Ardestanizadeh was with the Department of Electrical and Computer Engineering, University of California, San Diego. He is now with ASSIA Inc., 333 Twin Dolphin Drive, Redwood City, CA 94065, USA. M. A. Wigger was with the Department of Electrical and Computer Engineering, University of California, San Diego. She is now with the Department of Communications and Electronics, Telecom ParisTech, 46 Rue Barrault, Paris Cedex 13, France. Y.-H. Kim and T. Javidi are with the Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA, 92093-0407, USA. email: eardestani@assia-inc.com, michele.wigger@telecom-paristech.fr, yhk@ucsd.edu, tjavidi@ucsd.edu.
Abstract

The capacity region of the -sender Gaussian multiple access channel with feedback is not known in general. This paper studies the class of linear-feedback codes that includes (nonlinear) nonfeedback codes at one extreme and the linear-feedback codes by Schalkwijk and Kailath, Ozarow, and Kramer at the other extreme. The linear-feedback sum-capacity under symmetric power constraints is characterized, the maximum sum-rate achieved by linear-feedback codes when each sender has the equal block power constraint . In particular, it is shown that Kramer’s code achieves this linear-feedback sum-capacity. The proof involves the dependence balance condition introduced by Hekstra and Willems and extended by Kramer and Gastpar, and the analysis of the resulting nonconvex optimization problem via a Lagrange dual formulation. Finally, an observation is presented based on the properties of the conditional maximal correlation—an extension of the Hirschfeld–Gebelein–Rényi maximal correlation—which reinforces the conjecture that Kramer’s code achieves not only the linear-feedback sum-capacity, but also the sum-capacity itself (the maximum sum-rate achieved by arbitrary feedback codes).

Feedback, Gaussian multiple access channel, Kramer’s code, linear-feedback codes, maximal correlation, sum-capacity.

I Introduction

Feedback from the receivers to the senders can improve the performance of the communication systems in various ways. For example, as first shown by Gaarder and Wolf [1], feedback can enlarge the capacity region of memoryless multiple access channels by enabling the distributed senders to cooperate via coherent transmissions.

In this paper, we study the sum-capacity of the additive white Gaussian noise multiple access channel (Gaussian multiple access channel in short) with feedback depicted in Figure 1. For senders, Ozarow [2] established the capacity region which—unlike for the point-to-point channel—is strictly larger than the one without feedback. The capacity-achieving code proposed by Ozarow is an extension of the Schalkwijk–Kailath code [3, 4] for Gaussian point-to-point channels.

For , the capacity region is not known in general. Thomas [5] proved that feedback can at most double the sum capacity, and later Ordentlich [6] showed that the same bound holds for the entire capacity region even when the noise sequence is not white (cf. Pombra and Cover [7]). More recently, Kramer [8] extended Ozarow’s linear-feedback code to senders, and proved that this code achieves the sum-capacity under symmetric block power constraints on all the senders, when the power is above a certain threshold (see (4) in Section II) that depends on the number of senders .

Fig. 1: -sender Gaussian multiple access channel.

In this paper, we focus on the class of linear-feedback codes, where the feedback signals are incorporated linearly into the transmit signals (see Definition 1 in Section II). This class of codes includes the linear-feedback codes by Schalkwijk and Kailath [3], Ozarow [2], and Kramer [8] as well as arbitrary (nonlinear) nonfeedback codes.

We characterize the linear-feedback sum-capacity under symmetric block power constraints , which is the maximum sum-rate achieved by linear-feedback codes under equal block power constraints at all the senders. Our main contribution is the proof of the converse. We first prove an upper bound on , which is a multiletter optimization problem over Gaussian distributions satisfying a certain functional relationship (cf. Cover and Pombra [9]). Next, we relax the functional relationship by considering a dependence balance condition, introduced by Hekstra and Willems [10] and extended by Kramer and Gastpar [11], and derive an optimization problem over the set of positive semidefinite (covariance) matrices. Lastly, we carefully analyze this nonconvex optimization problem via a Lagrange dual formulation [12].

The linear-feedback sum-capacity is achieved by Kramer’s linear-feedback code. Hence, this rather simple code, which iteratively refines the receiver’s knowledge about the messages, is sum-rate optimal among the class of linear-feedback codes. For completeness, we briefly describe Kramer’s linear-feedback code and analyze it via properties of discrete algebraic Riccati recursions (cf. Wu et al. [13]). This analysis differs from the original approaches by Ozarow [2] and Kramer [8].

The complete characterization of the sum-capacity under symmetric block power constraints , i.e., the maximum sum-rate achieved by arbitrary feedback codes, still remains open. However, it has been commonly believed (cf. [11],[13]) that linear-feedback codes achieve the sum-capacity, i.e., . We offer an observation that further supports this conjecture. By introducing and analyzing the properties of conditional maximal correlation, which is an extension of the Hirschfeld–Gebelein–Rényi maximal correlation [14] to the case where an additional common random variable is shared, we show in Section V that the linear-feedback codes are greedy optimal for a multiletter optimization problem that upper bounds .

The rest of the paper is organized as follows. In Section II we formally state the problem and present our main result. Section III provides the proof of the converse and Section IV gives an alternative proof of achievability via Kramer’s linear-feedback code. Section V concludes the paper with a discussion on potential extensions of the main ideas to nonequal power constraints and arbitrary feedback codes, and with a proof that linear-feedback codes are greedy optimal for a multiletter optimization problem that upper bounds .

We closely follow the notation in [15]. In particular, a random variable is denoted by an upper case letter (e.g., ) and its realization is denoted by a lower case letter (e.g., ). The shorthand notation is used to denote the tuple (or the column vector) of random variables , and is used to denote their realizations. A random column vector and its realization are denoted by boldface letters (e.g. and ) as well. Uppercase letters (e.g., ) also denote deterministic matrices, which can be distinguished from random variables based on the context. The element of a matrix is denoted by . The conjugate transpose of a real or complex matrix is denoted by and the determinant of is denoted by . For the crosscovariance matrix of two random vectors and , we use the shorthand notation and for the covariance matrix of a random vector we use . Calligraphic letters (e.g., ) denote discrete sets. Let be a tuple of random variables and . The subtuple of random variables with indices from is denoted by . For every positive real number , the short-hand notation is used to denote the set of integers .

Ii Problem Setup and the Main Result

Consider the communication problem over a Gaussian multiple access channel with feedback depicted in Figure 1. Each sender wishes to transmit a message reliably to the common receiver. At each time , the output of the channel is

(1)

where is a discrete-time zero-mean white Gaussian noise process with unit average power, i.e., , and is independent of . We assume that the output symbols are causally fed back to each sender, and that the transmitted symbol from sender  at time  can thus depend on both the previous channel output sequence and the message .

We define a feedback code as

  1. message sets , where for ;

  2. a set of encoders, where encoder assigns a symbol to its message and the past channel output sequence for ; and

  3. a decoder that assigns message estimates , , to each received sequence .

We assume throughout that is uniformly distributed over . The probability of error is defined as

A rate tuple and its corresponding sum-rate are said to be achievable under the power constraints if there exists a sequence of feedback codes such that the expected block power constraints

are satisfied and . The supremum over all achievable sum-rates is referred to as the sum-capacity. In most of the paper, we will be interested in the case of symmetric power constraints . In this case we denote the sum-capacity by .

Our focus will be on the special class of linear-feedback codes defined as follows.

Definition 1

A feedback code is said to be a linear-feedback code if the encoder has the form

where

  1. the (potentially nonlinear) nonfeedback mapping is independent of and maps the message to a -dimensional real vector (message point) for some ; and

  2. the linear-feedback mapping maps the message point and the past feedback output sequence to the channel input symbol .

The class of linear-feedback codes includes as special cases the feedback codes by Schalkwijk and Kailath [3], Ozarow [2], and Kramer [8], and all nonfeedback codes. To recover the codes by Schalkwijk and Kailath [3] and Ozarow [2] it suffices to choose ; for Kramer’s code [8] we need ; and to recover all nonfeedback codes we have to choose and each message point equal to the codeword sent by encoder .

The linear-feedback sum-capacity is defined as the maximum achievable sum-rate using only linear-feedback codes. Under symmetric block power constraints , we denote the linear-feedback sum-capacity by .

We are ready to state the main result of this paper.

Theorem 1

For the Gaussian multiple access channel with symmetric block power constraints , the linear-feedback sum-capacity is

(2)

where is the unique solution to

(3)

in the interval .

The proof of Theorem 1 has several parts. The converse is proved in Section III. The proof of achievability follows by [8, Theorem 2] and can be proved based on Kramer’s linear-feedback code [8]. For completeness, we present a simple description and analysis of Kramer’s code in Section IV. Finally, the property that (3) has a unique solution in is proved in Appendix A.

Remark 1

Kramer showed [8] that when the power constraint exceeds the threshold , which is the unique positive solution to

(4)

then the sum-capacity is given by the right-hand side of (2). Thus, for this case Theorem 1 follows directly from Kramer’s more general result. Consequently, when , then the linear-feedback sum-capacity coincides with the sum-capacity, i.e., . It is not known whether this equality holds for all powers ; see also our discussion in Section V-B.

Remark 2

Since , we can define a parameter so that . Intuitively, measures the correlation between the transmitted signals. For example, when , the corresponding coincides with the optimal correlation coefficient in [2]. Thus, captures the amount of cooperation (coherent power gain) that can be established among the senders using linear-feedback codes, where corresponds to no cooperation and corresponds to full cooperation. For a fixed , is strictly increasing (see Appendix A); thus, more power allows for more cooperation. Moreover, as and as , which is seen as follows. We rewrite identity (3) as

(5)

and notice that the left-hand side (LHS) of (5) can be written as , where tends to 0 faster than . Thus, the LHS of (5) can equal its right-hand side (RHS) only if as , or equivalently, as . On the other hand, as , the LHS tends to a constant while the RHS tends to infinity unless tends to 0. Thus, by contradiction, as .

By the above observation, we have the following two corollaries to Theorem 1 for the low and high signal-to-noise ratio (SNR) regimes.

Corollary 1

In the low SNR regime, almost no cooperation is possible and the linear-feedback sum-capacity approaches the sum-capacity without feedback:

Corollary 2

In the high SNR regime, the linear-feedback sum-capacity approaches the sum-capacity with full cooperation where all the transmitted signals are coherently aligned with combined SNR equal to :

Iii Proof of the Converse

In this section we show that under the symmetric block power constraints , the linear-feedback sum-capacity is upper bounded as

(6)

where is defined in (3).

The proof involves five steps. First, we derive an upper bound on the linear-feedback sum-capacity based on Fano’s inequality and the maximum entropy property of Gaussian distributions (see Lemma 1). Second, we relax the problem by replacing the functional structure in the optimizing Gaussian input distributions (8) with a dependence balance condition [10, 11], and we rewrite the resulting nonconvex optimization problem as one over positive semidefinite matrices (see Lemma 2). Third, we consider the Lagrange dual function , which yields an upper bound on for every (see Lemma 3). Fourth, by exploiting the convexity and symmetry of the problem, we simplify the upper bound into an unconstrained optimization problem (which is still nonconvex) that involves only two optimization variables (see Lemma 4). Fifth and last, using brute-force calculus and strong duality, we show that there exist such that the corresponding upper bound coincides with the right hand side of (6) (see Lemma 5).

The details are as follows.

Lemma 1

The linear-feedback sum-capacity is upper bounded as

where111For simplicity of notation we do not include the parameter explicitly in most functions that we define in this section, e.g., .

(7)

and the maximum is over all inputs of the form

(8)

such that the function is linear, the vector is Gaussian, independent of the noise vector and the tuple , and the power constraint is satisfied.

Proof:

By Fano’s inequality [16],

for some that tends to zero along with as . Thus, for any achievable rate tuple , the sum-rate can be upper bounded as follows:

(9)
(10)
(11)

where (10) and (11) follow by the data processing inequality and the memoryless property of the channel, respectively. Therefore, the linear-feedback sum-capacity is upper bounded as

(12)

where the maximum is over all input distributions induced by a linear-feedback code satisfying the symmetric power constraints , i.e., over all choices of independent random vectors and linear functions such that the inputs satisfy the power constraints . Now let

be a Gaussian random vector with the same covariance matrix as , independent of . Using the same linear functions as in the given code, define

(13)

where is the channel output of a Gaussian MAC corresponding to the input tuple . It is not hard to see that is jointly Gaussian with zero mean and of the same covariance matrix as . Therefore, by the conditional maximum entropy theorem [5, Lemma 1] we have

(14)

Combining (12) and (14) and appropriately defining in (8) from in (13) completes the proof of Lemma 1. \qed

We define the following functions on -by- covariance matrices :

(15a)
(15b)

It can be readily checked that both functions are concave in  (see Appendix B).

Lemma 2

The linear-feedback sum-capacity is upper bounded as

(16)

where the maximum is over -by- covariance matrices such that

(17)
(18)
Proof:

Since is defined by the (causal) functional relationship in (8), by [10], [11, Theorem 1] we have the dependence balance condition

(19)

Furthermore, recall that is jointly Gaussian. Therefore, for every , conditioned on , the input (column) vector is zero-mean Gaussian with covariance matrix

irrespective of . Now consider

(20)

Also consider

which implies that

(21)

Hence, condition (19) reduces to (18). Rewriting (7) in terms of covariance matrices via (20) and relaxing the functional relationship (8) by the dependence balance condition (18) completes the proof of Lemma 2. \qed

Remark 3

Although both functions and are concave, their difference is neither concave nor convex. Hence, the optimization problem in (16) is nonconvex.

Lemma 3

Let and be defined as in (15a) and (15b). Then for every ,

(22)

where

(23)
Proof:

By the standard Lagrange duality [12], for any , the maximum in (16) is upper bounded as

where the maximum is over (without any other constraints). Here, are the Lagrange multipliers corresponding to the power constraints (17) and is the Lagrange multiplier corresponding to the dependence balance constraint (18). Finally, we choose , which yields

and completes the proof of Lemma 3. \qed

Lemma 4

For every ,

(24)

where

(25)

and

(26a)
(26b)
Proof:

Suppose that a covariance matrix attains the maximum in (23). For each permutation on , let be the covariance matrix obtained by permuting the rows and columns of according to , i.e., for . Let

be the arithmetic average of over all permutations. Clearly, is positive semidefinite and of the form

(27)

for some and . (The conditions on and assure that is positive semidefinite.) We now show that also attains the maximum in (23). First, notice that the function depends on the matrix  only via the sum of its entries and hence

Similarly,

Also, by symmetry we have . Hence, by the concavity of (see Appendix B) and Jensen’s inequality, . Therefore,

and the maximum of (23) is also attained by . Finally, defining and simplifying (15a) and (15b) yields

which completes the proof of Lemma 4. \qed

Remark 4

The symmetric in (27) was also considered in [5, 8] to evaluate the cutset upper bound, which corresponds to taking .

Lemma 5

There exist such that

where is defined in (3).

Proof:

Consider the optimization problem over , which defines in (24). Note that given by (25) is neither concave or convex in for . However, is concave in for fixed as shown in Appendix C.

Let be the unique nonnegative solution to

or equivalently to

(28)

(That such a unique solution exists is easily verified considering the equivalent quadratic equation; see (70) in Appendix D.) Then, by the concavity of in for fixed and ,

(29)

for any . (The inequality follows because might be larger than .)

Now let . Then, is nondecreasing and concave in  for fixed as shown in Appendix D. Thus,

(30)

where the first equality follows by Slater’s condition [12] and strong duality, and the last equality follows by the monotonicity of in . Alternatively, the equality in (30) can be viewed as the complementary slackness condition [12]. Indeed, since is not bounded from above, the optimal Lagrange multiplier must be positive. Therefore, the corresponding constraint is active at the optimum, i.e., .

Finally, we choose , where

which assures that coincides with (see (28)). Since is nonnegative by (57) in Appendix A and thus is a valid choice,

which, combined with (30), concludes the proof of Lemma 5 and of the converse. \qed

Iv Achievability via Kramer’s Code

We present (a slightly modified version of) Kramer’s linear-feedback code and analyze it based on the properties of discrete algebraic Riccati equations (DARE). In particular, we establish the following:

Theorem 2

Suppose that are real numbers and are distinct complex numbers on the unit circle. Let be a diagonal matrix, be the all-one column vector, and be the unique positive-definite solution to the discrete algebraic Riccati equation (DARE)

(31)

Then, a rate tuple is achievable under power constraints , provided that and , .

Achievability of Theorem 1 will be proved in Subsection IV-C as a corollary to Theorem 2.

Iv-a Kramer’s Linear-Feedback Code

Following [8], we represent a pair of consecutive uses of the given real Gaussian MAC as a single use of a complex Gaussian MAC. We represent the message point of sender by the complex scalar (corresponding to in the original real channel) and let be the (column) vector of message points.

The coding scheme has the following parameters: real coefficients and distinct complex numbers on the unit circle.

Nonfeedback mappings: For , we divide the square with corners at on the complex plane into equal subsquares. We then assign a different message to each subsquare and denote the complex number in the center of the subsquare by . The message point of sender is then .

Linear-feedback mappings: Let denote the (column) vector of channel inputs at time . We use the linear-feedback mappings

(32)

where

(33)

is a diagonal matrix with and

is the linear minimum mean squared error (MMSE) estimate of given .

Decoding: Upon receiving , the decoder forms a message estimate vector

(34)

and for each chooses such that is the center point of the subsquare containing .

Iv-B Analysis of the Probability of Error

Our analysis is based on the following auxiliary lemma. We use the short-hand notation .

Lemma 6
(35)

where is the unique positive-definite solution to the DARE (31).

Proof:

We rewrite the channel outputs in (1) as

(36)

From (IV-A) we have

(37)

where is the error covariance matrix of the linear MMSE estimate of given . Combining (36) and (37) we obtain the Riccati recursion [17]

(38)

for . Since has no unit-circle eigenvalue and the pair is detectable,222A pair is said to be detectable if there exists a column vector such that all the eigenvalues of lie inside the unit circle. For a diagonal matrix , the pair is detectable if and only if all the unstable eigenvalues , i.e., the ones on or outside the unit-circle, are distinct [18, Appendix C]. we use Lemma 2.5 in [19] to conclude (35). \qed

We now prove that Kramer’s code achieves any rate tuple such that

(39)

Define the difference vector . Since the minimum distance between message points is , by the union of events bound and the Chebyshev inequality, the probability of error of Kramer’s code is upper bounded as

(40)

Rewriting the encoding rule in (IV-A) as

and comparing it with the decoder’s estimation rule in (34) we have . Hence, with diagonal elements and (40) can be written as

(41)

But by Lemma 6, . Therefore, as .

Finally, by Lemma 6 and the Césaro mean lemma [20], the asymptotic power of sender satisfies

Hence, Kramer’s code satisfies the power constraints for sufficiently large , provided that

(42)

This completes the proof of Theorem 2.

Iv-C Achievability Proof of Theorem 1

Fix any such that