LinearFeedback SumCapacity for Gaussian Multiple Access Channels with Feedback
Abstract
The capacity region of the sender Gaussian multiple access channel with feedback is not known in general. This paper studies the class of linearfeedback codes that includes (nonlinear) nonfeedback codes at one extreme and the linearfeedback codes by Schalkwijk and Kailath, Ozarow, and Kramer at the other extreme. The linearfeedback sumcapacity under symmetric power constraints is characterized, the maximum sumrate achieved by linearfeedback codes when each sender has the equal block power constraint . In particular, it is shown that Kramer’s code achieves this linearfeedback sumcapacity. The proof involves the dependence balance condition introduced by Hekstra and Willems and extended by Kramer and Gastpar, and the analysis of the resulting nonconvex optimization problem via a Lagrange dual formulation. Finally, an observation is presented based on the properties of the conditional maximal correlation—an extension of the Hirschfeld–Gebelein–Rényi maximal correlation—which reinforces the conjecture that Kramer’s code achieves not only the linearfeedback sumcapacity, but also the sumcapacity itself (the maximum sumrate achieved by arbitrary feedback codes).
I Introduction
Feedback from the receivers to the senders can improve the performance of the communication systems in various ways. For example, as first shown by Gaarder and Wolf [1], feedback can enlarge the capacity region of memoryless multiple access channels by enabling the distributed senders to cooperate via coherent transmissions.
In this paper, we study the sumcapacity of the additive white Gaussian noise multiple access channel (Gaussian multiple access channel in short) with feedback depicted in Figure 1. For senders, Ozarow [2] established the capacity region which—unlike for the pointtopoint channel—is strictly larger than the one without feedback. The capacityachieving code proposed by Ozarow is an extension of the Schalkwijk–Kailath code [3, 4] for Gaussian pointtopoint channels.
For , the capacity region is not known in general. Thomas [5] proved that feedback can at most double the sum capacity, and later Ordentlich [6] showed that the same bound holds for the entire capacity region even when the noise sequence is not white (cf. Pombra and Cover [7]). More recently, Kramer [8] extended Ozarow’s linearfeedback code to senders, and proved that this code achieves the sumcapacity under symmetric block power constraints on all the senders, when the power is above a certain threshold (see (4) in Section II) that depends on the number of senders .
In this paper, we focus on the class of linearfeedback codes, where the feedback signals are incorporated linearly into the transmit signals (see Definition 1 in Section II). This class of codes includes the linearfeedback codes by Schalkwijk and Kailath [3], Ozarow [2], and Kramer [8] as well as arbitrary (nonlinear) nonfeedback codes.
We characterize the linearfeedback sumcapacity under symmetric block power constraints , which is the maximum sumrate achieved by linearfeedback codes under equal block power constraints at all the senders. Our main contribution is the proof of the converse. We first prove an upper bound on , which is a multiletter optimization problem over Gaussian distributions satisfying a certain functional relationship (cf. Cover and Pombra [9]). Next, we relax the functional relationship by considering a dependence balance condition, introduced by Hekstra and Willems [10] and extended by Kramer and Gastpar [11], and derive an optimization problem over the set of positive semidefinite (covariance) matrices. Lastly, we carefully analyze this nonconvex optimization problem via a Lagrange dual formulation [12].
The linearfeedback sumcapacity is achieved by Kramer’s linearfeedback code. Hence, this rather simple code, which iteratively refines the receiver’s knowledge about the messages, is sumrate optimal among the class of linearfeedback codes. For completeness, we briefly describe Kramer’s linearfeedback code and analyze it via properties of discrete algebraic Riccati recursions (cf. Wu et al. [13]). This analysis differs from the original approaches by Ozarow [2] and Kramer [8].
The complete characterization of the sumcapacity under symmetric block power constraints , i.e., the maximum sumrate achieved by arbitrary feedback codes, still remains open. However, it has been commonly believed (cf. [11],[13]) that linearfeedback codes achieve the sumcapacity, i.e., . We offer an observation that further supports this conjecture. By introducing and analyzing the properties of conditional maximal correlation, which is an extension of the Hirschfeld–Gebelein–Rényi maximal correlation [14] to the case where an additional common random variable is shared, we show in Section V that the linearfeedback codes are greedy optimal for a multiletter optimization problem that upper bounds .
The rest of the paper is organized as follows. In Section II we formally state the problem and present our main result. Section III provides the proof of the converse and Section IV gives an alternative proof of achievability via Kramer’s linearfeedback code. Section V concludes the paper with a discussion on potential extensions of the main ideas to nonequal power constraints and arbitrary feedback codes, and with a proof that linearfeedback codes are greedy optimal for a multiletter optimization problem that upper bounds .
We closely follow the notation in [15]. In particular, a random variable is denoted by an upper case letter (e.g., ) and its realization is denoted by a lower case letter (e.g., ). The shorthand notation is used to denote the tuple (or the column vector) of random variables , and is used to denote their realizations. A random column vector and its realization are denoted by boldface letters (e.g. and ) as well. Uppercase letters (e.g., ) also denote deterministic matrices, which can be distinguished from random variables based on the context. The element of a matrix is denoted by . The conjugate transpose of a real or complex matrix is denoted by and the determinant of is denoted by . For the crosscovariance matrix of two random vectors and , we use the shorthand notation and for the covariance matrix of a random vector we use . Calligraphic letters (e.g., ) denote discrete sets. Let be a tuple of random variables and . The subtuple of random variables with indices from is denoted by . For every positive real number , the shorthand notation is used to denote the set of integers .
Ii Problem Setup and the Main Result
Consider the communication problem over a Gaussian multiple access channel with feedback depicted in Figure 1. Each sender wishes to transmit a message reliably to the common receiver. At each time , the output of the channel is
(1) 
where is a discretetime zeromean white Gaussian noise process with unit average power, i.e., , and is independent of . We assume that the output symbols are causally fed back to each sender, and that the transmitted symbol from sender at time can thus depend on both the previous channel output sequence and the message .
We define a feedback code as

message sets , where for ;

a set of encoders, where encoder assigns a symbol to its message and the past channel output sequence for ; and

a decoder that assigns message estimates , , to each received sequence .
We assume throughout that is uniformly distributed over . The probability of error is defined as
A rate tuple and its corresponding sumrate are said to be achievable under the power constraints if there exists a sequence of feedback codes such that the expected block power constraints
are satisfied and . The supremum over all achievable sumrates is referred to as the sumcapacity. In most of the paper, we will be interested in the case of symmetric power constraints . In this case we denote the sumcapacity by .
Our focus will be on the special class of linearfeedback codes defined as follows.
Definition 1
A feedback code is said to be a linearfeedback code if the encoder has the form
where

the (potentially nonlinear) nonfeedback mapping is independent of and maps the message to a dimensional real vector (message point) for some ; and

the linearfeedback mapping maps the message point and the past feedback output sequence to the channel input symbol .
The class of linearfeedback codes includes as special cases the feedback codes by Schalkwijk and Kailath [3], Ozarow [2], and Kramer [8], and all nonfeedback codes. To recover the codes by Schalkwijk and Kailath [3] and Ozarow [2] it suffices to choose ; for Kramer’s code [8] we need ; and to recover all nonfeedback codes we have to choose and each message point equal to the codeword sent by encoder .
The linearfeedback sumcapacity is defined as the maximum achievable sumrate using only linearfeedback codes. Under symmetric block power constraints , we denote the linearfeedback sumcapacity by .
We are ready to state the main result of this paper.
Theorem 1
For the Gaussian multiple access channel with symmetric block power constraints , the linearfeedback sumcapacity is
(2) 
where is the unique solution to
(3) 
in the interval .
The proof of Theorem 1 has several parts. The converse is proved in Section III. The proof of achievability follows by [8, Theorem 2] and can be proved based on Kramer’s linearfeedback code [8]. For completeness, we present a simple description and analysis of Kramer’s code in Section IV. Finally, the property that (3) has a unique solution in is proved in Appendix A.
Remark 1
Kramer showed [8] that when the power constraint exceeds the threshold , which is the unique positive solution to
(4) 
then the sumcapacity is given by the righthand side of (2). Thus, for this case Theorem 1 follows directly from Kramer’s more general result. Consequently, when , then the linearfeedback sumcapacity coincides with the sumcapacity, i.e., . It is not known whether this equality holds for all powers ; see also our discussion in Section VB.
Remark 2
Since , we can define a parameter so that . Intuitively, measures the correlation between the transmitted signals. For example, when , the corresponding coincides with the optimal correlation coefficient in [2]. Thus, captures the amount of cooperation (coherent power gain) that can be established among the senders using linearfeedback codes, where corresponds to no cooperation and corresponds to full cooperation. For a fixed , is strictly increasing (see Appendix A); thus, more power allows for more cooperation. Moreover, as and as , which is seen as follows. We rewrite identity (3) as
(5) 
and notice that the lefthand side (LHS) of (5) can be written as , where tends to 0 faster than . Thus, the LHS of (5) can equal its righthand side (RHS) only if as , or equivalently, as . On the other hand, as , the LHS tends to a constant while the RHS tends to infinity unless tends to 0. Thus, by contradiction, as .
By the above observation, we have the following two corollaries to Theorem 1 for the low and high signaltonoise ratio (SNR) regimes.
Corollary 1
In the low SNR regime, almost no cooperation is possible and the linearfeedback sumcapacity approaches the sumcapacity without feedback:
Corollary 2
In the high SNR regime, the linearfeedback sumcapacity approaches the sumcapacity with full cooperation where all the transmitted signals are coherently aligned with combined SNR equal to :
Iii Proof of the Converse
In this section we show that under the symmetric block power constraints , the linearfeedback sumcapacity is upper bounded as
(6) 
where is defined in (3).
The proof involves five steps. First, we derive an upper bound on the linearfeedback sumcapacity based on Fano’s inequality and the maximum entropy property of Gaussian distributions (see Lemma 1). Second, we relax the problem by replacing the functional structure in the optimizing Gaussian input distributions (8) with a dependence balance condition [10, 11], and we rewrite the resulting nonconvex optimization problem as one over positive semidefinite matrices (see Lemma 2). Third, we consider the Lagrange dual function , which yields an upper bound on for every (see Lemma 3). Fourth, by exploiting the convexity and symmetry of the problem, we simplify the upper bound into an unconstrained optimization problem (which is still nonconvex) that involves only two optimization variables (see Lemma 4). Fifth and last, using bruteforce calculus and strong duality, we show that there exist such that the corresponding upper bound coincides with the right hand side of (6) (see Lemma 5).
The details are as follows.
Lemma 1
The linearfeedback sumcapacity is upper bounded as
where^{1}^{1}1For simplicity of notation we do not include the parameter explicitly in most functions that we define in this section, e.g., .
(7) 
and the maximum is over all inputs of the form
(8) 
such that the function is linear, the vector is Gaussian, independent of the noise vector and the tuple , and the power constraint is satisfied.
Proof:
By Fano’s inequality [16],
for some that tends to zero along with as . Thus, for any achievable rate tuple , the sumrate can be upper bounded as follows:
(9)  
(10)  
(11) 
where (10) and (11) follow by the data processing inequality and the memoryless property of the channel, respectively. Therefore, the linearfeedback sumcapacity is upper bounded as
(12) 
where the maximum is over all input distributions induced by a linearfeedback code satisfying the symmetric power constraints , i.e., over all choices of independent random vectors and linear functions such that the inputs satisfy the power constraints . Now let
be a Gaussian random vector with the same covariance matrix as , independent of . Using the same linear functions as in the given code, define
(13) 
where is the channel output of a Gaussian MAC corresponding to the input tuple . It is not hard to see that is jointly Gaussian with zero mean and of the same covariance matrix as . Therefore, by the conditional maximum entropy theorem [5, Lemma 1] we have
(14) 
Combining (12) and (14) and appropriately defining in (8) from in (13) completes the proof of Lemma 1. \qed
We define the following functions on by covariance matrices :
(15a)  
(15b) 
It can be readily checked that both functions are concave in (see Appendix B).
Lemma 2
The linearfeedback sumcapacity is upper bounded as
(16) 
where the maximum is over by covariance matrices such that
(17)  
(18) 
Proof:
Since is defined by the (causal) functional relationship in (8), by [10], [11, Theorem 1] we have the dependence balance condition
(19) 
Furthermore, recall that is jointly Gaussian. Therefore, for every , conditioned on , the input (column) vector is zeromean Gaussian with covariance matrix
irrespective of . Now consider
(20) 
Also consider
which implies that
(21) 
Hence, condition (19) reduces to (18). Rewriting (7) in terms of covariance matrices via (20) and relaxing the functional relationship (8) by the dependence balance condition (18) completes the proof of Lemma 2. \qed
Remark 3
Although both functions and are concave, their difference is neither concave nor convex. Hence, the optimization problem in (16) is nonconvex.
Proof:
By the standard Lagrange duality [12], for any , the maximum in (16) is upper bounded as
where the maximum is over (without any other constraints). Here, are the Lagrange multipliers corresponding to the power constraints (17) and is the Lagrange multiplier corresponding to the dependence balance constraint (18). Finally, we choose , which yields
and completes the proof of Lemma 3. \qed
Lemma 4
For every ,
(24) 
where
(25) 
and
(26a)  
(26b) 
Proof:
Suppose that a covariance matrix attains the maximum in (23). For each permutation on , let be the covariance matrix obtained by permuting the rows and columns of according to , i.e., for . Let
be the arithmetic average of over all permutations. Clearly, is positive semidefinite and of the form
(27) 
for some and . (The conditions on and assure that is positive semidefinite.) We now show that also attains the maximum in (23). First, notice that the function depends on the matrix only via the sum of its entries and hence
Similarly,
Also, by symmetry we have . Hence, by the concavity of (see Appendix B) and Jensen’s inequality, . Therefore,
and the maximum of (23) is also attained by . Finally, defining and simplifying (15a) and (15b) yields
which completes the proof of Lemma 4. \qed
Remark 4
Lemma 5
Proof:
Consider the optimization problem over , which defines in (24). Note that given by (25) is neither concave or convex in for . However, is concave in for fixed as shown in Appendix C.
Let be the unique nonnegative solution to
or equivalently to
(28) 
(That such a unique solution exists is easily verified considering the equivalent quadratic equation; see (70) in Appendix D.) Then, by the concavity of in for fixed and ,
(29) 
for any . (The inequality follows because might be larger than .)
Now let . Then, is nondecreasing and concave in for fixed as shown in Appendix D. Thus,
(30) 
where the first equality follows by Slater’s condition [12] and strong duality, and the last equality follows by the monotonicity of in . Alternatively, the equality in (30) can be viewed as the complementary slackness condition [12]. Indeed, since is not bounded from above, the optimal Lagrange multiplier must be positive. Therefore, the corresponding constraint is active at the optimum, i.e., .
Iv Achievability via Kramer’s Code
We present (a slightly modified version of) Kramer’s linearfeedback code and analyze it based on the properties of discrete algebraic Riccati equations (DARE). In particular, we establish the following:
Theorem 2
Suppose that are real numbers and are distinct complex numbers on the unit circle. Let be a diagonal matrix, be the allone column vector, and be the unique positivedefinite solution to the discrete algebraic Riccati equation (DARE)
(31) 
Then, a rate tuple is achievable under power constraints , provided that and , .
Iva Kramer’s LinearFeedback Code
Following [8], we represent a pair of consecutive uses of the given real Gaussian MAC as a single use of a complex Gaussian MAC. We represent the message point of sender by the complex scalar (corresponding to in the original real channel) and let be the (column) vector of message points.
The coding scheme has the following parameters: real coefficients and distinct complex numbers on the unit circle.
Nonfeedback mappings: For , we divide the square with corners at on the complex plane into equal subsquares. We then assign a different message to each subsquare and denote the complex number in the center of the subsquare by . The message point of sender is then .
Linearfeedback mappings: Let denote the (column) vector of channel inputs at time . We use the linearfeedback mappings
(32) 
where
(33) 
is a diagonal matrix with and
is the linear minimum mean squared error (MMSE) estimate of given .
Decoding: Upon receiving , the decoder forms a message estimate vector
(34) 
and for each chooses such that is the center point of the subsquare containing .
IvB Analysis of the Probability of Error
Our analysis is based on the following auxiliary lemma. We use the shorthand notation .
Lemma 6
(35) 
where is the unique positivedefinite solution to the DARE (31).
Proof:
We rewrite the channel outputs in (1) as
(36) 
From (IVA) we have
(37) 
where is the error covariance matrix of the linear MMSE estimate of given . Combining (36) and (37) we obtain the Riccati recursion [17]
(38) 
for . Since has no unitcircle eigenvalue and the pair is detectable,^{2}^{2}2A pair is said to be detectable if there exists a column vector such that all the eigenvalues of lie inside the unit circle. For a diagonal matrix , the pair is detectable if and only if all the unstable eigenvalues , i.e., the ones on or outside the unitcircle, are distinct [18, Appendix C]. we use Lemma 2.5 in [19] to conclude (35). \qed
We now prove that Kramer’s code achieves any rate tuple such that
(39) 
Define the difference vector . Since the minimum distance between message points is , by the union of events bound and the Chebyshev inequality, the probability of error of Kramer’s code is upper bounded as
(40) 
Rewriting the encoding rule in (IVA) as
and comparing it with the decoder’s estimation rule in (34) we have . Hence, with diagonal elements and (40) can be written as
(41) 
But by Lemma 6, . Therefore, as .
IvC Achievability Proof of Theorem 1
Fix any such that