Linear-Feedback Sum-Capacity for Gaussian Multiple Access Channels with Feedback
The capacity region of the -sender Gaussian multiple access channel with feedback is not known in general. This paper studies the class of linear-feedback codes that includes (nonlinear) nonfeedback codes at one extreme and the linear-feedback codes by Schalkwijk and Kailath, Ozarow, and Kramer at the other extreme. The linear-feedback sum-capacity under symmetric power constraints is characterized, the maximum sum-rate achieved by linear-feedback codes when each sender has the equal block power constraint . In particular, it is shown that Kramer’s code achieves this linear-feedback sum-capacity. The proof involves the dependence balance condition introduced by Hekstra and Willems and extended by Kramer and Gastpar, and the analysis of the resulting nonconvex optimization problem via a Lagrange dual formulation. Finally, an observation is presented based on the properties of the conditional maximal correlation—an extension of the Hirschfeld–Gebelein–Rényi maximal correlation—which reinforces the conjecture that Kramer’s code achieves not only the linear-feedback sum-capacity, but also the sum-capacity itself (the maximum sum-rate achieved by arbitrary feedback codes).
Feedback from the receivers to the senders can improve the performance of the communication systems in various ways. For example, as first shown by Gaarder and Wolf , feedback can enlarge the capacity region of memoryless multiple access channels by enabling the distributed senders to cooperate via coherent transmissions.
In this paper, we study the sum-capacity of the additive white Gaussian noise multiple access channel (Gaussian multiple access channel in short) with feedback depicted in Figure 1. For senders, Ozarow  established the capacity region which—unlike for the point-to-point channel—is strictly larger than the one without feedback. The capacity-achieving code proposed by Ozarow is an extension of the Schalkwijk–Kailath code [3, 4] for Gaussian point-to-point channels.
For , the capacity region is not known in general. Thomas  proved that feedback can at most double the sum capacity, and later Ordentlich  showed that the same bound holds for the entire capacity region even when the noise sequence is not white (cf. Pombra and Cover ). More recently, Kramer  extended Ozarow’s linear-feedback code to senders, and proved that this code achieves the sum-capacity under symmetric block power constraints on all the senders, when the power is above a certain threshold (see (4) in Section II) that depends on the number of senders .
In this paper, we focus on the class of linear-feedback codes, where the feedback signals are incorporated linearly into the transmit signals (see Definition 1 in Section II). This class of codes includes the linear-feedback codes by Schalkwijk and Kailath , Ozarow , and Kramer  as well as arbitrary (nonlinear) nonfeedback codes.
We characterize the linear-feedback sum-capacity under symmetric block power constraints , which is the maximum sum-rate achieved by linear-feedback codes under equal block power constraints at all the senders. Our main contribution is the proof of the converse. We first prove an upper bound on , which is a multiletter optimization problem over Gaussian distributions satisfying a certain functional relationship (cf. Cover and Pombra ). Next, we relax the functional relationship by considering a dependence balance condition, introduced by Hekstra and Willems  and extended by Kramer and Gastpar , and derive an optimization problem over the set of positive semidefinite (covariance) matrices. Lastly, we carefully analyze this nonconvex optimization problem via a Lagrange dual formulation .
The linear-feedback sum-capacity is achieved by Kramer’s linear-feedback code. Hence, this rather simple code, which iteratively refines the receiver’s knowledge about the messages, is sum-rate optimal among the class of linear-feedback codes. For completeness, we briefly describe Kramer’s linear-feedback code and analyze it via properties of discrete algebraic Riccati recursions (cf. Wu et al. ). This analysis differs from the original approaches by Ozarow  and Kramer .
The complete characterization of the sum-capacity under symmetric block power constraints , i.e., the maximum sum-rate achieved by arbitrary feedback codes, still remains open. However, it has been commonly believed (cf. ,) that linear-feedback codes achieve the sum-capacity, i.e., . We offer an observation that further supports this conjecture. By introducing and analyzing the properties of conditional maximal correlation, which is an extension of the Hirschfeld–Gebelein–Rényi maximal correlation  to the case where an additional common random variable is shared, we show in Section V that the linear-feedback codes are greedy optimal for a multiletter optimization problem that upper bounds .
The rest of the paper is organized as follows. In Section II we formally state the problem and present our main result. Section III provides the proof of the converse and Section IV gives an alternative proof of achievability via Kramer’s linear-feedback code. Section V concludes the paper with a discussion on potential extensions of the main ideas to nonequal power constraints and arbitrary feedback codes, and with a proof that linear-feedback codes are greedy optimal for a multiletter optimization problem that upper bounds .
We closely follow the notation in . In particular, a random variable is denoted by an upper case letter (e.g., ) and its realization is denoted by a lower case letter (e.g., ). The shorthand notation is used to denote the tuple (or the column vector) of random variables , and is used to denote their realizations. A random column vector and its realization are denoted by boldface letters (e.g. and ) as well. Uppercase letters (e.g., ) also denote deterministic matrices, which can be distinguished from random variables based on the context. The element of a matrix is denoted by . The conjugate transpose of a real or complex matrix is denoted by and the determinant of is denoted by . For the crosscovariance matrix of two random vectors and , we use the shorthand notation and for the covariance matrix of a random vector we use . Calligraphic letters (e.g., ) denote discrete sets. Let be a tuple of random variables and . The subtuple of random variables with indices from is denoted by . For every positive real number , the short-hand notation is used to denote the set of integers .
Ii Problem Setup and the Main Result
Consider the communication problem over a Gaussian multiple access channel with feedback depicted in Figure 1. Each sender wishes to transmit a message reliably to the common receiver. At each time , the output of the channel is
where is a discrete-time zero-mean white Gaussian noise process with unit average power, i.e., , and is independent of . We assume that the output symbols are causally fed back to each sender, and that the transmitted symbol from sender at time can thus depend on both the previous channel output sequence and the message .
We define a feedback code as
message sets , where for ;
a set of encoders, where encoder assigns a symbol to its message and the past channel output sequence for ; and
a decoder that assigns message estimates , , to each received sequence .
We assume throughout that is uniformly distributed over . The probability of error is defined as
A rate tuple and its corresponding sum-rate are said to be achievable under the power constraints if there exists a sequence of feedback codes such that the expected block power constraints
are satisfied and . The supremum over all achievable sum-rates is referred to as the sum-capacity. In most of the paper, we will be interested in the case of symmetric power constraints . In this case we denote the sum-capacity by .
Our focus will be on the special class of linear-feedback codes defined as follows.
A feedback code is said to be a linear-feedback code if the encoder has the form
the (potentially nonlinear) nonfeedback mapping is independent of and maps the message to a -dimensional real vector (message point) for some ; and
the linear-feedback mapping maps the message point and the past feedback output sequence to the channel input symbol .
The class of linear-feedback codes includes as special cases the feedback codes by Schalkwijk and Kailath , Ozarow , and Kramer , and all nonfeedback codes. To recover the codes by Schalkwijk and Kailath  and Ozarow  it suffices to choose ; for Kramer’s code  we need ; and to recover all nonfeedback codes we have to choose and each message point equal to the codeword sent by encoder .
The linear-feedback sum-capacity is defined as the maximum achievable sum-rate using only linear-feedback codes. Under symmetric block power constraints , we denote the linear-feedback sum-capacity by .
We are ready to state the main result of this paper.
For the Gaussian multiple access channel with symmetric block power constraints , the linear-feedback sum-capacity is
where is the unique solution to
in the interval .
The proof of Theorem 1 has several parts. The converse is proved in Section III. The proof of achievability follows by [8, Theorem 2] and can be proved based on Kramer’s linear-feedback code . For completeness, we present a simple description and analysis of Kramer’s code in Section IV. Finally, the property that (3) has a unique solution in is proved in Appendix A.
Kramer showed  that when the power constraint exceeds the threshold , which is the unique positive solution to
then the sum-capacity is given by the right-hand side of (2). Thus, for this case Theorem 1 follows directly from Kramer’s more general result. Consequently, when , then the linear-feedback sum-capacity coincides with the sum-capacity, i.e., . It is not known whether this equality holds for all powers ; see also our discussion in Section V-B.
Since , we can define a parameter so that . Intuitively, measures the correlation between the transmitted signals. For example, when , the corresponding coincides with the optimal correlation coefficient in . Thus, captures the amount of cooperation (coherent power gain) that can be established among the senders using linear-feedback codes, where corresponds to no cooperation and corresponds to full cooperation. For a fixed , is strictly increasing (see Appendix A); thus, more power allows for more cooperation. Moreover, as and as , which is seen as follows. We rewrite identity (3) as
and notice that the left-hand side (LHS) of (5) can be written as , where tends to 0 faster than . Thus, the LHS of (5) can equal its right-hand side (RHS) only if as , or equivalently, as . On the other hand, as , the LHS tends to a constant while the RHS tends to infinity unless tends to 0. Thus, by contradiction, as .
By the above observation, we have the following two corollaries to Theorem 1 for the low and high signal-to-noise ratio (SNR) regimes.
In the low SNR regime, almost no cooperation is possible and the linear-feedback sum-capacity approaches the sum-capacity without feedback:
In the high SNR regime, the linear-feedback sum-capacity approaches the sum-capacity with full cooperation where all the transmitted signals are coherently aligned with combined SNR equal to :
Iii Proof of the Converse
In this section we show that under the symmetric block power constraints , the linear-feedback sum-capacity is upper bounded as
where is defined in (3).
The proof involves five steps. First, we derive an upper bound on the linear-feedback sum-capacity based on Fano’s inequality and the maximum entropy property of Gaussian distributions (see Lemma 1). Second, we relax the problem by replacing the functional structure in the optimizing Gaussian input distributions (8) with a dependence balance condition [10, 11], and we rewrite the resulting nonconvex optimization problem as one over positive semidefinite matrices (see Lemma 2). Third, we consider the Lagrange dual function , which yields an upper bound on for every (see Lemma 3). Fourth, by exploiting the convexity and symmetry of the problem, we simplify the upper bound into an unconstrained optimization problem (which is still nonconvex) that involves only two optimization variables (see Lemma 4). Fifth and last, using brute-force calculus and strong duality, we show that there exist such that the corresponding upper bound coincides with the right hand side of (6) (see Lemma 5).
The details are as follows.
The linear-feedback sum-capacity is upper bounded as
where111For simplicity of notation we do not include the parameter explicitly in most functions that we define in this section, e.g., .
and the maximum is over all inputs of the form
such that the function is linear, the vector is Gaussian, independent of the noise vector and the tuple , and the power constraint is satisfied.
By Fano’s inequality ,
for some that tends to zero along with as . Thus, for any achievable rate tuple , the sum-rate can be upper bounded as follows:
where the maximum is over all input distributions induced by a linear-feedback code satisfying the symmetric power constraints , i.e., over all choices of independent random vectors and linear functions such that the inputs satisfy the power constraints . Now let
be a Gaussian random vector with the same covariance matrix as , independent of . Using the same linear functions as in the given code, define
where is the channel output of a Gaussian MAC corresponding to the input tuple . It is not hard to see that is jointly Gaussian with zero mean and of the same covariance matrix as . Therefore, by the conditional maximum entropy theorem [5, Lemma 1] we have
We define the following functions on -by- covariance matrices :
It can be readily checked that both functions are concave in (see Appendix B).
The linear-feedback sum-capacity is upper bounded as
where the maximum is over -by- covariance matrices such that
Furthermore, recall that is jointly Gaussian. Therefore, for every , conditioned on , the input (column) vector is zero-mean Gaussian with covariance matrix
irrespective of . Now consider
which implies that
Hence, condition (19) reduces to (18). Rewriting (7) in terms of covariance matrices via (20) and relaxing the functional relationship (8) by the dependence balance condition (18) completes the proof of Lemma 2. \qed
Although both functions and are concave, their difference is neither concave nor convex. Hence, the optimization problem in (16) is nonconvex.
where the maximum is over (without any other constraints). Here, are the Lagrange multipliers corresponding to the power constraints (17) and is the Lagrange multiplier corresponding to the dependence balance constraint (18). Finally, we choose , which yields
and completes the proof of Lemma 3. \qed
For every ,
Suppose that a covariance matrix attains the maximum in (23). For each permutation on , let be the covariance matrix obtained by permuting the rows and columns of according to , i.e., for . Let
be the arithmetic average of over all permutations. Clearly, is positive semidefinite and of the form
for some and . (The conditions on and assure that is positive semidefinite.) We now show that also attains the maximum in (23). First, notice that the function depends on the matrix only via the sum of its entries and hence
Also, by symmetry we have . Hence, by the concavity of (see Appendix B) and Jensen’s inequality, . Therefore,
which completes the proof of Lemma 4. \qed
There exist such that
where is defined in (3).
Let be the unique nonnegative solution to
or equivalently to
for any . (The inequality follows because might be larger than .)
Now let . Then, is nondecreasing and concave in for fixed as shown in Appendix D. Thus,
where the first equality follows by Slater’s condition  and strong duality, and the last equality follows by the monotonicity of in . Alternatively, the equality in (30) can be viewed as the complementary slackness condition . Indeed, since is not bounded from above, the optimal Lagrange multiplier must be positive. Therefore, the corresponding constraint is active at the optimum, i.e., .
Iv Achievability via Kramer’s Code
We present (a slightly modified version of) Kramer’s linear-feedback code and analyze it based on the properties of discrete algebraic Riccati equations (DARE). In particular, we establish the following:
Suppose that are real numbers and are distinct complex numbers on the unit circle. Let be a diagonal matrix, be the all-one column vector, and be the unique positive-definite solution to the discrete algebraic Riccati equation (DARE)
Then, a rate tuple is achievable under power constraints , provided that and , .
Iv-a Kramer’s Linear-Feedback Code
Following , we represent a pair of consecutive uses of the given real Gaussian MAC as a single use of a complex Gaussian MAC. We represent the message point of sender by the complex scalar (corresponding to in the original real channel) and let be the (column) vector of message points.
The coding scheme has the following parameters: real coefficients and distinct complex numbers on the unit circle.
Nonfeedback mappings: For , we divide the square with corners at on the complex plane into equal subsquares. We then assign a different message to each subsquare and denote the complex number in the center of the subsquare by . The message point of sender is then .
Linear-feedback mappings: Let denote the (column) vector of channel inputs at time . We use the linear-feedback mappings
is a diagonal matrix with and
is the linear minimum mean squared error (MMSE) estimate of given .
Decoding: Upon receiving , the decoder forms a message estimate vector
and for each chooses such that is the center point of the subsquare containing .
Iv-B Analysis of the Probability of Error
Our analysis is based on the following auxiliary lemma. We use the short-hand notation .
where is the unique positive-definite solution to the DARE (31).
We rewrite the channel outputs in (1) as
From (IV-A) we have
for . Since has no unit-circle eigenvalue and the pair is detectable,222A pair is said to be detectable if there exists a column vector such that all the eigenvalues of lie inside the unit circle. For a diagonal matrix , the pair is detectable if and only if all the unstable eigenvalues , i.e., the ones on or outside the unit-circle, are distinct [18, Appendix C]. we use Lemma 2.5 in  to conclude (35). \qed
We now prove that Kramer’s code achieves any rate tuple such that
Define the difference vector . Since the minimum distance between message points is , by the union of events bound and the Chebyshev inequality, the probability of error of Kramer’s code is upper bounded as
Rewriting the encoding rule in (IV-A) as
But by Lemma 6, . Therefore, as .
Iv-C Achievability Proof of Theorem 1
Fix any such that