On the Capacity of Vector Gaussian Channels With Bounded Inputs
Abstract
The capacity of a deterministic multipleinput multipleoutput (MIMO) channel under the peak and average power constraints is investigated. For the identity channel matrix, the approach of Shamai et al. is generalized to the higher dimension settings to derive the necessary and sufficient conditions for the optimal input probability density function. This approach prevents the usage of the identity theorem of the holomorphic functions of several complex variables which seems to fail in the multidimensional scenarios. It is proved that the support of the capacityachieving distribution is a finite set of hyperspheres with mutual independent phases and amplitude in the spherical domain. Subsequently, it is shown that when the average power constraint is relaxed, if the number of antennas is large enough, the capacity has a closed form solution and constant amplitude signaling at the peak power achieves it. Moreover, it will be observed that in a discretetime memoryless Gaussian channel, the average power constrained capacity, which results from a Gaussian input distribution, can be closely obtained by an input where the support of its magnitude is a discrete finite set. Finally, we investigate some upper and lower bounds for the capacity of the nonidentity channel matrix and evaluate their performance as a function of the condition number of the channel.
I Introduction
The capacity of a pointtopoint communication system subject to peak and average power constraints was investigated in [1] for the scalar Gaussian channel where it was shown that the capacityachieving distribution is unique and has a probability mass function with a finite number of mass points. In [2], Shamai and BarDavid gave a full account on the capacity of a quadrature Gaussian channel under the aforementioned constraints and proved that the optimal input distribution has a discrete amplitude and a uniform independent phase. This discreteness in the optimal input distribution was surprisingly shown in [3] to be true even without a peak power constraint for the Rayleighfading channel when no channel state information (CSI) is assumed either at the receiver or the transmitter. Following this work, the authors in [4] and [5] investigated the capacity of noncoherent AWGN and Ricianfading channels, respectively. In [6], a point to point real scalar channel is considered in which sufficient conditions for the additive noise are provided such that the support of the optimal bounded input has a finite number of mass points. These sufficient conditions are also useful in multiuser settings as shown in [7] for the MAC channel under bounded inputs.
The analysis of the MIMO channel under the peak power constraints per antenna is a straightforward problem after changing the vector channel into parallel AWGN channels and applying the results of [1] or [2]. Recently, the vector Gaussian channel under the peak and average power constraints has become more practical by the new scheme proposed in [8]. More specifically, this scheme enables multiple antenna transmission using only one RF chain and the peak power constraint (i.e., a peak constraint on the norm of the input vector rather than on each antenna separately) is the very result of this single RF chain. The capacity of the vector Gaussian channel under the peak and average power constraints has been explored in [9] and [10]. However, according to [11], it seems that the results in the higher dimension settings are not rigorous due to the usage of the identity theorem for holomorphic functions of several complex variables without fulfilling its conditions. As shown by an example in section IV of [11], a holomorphic function of several complex variables can be zero on , but not necessarily zero on . Since is not an open subset of , the identity theorem cannot be applied. To address this problem, the contributions of this paper are as follows.

For the identity channel matrix, the approach of [2] is generalized to the vector Gaussian channel in which the complex extension will be done only on a single variable which is the amplitude of the input in the spherical coordinates. The necessary and sufficient conditions for the optimality of the input distribution are derived and it is proved that the magnitude of the capacityachieving distribution has a probability mass function over a finite number of mass points which determines a finite number of hyper spheres in the spherical coordinates. Further, the magnitude and the phases of the capacityachieving distribution are mutually independent and the phases are distributed in a way that the points are uniformly distributed on each of the hyper spheres.

It is shown that if the average power constraint is relaxed, when the ratio of peak power to the number of dimensions remains below a certain threshold (), the constant amplitude signaling at the peak power achieves the capacity.

It is also shown that for a fixed SNR, the gap between the Shannon capacity and the constant amplitude signaling decreases as for large values of , where denotes the number of dimensions.

Finally, the case of the nonidentity channel matrix is considered where we start from the MISO channel and show that the support of the optimal input does not necessarily have discrete amplitude. Afterwards, several upper bounds and lower bounds are provided for the general by MIMO channel capacity. The performance of these bounds are evaluated numerically as a function of the condition number of the channel.
The paper is organized as follows. The system model and some preliminaries are provided in section II, respectively. The main result of the paper is given in section III for the identity channel. The general case of the nonidentity channel matrix is briefly investigated in section IV. Numerical results and the conclusion are given in sections V and VI, respectively. Some of the calculations are provided in the appendices at the end of the paper.
Ii System Model and preliminaries
In a discretetime memoryless vector Gaussian channel, the inputoutput relationship for the identity channel is given by
(1) 
where , () denote the input and output of the channel, respectively. denotes the channel use and is an i.i.d. noise vector process with which is independent of for every transmission . ^{1}^{1}1It is obvious that the dimensional complex AWGN channel can be mapped to the channel in (1) with
The capacity of the channel in (1) under the peak and the average power constraints is
(2) 
where denotes the input cumulative distribution function (CDF) of the input vector, and , are the upper bounds for the peak and the average power, respectively. Throughout the paper, any operator that involves a random variable reads with the term almostsurely (e.g. )^{2}^{2}2More precisely, let be the sample space of the probability model over which the random vector is defined. is equivalent to .
It is obvious that
Therefore, a trivial upper bound for the capacity is given by
(3) 
where is achieved by a Gaussian input vector distributed as .
We formulate the optimization problem in the spherical domain. The rational behind this change of coordinates is due to the spherical symmetry of the white Gaussian noise and the constraints which, as it will be clear, enables us to perform the optimization problem only on the magnitude of the input. By writing the mutual information in terms of the differential entropies, we have
where the entropies are in nats. Motivated by the spherical symmetry of the white Gaussian noise and the constraints, and can be written in spherical coordinates as
where and denote the magnitude of the output and the input, respectively. and are, respectively, the phase vectors of the output and the input, in which , and , is a unit vector in which
(4) 
As it will become clear later, this change of coordinates prevents the usage of the identity theorem for holomorphic functions of several complex variables. The optimization problem in (2) is equivalent to
(5) 
The differential entropy of the output is given by
(6) 
where is the Jacobian of the transform. The conditional pdf of conditioned on is given by
(7) 
From (7), the joint pdf of the magnitude and phases of the output is
(8) 
in which denotes the joint CDF of By integrating (8) over the phase vector , we have
(9) 
where ^{3}^{3}3The reason that is not a function of the phase vector is due to the spherically symmetric distribution of the white Gaussian noise. In other words, is the integral of the Gaussian pdf over the surface of an nsphere with radius which is invariant to the position of as long as , i.e. which is constant on . (9) implies that in the AWGN channel in (1), is induced only by and not .
It is obvious that
(10) 
where the first inequality is tight iff the elements of are mutually independent, and the second inequality becomes tight iff is uniformly distributed over . From (6) and (10),
(11) 
For the sake of readability, the following change of variables is helpful
(12) 
Since and , it is easy to show that the two mappings and (defined in (12)) are invertible. Also, the support set of is where (the Gamma function is defined as .) From (9), the pdf of is ^{4}^{4}4The existence of is guaranteed by the Gaussian distribution of the additive noise.
(13) 
where the notation in is to emphasize that has been induced by . Not that the integral transform in (13) is invertible as shown in Appendix D. The kernel is given by
(14)  
(15) 
where is the modified bessel function of the first kind and order . The calculations are provided in Appendix A. Note that is continuous on its domain. The differential entropy of is
(16) 
The differential entropy of is given by
(17) 
Rewriting (5), we have
(18)  
(19) 
where (18) results from (11), (16) and (17). (19) is due to the fact that since (the support of ) is bounded, is maximized when is uniformly distributed. It is easy to verify that if the magnitude and phases of the input are mutually independent with the phases having the distributions as
(20) 
the magnitude and phases of the output become mutually independent with the phases having the distributions as
(21) 
where . In other words, having the input distribution
(22) 
results in
(23) 
The above result can be easily checked either by solving for in (8) or by the fact that the summation of two independent spherically symmetric random vectors is still spherically symmetric.^{5}^{5}5The magnitude and the unit vector of a spherically symmetric random vector are independent and the unit vector is uniformly distributed on the unit ball. It can be verified that this property is equivalent to the vector having the distribution of (23) in spherical coordinates. Also, note that having distributed as in (21) implies uniform on It can be observed that the input pdf in (22) makes the inequalities in (18) and (19) tight. Since the constraint is only on the magnitude of the input and is induced only by , it is concluded that the optimal input distribution must have mutually independent phases and magnitude with the phases being distributed as in (20). Therefore,
(24) 
Before proceeding further, it is interesting to check whether the problem in (24) boils down to the classical results when the peak power constraint is relaxed (i.e., ). From the definition of ,
This can be verified by a change of variable (i.e., ) and using the derivative of (112) (in Appendix D) with respect to . Therefore, when , the problem in (24) becomes maximization of the differential entropy over all the distributions having a bounded moment of order which is addressed in Appendix B for an arbitrary moment. Substituting with and with in (92), the optimal distribution for is obatined and from (13), the corresponding has the general Rayleigh distribution as
which is the only solution, since (13) is an invertible transform (see Appendix D). Furthermore, it can be verified that the maximum is
(25) 
which coincides with the classical results for the identity channel matrix [12].
Iii Main results
Let denote the set of points of increase^{6}^{6}6A point is said to be a point of increase of a distribution if for any open set containing , we have of in the interval . The main result of the paper is given in the following theorem.
Theorem. The supremization in (24), which is for the identity channel matrix, has a unique solution and the optimal input achieving the supremum (and therefore the maximum) has the following distribution in the spherical coordinates,
(27) 
where has a finite number of points of increase (i.e., has a finite cardinality). Further, the necessary and sufficient condition for to be optimal is the existence of a for which
(28)  
(29) 
Note that when the average power constraint is relaxed (i.e., ), .
Proof.
The phases of the optimal input distribution have already been shown to be mutually independent and have the distribution in (20) being independent of the magnitude. Therefore, it is sufficient to show the optimal distribution of the input magnitude. This is proved by reductio ad absurdum. In other words, it is shown that having an infinite number of points of increase results in a contradiction. The detailed proof is given in Appendix C. ∎
Remark 1. When the average power constraint is relaxed (i.e. ), the following input distribution is asymptotically () optimal
(30) 
where is the unit step function. Further, the resulting capacity is given by
Later, in the numerical results section, we observe that the density in (30) remains optimal for the nonvanishing ratio when it is below a certain threshold.
Proof.
Since the density in (30) has spherical symmetry, it is sufficient to show that is optimal when . From (3), we have
(31) 
The CDF induces the following output pdf
(32) 
(33)  
(34)  
(35) 
When is small, the entropy of is given by (35) on top of the next page. In (33), we have approximated the modified bessel function with the first two terms in its power series expansion as follows
In (34), we use the approximation and in (35), the higher order term is neglected. Given the input distribution , the achievable rate with small ratio is given by (see (24))
(36) 
where we have used the fact that
From (36) and (31), it is concluded that the pdf in (30) is asymptotically optimal for when . Note that the distribution in (30) is not the only asymptotically optimal distribution. There are many possible alternatives, one of which, for example, is the binary PAM in each dimension with the points and which can be verified to have an achievable rate of when . Specifically, in the low peak power regime (), a sufficient condition for the input distribution to be asymptotically optimal is as follows. First, it has a constant magnitude at . Second, its is independent of and has a zero first Fourier coefficient i.e.,
(37) 
The claim is justified by noting that fulfilling the second condition results in the spherical symmetric output distribution of (23) as follows. Using the approximation , at small values of , (7) can be approximated as
(38) 
If is independent of , substituting (38) in (8) results in
(39) 
where . If has a zero first Fourier coefficient, due to the structure of (see (4)), we have
Therefore, (39) simplifies as
which implies that when , having independent of all other spherical variables with a zero first Fourier coefficient results in the output distribution in (23) which makes the inequalities (18) and (19) tight. Finally, fulfilling the first condition (i.e., having a constant magnitude at ) validates the previous reasoning starting from (32).
The asymptotic optimality of the constantmagnitude signaling in (30) can alternatively be proved by inspecting the behavior of the marginal entropy density when is sufficiently small. From (13)
Therefore,
(40) 
It is obvious that (40) is a (strictly) convex (strictly) increasing function. Hence, the necessary and sufficient conditions in (28) and (29) are satisfied if and only if the input has only one point of increase at which proves the asymptotic optimality of (30) for and . ∎
Remark 2. For a fixed SNR, the gap between Shannon capacity and the constant amplitude signaling decreases as for large values of .
Proof.
By writing the first two terms of the Taylor series expansion of the logarithm (i.e., ), we have
From (34), the achievable rate obtained by the constant envelope signaling is
This shows that the gap between achievable rate and the Shannon capacity decreases as (), when goes to infinity. ∎
While remark 2 shows an asymptotic behavior of the gap, the following remark provides an analytical lower bound for any values of .
Remark 3. The following lower bound holds for the capacity of constant amplitude signaling.