A Deterministic Equivalent for the Analysis of Correlated MIMO Multiple Access Channels
In this article, novel deterministic equivalents for the Stieltjes transform and the Shannon transform of a class of large dimensional random matrices are provided. These results are used to characterise the ergodic rate region of multiple antenna multiple access channels, when each point-to-point propagation channel is modelled according to the Kronecker model. Specifically, an approximation of all rates achieved within the ergodic rate region is derived and an approximation of the linear precoders that achieve the boundary of the rate region as well as an iterative water-filling algorithm to obtain these precoders are provided. An original feature of this work is that the proposed deterministic equivalents are proved valid even for strong correlation patterns at both communication sides. The above results are validated by Monte Carlo simulations.
When mobile networks were expected to run out of power and frequency resources while being simultaneously subject to a demand for higher transmission rates, Foschini  introduced the idea of multiple input multiple output (MIMO) communication schemes. Telatar  then predicted a growth of the channel capacity by a factor for an MIMO system compared to the single-antenna case when the matrix-valued channel is modelled with independent and identically distributed (i.i.d.) standard Gaussian entries. In practical systems though, this linear gain can only be achieved for high signal-to-noise ratios (SNR) and for uncorrelated transmit and receive antenna arrays at both communication sides. Nevertheless, the current scarcity of available frequency resources has led to a widespread incentive for MIMO communications. Mobile terminal engineers now embed numerous antennas in small devices. Due to space limitations, this inevitably induces channel correlation and thus reduced transmission rates. An implication of the results introduced in this paper is the ability to study the performance of MIMO systems subject to strong correlation effects in multi-user and multi-cellular contexts, a question which is paramount to cellular service providers.
Although alternative communication models could be treated using similar mathematical expressions, such as cooperative and non-cooperative multi-cell communications with users equipped with multiple antennas, the present article investigates the MIMO multiple access channel (MIMO-MAC), where multi-antenna mobile terminals transmit information to a single receiver. Under perfect channel state information at the transmitters (CSIT), the boundaries of the achievable rate region of the MIMO-MAC have been characterised by Yu et al.  who provide an iterative water-filling algorithm to obtain the sum rate maximising precoders. However, to achieve perfect CSIT, the channel must be quasi-static during a sufficiently long period to allow feedback or pilot signalling from the receiver to the transmitters. For high mobility wireless services, this is often unacceptable. In this situation, the transmitters are often assumed to have statistical information about the random fast varying channels, such as first order moments of their distribution. The achievable rates are in this case the points lying in the ergodic rate region. It is however difficult to characterise the boundary of the ergodic rate region because it is difficult to compute the optimal precoders that reach the boundaries. In the single-user context, an algorithm was provided by Vu et al. in  to solve this problem. However, the technique of  is rather involved as it requires nested Monte Carlo simulations and does not provide any insight on the nature of the optimal precoders.
In the present article, we provide a parallel approach that consists in approximating the ergodic sum rate by deterministic equivalents. That is, for all finite system dimensions, we provide an approximation of the ergodic rates, which is accurate as the system dimensions grow asymptotically large. Furthermore, we provide an efficient way to derive an asymptotically accurate approximation of the optimal precoders. The mathematical field of large dimensional random matrices is particularly suited for this purpose, as it can provide deterministic equivalents of the achievable rates depending only on the relevant channel parameters, e.g., the long-term transmit and receive channel covariance matrices in the present situation, the deterministic line of sight components in Rician models as in  etc. The earliest notable result in line with the present study is due to Tulino et al. , who provide an expression of the asymptotic ergodic capacity of point-to-point MIMO communications when the random channel matrix is composed of i.i.d. Gaussian entries. In , Peacock et al. extend the asymptotic result of  in the context of multi-user communications by considering a -user MAC with channels modelled as Gaussian with a separable variance profile. This is, the entries of are Gaussian independent with -th entry of zero mean and variance that can be written as a product of a term depending on and a term depending on . The asymptotic eigenvalue distribution of this matrix model is derived, but neither any explicit expression of the sum rate is provided as in , nor is any ergodic capacity maximising policy derived. In , Soysal et al. derive the sum rate maximising precoder policy in the case of a MAC channel with users whose channels are one-side correlated zero mean Gaussian, in the sense that all rows of have a common covariance matrix, different for each .
In this article, we concentrate on the more general Kronecker channel model. This is, we assume a -user MIMO-MAC, with channels , where each can be written in the form of a product where has i.i.d. zero mean Gaussian entries and the left and right correlation matrices and are deterministic nonnegative definite Hermitian matrices. This model clearly covers the aforementioned channel models of ,  and  as special cases. The Kronecker model is particularly suited to model communication channels that show transmit and receive correlations, different from one user to the next, in a rich scattering environment. Nonetheless, the Kronecker model is only valid in the absence of a line-of-sight component in the channel, when a sufficiently large number of scatterers is present in the communication medium to justify the i.i.d. aspect of the inner Gaussian matrix and when the channel is frequency flat over the transmission bandwidth. Using similar tools as those used in this article, many works have studied these channel models, mostly in a single-user context. We remind the main contributions, from which the present work borrows several ideas. In [5, 9, 10], Hachem et al. study the point-to-point multi-antenna Rician channel model, i.e., non-central Gaussian matrices with a variance profile, for which they provide a deterministic equivalent of the ergodic capacity , the corresponding ergodic capacity-achieving input covariance matrix  and a central limit theorem for the ergodic capacity . In , Moustakas et al. provide an expression of the mutual information in time varying frequency selective Kronecker channels, using the replica method . This result has been recently proved by Dupuy et al. in a yet unpublished work. Dupuy et al. then derived the expression of the capacity maximising precoding matrix for the frequency selective channel . A more general frequency selective channel model with non-separable variance profile is studied in  by Rashibi et al. using alternative tools from free probability theory. Of practical interest is also the theoretical work of Tse  on MIMO point-to-point capacity in both uncorrelated and correlated channels, which are validated by ray-tracing simulations.
The main contribution of this paper is summarized in two theorems contributing to the field of random matrix theory and enabling the evaluation of the ergodic rate region of the MIMO-MAC with Kronecker channels. We subsequently derive an iterative water-filling algorithm enabling the description of the boundaries of the rate region by providing an expression of the asymptotically optimal precoders. The remainder of this paper is structured as follows: in Section II, we provide a short summary of the main results and how they apply to multi-user wireless communications. In Section III, the two theorems are introduced, the complete proofs being left to the appendices. In Section IV, the ergodic rate region of the MIMO-MAC is studied. In this section, we introduce our third main result: an iterative water-filling algorithm to describe the boundary of the ergodic rate region of the MIMO-MAC. In Section V, we provide simulation results of the previously derived theoretical formulas. Finally, in Section VI, we give our conclusions.
Notation: In the following, boldface lower-case characters represent vectors, capital boldface characters denote matrices ( is the identity matrix). denotes the entry of . The Hermitian transpose is denoted . The operators , and represent the trace, determinant and spectral norm of matrix , respectively. The symbol denotes expectation. The notation stands for the (cumulative) distribution function of the eigenvalues of the Hermitian matrix . The function equals for real . For , two distribution functions, we denote the vague convergence of to . The notation denotes the almost sure convergence of the sequence to . The notation for the distribution function is the supremum norm defined as . The symbol for a square matrix means that is Hermitian nonnegative definite.
Ii Scope and Summary of Main Results
In this section, we summarise the main results of this article and explain their impact on the study of the effects of channel correlation on the achievable communication rates in the present multi-user framework.
Ii-a General Model
Consider a set of wireless terminals, equipped with antennas respectively, which we refer to as the transmitters, and another device equipped with antennas, which we call the receiver or the base station. We consider the uplink communication from the terminals to the base station. Denote the channel matrix between transmitter and the receiver. Let be defined as
where and are nonnegative Hermitian matrices and is a realisation of a random matrix with independent Gaussian entries of zero mean and variance . In this scenario, the matrices and model the correlation present in the channel at transmitter and at the receiver, respectively. This setup is depicted in Figure 1.
It is important to underline that the correlation patterns emerge both from the inter-antenna spacings on the volume-limited transmit and receive radio devices and from the solid angles of transmitted and received signal energy. Even though the transmit antennas emit signals in an isotropic manner, only a limited solid angle of emission is effectively received, and the same holds for the receiver which captures signal energy in a non-isotropic manner. Given this propagation factor, it is clear that the transmit covariance matrices matrices are not equal for all users. We nonetheless assume physically identical and interchangeable antennas on each device. We therefore claim that the diagonal entries of and , i.e., the variance of the channel fading on every antenna, are identical and equal to one, which, along with the normalisation of the Gaussian matrix , allows for a consistent definition of the SNR. As a consequence, and . We will see that under these trace constraints the hypotheses made in Theorem 1 are always satisfied, therefore making Theorem 1 valid for all possible figures of correlation, including strongly correlated patterns. The hypotheses of Theorem 2, used to characterise the ergodic rate region of the MIMO-MAC, require additional mild assumptions, making Theorem 2 valid for most practical models of and . These statements are of major importance and rather new since in other contributions, e.g., , , it is usually assumed that the correlation matrices have uniformly bounded spectral norms across . Physically this means that only low correlation patterns are allowed, excluding short distances between antennas and small solid angles of energy propagation. The counterpart of this interesting property is a theoretical reduction of the convergence rates of the derived deterministic equivalents, compared to those proposed in  and .
The rate performance of multi-cell or multi-user communication schemes is connected to the so-called Stieltjes transform and Shannon transform of matrices of the type
We study these matrices using tools from the field of large dimensional random matrix theory . Among these tools, we define the Stieltjes transform of the Hermitian nonnegative definite matrix , for , as
where denotes the (cumulative) distribution function of the eigenvalues of . The Stieltjes transform was originally used to characterise the asymptotic distribution of the eigenvalues of large dimensional random matrices . From a wireless communications viewpoint, it can be used to characterise the signal-to-interference plus noise ratio of certain communication models, e.g., , . A second interest of the Stieltjes transform in wireless communications is its link to the so-called Shannon transform of , that we define for as
The Shannon transform is commonly used to provide approximations of capacity expressions in large dimensional systems, e.g., . In the present work, the Shannon transform of will be used to provide a deterministic approximation of the ergodic achievable rate of the MIMO-MAC.
Ii-B Main results
The main results of this work unfold as follows:
Theorem 1 provides a deterministic equivalent for the Stieltjes transform of , under the assumption that and grow large with the same order of magnitude and the sequences of distribution functions and form tight sequences . This is, we provide an approximation of which can be expressed without reference to the random matrices and which is almost surely asymptotically exact when . The tightness hypothesis is the key assumption that allows degenerated and matrices to be valid in our framework, and that therefore allows us to study strongly correlated channel models;
Theorem 2 provides a deterministic equivalent for the Shannon transform of . In this theorem, the assumptions on the and matrices are only slightly more constraining and of marginal importance for practical purposes. In particular, Theorem 2 theoretically allows the largest eigenvalues of or to grow linearly with , as the number of antennas increases, as long as the number of these large eigenvalues is of order ;
based on Theorem 2, the precoders that maximise the deterministic equivalent of the ergodic sum rate of the MIMO-MAC are computed. Those precoders have the following properties:
the eigenspace of the precoder for user coincides with the eigenspace of the transmit channel correlation matrix at user ;
the eigenvalues of the precoder for user are the solution of a water-filling algorithm;
as the system dimensions grow large, the mutual information achieved using these precoders becomes asymptotically close to the channel capacity.
The major practical interest of Theorems 1 and 2 lies in the possibility to analyze mutual information expressions for multi-dimensional channels, not as the averaged of stochastic variables depending on the random matrices but as approximated deterministic expressions which do no longer feature the matrices . The study of those quantities is in general simpler than the study of the averaged stochastic expressions, which leads here to a simple derivation of the approximated rate optimal precoders.
In the next section, we introduce our theoretical results, whose proofs are left to the appendices.
Iii Mathematical Preliminaries
In this section, we first introduce Theorem 1, which provides a deterministic equivalent for the Stieltjes transform of matrices defined in (2). A deterministic equivalent for the Shannon transform of is then provided in Theorem 2, before we discuss in detail how this last result can be used to characterise the performance of the MIMO-MAC with strong channel correlation patterns.
Iii-a Main results
Let be positive integers and let
be an matrix with the following hypothesis for all :
has i.i.d. entries , such that ;
is a Hermitian nonnegative square root of the nonnegative definite Hermitian matrix ;
with for all ;
the sequences and are tight;111this is, for all , there exists such that and for all , . See e.g.,  for more details.
is Hermitian nonnegative definite;
there exist for which
Also denote, for , , the Stieltjes transform of . Then, as all and grow large, with ratio ,
and the functions , , form the unique solution to the equations
such that when and such that when is real and negative.
Moreover, for any , the convergence of Equation (5) is uniform over any region of bounded by a contour interior to
For all , the function is the Stieltjes transform of a distribution function . Denoting the empirical eigenvalue distribution function of , we finally have
weakly and almost surely as .
A few technical remarks are of order at his point.
In her PhD dissertation , Zhang derives an expression of the limiting eigenvalue distribution for the simpler case where and but is not constrained to be diagonal. Her work also uses a method based on the Stieltjes transform. Based on , it seems to the authors that Theorem 1 could well be extended to non-diagonal . However, proving so requires involved calculus, which we did not perform. Also, in , using the same techniques as in the proof provided in Appendix A, Silverstein et al. do not assume that the matrices are nonnegative definite. Our result could be extended to this less stringent requirement on the central matrices, although in this case Theorem 1 does not hold for real negative. For application purposes though, it is fundamental that the Stieltjes transform of exist for .
We now claim that, under proper initialisation, for , a classical fixed-point algorithm converges surely to the solution of (6).
|Define , the convergence threshold and , the iteration step. For all , set and . while do for do Compute (9) end for Assign end while|
Different hypotheses will be used in the applications of Theorem 1 provided in Section IV. For practical reasons, we will in particular need the entries of will be Gaussian, the matrices to be non-diagonal and . This entails the following corollary:
Let be positive integers and let
be an matrix with the following hypothesis for all :
has i.i.d. Gaussian entries , with and ;
is a Hermitian nonnegative square root of the nonnegative definite Hermitian matrix ;
is a nonnegative definite Hermitian matrix;
and form tight sequences;
there exist for which
Also denote, for , . Then, as all and grow large (while is fixed)
and the set of functions , , form the unique solution to the equations
such that for all .
Since the are Gaussian, the joint distribution of the entries of coincides with that of , for any unitary matrix. Therefore, in Theorem 1 can be substituted by without compromising the final result. As a consequence, the can be taken nonnegative definite Hermitian and the result of Theorem 1 holds. It then suffices to replace in the expression of to fall back on the result of Theorem 1. \qed
The deterministic equivalent of the Stieltjes transform of is then extended to a deterministic equivalent of the Shannon transform of in the following result:
Let and be a random Hermitian matrix as defined in Corollary 1 with the following additional assumptions:
there exists and a sequence , such that, for all ,
where denote the ordered eigenvalues of the matrix .
denoting an upper-bound on the spectral norm of the and , , and a constant such that , satisfies
Then, for large , , the Shannon transform of , satisfies
Note that this last result is consistent both with  when the transmission channels are i.i.d. Gaussian and with  when . This result is also similar in nature to the expressions obtained in  for the multi-antenna Rician channel model and with  in the case of frequency selective channels. We point out that the expressions obtained in ,  and , when the entries of the matrices are Gaussian distributed, suggest a faster convergence rate of the deterministic equivalent of the Stieltjes and Shannon transforms than the one obtained here. Indeed, while we show here a convergence of order (which is in fact refined to for any in Appendix A), in those works the convergence is proved to be of order .
However, contrary to the above contributions, we allow the and matrices to be more general than uniformly bounded in spectral norm. This is thoroughly discussed in the section below.
Iii-B Kronecker channel with strong correlation patterns
Theorem 1 and Corollary 1 require and to form tight sequences. Remark that, because of the trace constraint , all sequences are necessarily tight (the same reasoning naturally holds for ). Indeed, given , take ; is the number of eigenvalues in larger than , which is necessarily less than or equal to from the trace constraint, leading to and then . The same naturally holds for the matrices. Observe now that Condition 2) in Theorem 2 requires a stronger assumption on the correlation matrices. Under the trace constraint, a sufficient assumption for Condition 2) is that there exists , such that the number of eigenvalues in greater than is of order . This is a mild assumption, which may not be verified for some very specific choices of .222As a counter-example, take and the eigenvalues of to be The largest eigenvalue is of order so that is of order , and the number of eigenvalues larger than any for large is of order . Therefore here. Nonetheless, most conventional models for the and , even when showing strong correlation properties, satisfy the assumptions of Theorem 2. We mention in particular the following examples:
if all and have uniformly bounded spectral norm, then there exists such that all eigenvalues of and are less then for all . This implies for all and therefore the condition is trivially satisfied. Our model is therefore compatible with loosely correlated antenna structures;
in contrast, when antennas are densely packed on a volume-limited device, the correlation matrices and tend to be asymptotically of finite rank, see e.g.  in the case of a dense circular array. That is, for any given , for all large , the number of eigenvalues greater than is finite, while defined in Theorem 2 is of order . This implies and therefore volume-limited devices with densely packed antennas are consistent with our framework;
for one, two or three dimensional antenna arrays with neighbors separated by half the wavelength as discussed by Moustakas et al. in , the correlation figures have a peculiar behaviour. In a linear array of antenna, eigenvalues are of order of magnitude , the remaining eigenvalues being small. In a two-dimensional grid of antennas, eigenvalues are of order , the remaining eigenvalues being close to zero. Finally, in a three-dimensional parallelepiped of antennas, eigenvalues are of order , the remaining eigenvalues being close to also. As such, in the -dimensional scenario, we can approximate by , by and we have
so that the multi-dimensional antenna arrays with close antennas separated by half the wavelength also satisfy the hypotheses of Theorem 2.
As a consequence, a wide scope of antenna correlation models enter our deterministic equivalent framework, which comes again at the price of a slower theoretical convergence of the difference .
We now move to practical applications of the above results and more specifically to the determination of the ergodic rate region of the MIMO-MAC.
Iv Rate Region of the MIMO-MAC
In this section, we successively apply Theorem 2 to approximate the ergodic mutual information for all deterministic precoders, and then we determine the precoders that maximise this approximated mutual information. This gives an approximation of all points on the boundary of the MIMO-MAC rate region. We also introduce an iterative power allocation algorithm to obtain explicitly the optimal precoders.
Iv-a Deterministic equivalent of the mutual information
Consider the wireless multiple access channel as described in Section II and depicted in Figure 1. We denote the ratio between the number of antennas at the receive base station and the number of transmit antennas of user . Denote the Gaussian signal transmitted by user , such that and , with where is the total power of transmitter , the signal received at the base station and the additive white Gaussian noise of variance . We recall that the Kronecker channel between user and the base station is denoted , with the entries of Gaussian independent of zero mean and variance and , deterministic. The received vector is therefore given by
Suppose that the channels are varying fast and that the transmitters in the MAC only have statistical channel state information about the in the sense that user only knows the long term statistics and . In this case, for a noise variance equal to , the per-antenna ergodic MIMO-MAC rate region is given by 
with the expectation taken over the joint random variable , , and where we introduced the notation
Now, assuming the , and satisfy the hypotheses of Theorem 2, we have
as grow large for some sequence , on a subset of measure of the probability space that engenders . Integrating this expression over therefore leads to
We can therefore apply Theorem 2 to determine the ergodic rate region of the MIMO-MAC. We specifically have
with and the unique positive solutions of
This provides a deterministic equivalent for all points in the MIMO-MAC rate region, i.e., for all precoders.
Iv-B Rate maximisation
Now we wish to determine for which precoders the boundary of the MIMO-MAC rate region is reached. This requires here to determine the rate optimal precoding matrices , for all . To this end, we first need the following result:
If at least one of the correlation matrices , , is invertible, then is a strictly concave function in .
Without loss of generality, for any , since the matrices are standard Gaussian, and therefore of unitarily invariant joint distribution, can be assumed diagonal. If is not of full rank then it can be reduced into a matrix of smaller size, such that the resulting matrix is invertible, without changing the problem at hand. We therefore assume all matrices to be of full rank from now on. From Proposition 2, we then immediately prove that the -ary set of matrices which maximises the deterministic equivalent of the ergodic sum rate over the set is unique. In a very similar way as in , we then show that the matrices , , have the following properties:
For every , denote the spectral decomposition of with unitary and . Then the precoders which maximise the right-hand side of (IV-A) satisfy:
, with diagonal, i.e., the eigenspace of is the same as the eigenspace of ;
denoting, for all , as in (14) for , the diagonal entry of satisfies:
where the are evaluated such that .
In Table II, we provide an iterative water-filling algorithm to obtain the .
|Define the convergence threshold and the iteration step. At step , for all , , set . while do For , define as the solution of (6) for and with eigenvalues , obtained from the fixed-point algorithm of Table I. for do for do Set , with such that . end for end for Assign end while|
In , it is proved that the convergence of this algorithm implies its convergence towards the correct limit. The line of reasoning in  can be directly adapted to the current situation so that, if the iterative water-filling algorithm of Table II converges, then converge to the matrices . However, similar to , it is difficult to prove the sure convergence of the water-filling algorithm. Nonetheless, extensive simulations suggest that convergence is always attained.
For the set under consideration, denote now the true sum rate maximising precoders. Then, if and are such that Condition 1) of Theorem 2 is satisfied with the sets and replaced by (or ) and , respectively, we have from Theorem 2
where both right-hand side differences of the type tend to zero, while the left-hand side term is positive by definition of and the remaining right-hand side term is negative by definition of the . This finally ensures that
as grow large with uniformly bounded ratios. Therefore, the mutual information obtained based on the precoders is asymptotically close to the capacity achieved with the ideal precoders . Finally, if, for all sets , the matrices , and the resulting , satisfy the mild conditions of Theorem 2, then all points of the boundary of the MIMO-MAC rate region can be given a deterministic equivalent.
This concludes this application section. In the following section, we provide simulation results that confirm the accuracy of the deterministic equivalents as well as the validity of the hypotheses made on the and matrices.
V Simulations and Results
In the following, we apply the results obtained in Section IV to provide comparative simulation results between ergodic rate regions, sum rates and their respective deterministic equivalents, for non negligible channel correlations on both communication sides. We provide simulation results in the context of a two-user MIMO-MAC, with antennas at the base station and antennas at the user terminals. The antennas are placed on a possibly multi-dimensional array, antenna being located at . We further assume that both terminals are physically identical. To model the transmit and receive correlation matrices, we consider both the effect of the distances between adjacent antennas at the user terminals and at the base station, and the effect of the solid angles of effective energy transmission and reception. We assume a channel model where signals are transmitted and received isotropically in the vertical direction, but transmitted and received under an angle in the horizontal direction. We then model the entries of the correlation matrices from a natural extension of Jakes model  with privileged direction of signal departure and arrival. Denoting the transmit signal wavelength, , the entry of the matrix , is
with the effective horizontal directions of signal propagation. With similar notations for the other correlation matrices, we choose ,