Statistical mechanical analysis of the linear vector channel in digital communication

Statistical mechanical analysis of the linear vector channel in digital communication

Koujin Takeda†, Atsushi Hatabu†‡ and Yoshiyuki Kabashima† † Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, Yokohama 226-8502, Japan
‡ System IP Core Research Laboratories, NEC Corporation, 1753 Shimonumabe, Nakahara, Kawasaki 211-8666, Japan
Abstract

A statistical mechanical framework to analyze linear vector channel models in digital wireless communication is proposed for a large system. The framework is a generalization of that proposed for code-division multiple-access systems in Europhys. Lett., 76 (2006) 1193 and enables the analysis of the system in which the elements of the channel transfer matrix are statistically correlated with each other. The significance of the proposed scheme is demonstrated by assessing the performance of an existing model of multi-input multi-output communication systems.

pacs:
84.40.Ua, 75.10.Nr, 89.70.+c

1 Introduction

In recent years, the number of objects to which statistical mechanical analysis can be applied has increased rapidly. The digital wireless communication system is one such example, and many intriguing studies in this field have revealed a strong relationship between telecommunication systems and statistical mechanics [1, 2].

The linear vector channel is one of the basic categories of wireless communication system. Code division multiple access (CDMA) [3], which is employed in third-generation cellar phone systems and wireless LANs, is a type of linear vector channel. In the general CDMA scenario, many users transmit discrete symbols that are modulated by random signature sequences using a single channel, and mixtures of user signals and noises are received at the other end of the channel. This indicates that the problem of simultaneously demodulating user signals from received signals can be mapped to a virtual spin system governed by random interactions. This problem has been successfully solved by techniques developed in statistical mechanics for disordered systems, and in particular by the replica method [4, 5, 6, 7, 8].

The multi-input multi-output (MIMO) system is another well-known example of a linear vector channel to which the statistical mechanical approach is applicable [9, 10, 11]. A MIMO system is composed of many transmit and receive antennas. In a general scenario, multiple input signals transmitted from transmit antennas are received at the receive antennas, being linearly transformed to multiple output signals by a channel transfer matrix. In several preceding studies, channel transfer matrices are regarded as deterministic. However, the elements of such matrices vary with time in actual cases, which implies that handling the matrices as random is more realistic. In the simplest model, each element of the matrix could be regarded as an independent Gaussian variable with zero mean. Unfortunately, modeling of this type is inadequate for describing realistic MIMO systems in which correlations among the matrix elements are, in general, not negligible due to spatial proximity among transmit or receive antennas. For continuous inputs that are modeled as Gaussian variables, simple expressions of performance evaluation can still be obtained by using knowledge of random matrix theory [12]. However, such expressions cannot be applied directly to discrete inputs, which are usually used in digital communication. Therefore, developing a framework to analyze MIMO systems, and, more generally, linear vector channels of discrete inputs are demanded.

The purpose of the present article is to meet such a demand. Recently, the authors proposed a scheme to analyze CDMA systems under the assumption that a cross-correlation matrix of signature sequences can be regarded as a sample generated from a certain type of random matrix ensemble, which is characterized by an eigenvalue spectrum [13]. We herein generalize this scheme so as to be applicable to a wider class of linear vector channels.

The present paper is organized as follows. In Section 2, the linear vector channel models investigated herein are introduced. Section 3, in which a framework to analyze a given system is developed based on the replica method, is the main part of this article. The assumption of uniformity of transmitted signals generally guarantees that the average of the replicated Boltzmann weight depends on replicated vectors only through overlaps among the replicated and original vectors. This makes it possible to evaluate typical properties of the target system using a single function, which is referred to as [14, 15]. In general, the analysis of typical property requires assessment of quenched averages, which implies that should be evaluated as a quenched average utilizing the replica method. However, we will show that the assessment of this function can generally be reduced to the calculation of an annealed average due to a distinctive property underlying the evaluation of the average eigenvalue spectrum of the cross-correlation matrix for a given channel transfer matrix ensemble using the replica method if the eigenvalue spectrum of the ensemble is self-averaging. In Section 4, the significance of the framework is demonstrated by application to one of the typical MIMO models called the Kronecker model. Finally, Section 5 presents a summary of the present study.

2 Model definition

A linear vector channel is defined as a system in which an input vector composed of components, (boldface denotes vector or matrix), is linearly transformed by an channel transfer matrix and is additively degraded by noise. For generality and simplicity, we assume that and are defined over the complex number field, and the channel noise is given as circularly symmetric complex additive white Gaussian noise, the variance of which is . We denote and as real and complex parts of a complex number , respectively. denotes the absolute value of . Under these assumptions and notations, the output vector

 \boldmath{r}=\boldmath{H}\boldmath{b}0+√N0\boldmath{η}, (1)

is received by a receiver, where denotes the input vector that is actually transmitted, and the components of the noise vector , (), independently obey circularly symmetric complex normal distributions .

In the performance analysis shown below, is regarded as a sample from a certain random matrix ensemble, typical samples of which are dense. Namely, we assume that most elements of typical do not vanish. Since an elegant framework has been already established for Gaussian inputs [12], we focus on cases of discrete inputs in which input symbols are expressed as , where . This expression corresponds to standard digital communication schemes of binary phase shift-keying (BPSK) and quadratic phase shift-keying (QPSK) for and , respectively. We further assume that is encoded so as to be uniformly generated as for optimizing communication performance.

After receiving the remaining task for the receiver is to infer the original vector . The optimal inference scheme to minimize the component-wise probability of incorrect estimation, which is referred to as , is constructed from the posterior distribution

 P(\boldmath{b}|\boldmath{r})=Z−1exp[−1Nr|\boldmath{r}−\boldmath{H}\boldmath{b% }|2], (2)

as in the hope that a model parameter of noise variance is in agreement with the correct value . Here,

 Z≡∑\scriptsize\boldmath{b}∈AKSexp[−1Nr|\boldmath{r}−\boldmath{H}\boldmath{b% }|2] (3)

serves as the partition function, and denotes the estimate of , where denotes the -th extension of symbol set .

3 Analytical scheme

3.1 Gauge transformation and Haar measure

Since the posterior distribution (2) depends on predetermined random variables , , and , we resort to the replica method for assessing typical property of the linear vector channel. Thus, we first substitute equation (1) into equation (3) and perform gauge transformation , where denotes the complex conjugate. This yields an expression of the replicate partition function (3) for as

 Zn=∑\scriptsize\boldmath{τ}1,% \scriptsize\boldmath{τ}2,…,\scriptsize\boldmath{τ}n∈AKSexp[−1Nrn∑a=1|\boldmath% {H}diag(\boldmath{b}0)(\boldmath{1}−% \boldmath{τ}a)+√N0\boldmath{η}|2], (4)

where is a -dimensional vector, all elements of which are unity and . The diagonal matrix is defined as . Next, we average this expression with respect to over the correct prior , fixing gauged vectors . Here, we consider a property whereby vectors are sampled isotropically in -dimensional vector space under constraints of relative configuration

 (\boldmath{1}−\boldmath{τ}a)⋅(% \boldmath{1}−\boldmath{τ}b)=(\boldmath{b}0−%\boldmath$b$a)⋅(\boldmath{b}0−\boldmath{b}b)=K(1−m∗a−mb+qab), (5)

for , and

 (\boldmath{1}−\boldmath{τ}a)⋅(% \boldmath{1}−\boldmath{τ}a)=(\boldmath{b}0−%\boldmath$b$a)⋅(\boldmath{b}0−\boldmath{b}a)=2K(1−ma), (6)

for , where is defined as for complex vectors and . and (). This implies that for any fixed set of , the average of the replicated Boltzmann weight with respect to is assessed as

 1SK∑\scriptsize\boldmath{b}0∈AKSexp[−1Nrn∑a=1|\boldmath{H}diag(\boldmath{b}0)(\boldmath{1}−\boldmath{τ}a)+√N0\boldmath{η}|2] (7) ≃∫D\boldmath{U}exp[−1Nrn∑a=1|\boldmath{H}\boldmath{U}(\boldmath{1}% −\boldmath{τ}a)+√N0\boldmath{η}|2], (8)

where denotes the Haar measure of unitary matrix , which is normalized as .

Equation (8) is justified by the following arguments. For a fixed typical pair of and , which is dense, components of , denoted as , are composed of many independent random variables and, therefore, can be dealt with as complex Gaussian variables as a consequence of the central limit theorem when is sampled from the uniform distribution . This means that statistical properties of are fully characterized by only the first and second moments, which are determined by those of matrix components of . More precisely, the relevant moments are evaluated as and for and , where represents average with respect to . The final replacement for the second moments is allowed as and are statistically uncorrelated a priori. Here, it is noteworthy that the identical moments are reproduced by substituting unitary matrix for in conjunction with replacement of the Haar measure with . This validates equation (8), which will be supported by numerical experiments shown later herein as well.

3.2 G-functions and free energy

Next, averaging with respect to , we obtain

 ∫CLd\boldmath{η}πLexp[−|\boldmath{η}|2]×exp[−1Nrn∑a=1|\boldmath{H}\boldmath{U}(\boldmath{1}−\boldmath{τ}a)+√N0\boldmath{η}|2] (9) =exp[Tr\boldmath{R}\boldmath{L}(n)], (10)

where , and

 \boldmath{L}(n)= − 1Nrn∑a=1(\boldmath{1}−% \boldmath{τ}a)(\boldmath{1}−\boldmath{τ}a)† (11) + N0Nr(Nr+nN0)(n∑a=1(% \boldmath{1}−\boldmath{τ}a))(n∑b=1(%\boldmath$1$−\boldmath{τ}b))†, (12)

where denotes integration over a certain support set . Next, integrating equation (10) over the Haar measure in conjunction with taking average with respect to yields an expression of the averaged Boltzmann factor:

 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯exp[−1Nrn∑a=1|% \boldmath{H}diag(\boldmath{b}0)(\boldmath{1}−\boldmath{τ}a)+√N0\boldmath{η}|2] (13) ≃[∫D\boldmath{U}exp[Tr\boldmath{R}\boldmath{L}(n)]]\scriptsize% \boldmath{H} (14) ≃[exp[KTrG\scriptsize\boldmath{H}% (\boldmath{L}(n)K)]]% \scriptsize\boldmath{H}≃exp⎡⎣KTr[G\scriptsize% \boldmath{H}(\boldmath{L}(n)K)]% \scriptsize\boldmath{H}⎤⎦ (15) ≡exp[KTrG(\boldmath{L}(n)K)], (16)

for large , where and denote the averages with respect to , , and and to only , respectively. Transformation of equation (15) is valid if the eigenvalue spectrum of is self-averaging, i.e., if the discrepancy between the eigenvalue spectrum of typical samples of and its average vanishes as with keeping the load finite, as assumed hereinafter. A practical method to evaluate functions and and the validation of equation (15) are discussed later herein.

Intrinsic permutation symmetry among replicas naturally leads to the replica symmetric (RS) ansatz. This implies that configurations characterized by and provide the most dominant contribution to the evaluation of . Note that the RS ansatz requires obvious symmetry , which enforces to be real at the saddle point for . Under this ansatz, the matrix has three types of eigenvalues: , and , the numbers of degeneracy of which are , , and , respectively. This indicates that equation (16) is evaluated as

 (17)

In addition, the RS ansatz offers the number of microscopic configurations that satisfy constraints (5) and (6) as

 Tr{\scriptsize\boldmath{τ}a}n∏a=1δ(\boldmath{1}⋅\boldmath{τ}a−Km)∏a>bδ(\boldmath{τ}a⋅\boldmath{τ}b−Kq)≃exp[KSn(q,m)], (18)

where

 (19) Sn(q,m)=Extr^q,^m{aaaaaaa−n^mm−n2^q−n(n−1)2^qq}, (20)

where indicates the extremization of with respect to and denotes the complex Gaussian measure, the variance of which is normalized to unity in each direction of the real and complex axes. Analytically continuing equations (17) and (18) from to yields an expression for assessing the configurational average of free energy as

 1K¯¯¯¯¯¯¯¯¯lnZ=limn→01nKln¯Zn (21) =Extrm,q,^m,^q{G(−1−qNr)+(−1−2m+qNr+N0(1−q)N2r)G′(−1−qNr) (22) −^mm−^q(1−q)2+∫CDζln⎛⎝∑τ∈ASexp[Re((√^qζ∗+^m)τ)]⎞⎠⎫⎬⎭. (23)

The saddle-point solution yields the bit error rate for the demodulation strategy as , where vanishes if is maximized by among , and unity, otherwise.

Note that both the RS assessment presented here and that of the replica symmetry breaking (RSB) ansatz are generally possible following the current framework. For instance, as shown in Appendix A, one can evaluate the free energy under the ansatz of one step replica symmetry breaking (1RSB) by dividing replicated vectors into groups of identical size and assuming that the correct saddle point is characterized by the following relative configuration: , and for , for the case in which but and belong to an identical group and otherwise, respectively. In the limit , the 1RSB saddle-point equation is obtained as

 ^Δ=2xNr(G′0−G′1),^q=2N0N2rG′1−2ANrG′′1,^m=2NrG′1,Δ=∫CDζ(∫CDηΞx|⟨τ⟩1|2∫CDηΞx−∣∣∣∫CDηΞx⟨τ⟩1∫CDηΞx∣∣∣2),q=∫CDζ∣∣∣∫CDηΞx⟨τ⟩1∫CDηΞx∣∣∣2,m=∫CDζ∫CDηΞxRe(⟨τ⟩1)∫CDηΞx. (24)

Here, , , , , , and .

For , a useful relation

 ∫CDηΞ⟨τ⟩1∫CDηΞ=∑τ∈ASτexp[Re((√^qζ∗+^m)τ)]∑τ∈ASexp[Re((√^qζ∗+^m)τ)]≡⟨τ⟩0, (25)

implies that the saddle-point condition of equation (24) is governed by only four out of six variables, i.e., and . Actually, the condition for determining the four variables in this case corresponds to the saddle-point equation of RS free energy (23), which implies that RSB analysis is generally reduced to that of RS. Nevertheless, a nontrivial result can still be obtained by investigating the behaviors of and , which are subserviently determined from equation (24). This equation guarantees that always satisfies the saddle-point condition. However, for stability analysis indicates that a nontrivial solution of and emerges if

 (26)

which is in accordance with the de Almeida-Thouless (AT) condition signaling the local instability of the RS solution [16].

3.3 Equivalence between quenched and annealed averages in the assessment of G(x)

can be assessed by several formulae [17, 18, 19]. One formula uses the Stieltjes (or Cauchy) transformation of , which is the eigenvalue spectrum of cross-correlation matrix ,

 x=∫dλρ\scriptsize\boldmath{H}(λ)Λ(x)−λ (27)

where are eigenvalues of and are guaranteed to be real because is Hermitian. For given , this relationship determines implicitly, which is termed the Stieltjes inversion formula. Using , is assessed as

 G\scriptsize\boldmath{H}(x)=∫x0dt(Λ(t)−t−1), (28)

which is equivalent to the -transformation known in free probability theory [20, 21].

Here, we describe a rather primitive approach. For this objective, we substitute for in equation (10), where is a certain complex vector, the length of which is fixed as . Note that the eigenvalues of this operator are and zero, the degeneracies of which are and , respectively. Integrating over yields the following expression:

 exp[KG\scriptsize\boldmath{H}(x)]=exp[KTrG\scriptsize\boldmath{H}(x% \boldmath{e}\boldmath{e}†K)]=∫D\boldmath{U}exp[Tr\boldmath{R}(x\boldmath{% e}\boldmath{e}†)] (29) =∫CKd\boldmath{u}δ(|% \boldmath{u}|2−Kx)exp[\boldmath{u}†(% \boldmath{H}†\boldmath{H})\boldmath{u}]∫CKd\boldmath{u}δ(|\boldmath{u}|2−Kx), (30)

where we set . Inserting and employing the saddle-point method with respect to the integration of yields the following expression:

 G\scriptsize\boldmath{H}(x) = ExtrΛ{−1Klndet|Λ−\boldmath{H}†\boldmath{H}|+Λx}−lnx−1, (31) = ExtrΛ{−1KK∑k=1ln|Λ−λk|+Λx}−lnx−1, (32) = ExtrΛ{−∫dλρ% \scriptsize\boldmath{H}(λ)ln|Λ−λ|+Λx}−lnx−1. (33)

In conjunction with equation (16), this indicates that is offered as

 G(x)=ExtrΛ{−∫dλρ(λ)ln|Λ−λ|+Λx}−lnx−1, (34)

using average spectrum if a self-averaging property as holds for typical samples of .

can be formally assessed as follows [22]. For this, we introduce a partition function of complex Gaussian spins as

 ZGauss\scriptsize\boldmath{H}(λ) ≡ ∫CKd\boldmath{u}exp[−% \boldmath{u}†(Λ\boldmath{I}K−\boldmath{H% }†\boldmath{H})\boldmath{u}] (35) = πK[det(Λ\boldmath{I}K−% \boldmath{H}†\boldmath{H})]−1, (36)

where is a -dimensional complex vector and denotes a identity matrix. The dispersion formula

 δ(Λ−λ)=limϵ→+01πIm(1Λ−λ+iϵ)=−1πIm(∂∂Λln(Λ−λ)), (37)

indicates that can be assessed as

 ρ(Λ)=1πIm[∂∂Λ1K[lnZGauss\scriptsize\boldmath{H}(Λ)]\scriptsize\boldmath{H}], (38)

where can be evaluated by the replica method.

For this evaluation, we assess the moments of for with use of the saddle-point method, assuming that the saddle point is characterized by the Hermitian matrix . This yields the following expression:

 1Kln[(ZGauss\scriptsize% \boldmath{H}(Λ))n]\scriptsize\boldmath{H} (39) =Extr^\scriptsize\boldmath{Q},% \scriptsize\boldmath{Q}{1K[ln∫CnKn∏a=1d\boldmath{u}aexp[−n∑a=1(% \boldmath{u}a)†((Λ−^Qaa)\boldmath{I}% K−\boldmath{H}†\boldmath{H})% \boldmath{u}a (40) aaaaaaaaaaaaaaaaaaaaaaaaaaa+∑a≠b^Qab\boldmath{u}a⋅\boldmath{u}b⎤⎦⎤⎦\scriptsize\boldmath{H}−Tr^\boldmath% {Q}\boldmath{Q}⎫⎪⎬⎪⎭, (41)

where denotes the conjugate matrix of (Hermitian: does not indicate the Hermitian conjugate of ) to perform the saddle-point assessment. A distinctive property of equation (41) is that always satisfies the saddle-point condition for . This means that

 1nKln[(ZGauss\scriptsize% \boldmath{H}(Λ))n]\scriptsize\boldmath{H}=1K[−lndet(Λ\boldmath{I}K−% \boldmath{H}†\boldmath{H})]% \scriptsize\boldmath{H}+lnπ (42) =1Kln[ZGauss\scriptsize\boldmath{H}(Λ)]\scriptsize\boldmath{H}, (43)

generally holds for the partition function and . Extending this expression analytically from to and taking yield a formula by which to evaluate using the annealed average of the partition function as

 ρ(Λ)=1πIm[∂∂Λ1Kln[ZGauss\scriptsize\boldmath{H}(Λ)]\scriptsize\boldmath{H}]. (44)

Let us insert this expression to equation (34) and perform partial integral. This operation and the dispersion formula (37) yield another expression of as follows:

 G(x)=ExtrΛ{1Kln[ZGauss\scriptsize\boldmath{H}(Λ)]% \scriptsize\boldmath{H}+Λx−lnπ}−lnx−1 (45) =ExtrΛ{1Kln∫CKd\boldmath{u}πK[exp[−\boldmath{u}†(Λ\boldmath{I}K−\boldmath{H}†\boldmath{H})\boldmath{u}]]\scriptsize% \boldmath{H}+Λx}−lnx−1 (46) =1Kln⎡⎢ ⎢⎣∫CKd\boldmath{% u}δ(|\boldmath{u}|2−Kx)exp[\boldmath{u}†(\boldmath{H}†\boldmath{H})\boldmath{% u}]∫CKd\boldmath{u}δ(|% \boldmath{u}|2−Kx)⎤⎥ ⎥⎦\scriptsize\boldmath{H} (47) =1Kln[exp[KG\scriptsize\boldmath{H% }(x)]]\scriptsize\boldmath{H}. (48)

Equations (16), (33), and (48) indicate that equivalence between annealed and quenched averages holds in the assessment of , which validates replacement in equation (15).

The current argument may be useful in assessing the typical performance of an ensemble of channels. Equations (27) and (28) can be used for evaluating the performance of a single sample of or for evaluating the performance of a channel ensemble, in which the eigenvalue spectrum is fixed [13, 21]. However, naive extension to the analysis of the typical performance of a channel ensemble along this direction generally requires taking configurational averages with respect to a certain distribution of after evaluating the sample-by-sample performance of using equations (27) and (28). From a practical viewpoint, this is not possible. However, equation (48) indicates that the typical performance of a channel ensemble can be evaluated using a single function that characterizes the ensemble property if the eigenvalue spectra are self-averaging. This function can be assessed by the calculation of an annealed average, which, practically speaking, is in most cases much simpler than the calculation of the quenched averages. Formula (44) is useful as well because this equation can be used to numerically evaluate for a certain class of ensemble, for which analytical evaluation of is difficult. This is demonstrated in the next section.

4 Application

4.1 Kronecker model

Let us demonstrate the significance of the proposed framework by analyzing a certain channel ensemble. The ensemble that we focus on is termed the Kronecker model, which is a standard MIMO model [12, 23, 24].

In this model, the channel transfer matrix is represented as

 \boldmath{H}=√\boldmath{R}r% \boldmath{Z}√\boldmath{R}t, (49)

where is an random matrix, each component of which is independently sampled from an identical circularly symmetric complex Gaussian distribution