Asymptotic properties of eigenmatrices of a large sample covariance matrix

Asymptotic properties of eigenmatrices of a large sample covariance matrix

[ [    [ [    [ [ Northeast Normal University, National University of Singaporeand Hong Kong Baptist University Z. D. Bai
KLASMOE and School of Mathematics
 and Statistics
Northeast Normal University
Changchun 130024
China
\printeade1
H. X. Liu
Department of Statistics
 and Applied Probability
National University of Singapore
Science dr. 2 117546
Singapore
\printeade2
W. K. Wong
Department of Economics
Hong Kong Baptist University
Hong Kong
China
\printeade3
\smonth10 \syear2009\smonth2 \syear2010
\smonth10 \syear2009\smonth2 \syear2010
\smonth10 \syear2009\smonth2 \syear2010
Abstract

Let where is a matrix with i.i.d. complex standardized entries having finite fourth moments. Let in which and where is the Marčenko–Pastur law with parameter ; which converges to a positive constant as , and and are unit vectors in , having indices and , ranging in a compact subset of a finite-dimensional Euclidean space. In this paper, we prove that the sequence converges weakly to a -dimensional Gaussian process. This result provides further evidence in support of the conjecture that the distribution of the eigenmatrix of is asymptotically close to that of a Haar-distributed unitary matrix.

[
\kwd
\doi

10.1214/10-AAP748 \volume21 \issue5 2011 \firstpage1994 \lastpage2015 \newproclaimremarkRemark

\runtitle

Eigenmatrices of a sample covariance matrix

{aug}

A]\fnmsZ. D. \snmBai\corref\thanksreft1label=e1]baizd@nenu.edu.cn, B]\fnmsH. X. \snmLiulabel=e2]huixialiu@dbs.com.sg and C]\fnmsW. K. \snmWong\thanksreft2label=e3]ecswwk@nus.edu.sg

\thankstext

t1Supported by the NSFC 10871036 and NUS Grant R-155-000-079-112.

\thankstext

t2Supported by grants from Hong Kong Baptist University.

class=AMS] \kwd[Primary ]15A52 \kwd[; secondary ]60F05 \kwd15A18. Random matrix \kwdcentral limit theorems \kwdlinear spectral statistics \kwdsample covariance matrix \kwdHaar distribution \kwdMarčenko–Pastur law \kwdsemicircular law.

1 Introduction

Suppose that is a double array of complex random variables that are independent and identically distributed (i.i.d.) with mean zero and variance . Let and , we define

(1)

where and are the transposes of the complex conjugates of and , respectively. The matrix defined in (1) can be viewed as the sample covariance matrix of a -dimensional random sample with size . When the dimension is fixed and the sample size is large, the spectral behavior of has been extensively investigated in the literature due to its importance in multivariate statistical inference [see, e.g., Anderson (1951, 1989)]. However, when the dimension is proportional to the sample size in the limit; that is, as , the classical asymptotic theory will induce serious inaccuracy. This phenomenon can be easily explained from the viewpoint of random matrix theory (RMT).

Before introducing our advancement of the theory, we will first give a brief review of some well-known properties of in RMT. We define the empirical spectral distribution (ESD) of by

where ’s are eigenvalues of . First, it has long been known that converges almost surely to the standard Marčenko–Pastur law [MPL; see, e.g., Marčenko and Pastur (1967), Wachter (1978) and Yin (1986)] , which has a density function , supported on . For the case , has a point mass at 0. If its fourth moment is finite, as , the largest eigenvalue of converges to while the smallest eigenvalue (when ) or the st smallest eigenvalue (when ) converges to [see Bai (1999) for a review]. The central limit theorem (CLT) for linear spectral statistics (LSS) of has been established in Bai and Silverstein (2004).

While results on the eigenvalues of are abundant in the literature, not much work has been done on the behavior of the eigenvectors of . It has been conjectured that the eigenmatrix; that is, the matrix of orthonormal eigenvectors of , is asymptotically Haar-distributed. This conjecture has yet to be formally proven due to the difficulty of describing the “asymptotically Haar-distributed” properties when the dimension increases to infinity. Silverstein (1981) was the first one to create an approach to characterize the eigenvector properties. We describe his approach as follows: denoting the spectral decomposition of by , if is normally distributed, has a Haar measure on the orthogonal matrices and is independent of the eigenvalues in . For any unit vector , the vector performs like a uniform distribution over the unit sphere in . As such, for , a stochastic process

is defined. If , then has the same distribution as and is identically distributed with

Applying Donsker’s theorem [Donsker (1951)], tends to a standard Brownian bridge.

For any general large sample covariance, it is important to examine the behavior of the process. Silverstein (1981, 1984, 1989) prove that the integral of polynomial functions with respect to will tend to a normal distribution. To overcome the difficulty of tightness, Silverstein (1990) takes so that the process will tend to the standard Brownian bridge instead. In addition, Bai, Miao and Pan (2007) investigate the process , defined for with , a nonnegative positive definite matrix.

However, so far, the process is assumed to be generated only by one unit vector in . This imposes restrictions on many practical situations. For example, in the derivation of the limiting properties of the bootstrap corrected Markowitz portfolio estimates, we need to consider two unparallel vectors simultaneously [see Bai, Liu and Wong (2009) and Markowitz (1952, 1959, 1991)]. In this paper, we will go beyond the boundaries of their studies to investigate the asymptotics of the eigenmatrix for any general large sample covariance matrix when runs over a subset of the -dimensional unit sphere in which .

We describe the approach we introduced in this paper as follows: if is Haar-distributed, for any pair of -vectors and satisfying , possesses the same joint distribution as

(2)

where and are two independent -vectors whose components are i.i.d. standard normal variables. As tends to infinity, we have

(3)

Therefore, any group of functionals defined by these two random vectors should be asymptotically independent of each other. We shall adopt this setup to explore the conjecture that is asymptotically Haar-distributed.

We consider and to be two -vectors with an angle . Thereafter, we find two orthonormal vectors and such that

By (2) and (3), we have

(4)

Let be a positive constant, we now consider the following three quantities:

(5)

We hypothesize that if is asymptotically Haar-distributed and is asymptotically independent of , then the above three quantities should be asymptotically equivalent to

(6)

respectively. We then proceed to investigate the stochastic processes related to these functionals. By using the Stieltjes transform of the sample covariance matrix, we have

where is a solution to the quadratic equation

(7)

Here, the selection of is due to the fact that as . By using the same argument, we conclude that

Applying the results in Bai, Miao and Pan (2007), it can be easily shown that, for the complex case,

(8)

and for the real case, the limiting variance is , where , is with replaced by such that

and

Here, the definitions of “real case” and “complex case” are given in Theorem 1 as stated in the next section. By the same argument, one could obtain a similar result such that

(9)

We normalize the second term in (1) and, thereafter, derive the CLT for the joint distribution of all three terms stated in (1) after normalization. More notably, we establish some limiting behaviors of the processes defined by these normalized quantities.

2 Main results

Let be a subset of the unit -sphere indexed by an -dimensional hyper-cube . For any arbitrarily chosen orthogonal unit -vectors , we define

(10)

If is chosen in the form of (10), then the inner product is a function of and only (i.e., independent of ). Also, the norm of the difference (we call it norm difference in this paper) satisfies the Lipschitz condition. If the time index set is chosen arbitrarily, we could assume that the angle, , between and tends to a function of and whose norm difference satisfies the Lipschitz condition.

Thereafter, we define a stochastic process mapping from the time index set to with () such that

where . {remark} If the sample covariance matrix is real, the vectors and will be real, and thus, the set has to be defined as a subset of unit sphere . The time index can be similarly described for the complex case. In what follows, we shall implicitly use the convention for the real case.

We have the following theorem.

Theorem 1

Assume that the entries of are i.i.d. with mean 0, variance 1, and finite fourth moments. If the variables are complex, we further assume and , and refer to this case as the complex case. If the variables are real, we assume and refer to it as the real case. Then, as , the process converges weakly to a multivariate Gaussian process with mean zero and variance–covariance function satisfying

for the complex case and satisfying

for the real case where

and

We will provide the proof of this theorem in the next section. We note that Bai, Miao and Pan (2007) have proved that

for the complex case and proved that the asymptotic variance is for the real case.

More generally, if and are two orthonormal vectors, applying Theorem 1, we obtain the limiting distribution of the three quantities stated in (5) with normalization such that

(11)

for the complex case while the asymptotic covariance matrix is

for the real case. {remark} This theorem shows that the three quantities stated in (5) are asymptotically independent of one another. It provides a stronger support to the conjecture that is asymptotically Haar-distributed than those established in the previous literature.

In many practical applications, such as wireless communications and electrical engineering [see, e.g., Evans and Tse (2000)], we are interested in extending the process defined on a region where is a compact subset of the complex plane and is disjoint with the interval , the support of the MPL. We can define a complex measure by putting complex mass at , the th eigenvalue of , where is the -vector with 1 in its th entry and 0 otherwise. In this situation, the Stieltjes transform of this complex measure is

where with . When considering the CLT of LSS associated with the complex measure defined above, we need to examine the limiting properties of the Stieltjes transforms, which lead to the extension of the process to , where is an index number in .

If is a constant (or has a limit, we still denote it as for simplicity), it follows from Lemma 6 that

where

is the Stieltjes transform of MPL, in which, by convention, the square root takes the one with the positive imaginary part. When is real, is defined as the limit from the upper complex plane. By definition, . In calculating the limit, we follow the conventional sign of the square root of a complex number that the real part of should have the opposite sign of , and thus

Now, we are ready to extend the process to

where is the Stieltjes transform of the LSD of in which is replaced by . Here, with or . Thereby, we obtain the following theorem.

Theorem 2

Under the conditions of Theorem 1, the process tends to a multivariate Gaussian process with mean 0 and covariance function satisfying

(12)

for the complex case and satisfying

(13)

for the real case where

Theorem 2 follows from Theorem 1 and Vitali lemma [see Lemma 2.3 of Bai and Silverstein (2004)] since both and are analytic functions when is away from , the support of MPL.

Suppose that is analytic on an open region containing the interval . We construct an LSS with respect to the complex measure as defined earlier; that is,

We then consider the normalized quantity

where is the standardized MPL. By applying the Cauchy formula

where is a contour enclosing , we obtain

where is a contour enclosing the interval , , and

Thereafter, we obtain the following two corollaries.

Corollary 1

Under the conditions of Theorem 1, for any functions analytic on an open region containing the interval , the -dimensional process

tends to the -dimensional stochastic multivariate Gaussian process with mean zero and covariance function satisfying

where for the complex case and for the real case. Here, and are two disjoint contours that enclose the interval such that the functions are analytic inside and on them.

Corollary 2

The covariance function in Corollary 1 can also be written as

where has been defined in Corollary 1.

3 The proof of Theorem 1

To prove Theorem 1, by Lemma 7, it is sufficient to show that tends to the limit process . We will first prove the property of the finite-dimensional convergence in Section 3.1 before proving the tightness property in Section 3.3. Throughout the paper, the limit is taken as .

3.1 Finite-dimensional convergence

Under the assumption of a finite fourth moment, we follow Bai, Miao and Pan (2007) to truncate the random variables at for all and in which before renormalizing the random variables to have mean 0 and variance 1. Therefore, it is reasonable to impose an additional assumption that for all and .

Suppose denotes the th column of . Let and . Let and be any two vectors in . We define

and

We also define the -field . We denote by the conditional expectation when is given. By convention, denotes the unconditional expectation.

Using the martingale decomposition, we have

Therefore,

Consider the -dimensional distribution of where . Invoking Lemma 3, we will have

for any constants , , where

and

for the complex case and

for the real case.

To this end, we will verify the Liapounov condition and calculate the asymptotic covariance matrix (see Lemma 3) in the next subsections.

3.1.1 Verification of Liapounov’s condition

By (3.1), we have

(17)

The Liapounov condition with power index 4 follows by verifying that

(18)

The limit (18) holds if one can prove that, for any ,

(19)

To do this, applying Lemma 2.7 of Bai and Silverstein (1998), for any , we get

(20)

When , the can be replaced by in the first two inequalities. The assertion in (19) will then easily follow from the estimations in (20) and the observation that .

3.1.2 Simplification of

For any , from (3.1), we have

(21)

For the third term on the right-hand side of (3.1.2), applying (20), we have

For the second term on the right-hand side of (3.1.2), we have