Outlying eigenvalues of a polynomial in large random matrices

On the outlying eigenvalues of a polynomial in large independent random matrices

Serban T. Belinschi CNRS - Institut de Matheématiques de Toulouse
118 Route de Narbonne, 31062 Toulouse, FRANCE
serban.belinschi@math.univ-toulouse.fr
Hari Bercovici Department of Mathematics and Statistics, Indiana University, Bloomington, IN 47405 U.S.A. bercovic@indiana.edu  and  Mireille Capitaine CNRS - Institut de Matheématiques de Toulouse
118 Route de Narbonne, 31062 Toulouse, FRANCE
mireille.capitaine@math.univ-toulouse.fr
Abstract.

Given a selfadjoint polynomial in two noncommuting selfadjoint indeterminates, we investigate the asymptotic eigenvalue behavior of the random matrix , where and are independent random matrices and the distribution of is invariant under conjugation by unitary operators. We assume that the empirical eigenvalue distributions of and converge almost surely to deterministic probability measures and , respectively. In addition, the eigenvalues of and are assumed to converge uniformly almost surely to the support of and respectively, except for a fixed finite number of fixed eigenvalues (spikes) of . It is known that the empirical distribution of the eigenvalues of converges to a certain deterministic probability measure , and, when there are no spikes, the eigenvalues of converge uniformly almost surely to the support of . When spikes are present, we show that the eigenvalues of still converge uniformly to the support of , with the possible exception of certain isolated outliers whose location can be determined in terms of and the spikes of . We establish a similar result when is a Wigner matrix. The relation between outliers and spikes is described using the operator-valued subordination functions of free probability theory. These results extends known facts from the special case in which .

HB was supported by a grant from the National Science Foundation.

1. Introduction

Suppose given, for each positive integer , selfadjoint independent random matrices and , with the following properties:

  1. the distribution of is invariant under conjugation by unitary matrices;

  2. there exist compactly supported deterministic Borel probability measures on such that the empirical eigenvalue distributions of and converge almost surely to and , respectively;

  3. the eigenvalues of and converge uniformly almost surely to the support of and , respectively, with the exception of a fixed number of spikes, that is, fixed eigenvalues of that lie outside the support of .

It was shown in [22] that, under the assumption , the eigenvalues of converge uniformly almost surely to the support of the free additive convolution . When , the eigenvalues of also converge uniformly almost surely to a compact set such that has no accumulation points in . Moreover, if , then is one of the spikes of , where is a certain subordination function arising in free probability. The relative position of the eigenvectors corresponding to spikes and outliers is also given in terms of subordination functions. We refer to [11] for this result and for a description of earlier work in this area.

The first purpose in this paper is to show that analogous results hold when the sum is replaced by an arbitrary selfadjoint polynomial . Then, by a comparison procedure to the particular case when is a G.U.E. (Gaussian Unitary Ensemble), we are also able to identify the outliers of an arbitrary selfadjoint polynomial when is a Wigner matrix. This extends the previous result of [21] which considers additive deformations of Wigner matrices. More precisely we consider an Hermitian matrix , where is an infinite array of random variables such that

  1. , , , are independent, centered with variance 1,

  2. there exists a and a random variable with finite fourth moment for which there exists and an integer such that, for any and any integer number , we have

Remark 1.1.

Note that the previous assumptions (X2) and (X3) obviously hold if , , , are identically distributed with finite fourth moment. When these random variables are standard Gaussian variables, is a so-called G.U.E matrix.

Our result lies in the lineage of recent, and not so recent, works [6, 8, 9, 14, 17, 18, 20, 21, 25, 26, 30, 32, 34, 38, 39, 40] studying the influence of additive or multiplicative perturbations on the extremal eigenvalues of classical random matrix models, the seminal paper being [8], where the so-called BBP phase transition was observed.

We note that Shlyakhtenko [45] considered a framework which makes it possible to understand this kind of result as a manifestation of infinitesimal freeness. In fact, the results of [45] also allow one to detect the presence of spikes from the behaviour of the bulk of the eigenvalues of , even when has no outlying eigenvalues. In a related result, Collins, Hasebe and Sakuma [23] study the case in which and the eigenvalues of and accumulate to given sequences and of real numbers converging to zero.

2. Notation and preliminaries on strong asymptotic freeness

We recall that a -probability space is a pair , where is a -algebra and is a state on . It is often useful to assume that is faithful, and we shall do that. The elements of are referred to as random variables.

If is a classical probability space, then is a -probability space, where is the usual expected value. Given , is a -probability space, where denotes the normalized trace. More generally, if is an arbitrary -probability space and , then becomes a -probability space with the state .

The distribution of a selfadjoint element in a -probability space is the compactly supported probability measure on uniquely determined by the requirement that , . The spectrum of an element is

For instance, if is a selfadjoint matrix, then the distribution of relative to is the measure , where is the set of eigenvalues of , repeated according to multiplicity. As usual, the support of a probability measure on , denoted , is the smallest closed set with the property that . It is known that if , then

Suppose that we are given -probability spaces and selfadjoint elements , . We say that converges in distribution to if

(2.1)

We say that converges strongly in distribution to (or to ) if, in addition to (2.1), the sequence converges to in the Hausdorff metric. This condition is easily seen to be equivalent to the following assertion: for every , there exists such that

If all the traces are faithful, this condition can be reformulated as follows:

for every polynomial with complex coefficients. This observation allows us to extend the concept of (strong) convergence in distribution to -tuples of random variables, . For every , we denote by the algebra of polynomials with complex coefficients in noncommuting indeterminates . This is a -algebra with the adjoint operation determined by

Suppose that is a sequence of -probability spaces, , and is a sequence of -tuples of selfadjoint elements. We say that converges in distribution to if

(2.2)

We say that converges strongly in distribution to if, in addition to (2.2), we have

The above concepts extend to -tuples which do not necessarily consist of selfadjoint elements. The only change is that one must use polynomials in the variables and their adjoints , .

Remark 2.1.

Suppose that all the states , are faithful. It was observed in [22, Proposition 2.1] that converges strongly in distribution to if and only if converges strongly in distribution to for every selfadjoint polynomial

Moreover, strong convergence in distribution also implies the strong convergence at the matricial level. The following result is from [35, Proposition 7.3].

Proposition 2.2.

Let be -probability spaces with faithful states , let , and let be a sequence of -tuples of selfadjoint elements . Suppose that converges strongly in distribution to . Then for every and every matrix polynomial .

A special case of strong convergence in distribution arises from the consideration of random matrices in . The following result follows from [22, Theorem 1.4] and [12, Theorem 1.2].

Theorem 2.3.

Let denote the space , . Suppose that are fixed, , and are mutually independent random tuples in some classical probability space such that:

  1. are independent unitaries distributed according to the Haar measure on the unitary group .

  2. are independent Hermitian matrices, each satisfying assumptions defined in the introduction.

  3. is a vector of selfadjoint matrices such that the sequence converges strongly almost surely in distribution to some deterministic -tuple in a -probability space.

Then there exist a -probability space , a free family of Haar unitaries, a semicircular system and such that, and are free and converges strongly almost surely in distribution to .

We recall that a tuple of elements in a -probability space is a semicircular system if is a free family of selfadjoint random variables and for all , ,

where

(2.3)

is the semicircular standard distribution. An element is called a Haar unitary if and for all . Note that Theorem 1.2 in [12] deals with deterministic but the random case readily follows as pointed out by assertion 2 in [35, Section 3]. The point of Theorem 2.3 is, of course, that the resulting convergence is strong. Earlier results (see [49], [24], [3, Theorem 5.4.5]) exist on convergence in distribution.

We also need a simple coupling result from [22, Lemma 5.1].

Lemma 2.4.

Suppose given selfadjoint matrices , , such that the sequences and converge strongly in distribution. Then there exist diagonal matrices , , such that , , and the sequence converges strongly in distribution.

3. Description of the models

In order to describe in detail our matrix models, we need two compactly supported probability measures and on , a positive integer , and a sequence of fixed real numbers in . The matrix is random selfadjoint for all and satisfies the following conditions:

  1. almost surely, the sequence converges in distribution to ,

  2. are eigenvalues of , and

  3. the other eigenvalues of , which may be random, converge uniformly almost surely to : almost surely, for every there exists such that

    In other words, only the eigenvalues prevent from converging strongly in distribution to .

We will investigate two polynomial matricial models, both involving .

  • Our first model involves a sequence of random Hermitian matrices such that

    1. converges strongly in distribution to the compactly supported probability measure on ,

    2. for each , the distribution of is invariant under conjugation by any unitary matrix.

    We consider the matricial model

    (3.1)

    for any selfadjoint polynomial .

  • Our second model deals with a random Hermitian Wigner matrix , is an infinite array of random variables satifying defined in the Introduction. We consider the matricial model

    (3.2)

    for any selfadjoint polynomial .

According to results of Voiculescu [49] (see also [54]), there exist selfadjoint elements in a II-factor such that, almost surely, the sequence converges in distribution to . More specifically, and are freely independent and . In particular, if is a selfadjoint polynomial in , the sequence converges in distribution to . More precisely,

almost surely in the weak topology. When , Lemma 2.4, Theorem 2.3 and Remark 2.1 show that, almost surely, the sequence converges strongly in distribution to (see the proof of Corollary 2.2 in [22]).

According to (2.10) in [12] and [3, Theorem 5.4.5], if is a selfadjoint polynomial in , then

almost surely in the weak topology, where and are freely independent selfadjoint noncommutative random variables, , and is a standard semicircular variable (i.e ). As in the first model, when , Theorem 2.3 and Remark 2.1 show that, almost surely, the sequence converges strongly in distribution to .

Our main result applies to the case when . Let be either or . The set of outliers of is calculated from the spikes using Voiculescu’s matrix subordination function [52]. When , we also show that the eigenvectors associated to these outlying eigenvalues have projections of computable size onto the eigenspaces of . The results are stated in Theorem 5.1 and Theorem 5.3. In Section 4, we present the necessary tools from operator-valued noncommutative probability theory to present our main results in Section 5.

4. Linearization and subordination

We use two main tools: the analytic theory of operator-valued free additive convolution and the theory of (random and non-random) analytic maps on matrix spaces. For background on freeness, freeness with amalgamation and random matrices we refer to [3, 54, 51, 52]. We briefly describe the necessary terminology and results.

4.1. Operator-valued distributions and freeness with amalgamation

The concept of freeness with amalgamation and some of the relevant analytic transforms were introduced by Voiculescu in [51]. An important result in this context is the analytic subordination property for free additive convolution of operator-valued distributions [52]. In order to describe it, we need some notation. If is a unital -algebra and , we denote by and the real and imaginary parts of , so that . For a selfadjoint operator , we write if the spectrum of is contained in and if . The operator upper half-plane of is the set . We denote by .

Let be a von Neumann algebra endowed with a normal faithful tracial state . If is a von Neumann subalgebra, then there exists a unique trace- and unit-preserving conditional expectation (see [46, Proposition 2.36]). In the following we denote by the von Neumann algebra generated by and in .

Theorem 4.1.

Let be a von Neumann algebra endowed with a normal, faithful tracial state , let be a unital von Neumann subalgebra, let be a trace- and unit-preserving conditional expectation, and let be selfadjoint. Suppose that and are free over . Then there exists an analytic map such that

  1. , , and

  2. , for every .

Maps of the form for some selfadjoint and (conditional) expectation are also known as operator-valued Cauchy-Stieltjes transforms. Assertion (1) is proved in [52]. For (2) see [10, Remark 2.5].

In our applications, the algebra is for some . The following result from [36] explains why this case is relevant in our work.

Proposition 4.2.

Let be a -probability space, let be a positive integer, and let be freely independent. Then the map is a unit preserving conditional expectation, and and are free over for any .

4.2. Linearization

As in [4, 13], we use linearization to reduce a problem about a polynomial in freely independent, or asymptotically freely independent, random variables, to a problem about the addition of matrices having these random variables as entries. Then Proposition 4.2 allows us to apply Theorem 4.1 to the algebra and thereby produce the relevant subordination function. More specifically, suppose that . For our purposes, a linearization of is a linear polynomial of the form

where

with for some , and the following property is satisfied: given and elements in a algebra , is invertible in if and only if is invertible in . Usually, this is achieved by ensuring that exists as an element of and is one of the entries of the . It is known (see, for instance, [42]) that every polynomial has a linearization, and linearizations have been used in free probability earlier (see [28]).

We describe in some detail a linearization procedure from [4] (see also [33]) that has several advantages. In this procedure, we always have , where denotes the matrix whose only nonzero entry equals and occurs in the first row and first column. Given , we produce an integer and a linear polynomial of the form

such that , , is an invertible matrix in whose inverse is a polynomial of degree less than or equal to the degree of , and . Moreover, if , the coefficients of can be chosen to be selfadjoint matrices in .

The construction proceeds by induction on the number of monomials in the given polynomial. We start with a a single monomial , where and . In this case, we use the polynomial

As noted in [33], the lower right corner of this matrix is invertible in the algebra and its inverse has degree . (The constant term in this inverse is a selfadjoint matrix and its spectrum is contained in .) Suppose now that and linear polynomials

with the desired properties have been found for and . Then the matrix

is a linearization of with the same properties. The construction of a linearization is now easily completed for an arbitrary polynomial. Suppose now that is a selfadjoint polynomial, so for some other polynomial of the same degree. Let

provide a linearization of . Then the selfadjoint linear polynomial

linearizes . It is easy to verify, following the inductive steps, that this construction produces a matrix such that the constant term of has spectrum contained in . These properties of [33] are useful in our analysis.

Lemma 4.3.

Suppose that , and let

be a linearization of with the properties outlined above. Then for every and for every we have

(4.1)

and

Proof.

Suppressing the variables , we have

(4.1) readily follows. The dimension of the kernel of a square matrix does not change if the matrix is multiplied by some other invertible matrices. Also, since is invertible, the kernel of the matrix on the right hand side of the last equality is easily identified with . The lemma follows from these observations. ∎

In the case of selfadjoint polynomials, applied to selfadjoint matrices, we can estimate how far is from being invertible.

Lemma 4.4.

Suppose that , and let

be a linearization of with the properties outlined above. Suppose that is a unital -algebra and is a -tuple of selfadjoint elements. Let be such that is invertible. There exist two polynomials with nonnegative coefficients, depending only on , such that

In particular, if and , for some positive constants and , then there exists a constant , depending only on such that the distance from to is at least .

Proof.

For every element of a algebra, we have . Equality is achieved, for instance, if . A matrix calculation (in which we suppress the variables ) shows that

The lemma follows now because the entries of , , and are polynomials in , and

because is selfadjoint. ∎

The dependence on in the above lemma is given via the norms of and of . It can clearly be worsened artificially, for example by adding and subtracting for some large . Note also that in general : indeed, .

4.3. Domain of the subordination function

Theorem 4.1 informs us that the subordination function is defined on all elements with strictly positive imaginary part. In the following, we discuss the behavior of on certain subsets of the boundary of its natural domain. Thus, consider a tracial -probability space and two selfadjoint random variables which are free with respect to . Let and consider a linearization of which satisfies the properties outlined in Section 4.2. In particular, this means that are selfadjoint complex matrices for some . According to Proposition 4.2, and are free over with respect to . Theorem 4.1 provides a subordination function such that for all . However, in order to exploit the properties of the subordination function in the context of linearization, we need to prove that is defined on a larger set than . In this context, we will encounter meromorphic functions with values in . By this we mean the obvious thing: if is a domain, then a function is meromorphic if for any , there exists an such that is analytic on a small enough neighbourhood of .

Lemma 4.5.

The limit exists for any . The correspondence is analytic from to , extends meromorphically to , and the extension satisfies . In particular, is analytic on the complement of a discrete set , and is selfadjoint for any Moreover, if belongs to the connected component of the domain of that contains , then is invertible in if and only if is invertible in for all .

The result of the above lemma cannot be generally improved to analytic extension to . One can find counterexamples even when . However, if is a semicircular random variable, then it follows easily from the results of [29] that extends analytically to .

Given the occurence of meromorphic matrix-valued functions in our lemma, we feel it is justified to use the following convention: if is meromorphic and , then we say that is invertible if is analytic on a neighbourhood of . Thus, it may be that is a pole of and . It is in this sense that the last statement of Lemma 4.5 should be understood.

Proof.

Recall that for all , so that for all . Thus, the family is normal on . Let be a cluster point of this family. It follows from Lemma 4.4 that the correspondence

is analytic on and thus equal to the limit . In particular, is an -valued analytic function of . We claim that for sufficiently large, is invertible. This is equivalent to