Delocalization at small energy for heavy-tailed random matrices

# Delocalization at small energy for heavy-tailed random matrices

Charles Bordenave111CNRS & Université Toulouse III, France. E-mail: charles.bordenave@math.univ-toulouse.fr.   and Alice Guionnet222CNRS & École Normale Supérieure de Lyon, France, and MIT, Cambridge, USA. Partially supported by the Simons Foundation and by NSF Grant DMS-1307704. E-mail: aguionne@ens-lyon.fr.
###### Abstract

We prove that the eigenvectors associated to small enough eigenvalues of an heavy-tailed symmetric random matrix are delocalized with probability tending to one as the size of the matrix grows to infinity. The delocalization is measured thanks to a simple criterion related to the inverse participation ratio which computes an average ratio of and -norms of vectors. In contrast, as a consequence of a previous result, for random matrices with sufficiently heavy tails, the eigenvectors associated to large enough eigenvalues are localized according to the same criterion. The proof is based on a new analysis of the fixed point equation satisfied asymptotically by the law of a diagonal entry of the resolvent of this matrix.

## 1 Introduction

Anderson localization has attracted a lot of interest in both mathematical and physical communities recently. One of the most tractable model to study this phenomenon is given by random Schrödinger operators on trees, see notably [1, 2, 22, 3]. It was shown that at small energy the system displays delocalized waves whereas at large energy waves are localized. This phenomenon is related to the transition between a continuous spectrum and a discrete spectrum. Even, a transition between these two phases at a given energy, the so-called mobility edge, could be proved [3]. Such a a transition is expected to happen in much more general settings, see e.g. [20]. In this article we shall prove the existence of a similar phenomenon for random matrices with heavy tails, as conjectured in [13, 25, 27]. This is in contrast with the full delocalization observed for light tails Wigner matrices [18, 17, 19]. Indeed, we shall prove that for Lévy matrices with heavy enough tail, eigenvectors are delocalized for small enough energy whereas they are localized for large enough energy. We are not able to prove a sharp transition but the mobility edge value is predicted in [27] based on the replica trick. In fact, we already proved in [12] that eigenvectors are delocalized provided the entries have roughly speaking finite -norm, whereas a localization phenomenon appears for sufficiently heavy tail and large energy. However, we left open the question of proving delocalization at small energy and very heavy tails, or at least to exhibit a single criterion which would allow to distinguish these two phases. In this article we remedy this point.

Let us first describe roughly our main results. Consider a symmetric matrix of size with independent equidistributed real entries above the diagonal. Assume that the tail of is such that, for some ,

 nP(|Aij|≥t)≃t→∞t−α,

(in a sense which will be made precise later on). Then, for , consider the following fractional moment of the resolvent:

 ynz(β)=1nn∑k=1(I(A−zI)−1kk)β.

For , as goes to infinity and then goes to on the real line, converges towards the spectral density which turns out to be positive [7, 6, 11]. However, we proved in [12] that for , and for sufficiently heavy tails (), as goes to infinity and then goes to large enough, goes to zero. This can be shown to imply a localization of eigenvectors with large enough eigenvalue or energy. On the other end, we prove in the present article that for (outside a countable subset of ), as goes to infinity and then goes to small enough, is bounded below by a positive constant. Back to eigenvectors, this in turn allows to prove delocalization at small energies versus localization at high energies according to the following criterion. Consider an orthonormal basis of of eigenvectors of . Let be an interval of the real line so that goes to zero as goes to infinity. Let denote the set of eigenvectors of with eigenvalues in and set, if is not empty,

 QI=nn∑k=1⎛⎝1|ΛI|∑u∈ΛI⟨u,ek⟩2⎞⎠2.

We will explain below why is related to the nature of eigenvectors in . Then, the main result of this article is that for going to zero more slowly than for some , for large enough, goes to infinity (Theorem 1.1), whereas for small enough, it remains finite (Theorem 1.2).

Let us now describe our results more precisely. For integer , we consider an array of i.i.d. real random variables and set, for , . Then, we define the random symmetric matrix:

 A=(Aij)1≤i,j≤n.

The eigenvalues of the matrix are real and are denoted by . We also consider an orthogonal basis of of eigenvectors of , for , .

If is a random variable independent of and with variance equal to , then is a normalized Wigner matrix. In the large limit, the spectral properties of this matrix are now well understood, see e.g. [5, 9, 4, 16, 26]. The starting point of this analysis is the Wigner’s semicircle law, which asserts that for any interval , the proportion of eigenvalues in is asymptotically close to where is the distribution with support and density . Many more properties of the spectrum are known. For example, if is centered and has a subexponential tail, then, from [18, 19], for any and , with high probability,

 max1≤k≤n∥uk∥p≤n1/p−1/2+ε, (1)

where for , and . This implies that the eigenvectors are strongly delocalized with respect to the canonical basis.

In this paper, we are interested in heavy-tailed matrices, it corresponds to the assumption that the measure defined on ,

 Ln(⋅)=nP(|A11|2∈⋅) (2)

converges vaguely as goes to infinity to a (non trivial) Radon measure on . For example, if is a Bernoulli - variable with mean , then is equal . In this case, (up to the irrelevant diagonal terms) is the adjacency matrix of an Erdős-Rényi graph, where each edge is present independently with probability . In this paper, we will focus on Lévy matrices introduced by Bouchaud and Cizeau [13]. They can be defined as follows. We fix and assume that is a random variable independent of such that

 P(|X11|≥t)∼t→∞t−α. (3)

In the above setting, they correspond to the case , where is the Lebesgue measure on . For technical simplicity, we will further restrict to a symmetric -stable random variable such that for all ,

 Eexp(itX11)=exp(−σα|t|α),

with . With this choice, the random variable is normalized in the sense that (3) holds.

The spectrum of heavy-tailed matrices is far from being perfectly understood. It differs significantly from classical Wigner matrices. In the Lévy case (3), for any interval , the proportion of eigenvalues in is asymptotically close to where the probability measure depends on , it is symmetric, has support , a bounded density which is analytic outside a finite set of points. Moreover, has an explicit expression and as goes to , , see [15, 8, 6, 11].

The eigenvectors of Lévy matrices have been rigorously studied in [12]. It is shown there that if , there is a finite set such if is a compact set, for any , with high probability,

 max{∥uk∥∞:λk∈K}≤n−ρ+ε, (4)

where Since , it implies that the -norm of most eigenvectors is . Notice that when , then and it does not match with (1). It is expected that the upper bound (4) is pessimistic.

When , the situation turns out to be very different. In [13], Bouchaud and Cizeau have conjectured the existence of a mobility edge, , such that all eigenvectors with are delocalized in a sense similar to (4) while eigenvectors with are localized, that is they have a sparse representation in the canonical basis. In [12], the existence of this localized phase was established when . More precisely, for an interval of , as above is the set of eigenvectors whose eigenvalues are in . Then, if is not empty, for , we set

 PI(k)=1|ΛI|∑u∈ΛI⟨u,ek⟩2,

where is the cardinal of . In words, is the average amplitude of the -th coordinate of eigenvectors in . Theorem 1.1 in [12] asserts that is of order for intervals of length for some (depending on ). By construction, is a probability vector:

 n∑k=1PI(k)=1.

Observe also that . If the eigenvectors in are localized and contains few eigenvalues, then we might expect that for some , , while for most of the others . Alternatively, if the eigenvectors in are well delocalized, then for all . More quantitatively, we will measure the (de)localization of eigenvectors through

 QI=nn∑k=1PI(k)2∈[1,n]. (5)

The scalar is proportional to the Rényi divergence of order of with respect to the uniform measure . If then for any , the number of such that is at most . The scalar is also closely related to the inverse participation ratio which can be defined as

 ΠI=n|ΛI|∑u∈ΛIn∑k=1⟨u,ek⟩4=n|ΛI|∑u∈ΛI∥u∥44∈[1,n].

Using , we find

 QI≤ΠI≤QI|ΛI|.

We will write that a sequence of events defined on our underlying probability space holds with overwhelming probability, if for for any , goes to as goes to infinity. As we shall check, [12, Theorem 1.3] implies the following localization statement.

###### Theorem 1.1 (Localization of eigenvectors of large eigenvalues [12]).

Let , and . There exists such that for any compact , there is a constant and, if is an interval of length , then

 QI≥c1|I|−2κ2−α,

with overwhelming probability.

In this work, we shall prove the converse of this statement and prove notably that there exists a neighborhood of where eigenvectors are delocalized.

###### Theorem 1.2 (Delocalization of eigenvectors of small eigenvalues).

There exists a countable set with no accumulation point on such that the following holds. Let and . There is and constants such that, if is an interval of length , then

 QI≤c1,

with overwhelming probability.

As we shall see in the course of the proof, is finite for with going to zero iff the fractional moment of the resolvent is bounded below by a positive constant as goes to infinity. Our point will therefore be to provide such a bound.

The parameter could be replaced for any by

 np−1n∑k=1PI(k)p.

Then, the statements of Theorem 1.1 and Theorem 1.2 are essentially unchanged (up to modifying the value of in Theorem 1.2 and the exponent in the lower bound in Theorem 1.1). We have chosen to treat the case for its connection with the inverse participation ratio.

There are still many open problems concerning Lévy matrices. In the forthcoming Corollary 5.13, we prove a local law (i.e. a sharp quantitative estimate of for intervals of vanishing size) which improves for small value of on Theorem 1.1 in [12]. We conjecture that such local law holds for all and for all intervals of length much larger that .

There is no rigorous results on the local eigenvalue statistics for Lévy matrices, see [27] for a recent account of the predictions in the physics literature. It is expected that for the local eigenvalue statistics are similar to those of the Gaussian Orthogonal Ensemble (GOE), asymptotically described by a sine determinantal point process. For and energies larger than some , we expect that the local eigenvalue statistics are asymptotically described by a Poisson point process. In the regime and energies smaller than , [27] also predicts a GOE statistics. Finally, [13] conjectured the existence of yet another regime when at energies larger than some .

For , proving the existence of such mobility edge is already a very interesting open problem. The core of the difficulty is to better understand a fixed point equation described in its probabilistic form by (22) which is satisfied by the weak limit of as goes to infinity. More generally, the Lévy matrix is an example of a broader class of random matrices with heavy tails. The qualitative behavior of the spectrum depends on the Radon measure in (2) and its vague limit which we denoted by . It is a challenging question to understand how influences the nature of the spectrum around a given energy (regularity of the limiting spectral measure, localization of eigenvectors, local eigenvalue statistics).

The paper is organized as follows. In Section 2, we prove that Theorem 1.1 is a direct consequence of a result in [12]. Section 3 gives an outline of the proof of Theorem 1.2. The actual proof is contained in Section 4 and Section 5.

## 2 Proof of Theorem 1.1

For , and , it follows from [12, Theorem 1.3] that there exists such that for any compact , there are constants and for all integers , if is an interval of length , then

 nα/2−1n∑k=1PI(k)α/2≤c1|I|κ, (6)

with overwhelming probability. We may notice that the logarithm of the left hand side in the above expression is proportional to the Rényi divergence of order of with respect to the uniform measure. The smaller it is, more localized is (for explicit bounds see [12]).

We may use duality to obtain from (6) a lower bound on . From Hölder inequality, we write for and ,

 1=n∑k=1PI(k) = 1nn∑k=1(nPI(k))ε(nPI(k))1−ε ≤ (nεp−1n∑k=1PεpI(k))1/p(n(1−ε)q−1n∑k=1P(1−ε)qI(k))1/q.

We choose and . We have , and . Hence, if the event (6) holds, we deduce that

 (c1|I|κ)−qp=c′1|I|−2κ2−α≤QI.

It completes the proof of Theorem 1.1.

## 3 Outline of proof of Theorem 1.2

### 3.1 Connection with the resolvent

For , the resolvent matrix of is defined as

 R(z)=(A−zI)−1.

The next lemma shows that the quadratic mean of the diagonal coefficients of the resolvent upper bounds .

###### Lemma 3.1.

Let and . If , we have

 QI≤(n|I||ΛI|)2(1nn∑k=1(IRkk(z))2).
###### Proof.

We use the classical bound for ,

 ∑u∈ΛI⟨u,ek⟩2≤n∑j=12η2⟨uj,ek⟩2η2+(λj−λ)2=2ηIRkk(z)

We get,

 QI≤4nη2|ΛI|2n∑k=1(IRkk(z))2,

as requested. ∎

Incidentally, we remark from [12, Lemma 5.9] and the above proof of Theorem 1.1 that there is a converse lower bound of involving the average of for any .

We may now briefly describe the strategy behind Theorem 1.2. Take be an interval and . First, from [12, Theorem 1.1], there exists a constant , such that, with overwhelming probability, . Thanks to Lemma 3.1, it is thus sufficient to prove that

 1nn∑k=1(IRkk(z))2=O(1).

From general concentration inequalities, it turns out that the above quantity is self averaging for . Using the exchangeability of the coordinates, it remains to prove that

 E(IR11(z))2=O(1).

Now, the law of converges as goes to infinity to a limit random variable, say , whose law satisfies a fixed point equation. In subsection 4.1, in the spirit of [22], we will study this fixed point and prove, by an implicit function theorem, that . It will remain to establish an explicit convergence rate of to to conclude the proof of Theorem 1.2. A careful choice of the norm for this convergence will be very important. We outline the content of these two sections in the next subsection.

### 3.2 The fixed point equations

The starting point in our approach is Schur’s complement formula,

 R11(z) = −⎛⎝z−n−1αX11+n−2α∑2≤k,ℓ≤nX1kX1ℓR(1)kℓ(z)⎞⎠−1 (7) = −(z+n−2α∑2≤k≤nX21kR(1)kk(z)+Tz)−1,

where is the resolvent of the matrix and we have set

 Tz = n−1αX11+n−2α∑2≤k≠ℓ≤nX1kX1ℓR(1)kℓ(z).

It turns out that is negligible, at least for large enough. Assuming that, we observe that the moments of are governed by the order parameter

 yz=1nn∑k=1(IRkk(z))α2.

Indeed,

 IR11(z) ≃ −I(z+n−2α∑2≤k≤nX21kR(1)kk(z))−1 ≤ (n−2α∑2≤k≤nX21kIR(1)kk(z))−1.

The resolvent and being close, we can justify that . Then, taking moments and using the formula, for , ,

 1xp=1Γ(p)∫∞0tp−1e−xtdt, (8)

we deduce that

 E[(IR11(z))p] ≤ 1Γ(p)∫∞0tp−1E[e−t(IR11(z))−1]dt (9) ≲ 1Γ(p)∫∞0tp−1e−Γ(1−α/2)tα/2n−1∑nk=2(IR(1)kk(z))α2dt ≲ 1Γ(p)∫∞0tp−1e−Γ(1−α/2)tα/2yzdt,

where, in the second step, we used that the variable are in the domain of attraction of the non-negative -stable law (this approximation will be made more precise notably thanks to Lemma 5.7). The main point becomes to lower bound . To this end, we shall extend it as a function on and set

 γz(u)=Γ(1−α2)×1nn∑k=1(−iRkk(z).u)α2

where

 h.u=R(u)h+I(u)¯h=(R(u)+I(u))R(h)+i(R(u)−I(u))I(h).

Observe that we wish to lower bound

 γz(eiπ/4)=2α4Γ(1−α2)×1nn∑k=1(I(Rkk))α2=2α4Γ(1−α2)yz.

We shall study the function thanks to a fixed point argument. We shall use that is homogeneous. Also, we shall restrict ourselves to , or even in the first quadrant . Here and after, for , we take the argument in . We can see that is approximately solution of a fixed point equation by using (7). To state this result, let us first define the space , , in which we will consider . For any , we let denote the space of functions from to such that , for all . For , we endow with the norms

 ∥g∥∞ = supu∈S+1|g(u)|, ∥g∥κ = ∥g∥∞+supu∈S+1√|(i.u)κ∂1g(u)|2+|(i.u)κ∂ig(u)|2, (10)

where and is the partial derivative of at with respect to the real () or imaginary part () of . We denote the closure of for . The space is a Banach space. Notice also that and coincide. The norm will turn out to be useful with to obtain concentration estimates for as well as to establish existence and good properties for its limit (there is sufficient).

We define formally the function given for , and by

 Fh(g)(u)=∫π20dθ(sin2θ)α2−1∫∞0dyy−α2−1∫∞0drrα2−1e−rh.eiθ(e−rα2g(eiθ)−e−yrh.ue−rα2g(eiθ+yu)). (11)

is related to a fixed point equation satisfied by . Namely let

 cα=α2α2Γ(α/2)2 % and ˇu=i¯u=I(u)+iR(u). (12)

For , we introduce the map on given by

 Gz(f)(u)=cαF−iz(f)(ˇu). (13)

Finally we let

 ¯γz(u) =Eγz(u)=Γ(1−α2)E(−iR11(z).u)α2.

Then, and we shall prove that

###### Proposition 3.2.

Let and . There exists such that if for and some , we have , , and then for all and all large enough (depending on ),

 ∥¯γz−Gz(¯γz)∥1−α/2+δ ≤ (logn)c(η−α/2¯Mα/2z+η−α/2n−δ/2+¯M1−α/2z1α>1) |E|R11|p−rp,z(¯γz)| ≤ (logn)c(η−p¯Mα/2z+η−α/2n−δ/2) |E(−iR11)p−sp,z(¯γz(1))| ≤ (logn)c(η−p¯Mα/2z+η−α/2n−δ/2),

where we have set , and, for

 rp,z(f)=21−p/2Γ(p/2)2∫π20dθsin(2θ)p/2−1∫∞0drrp−1eirz.eiθ−rα/2f(eiθ),

and for ,

 sp,z(x)=1Γ(p)∫∞0drrp−1e−irz−rα/2x.

Using that is close to , we can upper bound if we can lower bound by (9). Similarly, we can upper bound if we can lower bound by (9). Assuming for a moment that we can obtain such bounds (by using a bootstrap argument) the above proposition shows that is approximately a fixed point for .

It turns out that, for any , converges to as goes to infinity, where is a solution of the equation (even we cannot prove that there is a unique solution of this equation). We will check that this last equation has a unique explicit solution of interest for , with also in . We will study the solutions of the equation for close to and close to thanks to the Implicit Function Theorem on Banach space. We will show that for most in , if is small enough, has a unique solution in the neighborhood of . Moreover, the real part of this solution is lower bounded by a positive constant. Let us summarize these results in the following statement:

###### Proposition 3.3.

There exists a countable subset with no accumulation point on such that the following holds. Let and . There exists such that if , then is the unique such that and . Moreover, uniformly in , is bounded from below and, for any , is bounded from above.

The possible existence of the set should be purely technical. Our proof requires that contains , but it could be larger as our argument is based on the fact that some function, analytic in , does not vanish except possible on a set with no accumulation points.

We will also deduce the following result.

###### Proposition 3.4.

Let be as in Proposition 3.3, , and be as in Proposition 3.3 for small enough. There exist and such that if and then

 ∥γ−γ⋆z∥κ≤c∥γ−Gz(γ)∥κ.

As a corollary of the above three propositions, we will prove that is lower bounded for sufficiently large. In the next section, we study the fixed point equation for and establish Proposition 3.3 and Proposition 3.4. In Section 5, we will prove Proposition 3.2 and complete the details of the proof of Theorem 1.2.

## 4 Analysis of the limiting fixed point equation

### 4.1 Analysis of the function Fh

In this first part, we show that the function is well defined as a map from into with the set of functions in such that for all in . We also check that has good regularity properties. Notice that is an open subset of for our norm. We set . Finally, the closure of is the set of functions in whose real part is non-negative on .

###### Lemma 4.1.

Let and . Let . defines a map from to . Moreover, if , defines a map from to and, for some constant ,

 ∥Fh(g)∥κ≤cR(h)α/2+cR(h)α∥g∥κ. (14)

Finally, if , and for some constant .

We shall also prove that is Fréchet differentiable and more precisely

###### Lemma 4.2.

Let , , , and . The Fréchet derivative of at is the bounded operator given, for any , by

 DFh(g)(f)(u)=∫π20dθ(sin2θ)α2−1∫∞0dyy−α2−1 ×∫∞0drrα−1e−rh.eiθ(f(eiθ)e−rα2g(eiθ)−f(eiθ+yu)e−yrh.ue−rα2g(eiθ+yu)).

Moreover, is continuously differentiable on and is continuous in .

As a corollary we shall see that all the functions defined in Proposition 3.2 are Lipschitz in some appropriate norm.

###### Lemma 4.3.

For any , and , is Lipschitz on : there exists such that for any and ,

 ∥Gz(f)−Gz(g)∥κ≤c∥f−g∥κ+(∥f∥κ+∥g∥κ)∥f−g∥∞.

Similarly, for any and