Detecting localized eigenstates of linear operators

Detecting localized eigenstates
of linear operators

Jianfeng Lu Department of Mathematics, Department of Physics, and Department of Chemistry, Duke University, Box 90320, Durham NC 27708, USA jianfeng@math.duke.edu  and  Stefan Steinerberger Department of Mathematics, Yale University, New Haven CT 06511, USA stefan.steinerberger@yale.edu
Abstract.

We describe a way of detecting the location of localized eigenvectors of a linear system for eigenvalues with comparatively large. We define the family of functions

where is a parameter and is the th standard basis vector. We prove that eigenvectors associated to eigenvalues with large absolute value localize around local maxima of : the metastable states in the power iteration method (slowing down its convergence) can be used to predict localization. We present a fast randomized algorithm and discuss different examples: a random band matrix, discretizations of the local operator and the nonlocal operator .

Key words and phrases:
Eigenvectors; localization; power iteration; randomized numerical linear algebra; Anderson localization.
2010 Mathematics Subject Classification:
35P20 (primary), 82B44 (secondary)
The research of J.L. was supported in part by the National Science Foundation under award DMS-1454939. He would also like to thank Yingzhou Li for helpful discussions.

1. Introduction and Main Idea

1.1. Introduction

We are interested in spatially localized eigenvectors of matrices . These objects are of paramount importance in many fields of mathematics: the ground state and low-frequency behavior of quantum systems [1, 7, 10, 11, 15], the behavior of metastable random dynamical systems [3, 4, 5], the detection of central points in graphs clusters [6], the principal component analysis for sample covariance matrix [19], and many more.
The purpose of this paper is to introduce a simple idea, which provably detects localized eigenstates associated to eigenvalues with large absolute value at low computational cost. We introduce the entire relevant theory for matrices , however, a crucial ingredient is the following: when numerically computing solutions for many infinite-dimensional linear operators of interest (linear/nonlocal/fractional partial differential equations, integral equations, …), these are usually discretized and the discretization respects the spatial ordering of the underlying domain. In particular, if the original continuous object has localized eigenstates and the discretization is sufficiently accurate, then the discretized linear operator will have localized eigenstates on the associated graph. We will completely ignore the question of how operators are discretized and restrict ourselves to the question of how to find localized eigenvectors.

1.2. Main idea

We are given a matrix (not necessarily symmetric) and are interested in finding, if they exist, the location of localized eigenvectors concentrating their mass on relatively few coordinates of

for in the spectral edge (meaning that is comparatively large to the rest of the spectrum, the low-lying eigenvalues, close to 0, can be obtained via the very same method after a transformation of , see below). Since strongly localized eigenstates are essentially created by localized structure, they should also be detectable by completely local operations.

Definition 1.

We define given by

where is a parameter and is the th standard basis vector.

The main idea is rather simple: if highly localized eigenvectors exist, then they have a nontrivial inner product with one of the standard basis vectors (whose size can be bounded from below depending only on the scale of localization and not on ). An iterated application of the matrix will then lead to larger growth than it would in other regions. The idea is vaguely related to the stochastic interpretation [16] of the Filoche-Mayboroda landscape function [11]. The logarithm counteracts the exponential growth purely for the purpose of visualization.

Example. We start by considering a numerical example (see Fig. 1): here where , , is given by a random band matrix with bandwidth around the diagonal (i.e., is a diagonal matrix) and every non-zero entry chosen independently randomly in the interval . A typical outcome can be seen in Figure 1 for : the function has a series of local maxima and the first few eigenvectors localized around these maxima; higher peaks in the landscape corresponds to eigenvalues with larger absolute value.

Figure 1. (black) and the plots of the absolute value of the first eigenvectors associated to the largest eigenvalues.

The value of depends on the precise circumstances; larger values can lead to higher accuracy but also increase the computational cost. It is worth pointing out that it is not interesting to have very large: whenever the largest eigenvalue is simple and the associated eigenvector does not vanish, then this approach becomes less effective since

It is not difficult to see that the convergence speed is going to depend on the spectral gap between the largest eigenvalue and the rest (in terms of absolute value of the eigenvalues)– it is commonly desirable to have a spectral gap; here, we are bound to encounter a delicate interplay between spectral gaps and the scale of localization.

2. Statament of Main result

We give one of the many possible formulation of a rigorous guarantee of the approach. Indeed, the underlying principle, as outlined in the previous section, is so simple that there are many ways of turning it into a precise statement; we give a fairly canonical one but it is by no means unique and different circumstances may call for different versions.

We start by clarifying our setup and introducing some parameters below. We first phrase everything in a way that is most natural in the setting of band matrices or matrices with rapid decay off the diagonal (which covers on subsets of , discretized by finite difference or finite volume methods) – the general case follows in a rather straightforward manner by replacing the notion of ‘interval’ by ‘subset’, we briefly discuss this below. The only restrictive assumption is the orthogonality of eigenvectors (1), which is usually given in the setting that we are interested in (localization of self-adjoint operators). (2) and (3) introduces various parameters that are always defined, however, in the non-localized regime they may result in a vacuous conclusion (see Figure 2 for an illustration).

  1. The eigenvectors of form an orthonormal basis of and we order the eigenvalues via

  2. Every one of the first eigenvectors has half of its mass supported on an interval , i.e.

    and we define as the longest such interval

  3. We assume that, for all , the eigenvector has exponential decay away from the interval , i.e. for all

    for some universal constant .

Figure 2. The modulus square of four eigenvectors localized in four intervals bounded uniformly in length . The eigenvectors have exponential decay away from the intervals.

Our main result states that, depending on the size of the spectral gap and the quality of localization, there exist such that the superlevel set intersects all localized intervals and can only be that large in a small neighborhood of the intervals . In particular, this allows for a detection of localization of the first eigenfunctions by looking at alone.

Theorem.

If is chosen such that

then there exists a critical value such that is large on all

and only large in their neighborhood: if , then

The condition on depends on the spectral gap and localization properties. If the spectral gap is large, then is sufficient. If the matrix satisfies that each row has a constant number of non-zero entries (for example, if it is a local discretization of a differential operator in dimensions), this implies the computation of may only require operations. It is clear from the proof that there are many other possible conditions under which similar result could be obtained. Natural variations include the following:

  • The statement guarantees a gap of size between the values of attained on the intervals and far away from their supports; in most instances a much smaller gap would suffice (especially if one identifies regions of localization via a notion of local maximum of , i.e., attaining a larger value than in a neighborhood of a certain size). We observe that our approach easily implies the inequalities

    and for any

    In combination, they suggest that the gap size could in various situations be replaced by something much smaller which would improve the error bounds.

  • Orthogonality is not crucial: if the eigenvectors of form a basis and the angles between different eigenvectors are not too small, our argument easily implies similar results.

  • If we know in advance that a generic localized eigenfunction is going to be roughly localized on an interval then it is clear that it suffices to compute on a net which further speeds up computation time.

  • is defined via purely local operations and thus, by definition, its value at a certain location is stable under perturbing the matrix entries far away. On the other hand, it is entirely possible that a large perturbation will destroy the spectral structure of the matrix. With additional assumptions on the perturbation to guarantee spectral stability, it is then possible to use of the unperturbed matrix to predict localization after perturbation.

We also remark that the assumptions of the theorem consists of both the existence of a gap of the spectrum and localization of the first few eigenvectors. In some situations, the localization of the eigenvectors follows from the gap assumption alone, for instance when the matrix comes from a local discretization of a differential operator, as established in [2, 13].

As increases, the neighborhoods in which the result guarantees localization are growing linearly in size (though only very slowly if ) and becomes less informative. This is not an artifact but necessary: whenever the largest eigenvalue is simple and the associated eigenvector never vanishes, then

This is similar in spirit to the classical power method for computing the first eigenvector. It is noteworthy that we exploit exactly the fact that in the edge of the spectrum which makes the power method a slow method in practice. Or, put differently, our method exploits that highly localized eigenvectors associated to eigenvalues in the spectral edge correspond to metastable states for the power iteration! When , the first eigenvector (and potentially other high lying ones) will then interfere with the performance of the procedure. In such situations, we may revise the procedure by first identifying those dominant eigenvectors and then applying the procedure while iteratively projecting onto the orthogonal complement of the subspace spanned by the dominant eigenvectors. The details are standard and left to the interested reader.

In many applications, we have the eigenvalue problem

with being positive definite and the dominant characteristics of the physical system being determined by the low-lying eigenvalues . A straightforward application of our localization technique is only going to yield the largest eigenvalues. The obvious modification is to consider the matrix

This operation preserves sparsity and flips the spectrum and the low-lying eigenvalues are now in the spectral edge. This is used in our numerical examples in Section 4. Alternatively, if we are given a self-adjoint, positive definite and linear map on a Hilbert space such that is compact, then a natural way of recovering the bottom of the spectrum is via considertion of the semigroup

An application of the spectral theorem allows us to write the semigroup as

which has slow decay for the small eigenvalues and large decay for larger eigenvalues. Note that if we Taylor expand and keeps only the leading order term, we get , which connects with the previous trick. We refer to the Appendix A for an application.

The result, as given above, is easiest to understand in the setting of banded matrices. Banded matrices correspond naturally to localized interactions on the lattice . Neighborhoods of points correspond to intervals and this is how the Theorem was phrased. At a greater level of generality, there need not be such underlying structure and we will replace intervals by general subsets . The notion of distance between an element and a subset is implicitly defined via assuming the inequality

to hold. We emphasize that in all the interesting applications, where is the discretization of a differential operator (or somewhat localized integral operator), these notions can be made rather precise and we recover the classical notion of distance in Euclidean space.

3. Fast Randomized Algorithms

Computationally, can be obtained by calculating the -norm of the rows of the matrix . Thus the algorithm is particularly efficient if has structures enabling fast multiplications, such as being sparse, low rank, etc. To further accelerate the computation, we exploit the ideas from randomized numerical linear algebra (see [8] for a review) to use the following randomized version of the landscape function. For simplicity, we assume that the matrix is symmetric in this section.

Definition 2.

We define given by

where is a parameter, is the th standard basis vector, and is a random matrix with i.i.d. entries.

In terms of computation, if , the randomized version only requires applying the matrix on a tall skinny matrix for times, and thus, the randomized version is particularly advantageous for dense , as it brings down the cost from to . As it turns out, the efficiency of this method is intimately coupled to very well studied concepts centered around the stability of random projection onto subspaces. To see this, we denote the columns of by and observe that

while for , we have

We start by quickly discussing a very strong sufficient condition that allows to transfer results almost verbatim from our Theorem to the random case and is equivalent to classical questions in dimensionality reduction; stronger results are discussed below. We observe that certainly all the results transfer in a pointwise manner if we knew that

for a typical realization of such random vectors. However, since the span the space, we are really asking that the map given by

This is, in fact, exactly the question that underlies the study of random projections, and has been dealt with extensively (see e.g., [17]). The main conclusion is that is in many cases sufficient, however, the implicit constant may be large.

We now explain why in practice a much smaller number of random vectors suffices. The idea is rather simple and most easily explained by considering the example of one random vector in the case of a large spectral gap . Clearly, we have

The outcome now depends on the random vector , however, for suitable large values of it is clear that in order for the random landscape to profoundly differ from the profile of the leading eigenvector, it is required that is very small: even if it were only moderately small, it would get drastically amplified by the exponential growth and still dominate the expression. The following widely-used Lemma shows that this is not overly likely.

Lemma.

Let satisfy and let be randomly chosen w.r.t. the uniform surface measure on . Then, for ,

where the implicit constant depends only on the dimension.

This simple Lemma quantifies the concentration of measure phenomenon and is standard (see e.g. [9]). It explains why in the case of highly localized eigenstates it is completely sufficient to work with only one randomly chosen vector. The inner products are not likely to be extremely small and get amplified by an exponential growth while the strong exponential localization preserves the structure. This is easy to make precise in a variety of ways: the simplest case is a spectral gap . The trivial estimate

implies, together with the Lemma above,

We emphasize that this simple argument did not even use localization of the eigenvectors; the trivial estimate is clearly quite weak if the spectrum is spread out, in that case much stronger results should hold.

Example. Let us revisit the example in Section 1.2 using the random sampling.

(a) without random sampling
(b) with random vector
(c) with random vectors
(d) with random vectors
Figure 3. The landscape function (A), compared with its randomized sampled version with (B), (C), and (D) random vectors, are plotted in black. The absolute value of the first eigenvectors associated to the largest eigenvalues is also plotted. (A) is a replication of Figure 1.

The same random band matrix , for Figure 1 is used. The randomized version with number of random vectors , , respectively are plotted in the panels of Figure 3. We observe that the randomized landscape function even with only random vector still captures the important feature, in particular the local maxima, of the landscape function.

4. Numerical Examples

4.1. Schrödinger Operator with Potential.

The case of finding a way of numerically detecting localized eigenstates of operators of the form for and Dirichlet conditions on the boundary has recently received a lot attention in the mathematics and physics literature (see e.g., [10, 11, 12, 14, 16]). We explain how our method can be applied to this case (without any restrictions on ). We consider the operator defined on with periodic boundary condition. is a smooth periodic potential generated randomly in the unit square. The pseudo-spectral discretization with mesh size is used.

(a) with random vectors
(b) Sum of modulus square:
(c) First low-lying eigenfunction
(d) Second low-lying eigenfunction
Figure 4. The landscape function (A), compared with the sum of modulus square of the low-lying eigenfunctions (B), for the operator . The modulus square of the first two low-lying eigenfunctions are also plotted in (C) and (D) respectively.

While the resulting matrix is dense due to the Fourier differentiation of the pseudo-spectral method, the method proposed still applies. In Figure 4, we show the landscape function using and random vectors on the left panel and the sum of the square modulus of the low-lying eigenvectors on the right panel. Good agreement is observed. For visualization, we only plotted the part of landscape function exceeding its maximum value minus .

(a) with random vectors
(b) Sum of modulus square:
(c) First low-lying eigenfunction
(d) Second low-lying eigenfunction
Figure 5. The landscape function (A), compared with the sum of modulus square of the low-lying eigenfunctions (B), for the operator . The same potential as in the example of Figure 4 is used. The modulus square of the first two low-lying eigenfunctions are also plotted in (C) and (D) respectively.

4.2. Fractional Laplacian

The method extends without much difficulty to the fractional Laplacian . We still consider the computational domain with periodic boundary condition and a smooth randomly generated potential. Thanks to the periodic boundary condition, the fractional Laplacian can be defined through spectral decomposition. Pseudo-spectral discretization is again used. Note that the fractional Laplacian is non-local, regardless of the discretization. In Figure 5, we show the result for where the same potential as in the example of Figure 4 is used. We again observe excellent agreement between the landscape function and the localization of the eigenfunctions.

5. Proofs

Proof of the Theorem.

We first show that exceeds a certain value on all the intervals (). We start by showing there is a with

If this were false, then

which is a contradiction. Here the first inequality comes from the assumption that contains half -mass of . A simple application of the spectral theorem implies that

It remains to show that this value cannot be attained unless one is close to one of the localized eigenvectors. If has at least distance from then

and thus

We want to relate this to the inequality

which will provide an upper bound on (meaning that if the distance is to the sets is larger than this bound, then we obtain the desired consequence). If, as we assume,

The inequality

is equivalent to

which is the desired statement. ∎

Appendix A Localized Wannier Bases

In this appendix, we connect the landscape functions to localized Wannier bases. Localized Wannier bases [18] are maximally localized bases of low-lying eigenfunctions. Give an operator over with mutually orthogonal low-lying eigenfunctions , it is often helpful to work with a basis of where each basis element is as localized as possible (for example to obtain sparser matrices). A classical approach, see e.g., the review article [15], is to simply project the Dirac measure onto the finite-dimensional subspace to obtain the best possible representation of this localized point in the basis. We illustrate this with a simple example: let us consider on the torus . The projection of onto the span is given by

which is merely the classical Dirichlet kernel. The function is indeed highly localized around , roughly constant for and then exhibits some oscillatory behavior around 0 further away from . Motivated by our main idea, a slightly different idea suggests itself: instead of merely projecting, it makes sense to run the dynamical system with as initial datum on the subspace on which we project. More precisely, this yields a projection that depends on

Figure 6. The projection on for and the parametrized projection with . We observe a tradeoff between localization and decay.

Basic intuition tells us that, depending on the speed of propagation within the dynamical system, at least for small values of the projection will still look pretty much like the direct propagation. Our main observation is that for sufficiently small values of , the projection will be pretty much as localized as but will have much better decay properties. We illustrate this again on the torus , where

We observe that the infinite limit is given by the Jacobi function

The Jacobi function has exponential decay in the region and, what is especially useful, the exponential decay in the coefficients implies that a cutoff at frequency is only going to introduce a small error. In particular, in this setting, the threshold-value for (where localization properties are not significantly worse) is . This example generalizes immediately to higher-dimensions, where the situation is more or less identical: the direct projection creates oscillations with slow decay throughout the entire space, the diffused leads to a more localized representation. Returning to our original problem for matrices , the relationship to is easily explained: in the case of localized eigenvectors, we see that if is the eigenvector associated to the largest eigenvalue on coordinate , then is a very good local approximation of . The proof of this simple statement proceeds along the very same lines as the proof of our main theorem.

References

  • [1] P. W. Anderson, Absence of Diffusion in Certain Random Lattices, Phys. Rev. 109 (1958), 1492–1505.
  • [2] M. Benzi, P. Boito, and N. Razouk, Decay properties of spectral projectors with applications to electronic structure, SIAM Rev. 55 (2013), 3–64.
  • [3] A. Bovier, M. Eckhoff, V. Gayrard and M. Klein, Metastability and low lying spectra in reversible Markov chains. Comm. Math. Phys. 228 (2002), no. 2, 219–255.
  • [4] A. Bovier, M. Eckhoff, V. Gayrard, and M. Klein, Metastability in reversible diffusion processes. I. Sharp asymptotics for capacities and exit times. J. Eur. Math. Soc. 6 (2004), no. 4, 399–424.
  • [5] A. Bovier, V. Gayrard and M. Klein, Metastability in reversible diffusion processes. II. Precise asymptotics for small eigenvalues. J. Eur. Math. Soc. 7 (2005), no. 1, 69–99.
  • [6] X. Cheng, M. Rachh and S. Steinerberger, On the Diffusion Geometry of Graph Laplacians and Applications, preprint, arXiv:1611.03033.
  • [7] Weinan E, Tiejun Li, and Jianfeng Lu, Localized bases of eigensubspaces and operator compression, Proc. Natl. Acad. Sci. USA, 107 (2010), 1273–1278.
  • [8] N. Halko, P. G. Martinsson, and J. A. Tropp, Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, SIAM Rev. (2011), 53(2), 217–288.
  • [9] C. Kenney and A. Laub, Small-sample statistical condition estimates for general matrix functions. SIAM J. Sci. Comput. 15 (1994), no. 1, 36–61.
  • [10] G. Lefebvre, A. Gondel, M. Dubois, M. Atlan, F. Feppon, A. Labbe, C. Gillot, A. Garelli, M. Ernoult, S. Mayboroda, M. Filoche, P. Sebbah, One single static measurement predicts wave localization in complex structures, Phys. Rev. Lett. 117 (2016), 074301.
  • [11] M. Filoche and S. Mayboroda, Universal mechanism for Anderson and weak localization. Proc. Natl. Acad. Sci. USA 109 (2012), no. 37, 14761-14766.
  • [12] M. Filoche and S. Mayboroda, The landscape of Anderson localization in a disordered medium, Contemporary Mathematics, 601 (2013), 113–121.
  • [13] L. Lin and J. Lu, Decay estimates of discretized Green’s functions for Schrödinger type operators, Sci. China Math. 59 (2016) 1561–1578.
  • [14] M. Lyra, S. Mayboroda and M. Filoche, Dual hidden landscapes in Anderson localization on discrete lattices, Euro. Phys. Lett. 109 (2015), 4700.
  • [15] N. Marzari, A. A. Mostofi, J. R. Yates, I. Souza, D. Vanderbilt, Maximally localized Wannier functions: Theory and applications, Rev. Mod. Phys. 84 (2012), 1419–1475.
  • [16] S. Steinerberger, Localization of Quantum States and Landscape Functions, Proc. Amer. Math. Soc, accepted.
  • [17] S. Vempala, The random projection method. With a foreword by Christos H. Papadimitriou. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 65. American Mathematical Society, Providence, RI, 2004.
  • [18] G. Wannier, The Structure of Electronic Excitation Levels in Insulating Crystals, Phys. Rev. 52 (1937), 191.
  • [19] H. Zou, T. Hastie, R. Tibshirani, Sparse principal component analysis, J. Comput. Graph. Stat. 15 (2006), 262–286.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
261554
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description