How many eigenvalues of a Gaussian random matrix are positive?

How many eigenvalues of a Gaussian random matrix are positive?

Abstract

We study the probability distribution of the index , i.e., the number of positive eigenvalues of an Gaussian random matrix. We show analytically that, for large and large with the fraction of positive eigenvalues fixed, the index distribution where is the Dyson index characterizing the Gaussian ensemble. The associated large deviation rate function is computed explicitly for all . It is independent of and displays a quadratic form modulated by a logarithmic singularity around . As a consequence, the distribution of the index has a Gaussian form near the peak, but with a variance of index fluctuations growing as for large . For , this result is independently confirmed against an exact finite formula, yielding for large , where the constant has the nontrivial value and is the Euler constant. We also determine for large the probability that the interval is free of eigenvalues. Part of these results have been announced in a recent letter [Phys. Rev. Lett. 103, 220603 (2009)].

Gaussian random matrices, large deviations, Coulomb gas method, index
pacs:
02.50.-r; 02.10.Yn; 24.60.-k

I Introduction

Statistical properties of eigenvalues of random matrices have been extensively studied for decades, stemming from the seminal work of Wigner [[1]]. Random Matrix Theory (RMT) has successfully provided tools and methods to disparate areas of physics and mathematics [[2]], with countless applications so far. Statistics of several random variables associated with random eigenvalues have been studied extensively. This includes the length of a gap in the eigenvalue spectra, number of eigenvalues in a given interval, the largest eigenvalue, the trace etc. [[2]]. Most studies concerned with the probability of typical fluctuation of such a random variable around its mean.

However, various recent applications of random matrix theory have posed questions regarding atypical large fluctuations of such random variables associated with the eigenvalues, thus triggering a number of recent studies on the large deviation probabilities of such random variables. This includes, for instance, the large deviation probability of the extreme (maximum and minimum) eigenvalues of Gaussian [[3], [4], [5], [6], [7]] and Wishart random matrices [[8], [4], [9]], of the number of stationary points of random Gaussian landscapes [[10], [11]], of the distribution of free energies in mean-field spin glass models [[12], [13]], of the conductance and shot noise power in chaotic mesoscopic cavities [[14], [15]], of the entanglement entropy of a pure random state of a bipartite quantum system [[16], [17], [18], [19]] and of the mutual information in multiple input multiple output (MIMO) channels [[20]]. In addition, random matrix theory has been used to understand large deviation properties of various observables in the so called vicious walker (or nonintersecting Brownian motion) problem [[21], [22], [23], [24]]. The purpose of the present paper is to provide a detailed analysis of the large deviation properties of another natural random variable for large Gaussian matrices, namely the fraction of positive eigenvalues of an Gaussian matrix. Part of the main results presented here were announced in a recent Letter [[25]]. We will explain shortly why this fraction is a natural observable that arises in a number of physical situations. But before we do that, it is useful to recall some well-known facts about Gaussian matrices.

There are three families of Gaussian random matrices with real spectrum: orthogonal (GOE), unitary (GUE) and symplectic (GSE). The matrices belonging to these families are real symmetric, complex hermitian and quaternion self-dual respectively, whose entries are independent Gaussian variables (real, complex or quaternions) labeled by the Dyson index respectively. The probability distribution of the entries of a matrix is then given by the Gaussian weight:

(1)

where stands for the inner product on the space of matrices invariant under orthogonal, unitary and symplectic transformation respectively. Explicitly, one has:

(2)
(3)
(4)

where denotes hermitian conjugation and the quaternion self-dual. The celebrated result by Wigner states that for large matrix size , the average density of eigenvalues (all real) for such ensembles has a -independent semicircular form [[1], [2]]

(5)

which vanishes identically at the two edges and is normalized to unity. Clearly, the mean spacing between eigenvalues in the bulk, i.e., close to the origin, behaves for large as .

A natural and much studied question that goes back to Dyson [[26]] is: how many eigenvalues are there in a given interval on the real line? Clearly this number is a random variable that fluctuates from one sample to another. Its mean value, for large , is easy to compute by integrating the semi-circular average density in (5) over the interval : . But how does this number fluctuate from one sample to another? Dyson studied this number fluctuation in the so called bulk limit, i.e., he focused on a small symmetric interval around the origin where is the mean bulk spacing and is kept fixed while one takes the limit. Let denote the number of eigenvalues in this interval. Clearly, the mean number of eigenvalues . But Dyson also computed the variance of in the large limit (with fixed) and showed that for large the variance grows logarithmically with

(6)

and the constant was computed by Dyson and Mehta [[27]]. Thus the typical fluctuations of grow as for large . More recently, even the higher moments of (in the limit with fixed) were computed which proved that on a scale of around the mean , the random variable has a Gaussian distribution [[28], [29]].

Here our focus will be on a different limit, namely we study the statistics of the number of eigenvalues, not on a small symmetric interval around the origin (i.e, the bulk limit), but rather on the full unbounded interval . In other words, we are interested simply in the distribution of the number of positive eigenvalues (called the index) of a Gaussian random matrix . Since the average density of states is symmetric in , it is clear that on average there are positive eigenvalues. Clearly the index fluctuates from one realization of the matrix to another and in this paper, we are precisely interested in the fluctuation properties of the random variable , i.e., in the full probability distribution . Evidently, . Also, the number of negative eigenvalues is distributed identically as the number of positive eigenvalues by virtue of the Gaussian symmetry, indicating . Hence the distribution of is clearly symmetric around its mean value . It thus suffices to study the range .

So, why are we interested in this index distribution? This question naturally arises in the study of the stability patterns associated with a multidimensional potential landscape  [[30]]. For instance, in the context of glassy systems, the point represents a configuration of the system and is just the energy of the configuration [[31]]. Similarly, in the context of disordered systems or spin glasses, may represent the free energy landscape. In the context of string theory, may represent the potential associated with a moduli space [[32]]. Typically such an -dimensional landscape has many stationary points (minima, maxima and saddles) with complex stability patterns that play an important role both in statics and dynamics of such systems [[30]]. The stability of a stationary point of this -dimensional landscape is decided by the real eigenvalues of the Hessian matrix which is symmetric. If all the eigenvalues are positive (negative), the stationary point is a local minimum (local maximum). If some, but not all, are positive then the stationary point is a saddle. The number of positive eigenvalues (the index), , is then a key object that determines in how many directions the stationary point is stable. Given a random potential , the entries of the Hessian matrix at a stationary point are usually correlated. However, in many situations, important insights can be obtained by ignoring these correlations and just assuming the entries of the Hessian matrix are just independent Gaussian variables. This then leads to the study of the statistics of index for a GOE matrix. This toy model, called the random Hessian model (RHM), has been studied extensively in the context of disordered systems [[31]], landscape based string theory [[33]] and also in quantum cosmology [[34]]. Although in RHM , it is quite natural to study the index distribution for other Gaussian ensembles, namely for GUE () and GSE ().

For the GOE (), the statistics of was studied by Cavagna et al. [[31]] using supersymmetric replica methods and some additional approximations. They argued that around its mean value , the random variable has typical fluctuations of for large . Moreover, the distribution of these typical fluctuations is Gaussian. In other words, over a region of width , the distribution for large is given by [[31]]

(7)

implying that for , for large .

On the other hand, this Gaussian form does not describe the atypically large fluctuations of . For example, in the extreme limit when , the probability that all eigenvalues are positive was computed recently for large and for all  [[3]],

(8)

This question of the probability of extreme large fluctuation of (fluctuation on a scale around its mean ) naturally came up in several recent contexts such as in landscape based string theory [[33]], quantum cosmology [[34]] and in the distribution of the number of minima of a random polynomial [[35]].

These two rather different forms of the distribution in the two limits, namely in the vicinity of (over a scale of ) (as in (7)) and when (as in (8)) raise an interesting question: what is the form of the distribution for intermediate values of ? In other words, how does one interpolate between the limits of typically small and atypically large fluctuations? To answer this question, it is natural to set where the intensive variable denotes the fraction of positive eigenvalues and study the large limit of the distribution with fixed. Again, due to the Gaussian symmetry, and it is sufficient to restrict in the range .

In a recent Letter [[25]], we computed the large limit of the distribution in the full range for all and showed that

(9)

where the rate function , independent of , was computed explicitly for all 1. The fact that the logarithm of the probability is for fixed is quite natural, as it represents the free energy of an associated Coulomb fluid of charges (eigenvalues) (to be discussed in detail later). The Coulomb energy of charges clearly scales as . In the limit , we get in agreement with (8). The distribution is thus highly non-Gaussian near its tails. In the opposite limit , we find a marginally quadratic behavior, modulated by a logarithmic singularity

(10)

Setting and substituting this form in (9), we find that in the vicinity of and over a scale of , indeed one recovers the Gaussian distribution

(11)

thus proving that the variance for large and for all . For , this perfectly agrees with the results of Cavagna et al. [[31]].

In addition to obtaining the full distribution of the fraction of positive eigenvalues , our Coulomb gas approach also provides a new method of finding solutions to singular integral equation with two disconnected supports, as discussed in detail later. This method is rather general and can be fruitfully applied to other related problems in RMT, an example is later provided in the paper in calculating the probability that an interval is free of eigenvalues, i.e., there is a gap in the spectrum. The details of these calculations are somewhat involved and were not presented in our previous Letter [[25]]. The purpose of this paper is to provide these details which we believe will be important for other problems as well.

The paper is organized as follows. In Section II.A we set up the problem and show that the rate function can be computed via the solution of a singular integral equation on a disconnected support. In subsections II.B and II.C, we provide two different strategies to find such a solution, the first based on a scalar Riemann-Hilbert ansatz and the second based on an iterated application of a theorem by Tricomi. In subsection II.D we derive the free energy of the associated Coulomb gas and the large deviation function associated with the index distribution. In subsection II.E we provide an asymptotic analysis of near and determine the variance of the index for large matrix size . In section III we provide details of numerical simulations. As an application of the general method for solving two-support integral equation, we compute in section IV, the probability that a Gaussian random matrix has a gap in the spectrum. In section V we offer a derivation of a determinantal formula for the variance of the index at finite for . Finally, we conclude with a summary in section VI.

Ii The probability distribution of the index

ii.1 Setting and Notation

We consider the standard Gaussian ensembles of random matrices with Dyson index , corresponding to real, complex and quaternion entries respectively. The probability distribution of the entries is given in (1) and consequently the joint probability density of eigenvalues reads [[2]]

(12)

where is the normalization constant which can be explicitly computed via a Selberg-like integral [[2]] and to leading order for large , where  [[3]].

We wish to compute the probability distribution of the index , defined as the number of positive eigenvalues of the matrix :

(13)

By definition:

(14)

We will set where is the fraction of positive eigenvalues. As mentioned in the introduction, due to the Gaussian symmetry, the number of positive eigenvalues will have the same distribution as the number of negative eigenvalues . Hence, (the distribution is symmetric around ). Thus, it is sufficient to focus only on the range .

The evaluation of the -fold integral (14) in the large limit consists of the following steps: first, we write the integrand (ignoring the delta function) as, with . Written in this form, the integral has a natural interpretation as the partition function of a Coulomb gas in equilibrium at inverse temperature . We can identify ’s as the coordinates of the charges of a -d fluid confined on the real axis. The charges repel each other via the -d logarithmic Coulomb potential and are confined by a quadratic external potential. Then is the energy of this Coulomb gas. Furthermore, the Coulomb energy scales, for large , as (since it involves pairwise interaction between charges). In contrast, the external potential energy scales as where is a typical eigenvalue. Balancing the two energy scales, one finds that a typical eigenvalue scales as for large .

The next step is to evaluate this partition function of the Coulomb gas in the large limit via the saddle point method. In the large N limit, the eigenvalues become rather dense and one can then take a continuum limit where one replaces the integration over the discrete eigenvalues by a functional integral over the density of these eigenvalues. Originally introduced by Dyson [[26]], this procedure (see also  [[36]]) has recently been successfully used in a number of different contexts. These include the computation of the extreme eigenvalue distribution of Gaussian [[3], [4]] and Wishart random matrices [[8], [4], [9]], counting the number of stationary points of random Gaussian landscapes [[10], [11]], and computing the distribution of the bipartite quantum entanglement  [[16], [17], [18]]. In addition, this method has also been used recently in systems such as nonintersecting fluctuating interfaces in presence of a substrate [[22]], in computing the distribution of conductance and shot noise power in mesoscopic cavities [[14], [15]] and in the study of multiple input multiple output (MIMO) channels [[20]].

Dyson’s prescription requires first a coarse-graining procedure, where one sums over (partial tracing) all microscopic configurations of ’s compatible with a fixed charge density function . Secondly, one performs a functional integral over all possible positive charge densities normalized to unity. Finally the functional integral is carried out in the large limit by the saddle point method.

Following this prescription, we introduce a continuum fluid representation for the Coulomb cloud of eigenvalues with density . Since , it follows that the normalized density should have the scaling form for large . The scaled density satisfies the obvious normalization conditions:

(15)
(16)

where we have set with being the fraction of positive eigenvalues. The probability density (14) can then be rewritten as a functional integral over as:

(17)

where the numerator reads:

(18)

where are Lagrange multipliers enforcing the normalization conditions (15) and (16).

We define the action as:

(19)

Evaluating (18) by the method of steepest descent and using the large asymptotics of the denominator in (17) gives, to leading order for large ,

(20)
(21)

where [[3]] and is the solution of the saddle point equation

(22)

The function can be interpreted as the equilibrium (or optimal) charge density of the eigenvalue fluid, given a fixed fraction of positive charges. Once we obtain the solution of the integral equation (22), we can evaluate the saddle point action in (20), and together with (21) one then gets the index distribution

(23)

where is the large deviation function.

Thus all we have to do is to solve the saddle point equation (22) for a fixed . To avoid the Lagrange multipliers, it is convenient to differentiate (22) with respect to and for (), one gets the integral equation

(24)

(where denotes Cauchy’s principal value), supplemented with the constraints:

(25)
(26)

Singular integral equations of this type have been studied by Tricomi [[37]], who derived an explicit formula for the solution in the case when the solution is nonzero over a single finite connected support where and are respectively the lower and the upper end of the support. Tricomi’s theorem states that the general solution to singular integral equations of the form

(27)

over the interval with (where the source function is given and arbitrary) is [[37]]:

(28)

where is a constant.

Let us then first assume that indeed the solution of (24), with the source function , has a single support over . Substituting , one can evaluate the integral in (28) explicitly to obtain

(29)

where we have used the normalization condition to set the constant . There are two unknown constants , which are to be fixed from the constraint (26) and the consistency condition that the solution (which represents a density) must be non-negative over . At the two endpoints and , the solution either vanishes or has an inverse square root divergence (which is integrable). If we try to evaluate these constants, it is easy to check that a non-negative consistent solution is possible only for two limiting values of , namely and . Let us discuss these two cases first.

The case : In this case, the solution must be symmetric which indicates . In addition, it is clear physically that the solution must vanish at the endpoints and . This fixes and the solution in (29) reduces to the Wigner semicircle law, namely

(30)

This is reassuring and is expected for the following reason: if there was no constraint at all on the fraction of positive eigenvalues, the system would naturally choose to have half the eigenvalues positive and half negative on average, implying , and the equilibrium charge density would be the standard Wigner’s semicircle law.

The case : In the other extreme limit where all the eigenvalues are forced to be positive, one can again find a consistent solution from (29) that satisfies all the constraints and is given by

(31)

where . In this case, the support is over with , . Note that this solution vanishes at the upper edge and diverges as at the lower edge . This explicit solution was first obtained in  [[3]].

It turns out that for other values of , there is no single support solution (29) that satisfies the constraint (26) and is non-negative for all . To see what is going wrong, it was instructive to perform numerical simulation (the details of which will be described later) for . For example, for , the optimal density is given in Fig. (1). It is evident from the figure that for , indeed there are two disconnected supports of the optimal charge density .

Figure 1: Analytical density in (32) for (solid black) together with results from i) (red) numerical diagonalization of matrices of size , where only samples having positive eigenvalues were retained for the statistics (), and ii) (blue) Montecarlo simulations of the Coulomb fluid with particles.

A similar feature actually holds for all . As from below, the area under the left support vanishes and we are left with a single support over as in (31). On the other hand, as decreases continuously, the area under the left support grows and the upper edge of the left support (always on the negative side) also increases. Finally when hits , the two supports merge into a single support, symmetric about the origin, and reduces to the Wigner semicircle law (see Fig. (2)).

Figure 2: The optimal density of eigenvalues (Eq. (32)) for (red), (green) and (blue).

Hence, it is not surprising that we cannot obtain any consistent single support solution using Tricomi’s result in (29) for , as the optimal density does not have a single support but rather two disconnected supports. The technical reason for the two-support solution can indeed be traced back to the jump discontinuity at due to the Heaviside theta function in the saddle point equation (22). So, the main technical challenge is how to obtain analytically an explicit two-support solution of the integral equation (24) for all , given that we cannot use the Tricomi solution any more. This is an interesting mathematical challenge since such two-support solutions appear in other problems as well and a general method would be very useful. This is what we achieve here as detailed in the next two subsections. In fact we will present two different approaches producing the same results. But before we get into the technical details of the two methods, it may be useful to summarize here the main result.

We show that the solution of (24) satisfying the constraints (25) and (26) and the condition of non-negativity, for all is given by

(32)

where

(33)

and the parameter is determined implicitly as a function of from (26) by the condition:

(34)

For general , the equilibrium density (32) has support on the union of two disconnected intervals

(35)

One can easily check that in the two limiting cases and , our general solution reduces respectively to (30) and (31).

  • : this corresponds to having no negative eigenvalues at all, thus the equilibrium density must match the solution in [[3]] at . This is achieved as long as and thus as expected (compare to [[3]]). Then the blob of negative eigenvalues in (32) (see (35)) collapses to a single point and vanishes.

  • : this case represents the usual Wigner’s semicircle and is recovered from (32) when and consequently . In this case, the support (35) becomes compact as it should.

In the next two subsections, we provide two alternative derivations of (32), the first one based on a scalar Riemann-Hilbert ansatz and the second one based on an iterated application of Tricomi’s single support solution.

ii.2 Method I: proof of (32) via Riemann-Hilbert ansatz

In the context of counting of planar diagrams, Brezin et. al. [[38]] encountered singular integral equation of the type (27) with a single support solution. They did not use the explicit Tricomi solution, but instead developed an alternative method using a scalar Riemann-Hilbert ansatz. This method makes use of properties of analytic functions in the complex plane. Even though the method requires making a guess or ansatz (verified a posteriori), it turns out to be rather useful. This method can be generalized in a straightforward manner to the case when the solution has multiple disconnected supports and has been used before in other contexts [an example in a specific case can be found in the appendix of  [[15]], see also [[39]]]. Let us illustrate below the main idea behind this method.

Let us consider the singular integral equation

(36)

where the solution has support on the union of a finite number of intervals on the real line and is normalized to unity: . The next step is to define a complex function (without the principal part)

(37)

in the complex plane. The function has the following properties:

  1. it is analytic everywhere in the complex plane outside the cuts on the real line

  2. it behaves as when since due to the normalization,

  3. it is real for real outside the cuts

  4. as one approaches to any point on the cuts on the real axis, . This is a consequence of (36). Thus, .

The general theory of analytic functions in the complex plane tells us that there is a unique function which satisfies all the four properties mentioned above. Thus, if one can make a good guess or ansatz for the function and verifies that it satisfies all the above properties, then this is unique. Knowing , one can then read off the solution using the -th property mentioned above.

In our case, , and from the simulation results we already know that there are only two supports for , one on the positive side and one on the negative side. To make a good guess for , let us reexamine the precise form of the solution in the two limiting cases and

(38)
(39)

with . For intermediate values of , we then seek a sensible two-support ansatz that interpolates between (38) and (39). A suitable ansatz, that is verified a posteriori, is

(40)

which has support over . The unknown parameters depend on in such a way that for , and for , . We can then make the following guess for the function , valid everywhere in the complex plane , except on the cuts on the real axis

(41)

It is easy to check that the definition (41) indeed satisfies all the four properties mentioned above and hence is unique. From the 4-th property mentioned above, namely taking the limit with , it follows that that is indeed given by (40).

To fix the parameters , and , we will use the 2nd property of mentioned above, namely that as , . Expanding in (41) for large we get

(42)

where

(43)

Imposing the exact asymptotic decay for large , we immediately get the two conditions

(44)
(45)

which leads to

(46)
(47)

as stated in (32). Thus, we are left with only one unknown parameter . This is fixed from the normalization condition leading to (34) which determines implicitly as a function .

ii.3 Method II: proof of (32) via double iteration of the Tricomi solution

While the method (I) presented in the previous subsection, for finding the solution with two disconnected supports of the integral equation (36) with , is rather elegant it has the drawback that one has to make a judicious guess for the function . It is thus desirable to find a method where one does not need to guess. We show in this subsection that indeed it is possible to obtain an explicit two-support solution to (36) without making an a priori guess. The main idea behind this new method (II) is to actually use the Tricomi single support solution twice. Let us first outline below the basic principle behind this idea which turns out to be rather general and works for arbitrary source function in (36).

We consider again the integral equation

(48)

where is assumed to have nonzero solution over two connected components , with . Note that the equation (48) holds for and also for . Let us write the solution as

(49)

Then (48) can be divided into two parts (respectively for the left and the right supports) and rewritten as

(50)
(51)

Note that for , the integral over becomes an ordinary integral (as there is no pole and we can drop the ) and similarly for the other side.

The main idea then is to eliminate say from these two equations and obtain a single integral equation for . This is carried out in the following way. For , (50) can be rewritten as

(52)

The solution has a single support over . Hence we can now use the explicit Tricomi solution (28) (replacing in (28) by the new effective source function ) to express (for ) as a functional of where . Next, we use this explicit solution for in the second equation (51) and thus obtain a single integral equation involving . It turns out that for arbitrary , this integral equation for can be recast, with a suitable multiplicative factor, in the same form as (27) and since has only a single support over , one can again use the Tricomi solution (28) to explicitly obtain . This is the general programme. Below we show how the steps actually work out. Even though the method is quite general and works for arbitrary , let us focus below on our specific case just for simplicity.

Our basic saddle point equation reads

(53)

where the density must also satisfy the two constraints (25) and (26):

(54)

The solution is expected to have support over two disconnected components , with . For consistency, we expect . We also expect if (or otherwise with no constraint on ), and similarly if . We divide into two parts as in (49). The constraints thus become:

(55)

We then apply Tricomi’s theorem (28) to (52) with to determine on the interval and obtain