Empirical spectral distributions of sparse random graphs

Empirical spectral distributions
of sparse random graphs

Amir Dembo Amir Dembo Department of Mathematics
Stanford University
Sloan Hall
Stanford, CA 94305, USA.
amir@math.stanford.edu
 and  Eyal Lubetzky Eyal Lubetzky Courant Institute
New York University
251 Mercer Street
New York, NY 10012, USA.
eyal@courant.nyu.edu
Abstract.

We study the spectrum of a random multigraph with a degree sequence and average degree , generated by the configuration model. We show that, when the empirical spectral distribution (esd) of converges weakly to a limit , under mild moment assumptions (e.g., are i.i.d. with a finite second moment), the esd of the normalized adjacency matrix converges in probability to , the free multiplicative convolution of with the semicircle law. Relating this limit with a variant of the Marchenko–Pastur law yields the continuity of its density (away from zero), and an effective procedure for determining its support.

Our proof of convergence is based on a coupling of the graph to an inhomogeneous Erdős-Rényi graph with the target esd, using three intermediate random graphs, with a negligible number of edges modified in each step.

Key words and phrases:
random matrices, empirical spectral distribution, random graphs
2010 Mathematics Subject Classification:
05C80, 60B20

1. Introduction

We study the spectrum of a random multigraph with degrees , constructed by the configuration model (associating vertex with half-edges and drawing a uniform matching of all half-edges), where has

(1.1)

Letting denote the adjacency matrix of , it is well-known (see, e.g.[1]) that, for random regular graphs—the case of for all with —the empirical spectral distribution (esd, defined for a symmetric matrix with eigenvalues as ) of the normalized matrix converges weakly, in probability, to , the standard semicircle law (with support ).

The non-regular case with has been studied by Bordenave and Lelarge [4] when the graphs converge in the Benjamini–Schramm sense, translating in the above setup to having that are i.i.d. in and uniformly integrable in . The existence and uniqueness of the limiting esd was obtained in [4] by relating the Stieltjes transform of the esd to a recursive distributional equation (arising from the resolvent of the Galton–Watson trees corresponding to the local neighborhoods in ). Note that (a) this approach relies on the locally-tree-like structure of the graphs, and is thus tailored for low (at most logarithmic) degrees; and (b) very little is known on this limit, even in seemingly simple settings such as when all degrees are either 3 or 4.

At the other extreme, when diverges polynomially with (whence the tree approximations are invalid), the trace method—the standard tool for establishing the convergence of the esd of an Erdős–Rényi random graph to —faces the obstacle of nonnegligible dependencies between the edges in the configuration model.

In this work, we study the spectrum of via sequence of approximation steps, each of which couples the multigraph with one that forgoes some of the dependencies, until finally arriving at a tractable Erdős–Rényi (inhomogeneous) random graph.

Our assumptions on the triangular sequence are that they correspond to a sparse multigraph, that is (1.1) holds, and, in addition, there exists some such that

(1.2)

w.r.t. which the normalized degrees satisfy that

(1.3)

where is uniformly chosen in . Let

Theorem 1.1.

Let be the random multigraph with degrees such that (1.1)–(1.3) hold, and suppose that the esd converges weakly to a limit . Then the esd converges weakly, in probability, to , the free multiplicative convolution of with the standard semicircle law .

Remark 1.2.

The free multiplicative convolution was defined for probability measures of non-zero mean, in terms of their -transform, first ([10]) for measures with bounded support, and then ([3]) for measures supported on . Following the extension in  [7] of the -transform to measures of zero mean and finite moments of all order, [2, Theorem 6] provides the -transform for symmetric measures and [2, Theorem 7] correspondingly defines the free multiplicative convolution of such with supported on , a special case of which appears in Theorem 1.1.

Remark 1.3.

The standard goe random matrix (or any Wigner matrix whose i.i.d. entries have finite moments of all order), is asymptotically free of any uniformly bounded diagonal (see, e.g., [1, Theorem 5.4.5]). With the spectral radius of the goe bounded by up to an exponentially small probability, a truncation argument extends the validity of [1, Corollary 5.4.11] to show that is then also the weak limit of the esd for the random matrices .

Corollary 1.4.

Let be i.i.d. for each , such that , , and the law of converges weakly to some . For every sequence such that and , if is the random multigraph with degrees (modifying by if needed for an even sum of the degrees), then the esd converges weakly, in probability, to .

Our convergence results, Theorem 1.1 and Corollary 1.4, are proved in §2. We note that, using the same approach, analogs of these results can be derived for the case of uniformly chosen simple graphs under an extra assumption on the maximal degree, e.g., , whereby the effect of loops and multiple edges is negligible.

The next two propositions, proved in §3, relate the limiting measure with a Marchenko–Pastur law, and thereby, via [9], yield its support and density regularity.

Figure 1. Spectra of two random multigraphs on vertices with different degree sequences . In red, for all , and in blue, for and for , with i.i.d. (bottom plot). The limiting law for the esd, shown by Therorem 1.1 to be , is plotted in black (top plot).
Proposition 1.5.

Let be the law of a nonnegative random variable with . The free multiplicative convolution has the Cauchy–Stieltjes transform

(1.4)

where the symmetric probability measure is the push forward under of the Marchenko–Pastur limit (on ) of the esd of , in which the non-symmetric has standard i.i.d. complex Gaussian entries and for non-negative diagonal matrices and the size-biased such that on .

Remark 1.6.

With denoting the push-forward of by the map (that is, the weak limit of ), we have similarly to Remark 1.3 that , where the push-forward (of density on ), is the limiting empirical distribution of singular values of .

Figure 2. Recovering the support of the limiting esd. Left: esd of the random multigraph on vertices with degrees , , in frequencies , , , resp. Right: from Remark 1.8.

Recall [9, Lemma 3.1, Lemma 3.2] that is uniformly bounded on away from the imaginary axis, and [9, Theorem 1.1] that whenever converges to . Further, the -valued function is continuous on with the corresponding continuous density

(1.5)

being real analytic at any where it is positive. The density of inherits these regularity properties. Bounding uniformly and analyzing the effect of (1.4) we next make similar conclusions about the density of , now also at .

Proposition 1.7.

In the setting of Proposition 1.5, for there is density

(1.6)

which is continuous, symmetric, and moreover real analytic where positive. The support of is , which up to the mapping further matches the support of . In addition , and if then is absolutely continuous (i.e., ).

Remark 1.8.

Recall the unique inverse of on given by

(1.7)

namely throughout (see [9, Eqn. (1.4)]). This inverse extends analytically to a neighborhood of for and [9, Theorems 4.1 & 4.2] show that iff for , where and (thus validating the characterization of which has been proposed in [6]). We show in Lemma 3.1 that everywhere, hence the behavior of at the soft-edges of can be read from the soft-edges of (as in [5, Prop. 2.3]).

Figure 3. Phase diagram for the existence of holes in the limiting esd when is supported on two atoms as given by Corollary 1.9. Left: the region (1.8) (where is not connected) highlighted in blue. Right: zooming in on the emergence of a hole as varies at .
Corollary 1.9.

Suppose of mean one is supported on two atoms . The support of is then disconnected iff

(1.8)

Moreover, when (1.8) holds, consists of exactly two disjoint intervals.

2. Convergence of the ESD’s

The proof of Theorem 1.1 will use the following standard lemma.

Lemma 2.1.

Let be a family of matrices of order , define and . Let denote a family of measures such that

(2.1)
(2.2)
(2.3)

Then the weak limit of as exists and equals .

Proof.

Let be a limit point of , the existence of which is guaranteed by the tightness assumption (2.2). A standard consequence of the Hoffman–Wielandt bound (cf. [1, Lemma 2.1.19]) and Cauchy–Schwarz is that for matrices and of order ,

where is the bounded-Lipchitz metric on the space of probability measures on (see the proof of [1, Theorem 2.1.21]). Thus, by (2.1) and the triangle-inequality for , it follows that

Consequently, as , from which the uniqueness of also follows. ∎

Proof of Theorem 1.1.

In Step I we reduce the proof to dealing with the single-adjacency matrix of , where multiple copies of an edge/loop are replaced by a single one (that is, entry-wise), and further the collection is a fixed finite set . Scaling we rely in Step II, on Proposition 2.3 to replace the limit points of by those of for symmetric matrices of independent Bernoulli entries, using the moment method in Step III to relate the latter to the limit of for the matrices of Remark 1.3.

Step I. We claim that if in probability, then the same applies for . This will follow from Lemma 2.1 with and upon verifying that

(2.4)

Indeed, condition (2.1) has been assumed; condition (2.2) follows from the fact that

so in particular , yielding tightness; and condition (2.3) holds in probability by (2.4) and Markov’s inequality. To establish (2.4), observe that, for every and we have for and , whereas for every and such that . Thus,

Since uniformly over , we take wlog , yielding for large

by our assumption that . Considering hereafter only single-adjacency matrices, we proceed to reduce the problem to the case where the variables are all supported on a finite set. To this end, let for and

where are continuity points of of interdistances in , which are furthermore in for some irrational . Let

possibly deleting one half-edge from if needed to make even.

Observation 2.2.

Let be two degree sequence with for all , and let be a random multigraph with degrees generated by the configuration model. Construct by (a) marking a uniformly chosen subset of half-edges of vertex blue, independently; (b) retaining every edge that has two blue endpoints; and (c) adding an independent uniform matching on all other blue half-edges. Then has the law of the random multigraph with degrees generated by the configuration model.

(Indeed, since the configuration model matches the half-edges in via a uniformly chosen perfect matching, and the coloring step (a) is performed independently of this matching, it follows that the induced matching on the subset of blue half-edges that are matched to blue counterparts—namely, the edges retained in step (b)—is uniform.) Using this, and noting that for all , let be the following random mutligraph with degrees , coupled to the already-constructed :

  1. For each , mark a uniformly chosen subset of half-edges incident to vertex  as blue in .

  2. Retain in every edge of where both parts are blue.

  3. Complete the construction of via a uniformly chosen matching of all unmatched half-edges.

Let for , the single-adjacency matrix of . We next control the difference between and . Indeed, by the definition of the coupling of and , the cardinality of the symmetric is at most twice the number of unmarked half-edges in . Thus,

(2.5)

where the first term in accounts for the discrepancy between and , the term accounts for the degree quantization, while the last term accounts for degree truncation (since ). Thanks to the assumed uniform integrability of we have that satisfies as . Furthermore,

by the choice of in (1.2), yielding the tightness of . Altogether, we conclude from Lemma 2.1 that, if , then .

Next, let (as in (1.2) but for the multigraph ). Since (see (2.5)),

wlog we replace by in the definition of , i.e., starting with

Further, note that the hypothesis as , together with our choice of , implies that (corresponding to ) converges weakly for each to some , supported on , and further, , as .

Let denote hereafter the push-forward of the measure by the mapping . Recall that provided , all of which are supported on (see, e.g.[2, Prop. 3]). Applying this twice, we deduce that

(2.6)

Recall [2, Lemma 8] that the lhs of (2.6) equals , while likewise its rhs equals . For any , the function is in . Thus, the weak convergence , implies for the symmetric source measures, that . In conclusion, it suffices hereafter to prove the theorem for the case where , a fixed finite set, for all .

Step II. Turning to this task, for , let where is the set of vertices of degree in . By assumption, for . (Observe that our choice of dictates that .) For all , set

Let for the edge-disjoint multigraphs that are generated by the configuration model in the following way.

  • For , let be the random -regular multigraph on , where is even and converges to as .

  • For , let be the random bipartite multigraph with sides and degrees in and in , such that the detailed balance

    holds, and tends to as (hence, ).

Finally, setting

(2.7)

let denote the singe-adjacency matrix of the multigraph , where the edge-disjoint multigraphs are defined as follows.

  • For , mutually independently set the multiplicity of the edge between distinct and in to be a random variable.

  • For , mutually independently set the number of loops incident to to be a random variable.

Our next proposition shows that , in probability, whenever

(2.8)
Proposition 2.3.

The empirical spectral measures of and , the respective single-adjacency matrices of and , satisfy

in probability, as .

Proof.

Setting

associate with each multigraph its sub-degrees (accounting for edge multiplicities),

so in particular where is such that . Of course, for ,

(2.9)
Claim 2.4.

Conditional on a given sequence of sub-degrees , the adjacency matrices for all have the same conditional law.

Proof.

Observe that gives the same weight to each perfect matching of its half-edges, thus conditioning on amounts to specifying a subset of permissible matchings, on which the conditional distribution would be uniform. The same applies to the graphs for all , each being an independently drawn uniform multigraph, and hence to their union , thus establishing the claim for . To treat , notice that the probability that the multigraph , , given the sub-degrees , features the adjacency matrix (, ), is

by the definition of the configuration model. As the distribution of a vector of i.i.d. Poisson variables with mean , conditional on their sum being , is multinomial with parameters , the analogous conditional probability under is

Lastly, the probability that , conditional on , assigns to is

whereas the analogous conditional probability under (now involving a vector that is multinomial with parameters for , recalling the factor of 2 in the definition of the rate of loops under ), is