Interlacing Families IV: Bipartite Ramanujan Graphs of All Sizes
We prove that there exist bipartite Ramanujan graphs of every degree and every number of vertices. The proof is based on analyzing the expected characteristic polynomial of a union of random perfect matchings, and involves three ingredients: (1) a formula for the expected characteristic polynomial of the sum of a regular graph with a random permutation of another regular graph, (2) a proof that this expected polynomial is real rooted and that the family of polynomials considered in this sum is an interlacing family, and (3) strong bounds on the roots of the expected characteristic polynomial of a union of random perfect matchings, established using the framework of finite free convolutions introduced recently in [MSS15a].
Ramanujan graphs are undirected regular graphs whose nontrivial adjacency matrix eigenvalues are asymptotically as small as possible; in other words, they are the optimal spectral expander graphs. In this paper, we prove the existence of bipartite Ramanujan graphs of every degree and every size. We do this by showing that a random regular bipartite graph, obtained as a union of random perfect matchings across a bipartition of an even number of vertices, is Ramanujan with nonzero probability. Specifically, we prove that the expected characteristic polynomial of such a random graph has roots concentrated in the appropriate range, and use the method of interlacing families introduced in [MSS15b] to deduce that there must be an actual graph whose eigenvalues are no worse than the roots of this polynomial. Infinite families of bipartite Ramanujan graphs were shown in that paper to exist for every degree , but it was not known whether they exist for every number of vertices.
The main conceptual and technical contributions of this work and the companion paper [MSS15a] are the following. First, we identify a new class of real-rooted expected characteristic polynomials related to random graphs, and develop new tools for establishing their interlacing properties and analyzing the locations of their roots. These methods are different from those used to study the mixed characteristic polynomials of [MSS15c], and the bounds we obtain are strictly stronger than those produced by the original “barrier method” argument introduced in [BSS12] (which is off by a factor of two in this setting). Notably, the expected characteristic polynomials we consider are computable in polynomial time, unlike most other known expected characteristic polynomials. Second, in contrast to previous work, we derive the Ramanujan bound from completely generic considerations involving random orthogonal matrices, in particular making no use of results from algebraic graph theory or number theory.
1.1 Summary of Results
Recall that the adjacency matrix of an regular graph on vertices
Our main theorem is that a union of random perfect matchings across a bipartition of vertices is Ramanujan with nonzero probability.
Let be independent uniformly random permutation matrices, . Then, with nonzero probability the nontrivial eigenvalues of
are all less than in absolute value.
We also prove the following non-bipartite version of this theorem, regarding a union of random perfect matchings on vertices (not bipartite), with even.
Let be independent uniformly random permutation matrices, even, . Let be the adjacency matrix of any fixed perfect matching on vertices. Then with nonzero probability:
Since we only prove nonzero bounds on the probabilities, the nonbipartite theorem is a logical consequence of the bipartite one. We describe it here because its proof is substantially easier and contains most of the main ideas. Note that Theorem 1.2 does not produce Ramanujan graphs because it does not guarantee any control of the least eigenvalue .
We remark that as they are unions of independent matchings, the graphs we produce may have multiple edges between two vertices. Thus, they are strictly speaking multigraphs, and do not subsume the previous results if one insists on simple graphs. However, it seems that it should be more difficult to construct Ramanujan graphs with multiedges than without. Like [MSS15b], this paper establishes existence but does not give a polynomial time construction of Ramanujan graphs.
1.2 Related Work
Infinite families of Ramanujan graphs were first shown to exist for , a prime, in the seminal work of Margulis and Lubotzky-Phillips-Sarnak [Mar88, LPS88]. The graphs they produce are Cayley graphs and can be constructed very efficiently, and their analysis relies on deep results from number theory, which is responsible for the “Ramanujan” nomenclature. Friedman [Fri08] showed that a random regular graph is almost Ramanujan: specifically, that a union of perfect matchings has non-trivial eigenvalues bounded by with high probability, for every . More recently, in [MSS15b], we proved the existence of infinite families of regular bipartite Ramanujan graphs for every by proving (part of) a conjecture of Bilu and Linial [BL06] regarding the existence of good lifts of regular graphs. Prior to the present paper, it was unknown if there exist Ramanujan graphs of every number of vertices. We refer the reader to [HLW06] and [MSS15b] for a more detailed discussion of Ramanujan graphs and lifts.
1.3 Outline of the Paper
The proofs of both of our theorems follow the same strategy and consist of three steps. In each step we present the simpler non-bipartite case first, and then indicate the modifications required for the bipartite case.
First, we show that the expected characteristic polynomials of the random graphs we are interested in are real rooted and come from interlacing families (reviewed in Section 2.1), which reduces our existence theorems to analyzing the roots of these polynomials. This is achieved in Section 3 by decomposing the random permutations used to generate these expected polynomials into swaps acting on two vertices at a time, and showing that such swaps correspond to linear transformations which preserve real-rootedness properties of certain multivariate polynomials. Theorem 3.3 implies that if and are symmetric matrices, then the expected characteristic polynomial of is real rooted for a random permutation matrix . We remark that this argument is completely elementary and self-contained, and unlike [MSS15b, MSS15c] does not appeal to any results from the theory of real stable or hyperbolic polynomials. In the process, we introduce a class of “determinant-like” polynomials which may be of independent interest.
Next, in Section 4 we derive a closed-form formula for the expected characteristic polynomial of a sum of randomly permuted regular graphs. We begin by proving that the expected characteristic polynomials over random permutations can be replaced by expected characteristic polynomials over random orthogonormal matrices. This may be seen as a quadrature or derandomization statement, which says that these characteristic polynomials are not able to distinguish between the set of permutation matrices and the set of orthogonal matrices; essentially this happens because determinants are multilinear, which causes certain restrictions of them to have very low degree Fourier coefficients. This component of the proof may also be of independent interest.
Finally, we appeal to machinery developed in our companion paper [MSS15a], which studies the structure of expected characteristic polynomials over random orthogonal matrices. In particular, such polynomials may be expressed crisply in terms of a simple (and explicitly computable) convolution operation on characteristic polynomials, which we call the finite free additive convolution. In this framework, the characteristic polynomial of a union of random matchings is simply the wise convolution of the characteristic polynomial of a single matching. By applying strong bounds on the roots of these convolutions derived in [MSS15a], we obtain the desired Ramanujan bound of . The requisite material regarding free convolutions is introduced in Sections 2.2 and 2.3.
2.1 Interlacing Families
We recall the following theorem from [MSS15c], stated here in the slightly different language of product distributions.
Theorem 2.1 (Interlacing Families).
Suppose is a family of real-rooted polynomials of the same degree with positive leading coefficient, such that
is real-rooted for every product distribution on . Then for every and every such , there is some such that
where denotes the th largest root of a real-rooted polynomial.
For real rooted polynomials and , we write if the roots of and interlace and the largest root of is at least as big as the largest root of . We will use the following elementary facts about interlacing and real-rootedness, which may be found in [Fis08].
If has degree one less than and both are real-rooted, then
if and only if is real-rooted for all
implies that .
If and are monic and real-rooted of the same degree, then they have a common interlacing if and only if is real-rooted for all .
2.2 Finite Free Convolutions of Polynomials
To analyze the expected characteristic polynomials of the random graphs we consider, we will need the notion of a finite free convolution of two polynomials, developed in our companion paper [MSS15a]. We denote the characteristic polynomial of a matrix by:
Definition 2.3 (Symmetric Additive Convolution).
Let and be two real-rooted polynomials, for some symmetric matrices and . The symmetric additive convolution of and is defined as:
where the expectation is taken over random orthogonal matrices sampled according to the Haar measure on , the group of -dimensional orthonormal matrices.
Note that this is a well-defined operation on polynomials because the distribution of the eigenvalues of depends only on the eigenvalues of and the eigenvalues of , which are the roots of and .
Definition 2.4 (Asymmetric Additive Convolution).
Let and be two real-rooted polynomials with nonnegative roots, for some arbitrary (not necessarily symmetric) matrices and . The asymmetric additive convolution of and is defined as
where and are independent random orthogonal matrices sampled uniformly from .
When dealing with a possibly asymmetric matrix , we will frequently consider the dilation
which is by construction a symmetric matrix. We will refer to a matrix of this type as a bipartite matrix. It is easy to see that its eigenvalues are symmetric about and are equal to , i.e., in absolute value to the singular values of . This correspondence also gives the useful identity
where the operator is defined by
With this notation in hand, we can alternately express the asymmetric additive convolution as
Explicit, polynomial time computable formulas for the additive convolutions in terms of the coefficients of and may be found in Theorems 1.1 and 1.3 of [MSS15a]. For this work, we only require the following important consequences of these formulas, also established in [MSS15a]. We will occasionally drop the subscripts in and when it is clear from the context.
Lemma 2.5 (Properties of and ).
If and are real-rooted then is also real-rooted.
If and are real-rooted with all roots nonnegative, then is also real-rooted with all roots nonnegative.
The operations and are bilinear (in the coefficients of the polynomials on which they operate) and associative.
for random orthogonal matrices . The same argument shows that this is also equal to .
An analogous argument using the formula (2) shows that is also associative. ∎
A consequence of the above lemma is that for matrices , identities such as
2.3 Cauchy Transforms
The device that we use to analyze the roots of finite free convolutions of polynomials is the Cauchy Transform. This is the same (up to normalization) as the Stieltjes Transform and the “Barrier Function” of [BSS12, MSS15b, MSS15c].
Definition 2.6 (Cauchy Transform).
The Cauchy Transform of a polynomial with roots is defined to be the function
We define the inverse Cauchy Transform of to be
Note that the Cauchy transform has poles at the roots of , and when all the roots of are real, is monotone decreasing for greater than the largest root. Thus, is the unique value of that is larger than all the for which . In particular, it is an upper bound on the largest root of , and approaches the largest root as .
Our bounds on the expected characteristic polynomials of random graphs are a consequence of the following two theorems, which are proved in [MSS15a].
Theorem 2.7 (Theorem 1.7 of [MSS15a]).
For real-rooted degree polynomials and and ,
The above theorem is a strengthening of the univariate barrier function argument for characteristic polynomials introduced in [BSS12]. This may be seen by taking , which corresponds to a rank one matrix with trace equal to . It is easy to check that in this case .
We remark that Theorem 2.7 is inspired by an equality regarding inverse Cauchy transforms of limiting spectral distributions of certain random matrix models arising in Free Probability theory; we refer the interested reader to [MSS15a] for a more detailed discussion. To analyze the case of bipartite random graphs, we will need the corresponding inequality for the asymmetric convolution.
Theorem 2.8 (Theorem 1.8 of [MSS15a]).
For degree polynomials and having only nonnegative real roots,
3 Interlacing for Permutations
In this section, we show that the expected characteristic polynomials obtained by averaging over certain random permutation matrices form interlacing families. The class of random permutations which have this property are those that are products of independent random swaps, which we now formally define.
Definition 3.1 (Random Swap).
A random swap is a matrix-valued random variable which is equal to a transposition of two (fixed) indices with probability and equal to the identity with probability , for some .
Definition 3.2 (Realizability by Swaps).
A matrix-valued random variable supported on permutation matrices is realizable by swaps if there are random swaps such that the distribution of is the same as the distribution of the product .
For example, we show in Lemma 3.5 below that a uniformly random permutation matrix is realizable by swaps.
The main result of this section is that expected characteristic polynomials over products of random swaps are always real-rooted. These polynomials play a role analogous to that of mixed characteristic polynomials in [MSS15b, MSS15c].
Let be symmetric matrices and let be independent (not necessarily identical) random swaps. Then the expected characteristic polynomial
Theorem 3.4 (Interlacing Families for Permutations).
Suppose are symmetric matrices, and are independent random permutations realizable by swaps. Then, for every :
with nonzero probability.
Theorem 3.4 is useful because the uniform distribution on permutations and its bipartite version, which we use to generate our random graphs, are realizable by swaps.
Let and be uniformly random permutation matrices. Both and are realizable by swaps, where is the direct sum of and .
We will establish the claim for first. We proceed inductively. Let be a random swap which swaps and with probability , and for let
where swaps and with probability .
Let . By induction, assume that the first coordinates of are in uniformly random order; in particular, that is a random element of This means that:
With probability : and the remaining indices contain a random permutation of .
With probability : is a uniformly random element and the remaining indices contain a random permutation of
Thus, is uniformly random on , and by induction .
For , we use the above argument to realize and separately and then multiply them. ∎
The rest of this section is devoted to proving Theorem 3.3. This is achieved by showing that the polynomials in (4) are univariate restrictions of certain nice multivariate polynomials. The relevant notion is the following.
Definition 3.6 (Determinant-like Polynomials).
A homogeneous polynomial of degree in the entries of symmetric matrices is called determinant-like if it has the following two properties.
Hyperbolicity. The univariate restrictions
are real-rooted for all symmetric .
This condition is known as hyperbolicity of the polynomial with respect to the point . We do not discuss the notion of hyperbolicity further, since the self-contained definition above suffices for this paper. We point the interested reader to [Pem12] for a detailed discussion of the theory.
Rank-1 Linearity. For every vector , index , and real number , we have
is the directional derivative of in direction , where appears in the th position. Note that is homogeneous of degree .
An important example of a determinant-like polynomial is the determinant of a sum of matrices:
Hyperbolicity follows from the fact that
is the characteristic polynomial of a symmetric matrix. Rank-1 linearity can be seen to follow from the invariance of the determinant under change of basis and its linearity with respect to matrix entries. Alternatively, one can prove it by using the matrix determinant lemma, which tells us
The crux of the proof of Theorem 3.3 lies in the fact that random swaps define linear operators which preserve the property of being determinant-like.
Lemma 3.7 (Random swaps preserve determinant-likeness).
If is determinant-like, then for any and random swap , the polynomial
Before proving this lemma, we record some preliminary facts about determinant-like polynomials.
Lemma 3.8 (Rank-1 updates interlace).
Suppose is determinant-like. Then for every vector and symmetric matrices we have
where denotes interlacing, pointing to the polynomial with the largest root.
Assume without loss of generality that . By rank-1 linearity,
By the hyperbolicity of , we know that this is real rooted when viewed as a univariate polynomial in . Since is of degree one less than , the first part of Lemma 2.2 implies that
which in turn by the second part of Lemma 2.2 gives
as desired. ∎
Lemma 3.9 (Permutations preserve rank-1 linearity).
(1) If is a permutation matrix and is rank-1 linear then is also rank-1 linear. (2) If and are rank-1 linear then so is .
(1) is true because the set of rank one matrices is invariant under conjugation by permutations. (2) holds because is a linear operator. ∎
We will also need the following elementary observation, which says that random swaps correspond to trace zero rank two updates. This is the structural property which causes interlacing to occur.
If is a transposition and is symmetric then has rank and trace .
Assume without loss of generality that swaps the first two coordinates. Then by symmetry the difference has entries
for some numbers and some column vector of length . If then the sum of the first two rows is equal to for some , and every other row is a scalar multiple of this. On the other hand, if then the first two rows are linearly dependent, and all of the other rows are multiples of ∎
We can now complete the proof of Lemma 3.7
Proof of Lemma 3.7.
Assume is determinant-like, and let be a random swap, equal to some transposition with probability and the identity with probability . We will show that
is hyperbolic and rank-1 linear. It is clear that is homogeneous since swaps and convex combinations preserve homogeneity. Lemma 3.9 implies that rank-1 linearity is also preserved, so all that remains is hyperbolicity. Assume without loss of generality that and consider any univariate restriction along :
We need to show that this has all real roots. Observe that the second polynomial may be written as
In this section, we show that the expected characteristic polynomials we are interested in are free convolutions of the characteristic polynomials of perfect matchings, after the trivial eigenvalues corresponding to the all ones vector are removed. This gives us explicit formulas for these polynomials, and more importantly (since we understand the behavior of roots under free convolutions) a way of bounding their roots. We begin by showing how to do this for the symmetric case, which is more transparent and contains all the main ideas. In Section 4.2 we derive the result for the bipartite case as a corollary of the result for the symmetric case.
4.1 Quadrature for Symmetric Matrices
The following theorem gives an explicit formula for the expected characteristic polynomial of the sum of two symmetric matrices with constant row sums when the rows and columns of one of the matrices is randomly permuted. This can be used to compute the expected characteristic polynomial of the Laplacian matrix of the sum of two graphs when one is randomly permuted. In this paper, we use the result to compute the expected characteristic polynomial of the adjacency matrix when both graphs are regular.
Suppose and are symmetric matrices with and . Let and . Then,
where is a uniformly random permutation.
We begin by writing (6) in a more concrete form. Observe that all of the matrices have as a left and right eigenvector, which means that there is an orthogonal change of basis (for concreteness, mapping to the standard basis vector ) that simultaneously block diagonalizes all of them:
where denotes the direct sum
Since the determinant is invariant under change of basis, we may write
Notice also that and , so
where is a (Haar) random orthogonal matrix. Thus, (6) is equivalent to showing that
for all symmetric matrices . Note that for any permutation , the orthogonal transformation correspondingly permutes , the projections orthogonal to of the standard basis vectors , embedded in . Since these are the vertices of a regular simplex with vertices in centered at the origin, we interpret the as elements of the symmetry group of this simplex. We denote this subgroup of by .
Since there is no longer any assumption on other than symmetry, we may absorb the term into in (9), and we see that it is sufficient to establish the following.
Theorem 4.2 (Quadrature Theorem).
For symmetric matrices and ,
It is easy to see that the theorem will follow if we can show that the left hand side of (10) is invariant under right multiplication of by orthogonal matrices.
Lemma 4.3 (Invariance Implies Quadrature).
Let be a function from to and let be a finite subgroup of . If
for all , then
where is chosen according to Haar measure and is uniform on .
as desired. ∎
We will prove Theorem 4.2 by showing that satisfies (11). We will achieve this by demonstrating that is invariant under certain elementary orthogonal transformations acting on 3-faces of the regular simplex, which generate all orthogonal transformations. Let us fix some notation to precisely describe these elementary transformations.
Given three vertices of the regular simplex, let denote the subgroup of consisting of permutations of which leave all of the other vertices fixed. Let denote the subgroup of acting on the two dimensional linear subspace parallel to the affine subspace through the three vertices, and leaving the orthogonal subspace fixed. Note that is a subgroup of , and that these groups are isomorphic to and , respectively.
The heart of the proof lies in the following lemma, which implies by Lemma 4.3 that the polynomials we are interested in are not able to distinguish between the uniform distributions on and . The reason for this is that these polynomials have very low degree (at most two) in the entries of any orthogonal matrix acting on a two-dimensional subspace, a fact which is essentially a consequence of the multilinearity of the determinant. The argument below is similar to the proof of Lemma 2.7 in [MSS15a].
Lemma 4.4 (Invariance for ).
If and are symmetric matrices, then for every ,
Let be the subgroup of consisting of rotation matrices
and let be the subgroup of consisting of the three rotations We begin by showing that
for every , where is the )-dimensional identity. Since the elements of are themselves rotations, we can rewrite thrice the right hand side of (14) as
As this quantity is independent of , we can assume , which gives the left hand side of (14).
To finish the proof, we observe that
where consists of the identity and the reflection across the horizontal axis:
and is chosen uniformly from .
Thus, the left hand side of (13) is invariant under conjugation of with the matrices . Since every can be written a for some , and we have already established invariance under in (14), the lemma is proved.
Lemma 4.5 (Determinants are Low Degree in Rank 2 Rotations).
Let be matrices and let
Then for .
Recall that all rotations may be diagonalized as
is independent of . This implies that for diagonal containing and in the upper right block and ones elsewhere, with independent of . Thus, we see that
Notice that the matrix depends linearly on , and that the (resp. ) terms appear only in the first (resp. second) row and column of , respectively. Since each monomial in the expansion of the determinant contains at most one entry from each row and each column and , this implies that no terms of degree higher than two in or appear. ∎
Corollary 4.6 (Invariance for ).
For every , and ,
Let be the orthogonal transformation that maps the affine subspace spanned by the vertices to the first two coordinates of