Arithmetic expanders and deviation bounds for random tensors

# Arithmetic expanders and deviation bounds for sums of random tensors

Jop Briët CWI, Science Park 123, 1098 XG Amsterdam, The Netherlands  and  Shravas Rao Courant Institute, New York University, 251 Mercer Street, New York NY 10012, USA
###### Abstract.

We prove hypergraph variants of the celebrated Alon–Roichman theorem on spectral expansion of sparse random Cayley graphs. One of these variants implies that for every prime and any , there exists a set of directions of size such that for every set  of density , the fraction of lines in  with direction in  is within of the fraction of all lines in . Our proof uses new deviation bounds for sums of independent random multi-linear forms taking values in a generalization of the Birkhoff polytope. The proof of our deviation bound is based on Dudley’s integral inequality and a probabilistic construction of -nets. Using the polynomial method we prove that a Cayley hypergraph with edges generated by a set  as above requires  for (our notion of) spectral expansion for hypergraphs.

J. B. was supported by a VENI grant from the Netherlands Organisation for Scientific Research (NWO)
This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1342536.

## 1. Introduction

In the following all graphs are undirected and may have loops and parallel edges. For an -vertex graph  and denote by the number of edges connecting  and . If  is -regular then its normalized adjacency matrix  is given by . Let  be the eigenvalues of  arranged in decreasing order and denote .

### 1.1. Spectral expanders.

Spectral expanders are infinite families of graphs of size increasing with  such that the spectral gap is at least some that is independent of . A single graph is said to be an expander if it is tacitly understood to belong to such a family. Spectral expansion, the property of having large spectral gap, occurs in random graphs have with high probability. Seminal work on quasirandomness of Thomason [Tho87a, Tho87b], and Chung, Graham, and Wilson [CGW89] showed that for dense graphs, this property is equivalent to a number of other likely features of random graphs. One of these is expansion, a measure of connectedness showing that no large set of vertices can be disconnected from its complement by cutting only a few edges. Another is discrepancy, which refers the property that the edge density of any sufficiently large induced subgraph is close to the overall edge density.

A long line of research extending the results of [CGW89] to dense hypergraphs was initiated by Chung and Graham [CG90], culminating in recent work of Lenz and Mubayi [LM15b, LM15a] (which we refer to for a more detailed account). Partially motivated by an application in Theoretical Computer Science concerning special types of error-correcting codes (locally decodable codes) [BDG17], we study the extent to which some known results on sparse expanders generalize to hypergraphs. Along the way we establish a new deviation inequality for sums of independent random multi-linear forms (Theorem 2.4) that we hope will find applications elsewhere.

### 1.2. Cayley graphs and the Alon–Roichman Theorem.

Most known examples of sparse expanders are Cayley graphs, which are defined as follows. For a finite group  and an element , the Cayley graph is the 2-regular graph with vertex set  and edge set , where in case , all edges are doubled. For a multiset111We use curly brackets to delimit multisets: unordered lists that may contain repeated elements. , the Cayley graph is the -regular graph formed by the union of the graphs .

The group over which Cayley graphs are defined strongly influences the minimal degree required for spectral expansion. The famous examples of constant-degree expanders of Margulis [Mar73, Mar88] and Lubotzky, Phillips, and Sarnak [LPS88] are Cayley graphs which, crucially, are defined over non-Abelian groups. It is easy to see that a Cayley graph over the Abelian group , for example, requires degree at least  to be an expander [AR94].

###### Proposition 1.1.

Let  be such that . Then, .

###### Proof.

Let . Let  be an -dimensional subspace containing  and let . Since are connected if and only if and every pair satisfies , the sets  and  are disconnected. It follows that is an eigenvector of and has eigenvalue . Hence, . ∎

Similarly, because expanders must be connected, it follows that spectral expansion requires degree in any Cayley graph over any -element Abelian group [HLW06, Proposition 11.5]. A celebrated result of Alon and Roichman [AR94], however, shows that Abelian groups are extreme in this sense.

###### Theorem 1.2 (Alon–Roichman Theorem).

For any  there exists a such that the following holds. Let  be a finite group of cardinality . Let  be an integer and let  be independent uniformly distributed elements from . Then, with probability at least , the Cayley graph  satisfies

Our main results are hypergraph versions of Proposition 1.1 and Theorem 1.2.

### 1.3. Hypergraphs

A -uniform hypergraph with vertex set  has as edge set  a family of unordered -element multisets with possible parallel edges. For let denote the number of edges equal to . The adjacency form of  is the -linear form defined by . The degree of a vertex  is defined by and is -regular if every vertex has degree exactly , in which case its normalized adjacency form is . Of particular importance here are hypergraphs whose edge set is given by a multiset of the form , , where are permutations on . In this case we set

 (1) eH(u1,…,ut)=∑σ∑v∈V1{u1}(πσ(1)(v))⋯1{ut}(πσ(t)(v)),

where  runs over all permutations of , giving a -regular hypergraph.

### 1.4. Hypergraph spectral expansion.

To define spectral expansion for hypergraphs we build on the following characterisation of . Recall that the Schatten- norm (or spectral norm) of a matrix  is given by . If  is symmetric, then this norm is precisely the maximum absolute value of the eigenvalues of . Since for an -vertex graph , the eigenvector associated with the first eigenvalue  is the normalized all-ones vector , we have , where is the all-ones matrix. Our definition of spectral expansion for hypergraphs is based on the following norm on multilinear forms. For a -linear form  on and define

 ∥A∥ℓp,…,ℓp=sup{A(x[1],…,x[t])∥x[1]∥ℓp⋯∥x[t]∥ℓp:x[1],…,x[t]∈Rn∖{0}}.

The notion of spectral expansion we shall use is relative to a fixed regular -uniform hypergraph . In particular, for a regular -uniform hypergraph , we define

 (2) λK(H)=∥AH−AK∥ℓt,…,ℓt.

For graphs, this parameter coincides with  if  is the complete graph with all loops.

### 1.5. Cayley hypergraphs.

A Cayley hypergraph over a finite group  is a disjoint union of particular permutation hypergraphs as mentioned in Section 1.3. Let be an integer vector such that no element of  has order  for every . This ensures that for every , the maps are permutations. For , we define to be the hypergraph as in Section 1.3 based on the permutations . For a multiset , we let be the -regular hypergraph given by the union of for .

To connect the above definitions, consider a Cayley hypergraph . For a subset , let be a sub-hypergraph of  and let . Then, for every set of density , we have . Dividing by  shows that the fraction of edges that  induces in  is within  of the fraction of edges it induces in .

### 1.6. Translation invariant equations

To motivate the above definitions we focus on a special class of Cayley hypergraphs that arises from systems of translation invariant equations. Such a system can be given in terms of a matrix and a vector such that . For an Abelian group  without elements of order for every , we then consider the set of solutions in to the linear equations defined by ,

 Sol(C)={h=(h[1],…,h[t])∈Γt:Ch=0}.

There is a large body of literature on the problem of bounding the maximum size of a set such that contains only trivial solutions. Well-studied examples involving a single equation (where ) include Sidon sets [O’B04], where , and sets without 3-term arithmetic progressions (APs), where (sometimes referred to as cap sets) [O’B11, San11, EG16]. Sets avoiding a general -variate translation invariant equation were studied in [Ruz93, Blo12, SS14]. Probably the most-studied examples involving more than one equation are -term APs [Sze90, Gre07, Tao07, O’B11], where

 (3) C=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣1−210⋯00001−21⋯000⋮⋮⋮⋮⋱⋮⋮⋮0000⋯1−21⎤⎥ ⎥ ⎥ ⎥ ⎥⎦∈Z(t−2)×t.

Translation invariance refers to the fact that for every in  and every , the tuple belongs to  as well. As such, is a union of cosets of the subgroup . If is a set of representatives of these cosets, then the edge set of the hypergraph is furnished precisely by the (unordered) tuples in , which leads to the following definition.

###### Definition 1.3 (Arithmetic expander).

Let be the Cayley hypergraph as above. A multiset is a -arithmetic expander if

 λK(Cay(t)(Γ,q,S′))≤ε.

The preceding discussion shows that an arithmetic expander has the property that for every set  of density , the fraction of solutions in  among the cosets represented by  is within of the fraction of all solutions in . For APs, this means the following. The matrix  as in (3) satisfies for , from which it follows that consists of cosets represented by APs through zero, , which correspond to the possible steps that an AP can take. In this case, an arithmetic expander is thus characterized by a small set of steps  such that the fraction of APs in any set  taking steps from  gives an accurate estimate of the fraction of all APs in . The AP matrix  also satisfies for , from which it follows that consists of the cosets with representatives given by the points through which -term APs travel, . In this case, an arithmetic expander thus estimates the fraction of all APs by the fraction of APs travelling through a small fixed set of points.

## 2. Our results

### 2.1. Spectral expansion of Cayley hypergraphs.

Our first result is an extension of Proposition 1.1 concerning arithmetic expanders for -APs where is a prime.

###### Theorem 2.1.

For every prime  there exist such that the following holds. Let be an integer, let and let be as in (3) with . Then, for any , any -arithmetic expander has size at least .

Our second result is a version of Theorem 1.2, showing for instance that in the AP case, for  as in (3), there exist -arithmetic expanders of size for both options of , where depends on and only.

###### Theorem 2.2.

For every integer and there exists a such that the following holds. Let be a finite group of cardinality , let be such that has no elements of order for every , let be a multiset and . For , let be a multi-set of  independent uniformly distributed tuples from . Then, with probability at least , .

### 2.2. A deviation bound for sums of random tensors

Our proof of Theorem 2.2 follows similar lines as a slick proof of Theorem 1.2 due to Landau and Russel [LR04]. Their proof is based on a matrix-valued deviation inequality called the matrix-Chernoff bound. One can also use the following matrix version of the Hoeffding bound, which follows from a non-commutative Khintchine inequality of Tomczak-Jaegermann [TJ74] (see Appendix A) and which is more in line with the tools we shall use below.

###### Theorem 2.3 (Matrix Hoeffding bound).

There exist absolute constants  such that the following holds. Let  be independent random matrices such that  for each . Then, for any , we have

 \rm Pr[∥∥1kk∑i=1(Ai−E[Ai])∥∥S∞>ε]≤Cexp(−ckε2logn).
###### of Theorem 1.2.

For  let  denote the adjacency matrix of the Cayley graph . Since  is the average of two permutation matrices, . Observe that if  is a uniformly distributed element, then . Moreover, since , the result now follows from Theorem 2.3. ∎

The proof of Theorem 2.2 is similarly based on a new deviation bound for multi-linear forms that belong to a generalization of the Birkhoff polytope (of doubly-stochastic matrices). To define this polytope, we first consider the following generalization of a doubly-stochastic matrix. Let be the standard basis vectors and let denote the all-ones vector. A -linear form on is plane sub-stochastic if is nonnegative on the standard basis vectors and if for every , we have

 A(es,1,1,…,1) ≤1 A(1,es,1,…,1) ≤1 ⋮ (4) A(1,1,…,1,es) ≤1.

Let  be the polytope of -linear forms  on such that the form  defined by for , is plane sub-stochastic. Observe that the set is the set of matrices such that is doubly sub-stochastic.222Recall that the Birkhoff–von Neumann Theorem states that the Birkhoff polytope is the convex hull of the set of permutation matrices. In [LL14] it is shown that for , the natural analogue of this fails for the set of forms in  that attain equalities in (4) and are nonnegative on standard basis vectors. Our deviation bound then is as follows.

###### Theorem 2.4.

For every integer  there exist absolute constants such that the following holds. Let  be independent random elements over . Then, for any and ,

 (5) \rm Pr[∥∥1kk∑i=1(Ai−E[Ai])∥∥ℓp,…,ℓp>ε]≤Cexp(−ckε2σp,t(n)2),

where

 σp,t(n)=n12−1pmax{1,n1−12t−t−1p}(logn)t+12.

For example, for , we have , and . The proof of Theorem 2.2 is now nearly identical to the proof of the Alon–Roichman theorem shown above.

###### of Theorem 2.2.

Let . For  let  be the adjacency form of and recall from Section 1.5 that is plane sub-stochastic. Observe that if  is uniform over , then . Finally, since , the result follows from Theorem 2.4 (with ) and the definition of . ∎

###### Remark 2.5 (Sub-optimality of Theorem 2.4?).

We conjecture that when , the dependence of (5) on  is sub-optimal in the sense that can be replaced with some function . However, due to a result of Naor, Regev, and the first author [BNR12] (see also [Bri15]), it must be the case that for every . Their result implies that for every there exist , such that the following holds. For infinitely many , there exists a collection of forms such that for independent Rademacher random variables (satisfying ), we have

 E[∥∥1kk∑i=1ϵiBi∥∥ℓt,…,ℓt]≥ε(t).

Setting , a standard calculation shows that the above expectation is at most

 ∫∞0\rm Pr[∥∥1kk∑i=1(Ai−E[Ai])∥∥ℓt,…,ℓt>ε]dε≤C√f(n)k

for some absolute constant , showing that  cannot be poly-logarithmic in .

### Open problems

Our results leave open the problem of determining the minimal degree required for spectral expansion of random Cayley hypergraphs. Remark 2.5 could be interpreted as suggesting the intriguing possibility that, in stark contrast with the Alon–Roichman Theorem, this degree must be quasi-polynomial in the size of the group. Another problem is to determine the optimal form of Theorem 2.4. Finally, it is open if the straightforward generalization of the Expander Mixing Lemma given in Proposition 3.1 below admits a converse for Cayley hypergraphs. A converse was shown to hold for Cayley graphs by Kohayakawa, Rödl, and Schacht [KRS16] and Conlon and Zhao [CZ16].

### Acknowledgements

J. B. would like to thank Jozef Skokan for pointing him to [LM15b] and Zeev Dvir and Sivakanth Gopi for helpful discussions.

## 3. Proof of Theorem 2.1

In this section we prove Theorem 2.1. To rephrase this result, consider for a set the Cayley hypergraph

 LD=Cay(p)(Fnp,1,{{0,y,2y,…,(p−1)y}:y∈D}).

Then, by Definition 1.3, Theorem 2.1 says that for every of size , the hypergraphs and satisfy . The first ingredient of the proof is the following straightforward generalization of the Expander Mixing Lemma [AC88], which follows directly from the above definitions.

###### Proposition 3.1 (Generalized Expander Mixing Lemma).

Let be a Cayley hypergraph, be a multiset and . Then, for every ,

To prove the theorem it thus suffices to show that for every of size , there exist such that on the one hand, , while on the other hand, , which is precisely what we shall do with sets satisfying . We achieve this by constructing a combinatorial rectangle that contains many lines, but no lines with direction in , by which we mean the following. Define the line through in direction , denoted , to be the sequence . Say that contains if for every . Denote by the number of lines contained in  that have direction . The following proposition shows why considering lines through rectangles suffices.

###### Proposition 3.2.

Let so that and are disjoint, and let be the -dimensional combinatorial rectangle . Then,

 ALD(1T1,1T2,…,1T2)=LD(R)/|D|.
###### Proof.

Recall from Section 1.5 and multi-linearity, that

 ALD(1T1,1T2…,1T2) =∑z1∈T1⋯∑zp∈TpALD(1{z1},…,1{zp}) (6) =1|D|p!∑y∈D∑x∈Fnp∑σ∈Sp1T1(x+(σ(1)−1)y)⋯1T2(x+(σ(p)−1)y).

Consider a pair such that the corresponding sum over  in (3) is nonzero. We claim that in this case, the sum equals . Indeed, if  is a permutation such that the corresponding term in the sum equals 1, then since  and  are disjoint, a term corresponding to another permutation  is nonzero if and only if . Let be the set of such pairs for which the sum over  is nonzero. It follows that (3) is equal to and the lemma follows if .

We compute the size of . Let be the function that maps a pair in to a line in where is an arbitrary permutation that contributes to the corresponding sum in (3). To see that the image of contains only lines in , observe that for every pair , and for , we have and for every . Moreover, is surjective, since for each line in , we have because the term corresponding to the identity permutation in (3) is nonzero.

Next we show that for each , its pre-image under  has size exactly , which implies the proposition. Let be a pair in . Then is fixed, and for some . We claim that all such choices of are in . Indeed, for every , and such that , we have that and for all other . Therefore . ∎

Theorem 2.1 will thus follow from the following result.

###### Theorem 3.3.

Let and let be a set of size . Then, there exist disjoint sets such that the -dimensional rectangle contains at least lines, but no lines with direction in .

###### of Theorem 2.1.

Let be as in Theorem 3.3. Then, by Proposition 3.2, we have , but

 ALFnp(1T1,1T2,…,1T2)≥1pnp2n+p−p2≥1pp2−p|T1|1/p|T2|(p−1)/p.

The result now follows from Proposition 3.1. ∎

The proof of Theorem 3.3 uses the polynomial method. For the remainder of this section let . For an -variate polynomial denote .

###### Lemma 3.4.

Let be a homogeneous polynomial of degree at most . Let and let be such that the set is nonempty. Then, the -dimensional rectangle contains no lines with directions .

###### Proof.

Recall that a Vandermonde matrix is a square matrix of the form

We record the well-known and easy fact that if the  are distinct, then the above matrix has a nonzero determinant and therefore full rank.

For a contradiction, suppose there does exist such a line with . Consider the polynomial defined by . Since  has degree at most , so does . Moreover, since and since  is homogeneous, the constant term and the coefficient of  of  are zero. Our assumption that for every then implies

 g(λ)=p−2∑i=1ciλi=a−1f(x+λd)=1,λ∈[p−1].

Hence, the all-ones vector lies in the linear span of the vectors for , since . But the matrix is a full-rank Vandermonde matrix, which is a contradiction. ∎

The following basic and standard result (see for example [Tao14]) shows that for any small set , we can always find a low-degree homogeneous polynomial  such that .

###### Lemma 3.5 (Homogeneous Interpolation).

For every of size there exists a nonzero homogeneous polynomial of degree  such that .

###### Proof.

Let  be the vector space of homogeneous degree- polynomials in . Note that . Let . Let be the linear map given by . Since , it follows from the Rank Nullity Theorem that . Hence, there exists a nonzero such that for all . ∎

We also use the following standard result bounding the zero-set of a polynomial in terms of its degree; the specific form quoted below is from  [CT14, Lemma 2.2].

###### Lemma 3.6 (DeMillo–Lipton–Schwartz–Zippel).

Let  be a finite field with  elements and let  be a nonzero polynomial of degree . Then,

 |Z(f)|≤(1−1qd/(q−1))qn.
###### Proof.

For write . Using induction on  we shall prove that , which establishes the result. First observe that, by Lagrange’s Theorem, we may assume that each variable in  has degree at most . The base case follows from the Factor Theorem, since then .

Assume the result holds for -variable polynomials. We can decompose  as

 (7) f(t,y1,…,yn−1)=min{d,q−1}∑i=1tigi(y1,…,yn−1),

where has degree at most . Let  be the maximum  for which  is nonzero. By the induction hypothesis, the polynomial  satisfies .

For each  let be the univariate polynomial defined by . The decomposition (7) shows that each  is nonzero and has degree , and thus . We conclude that

 |¯¯¯¯¯¯¯¯¯¯¯Z(f)|≥∑y∈¯¯¯¯¯¯¯¯¯¯¯Z(gk)|¯¯¯¯¯¯¯¯¯¯¯¯¯Z(hy)|≥qn/qd/(q−1).

Finally, we use the Chevalley–Warning Theorem to lower bound the number of common zeros of a system of polynomials [LN83, Chapter 6].

###### Theorem 3.7 (Chevalley–Warning).

Let be a finite field and let be nonzero polynomials such that . If there is at least one solution to the system in , then there are at least  solutions.

We include a quick proof we learned from Dion Gijswijt, which is based on Lemma 3.6.

###### Proof.

Define the polynomial by . Observe that and that if and otherwise. By Lemma 3.6, . Hence, . ∎

With this, we are set up to prove Theorem 3.3.

###### of Theorem 3.3.

By Lemma 3.5, there exists a nonzero degree- homogeneous polynomial such that . Set . By Lemma 3.6, there exists an  such that the set  is nonempty. For each set . It then follows from Lemma 3.4 that the combinatorial rectangle contains no lines with direction in .

We show that  contains many lines. To this end, define degree- polynomials by setting and for each . Then, a solution in  to the set of equations is a line through . There is at least one such solution. Indeed, if we let and , then since  is homogenous of degree , we have and by Fermat’s Little Theorem. By Theorem 3.7, the above system has at least solutions in  and  has at least that many lines. ∎

## 4. Proof of Theorem 2.4

In this section we prove Theorem 2.4. Throughout this section, let be independent uniformly distributed -valued random variables and let .

### 4.1. Reduction to Bernoulli processes

The main new ingredient needed for the proof of Theorem 2.4 is a bound showing that for fixed , the expected norm of the Rademacher sum is at most a constant times . From this, we derive the result using standard techniques based on combining a symmetrization trick, the Kahane–Khintchine inequality and an exponential Markov inequality. The details follow below. Recall that a real-valued random variable is centered if it has expectation zero.

The following standard symmetrization lemma allows us to bound the moments of the random variable whose tail we aim to bound in (5) in terms of the moments of the norm of a Rademacher sum of fixed plane sub-stochastic forms.

###### Lemma 4.1 (Symmetrization).

Let  be a real finite-dimensional normed vector space and let  be subsets. For let