A Berry-Esseen type inequality for convex bodies with an unconditional basis

A Berry-Esseen type inequality for convex bodies with an unconditional basis

Bo’az Klartag

Department of Mathematics
Princeton University
Princeton, NJ 08544, USA
(e-mail: bklartag@princeton.edu)
The author is a Clay Research Fellow, and is also supported by NSF grant .
Abstract

Suppose is a random vector, distributed uniformly in a convex body . We assume the normalization for . The body is further required to be invariant under coordinate reflections, that is, we assume that has the same distribution as for any choice of signs. Then, we show that

 E(|X|−√n)2≤C2,

where is a positive universal constant, and is the standard Euclidean norm in . The estimate is tight, up to the value of the constant. It leads to a Berry-Esseen type bound in the central limit theorem for unconditional convex bodies.

1 Introduction

Let be random variables. We assume that the random vector is distributed according to a density , and that the following hold:

1. The joint density is log-concave. That is, the function has the form with being a convex function.

2. The joint density is “unconditional”. That is, for any point and a sign vector ,

 f(x1,…,xn)=f(δ1x1,…,δnxn).

Equivalently, the random vector has the same distribution as for any choice of signs.

3. The isotropic normalization holds for .

A particular case is when is distributed uniformly in a convex set , which is normalized so that for all , and is also “unconditional”, i.e., for any and for any choice of signs,

 (x1,…,xn)∈K     ⇒      (±x1,…,±xn)∈K.

We prove the following Berry-Esseen type theorem:

Theorem 1

Under assumptions (A), (B) and (C),

 supα≤β∣∣ ∣∣P(α≤1√nn∑i=1Xi≤β)−1√2π∫βαe−t2/2dt∣∣ ∣∣≤Cn, (1)

where is a universal constant. Moreover, for any with ,

 supα≤β∣∣ ∣∣P(α≤n∑i=1θiXi≤β)−1√2π∫βαe−t2/2dt∣∣ ∣∣≤Cn∑i=1θ4i. (2)

The log-concavity requirement (A) is crucial. A simple example may be described as follows: Denote by the standard orthonormal basis in . Let be a random variable, distributed uniformly in the set . Let be a random variable, independent of , distributed uniformly in the interval . Consider the random vector . Then has the same distribution as for any choice of signs, and also for all . However, is distributed uniformly in an interval, and hence its distribution is far from normal. This demonstrates that assumptions (B) and (C) alone cannot guarantee gaussian approximation.

The bound in (1) is optimal, up to the precise value of the constant, as shown by the example of being independent random variables, with each distributed, say, uniformly in a symmetric interval (see, e.g., [14, Vol. II, Section XVI.4]). A central element in the proof of Theorem 1 is the sharp estimate

 Var(|X|2n)=E(|X|2n−1)2≤Cn, (3)

for a positive universal constant . Inequality (3) implies that most of the mass of the random vector is concentrated in a thin spherical shell of radius , centered at the origin in , whose width has the order of magnitude of a universal constant. The bound (3) was established by Wojtaszczyk [41] in the case of Orlicz balls following a result of Anttila, Ball and Perissinaki [1] regarding -balls. We say that a random vector in is isotropically-normalized if and for all , where is Kronecker’s delta. A conjecture going back to Anttila, Ball and Perissinaki (see [1, 5]) is that the thin spherical shell inequality (3) actually holds whenever is an isotropically-normalized random vector in with a log-concave density. We were able to verify this conjecture under the additional assumption that the density of is unconditional.

Theorem 1 ought to be understood in the context of the central limit theorem for convex bodies. The central limit theorem for convex bodies is the following high-dimensional effect, suggested in the works of Brehm and Voigt [8] and Anttila, Ball and Perissinaki [1], and proven in [22, 23]: Whenever is an isotropically-normalized random vector in , for large , with a log-concave density, then for “most” choices of coefficients , the random variable is approximately gaussian. (In the context of Theorem 1, note that if the vector of coefficients is distributed uniformly on the unit sphere in , then the right-hand side of (2) is at most with probability greater than . Here are universal constants.) There is an intimate relation between the central limit theorem for convex bodies and thin spherical shell estimates like (3). This connection is well-known, beginning with the work of Sudakov [39]. The reader is referred to, e.g., [22] for more background on the central limit theorem for convex bodies and to, e.g., [1, 4, 5] for the relation to thin shell estimates.

Previous techniques for obtaining thin spherical shell estimates under convexity assumptions relied almost entirely on concentration of measure ideas, either on the sphere (see [15, 22]), or on the orthogonal group (see [23]). The quantitative estimates that these techniques have yielded so far are sub-optimal. Inequality (3) was previously known to hold with the bound in place of , where the exponent is slightly smaller than , see [22, 23]. The latter result is applicable for all isotropically-normalized random vectors with a log-concave density.

In this article we suggest a different approach. Rather than employing concentration of measure inequalities, our proof of the optimal inequality (3) is based on analysis of the Neumann Laplacian on convex domains, the so-called -method in convexity, going back to Hörmander [18] and to Helffer and Sjöstrand [17]. The argument is further simplified by using the theory of optimal transportation of measures. We expect this technique to be useful also in the study of other problems in convex geometry, such as central limit theorems for convex bodies with various types of symmetries. The argument leading to the thin shell estimate occupies Section 2, Section 3 and Section 5. In Section 6 we apply these estimates and complete the proof of Theorem 1.

Readers who are interested only in the proof of inequality (3) and Theorem 1 may skip Section 4. This section is devoted to several results, that were obtained as by-products, regarding the first non-zero eigenvalue and the corresponding eigenfunctions of the Neumann Laplacian on -dimensional convex bodies. In particular, we show that the eigenfunctions are all “biased” towards some direction in space. This rules out, for instance, the possibility of an even eigenfunction.

As the reader has probably figured out by now, we denote expectation by and probability by . We write for variance, and for the Lebesgue measure of a measurable set . The scalar product of is denoted by . The letters etc. stand for various positive universal constants, whose value may change from one line to the next.

Acknowledgement. We would like to express our gratitude to Sasha Sodin for his kind help with the analysis related to the classical central limit theorem, to Tom Spencer for illuminating explanations regarding the work of Helffer and Sjöstrand, and to Dario Cordero-Erausquin, Leonid Friedlandler, Robert McCann, Emanuel Milman, Vitali Milman and Elias Stein for valuable discussions on related topics. Thanks also to the referee for useful comments and suggestions.

2 Convexity and the Neumann Laplacian

In this section we analyze some convexity related properties of the Neumann Laplacian, most of which are standard. A convex body in is a compact, convex set with a non-empty interior. Let be a convex body with a -smooth boundary, to be fixed throughout this section. We say that a function belongs to if all of its derivatives of all orders exist and are bounded in the interior of . When is a -smooth function, the boundary values of and its derivatives are well defined, and are -smooth on the boundary . For define

 ∥u∥H−1(K)=sup{∫Kφu;φ∈C∞(K), ∫K|∇φ|2≤1}.

Note that necessarily when . For a function in variables and for we write for the derivative of with respect to the coordinate. When is a square-integrable function, set

 VarK(f)=∫K(f(x)−E)2dx

with . The main result of this section reads as follows:

Lemma 1

Let be a convex body with a -smooth boundary. Let be a -smooth function. Then,

 VarK(f)≤n∑i=1∥∂if∥2H−1(K). (4)

One may verify that the right-hand side of (4) does not depend on the choice of orthogonal coordinates in . See [13] for an analog of Lemma 1 for non-convex domains. Let be a convex function which is -smooth with bounded derivatives of all orders in a neighborhood of , such that

 ρ(x)=0, |∇ρ(x)|=1     for x∈∂K

and for . For instance, we may select . Note that for any , the vector is the outer unit normal to at .

Denote by the space of all -smooth functions that satisfy the following Neumann boundary condition:

 ∇u(x)⋅∇ρ(x)=0     for x∈∂K.

The following lemma is a standard Bochner-Weitzenböck type integration by parts formula, going back at least to Lichnerowicz [25], to Hörmander [18] and to Kadlec [21]. We write for the hessian matrix of the function .

Lemma 2

Let and denote . Then,

 ∫Kf2=∫Kn∑i=1|∇∂iu|2+∫∂K∇2ρ(∇u)⋅∇u. (5)

Proof: The function vanishes on . Since is tangential to , the derivative of the function in the direction of vanishes on . That is,

 ∇u(x)⋅∇(∇u(x)⋅∇ρ(x))=0     for  x∈∂K.

Equivalently,

 (∇2u)(∇ρ)⋅∇u+(∇2ρ)(∇u)⋅∇u=0       on  ∂K. (6)

By Stokes theorem,

 ∫Kf2=∫K(△u)2=−∫K∇(△u)⋅∇u+∫∂K(△u∇u)⋅∇ρ. (7)

The boundary term vanishes, since on . We conclude from (7) and from an additional application of Stokes theorem that

 ∫Kf2=−n∑i=1∫K∂iu△(∂iu)=n∑i=1∫K|∇∂iu|2−∫∂Kn∑i=1(∂iu∇∂iu)⋅∇ρ.

Note that the integrand in the integral over is exactly . Hence, from (6),

 ∫Kf2=n∑i=1∫K|∇∂iu|2+∫∂K∇2ρ(∇u)⋅∇u,

and the lemma is proven.

The convexity of will be used next. Recall that is a convex function, and hence its hessian is a positive semi-definite matrix for any . Therefore, Lemma 2 implies that for any ,

 n∑i=1∫K|∇∂iu|2≤∫Kf2 (8)

where . Lemma 1 will be proven by dualizing inequality (8), in a way which is very much related to the approach taken by Hörmander [18] and by Helffer and Sjöstrand [17].

Proof of Lemma 1: We are given and we would like to prove (4). We may assume that (otherwise, subtract from the function ).

Since and , there exists with

 −△u=f.

The existence of such is a consequence of the classical existence and regularity theory of the Neumann problem for the Laplacian on domains with a -smooth boundary (see, e.g., Folland’s book [16, chapter 7]). Stokes theorem yields

 ∫Kf2=−∫Kf△u=∫K∇f⋅∇u−∫∂Kf∇u⋅∇ρ=n∑i=1∫K∂if∂iu,

where the boundary term vanishes since . From the definition of the -norm and the Cauchy-Schwartz inequality,

 ∫Kf2=n∑i=1∫K∂if∂iu ≤ n∑i=1∥∂if∥H−1(K)√∫K|∇∂iu|2 ≤  ⎷n∑i=1∥∂if∥2H−1(K)⋅ ⎷n∑i=1∫K|∇∂iu|2.

Combine (2) and (8) to conclude that

 ∫Kf2≤n∑i=1∥∂if∥2H−1(K).

3 Transportation of Measure

Suppose and are finite Borel measures on and respectively, and is a measurable map. We say that pushes forward, or transports, to if

 μ1(T−1(A))=μ2(A)

for all Borel sets . In this case we write , and we call the transportation map. Note that for any bounded, measurable function .

For example, let be a Borel measure on . For we write and . We say that the measure is the marginal of on the first coordinate, and is the marginal of on the second coordinate. A measure on with and is called a “coupling” of and .

Suppose and are two finite Borel measures on . If pushes forward to , then the map

 x↦(x,Tx)

transports the measure to a measure on which is a coupling of and . The -Wasserstein distance between is defined as

 W2(μ1,μ2)=infγ(∫Rn×Rn|x−y|2dγ(x,y))1/2,

where the infimum runs over all couplings of and . If there is no coupling, then . Let be a finite, compactly-supported Borel measure on . For a -smooth function , set

 ∥u∥H−1(μ)=sup{∫Rnuφdμ;φ∈C∞(Rn), ∫Rn|∇φ|2dμ≤1}.

This definition fits with the one given in Section 2; We have where denotes the restriction of the Lebesgue measure to .

The next theorem is an extension of a remark by Yann Brenier [9] that we learned from Robert McCann. For the convenience of the reader, we provide in the appendix a detailed exposition of the elegant proof from Villani [40, Section 7.6].

Theorem 2

Let be a finite, compactly-supported Borel measure on . Let be a bounded, measurable function with

 ∫hdμ=0.

For a sufficiently small , let be the measure whose density with respect to is the non-negative function . Then,

 ∥h∥H−1(μ)≤liminfε→0+W2(μ,με)ε.

See [9] and [40] for the intuition behind Theorem 2. We write for the standard orthonormal basis in . Let be a convex body. Fix a point and . Consider the line , that is, the line in the direction of that passes through . This line meets with a closed segment (or a single point). The two endpoints of this segment in will be denoted by and , where . Thus,

 K∩(x+Rei)=[B−i(x),B+i(x)],

the line segment from to . See Figure 1.

For consider the projection

 πi(x1,…,xn)=(x1,…,xi−1,xi+1,…,xn),

defined for . Then is a convex body in . For , we define to be the minimal coordinate among all points with . Similarly, we define to be the maximal coordinate.

Figure 1

Lemma 3

Let be a convex body with a -smooth boundary. Fix . Let be a -smooth function such that for any ,

 Ψ(B−i(x))=Ψ(B+i(x)). (10)

For a sufficiently small denote by the measure whose density with respect to is . Then,

Proof: Without loss of generality, assume that . For a sufficiently small , the function is positive on , and hence is a non-negative measure. Fix such a sufficiently small .

For we will use the coordinates where . Fix and denote and . According to our assumption (10),

 ∫qp(1+ε∂1Ψ(t,y))dt=(q−p)+εΨ(t,y)|qt=p=q−p.

Consequently, the densities and have an equal amount of mass on the interval . We consider the monotone transportation between these two densities. That is, we define a map by requiring that for any ,

 ∫x1p(1+ε∂1Ψ(t,y))dt=∫T(x1)pdt. (11)

The unique map that satisfies (11) transports the measure whose density is on to the Lebesgue measure on . We deduce from (11) that for ,

 T(x1)=x1+ε[Ψ(x1,y)−Ψ(p,y)].

Therefore,

 ∫qp|T(t)−t|2⋅(1+ε∂1Ψ(t,y))dt = ε2∫qp[Ψ(t,y)−Ψ(p,y)]2dt+ε3R,

with bounded by a constant depending only on and (and in particular, independent of or ). We now let vary, and we write

 S(x1,y)=(Ty(x1),y)                % for   (x1,y)∈K.

Note that is well-defined (since belongs to the domain of definition of when ), one-to-one, continuous, and maps onto . Moreover, by Fubini, for any continuous function ,

 ∫Kφ(S(x))dμε(x)=∫π(K)[∫q+1(y)q−1(y)φ(Ty(x1),y)⋅(1+ε∂1Ψ)dx1]dy = ∫π(K)[∫q+1(y)q−1(y)φ(x1,y)dx1]dy=∫Kφ(x)dμ(x).

Therefore the map transports to . According to (3),

 W2(μ,με)2≤∫K|S(x)−x|2dμε(x)=ε2∫K[Ψ(x)−Ψ(B−1(x))]2dx+ε3R′,

with smaller than a constant depending only on and , and in particular independent of . To complete the proof, let tend to zero.

4 A digression: Neumann eigenvalues and eigenfunctions

This section presents some additional relations between convexity and the Neumann Laplacian. We retain the setup and notation of Section 2. We write for the Hilbert space that is the completion of with respect to the norm

 ∥u∥L2(K)=√∫Ku2.

The operator , acting on the subspace , is a symmetric, positive semi-definite operator. The classical theory implies that has a complete system of orthonormal Neumann eigenfunctions and Neumann eigenvalues (see, e.g., [16, Chapter 7]). The first eigenvalue is , with the eigenfunction being constant. It is well-known that when is convex (see, e.g, [34]. It is actually enough to assume that is connected, see e.g., [11, Theorem 1]). We refer to as the first non-zero Neumann eigenvalue of . It is well-known that for any -smooth function with ,

 λ1∫Ku2≤∫K|∇u|2. (13)

Equality in (13) holds if and only if is an eigenfunction corresponding to the eigenvalue .

We say that the boundary of is uniformly strictly convex if is a positive definite matrix for any . Equivalently, is uniformly strictly convex if the principal curvatures are all positive – and not merely non-negative – everywhere on the boundary. Our next corollary claims, loosely speaking, that any non-trivial eigenfunction corresponding to cannot be “spatially isotropic”, but must have “preference” for a certain direction in space.

Corollary 1

Suppose is a convex body whose boundary is -smooth and uniformly strictly convex. Let be an eigenfunction corresponding to the first non-zero Neumann eigenvalue. Then,

 ∫K∇φ≠0. (14)

Consequently, the multiplicity of the first non-zero Neumann eigenvalue is at most .

Proof: Assume the opposite. Then,

 ∫K∂iφ=0    for i=1,…,n. (15)

We write for the first non-zero eigenvalue, i.e., . Since , inequality (8) gives

 λ21∫Kφ2=∫K|△φ|2≥n∑i=1∫K|∇∂iφ|2. (16)

From (15) we know that for all . Thus (16) and (13) yield

 λ21∫Kφ2≥n∑i=1∫K|∇∂iφ|2≥λ1n∑i=1∫K(∂iφ)2=λ1∫K|∇φ|2=λ21∫Kφ2.

Therefore, there must be equality in all steps and hence are all Neumann eigenfunctions with eigenvalue . We necessarily have equality also in (16). According to Lemma 2 this means that

 ∫∂K∇2ρ(∇φ)⋅∇φ=0.

Since the integrand is non-negative and continuous, necessarily

 ∇2ρ(∇φ)⋅∇φ=0      on  ∂K. (17)

So far we have only used the convexity of . The uniform strict convexity of means that on . Equation (17) has the consequence that on , and therefore

 φ≡Const      on  ∂K. (18)

This is well-known to be impossible for a Neumann eigenfunction corresponding to the first non-zero eigenvalue. We sketch the standard argument, see, e.g., [11] for more information. Denote

 N={x∈K;φ(x)>0}.

The set is non-empty since . Moreover, vanishes on because of (18). Since in , then is a Dirichlet eigenfunction of the domain corresponding to the Dirichlet eigenvalue . For a domain , denote by the minimal eigenvalue of with Dirichlet boundary conditions on . Then , as is witnessed by . Furthermore, by domain monotonicity (see, e.g, [11]), hence . However, we have the strict inequality (see, e.g., [24] for a much more accurate result). We thus arrive at a contradiction. Consequently our assumption that was absurd. The proof of (14) is complete.

The linear map from the eigenspace of to is therefore injective, so the multiplicity of the eigenvalue cannot exceed .

Remark. Leonid Friedlandler explained to us how to eliminate the uniform strict convexity requirement from Corollary 1. His idea is to observe that since are all eigenfunctions, then the restriction of to the boundary is actually an eigenfunction of the Laplacian associated with the Riemannian manifold . However, (17) entails that is constant in some open set in , which is known to be impossible for an eigenfunction. We omit the details.

For and write

 σi(x)=(x1,…,xi−1,−xi,xi+1,…,xn),

i.e., we flip the sign of the coordinate. For a function , we write . Our next corollary exploits the well-known relationship between the eigenfunctions and symmetry. Similar arguments appear, e.g., in [2].

Corollary 2

Suppose is a convex body with a -smooth boundary. Denote by the eigenspace corresponding to the first non-zero Neumann eigenvalue of .

1. If is unconditional, then there exist and an eigenfunction , such that

 σi(φ)=−φ.
2. If is centrally-symmetric (i.e., ), then there exists an eigenfunction , such that

 φ(−x)=−φ(x)     for x∈K.

Proof: Begin with the proof of (i). We are given the unconditional convex body . Since is unconditional, then implies for . Begin with any non-zero eigenfunction , and recursively define

 fi=fi−1+σi(fi−1).

Then . If there exists such that then we are done: Suppose is the minimal such index. Then with , and we found our desired eigenfunction.

It remains to deal with the case where is a non-zero eigenfunction. Note that and hence

 σi(∂iψ)=−∂iψ (19)

for . Therefore,

 ∫K∇ψ=0. (20)

In the proof of Corollary 1 (the first part, which did not use the uniform strict convexity) we observed that (20) implies that . Since , there exists with . We see from (19) that is the eigenfunction we are looking for. This completes the proof of the first part of the lemma.

The proof of the second part is similar. Begin with any and set . If , then is an odd function and we are done. Otherwise, is an even function, hence . As before, this implies that are all odd eigenfunctions corresponding to the same eigenvalue .

Corollary 1 and Corollary 2 seem very much expected. Notably, Nadirashvili [29] has proved that in two dimensions, the multiplicity of the first non-zero Neumann eigenvalue is at most for any simply-connected domain. Our simple proof of Corollary 1 is not applicable in such generality. Corollary 1 is related to the “hot spots” problem, see, e.g., Burdzy [10], Jerison and Nadirashvili [19] and references therein. A proof of Corollary 2 for the two-dimensional case – under much more general assumptions than convexity – can be found in [2, Theorem 4.3]. However, the proofs of the two-dimensional results mentioned do not seem to admit easy generalization to higher dimensions. As observed by Payne and Weinberger [33], Corollary 2 leads to the following comparison principle:

Corollary 3

Let be an unconditional convex body with a -smooth boundary. Assume that is such that

 K⊆[−R,R]n={(x1,…,xn)∈Rn;|xi|≤R  for  i=1,…,n}.

Denote by the first non-zero Neumann eigenvalue of . Then,

 λ1≥π2R2.

Equality holds when , an -dimensional cube.

Proof: A well-known, elementary calculation shows that for any and a smooth odd function ,

 π2R2∫r−rψ2(x)dx≤π2r2∫r−rψ2(x)dx≤∫r−r(dψdx)2dx. (21)

According to Corollary 2(i), there exists an index and a non-zero eigenfunction corresponding to such that . By Fubini’s theorem and (21),

 π2R2∫Kφ2≤∫K|∂iφ|2≤∫K|∇φ|2=λ1∫Kφ2,

hence .

Remarks.

1. Corollary 3 shows that the cube satisfies a certain domain monotonicity principle for the Neumann Laplacian, at least in the category of unconditional, convex bodies. The Euclidean ball, for instance, does not satisfy a corresponding principle.

2. Suppose is an unconditional convex body. Assume that is isotropically normalized, i.e., the random vector which is distributed uniformly in is isotropically normalized. Corollary 3 implies the probably non-optimal bound

 λ1(K)≥c/log2(n+1), (22)

where is the first non-zero Neumann eigenvalue of , and is a universal constant. To establish (22), consider

 K′=K∩[−R,R]n,       for R=50log(n+1).

Use Corollary 3 to deduce the bound . The body is a good approximation to the body : It is easily proven that

 Vol(K′)≥(1−1n)Voln(K).

We may thus apply E. Milman’s result [27, Theorem 1.7], which builds upon the Sternberg-Zumbrun concavity principle [38], to conclude that and the bound (22) follows. See [20] for a conjectural better bound, without the logarithmic factor.

5 Unconditional convex bodies

We begin this section with a corollary to the theorems of Section 2 and Section 3.

Corollary 4

Let be an unconditional convex body.

1. Let be an unconditional, continuous function. Then,

 VarK(Ψ)≤n∑i=1∫K(Ψ(x)−Ψ(B+i(x)))2dx.
2. In particular, suppose are even, continuous functions. Denote . Then,

 VarK(Ψ)≤n∑i=1∫Ksups,t∈Ji(x)(fi(s)−fi(t))2dx,

where . That is, is a symmetric interval about the origin with the same length as .

Proof: Begin with (i). By approximation, we may assume that has a -smooth boundary, and that is a -smooth function. Lemma 1 states that

 VarK(Ψ)≤n∑i=1∥∂iΨ∥2H−1(K).

Fix . We may apply Theorem 2 for since , as implied by the symmetries of . We may apply Lemma 3, since clearly for any . Theorem 2 and Lemma 3 entail the inequality

 ∥∂iΨ∥2H