Uniform logarithmic Sobolev inequalities for conservative spin systems with super-quadratic single-site potential

Uniform logarithmic Sobolev inequalities for conservative spin systems with super-quadratic single-site potential

Abstract

We consider a noninteracting unbounded spin system with conservation of the mean spin. We derive a uniform logarithmic Sobolev inequality (LSI) provided the single-site potential is a bounded perturbation of a strictly convex function. The scaling of the LSI constant is optimal in the system size. The argument adapts the two-scale approach of Grunewald, Villani, Westdickenberg and the second author from the quadratic to the general case. Using an asymmetric Brascamp–Lieb-type inequality for covariances, we reduce the task of deriving a uniform LSI to the convexification of the coarse-grained Hamiltonian, which follows from a general local Cramér theorem.

[
\kwd
\doi

10.1214/11-AOP715 \volume41 \issue3B 2013 \firstpage2182 \lastpage2224 \newproclaimdefinition[theorem]Definition \newproclaimremark[theorem]Remark

\runtitle

LSI for conservative spin systems

{aug}

A]\fnmsGeorg \snmMenz\correflabel=e1]Georg.Menz@mis.mpg.de\thanksreftt1 and A]\fnmsFelix \snmOttolabel=e2]Felix.Otto@mis.mpg.delabel=u1,url]http://www.mis.mpg.de/ \thankstexttt1Supported through the Gottfried Wilhelm Leibniz program, the Bonn International Graduate School in Mathematics and the Max Planck Institute for Mathematics in the Sciences in Leipzig.

class=AMS] \kwd[Primary ]60K35 \kwd[; secondary ]60J25 \kwd82B21. Logarithmic Sobolev inequality \kwdspin system \kwdKawasaki dynamics \kwdcanonical ensemble \kwdcoarse-graining.

1 Introduction and main result

The grand canonical ensemble is a probability measure on given by

Throughout the article, denotes a generic normalization constant. The value of may change from line to line or even within a line. The noninteracting Hamiltonian is given by a sum of single-site potentials that are specified later, that is,

(1)

For a real number , we consider the dimensional hyper-plane given by

We equip with the standard scalar product induced by , namely

The restriction of to is called canonical ensemble , that is,

(2)

Here, denotes the dimensional Hausdorff measure restricted to the hyperplane . For convenience, we introduce the notation:

In 1993, Varadhan ([23], Lemma 5.3 ff.) posed the question for which kind of single-site potential the canonical ensemble satisfies a spectral gap inequality (SG) uniformly in the system size and the mean spin . A partial answer was given by Caputo [5]:

Theorem 1.1 ((Caputo))

Assume that for the single-site potential there exist a splitting and constants , such that for all

(3)

Then the canonical ensemble satisfies the SG with constant uniformly in the system size and the mean spin . More precisely, for any function ,

Here, denotes the gradient determined by the Euclidean structure of .

In this article, we give a full answer to the question by Varadhan [23] and also show that the last theorem can be strengthened to the logarithmic Sobolev inequality (LSI). {definition}[(LSI)] Let be a Euclidean space. A Borel probability measure on satisfies the LSI with constant , if for all functions

(4)

Here, denotes the gradient determined by the Euclidean structure of .

{remark}

[(Gradient on )] If we choose in Definition 1.1, we can calculate in the following way: Extend to be constant on the direction normal to . Then

The LSI was originally introduced by Gross [10]. It yields the SG and can be used as a powerful tool for studying spin systems. Like the SG, the LSI implies exponential convergence to equilibrium of the naturally associated conservative diffusion process. The rate of convergence is given by the LSI constant ; cf. [22], Chapter 3.2, and Remark 1.4. Therefore, an appropriate scaling of the LSI constant in the system size indicates the absence of phase transitions. The SG yields convergence in the sense of variances in contrast to the LSI, which yields convergence in the sense of relative entropies. The SG and the LSI are also useful for deducing the hydrodynamic limit; see [23] for the SG and [11] for the LSI.

We consider three cases of different potentials: sub-quadratic, quadratic and super-quadratic single-site potentials. In the case of sub-quadratic single-site potentials, Barthe and Wolff [2] gave a counterexample where the scaling in the system size of the SG and the LSI constant of the canonical ensemble differs in the system size. More precisely, they showed:

Theorem 1.2 ((Barthe and Wolff))

Assume that the single-site potential is given by

Then the SG constant and the LSI constant of the canonical ensemble satisfy

In the case of perturbed quadratic single-site potentials it is known that Theorem 1.1 can be improved to the LSI. More precisely, several authors (cf. [17, 6, 11]) deduced the following statement by different methods:

Theorem 1.3 ((Landim, Panizo and Yau))

Assume that the single-site potential is perturbed quadratic in the following sense: There exists a splitting such that

(5)

Then the canonical ensemble satisfies the LSI with constant uniformly in the system size and the mean spin .

There is only left to consider the super-quadratic case. It is conjectured that the optimal scaling LSI also holds if the single-site potential is a bounded perturbation of a strictly convex function; cf. [17], page 741, [6], Theorem 0.3 f., and [5], page 226. Heuristically, this conjecture seems reasonable: Because the LSI is closely linked to convexity (consider, e.g., the Bakry–Émery criterion), a perturbed strictly convex potential should behave no worse than a perturbed quadratic one. However technically, the methods for the quadratic case are not able to handle the perturbed strictly convex case because they require an upper bound on the second derivative of the Hamiltonian. In the main result of the article we show that the conjecture from above is true:

Theorem 1.4

Assume that the single-site potential is perturbed strictly convex in the sense that there is a splitting such that

(6)

Then the canonical ensemble satisfies the LSI with constant uniformly in the system size and the mean spin .

{remark}

[(From Glauber to Kawasaki)] The bound on the r.h.s. of (4) is given in terms of the Glauber dynamics in the sense that we have endowed with the standard Euclidean structure inherited from . By the discrete Poincaré inequality, one can recover the bound for the Kawasaki dynamics (cf. [11], Remark 15, or [5]) in the sense that one endows with the Euclidean structure coming from the discrete -norm. More precisely, if is a cubic lattice in any dimension of width , then Theorem 1.4 yields the LSI for Kawasaki dynamics with constant , which is the optimal scaling in ; cf. [24].

Note that the standard criteria for the SG and the LSI (cf. Appendix) fail for the canonical ensemble :

  • The Tensorization principle for the SG and the LSI does not apply because of the restriction to the hyper-plane ; cf. [12], Theorem 4.4, or Theorem .1.

  • The Bakry–Émery criterion does not apply because the Hamiltonian is not strictly convex; cf. [1], Proposition 3 and Corollary 2, or Theorem .3.

  • The Holley–Stroock criterion does not help because the LSI constant has to be independent of the system size ; cf. [14], page 1184, or Theorem .2.

Therefore, a more elaborated machinery was needed for the proof of Theorems 1.1 and 1.3. The approach of Caputo to Theorem 1.1 seems to be restricted to the SG because it relies on the spectral nature of the SG. For the proof of Theorem 1.3, Landim, Panizo and Yau [17] and Chafaï [6] used the Lu–Yau martingale method that was originally introduced in [19] to deduce an analog version of Theorem 1.3 in the case of discrete spin values. Recently, Grunewald, Villani, Westdickenberg and the second author [11] provided a new technique for deducing Theorem 1.3, called the two-scale approach. We follow this approach in the proof of Theorem 1.4.

The limiting factor for extending Theorem 1.3 to more general single-site potentials is almost the same for the Lu–Yau martingale method and for the two-scale approach: It is the estimation of a covariance term w.r.t. the measure conditioned on a special event; cf. [17], (4.6), and [11], (42). In the two-scale approach one has to estimate for some large but fixed and any nonnegative function the covariance

In [11], this term term was estimated by using a standard estimate (cf. Lemma 2.6 and [11], Lemma 22) that only can be applied for perturbed quadratic single-site potentials . We get around this difficulty by making the following adaptations: Instead of one-time coarse-graining of big blocks, we consider iterative coarse-graining of pairs. As a consequence we only have to estimate the covariance term from above in the case . Because is a one-dimensional measure, we are able to apply the more robust asymmetric Brascamp–Lieb inequality (cf. Lemma 2.7) that can also be applied for perturbed strictly convex single-site potentials .

Recently, the optimal scaling LSI was established in [20] by the first author for a weakly interacting Hamiltonian with perturbed quadratic single-site potentials , that is,

Because the original two-scale approach was used, it is an interesting question if one could extend this result to perturbed strictly convex single-site potentials. A direct transfer of the argument of [20] fails because of the iterative structure of the proof of Theorem 1.4.

The remaining part of this article is organized as follows. In Section 2.1 we prove the main result. The auxiliary results of Section 2.1 are proved in Section 2.2. There is one exception: The convexification of the single-site potential by iterated renormalization (see Theorem 2.4) is proved in Section 3. In the short Appendix we state the standard criteria for the SG and the LSI.

2 Adapted two-scale approach

2.1 Proof of the main result

The proof of Theorem 1.4 is based on an adaptation of the two-scale approach of [11]. We start with introducing the concept of coarse-graining of pairs. We recommend reading [11], Chapter 2.1, as a guideline.

We assume that the number of sites is given by for some large number . The step to arbitrary is not difficult; cf. Remark 2.1, below. We decompose the spin system into blocks, each containing two spins. The coarse-graining operator assigns to each block the mean spin of the block. More precisely, is given by

(7)

Due to the coarse-graining operator , we can decompose the canonical ensemble into

(8)

where denotes the push forward of the Gibbs measure under and is the conditional measure of given . The last equation has to be understood in a weak sense; that is, for any test function 

Now, we are able to state the first ingredient of the proof of Theorem 1.4.

Proposition 2.1 ((Hierarchic criterion for the LSI))

Assume that the single-site potential is perturbed strictly convex in the sense of (6). If the marginal satisfies the LSI with constant uniformly in the system size and the mean spin , then the canonical ensemble also satisfies the LSI with constant uniformly in the system size and the mean spin .

The proof of this statement is given in Section 2.2. Due to the last proposition it suffices to deduce the LSI for the marginal . Hence, let us have a closer look at the structure of . We will characterize the Hamiltonian of the marginal with the help of the renormalization operator , which is introduced as follows. {definition} Let be a single-site potential. Then the renormalized single-site potential is defined by

(9)
{remark}

The renormalized single-site potential can be interpreted in the following way: A change of variables (cf. [8], Section 3.3.3) and the invariance of the Hausdorff measure under translation yield the identity

Therefore, the renormalized single-site potential describes the free energy of two independent spins and [identically distributed as] conditioned on a fixed mean value .

Lemma 2.2 ((Invariance under renormalization))

Assume that the single-site potential is perturbed strictly convex in the sense of (6). Then the renormalized Hamiltonian is also perturbed strictly convex in the sense of (6).

Direct calculation using the coarea formula (cf. [8], Section 3.4.2) reveals the following structure of the marginal .

Lemma 2.3

The marginal is given by

It follows from the last two lemmas that the marginal has the same structure as the canonical ensemble . The single-site potential of is given by the renormalized single-site potential . Hence, one can iterate the coarse-graining of pairs. The next statement shows that after finitely many iterations the renormalized single-site potential becomes uniformly strictly convex. Therefore, the Bakry–Émery criterion (cf. Theorem .3) yields that the corresponding marginal satisfies the LSI with constant , uniformly in the system size and the mean spin . Then, an iterated application of the hierarchic criterion of the LSI (cf. Proposition 2.1) yields Theorem 1.4 in the case .

Theorem 2.4 ((Convexification by renormalization))

Let be a perturbed strictly convex single-site potential in the sense of (6). Then there is an integer such that for all the -times renormalized single-site potential is uniformly strictly convex independently of the system size and the mean spin .

We conclude this section by giving some remarks and pointing out the central tools needed for the proof of the auxiliary results. The next remark shows how Theorem 1.4 is verified in the case of an arbitrary number of sites.

{remark}

Note that an arbitrary number of sites can be written as

for some number , a large but fixed number and a bounded number . Hence, one can decompose the spin system into blocks of spins and one block of spins. The big blocks of spins are coarse-grained by pairs, whereas the small block of spins is not coarse-grained at all. After iterating this procedure sufficiently often, the renormalized single-site potentials of the big blocks are uniformly strictly convex. On the remaining block of spins, the corresponding single-site potentials are unchanged. Because is a bounded perturbation of a strictly convex function, it follows from a combination of the Bakry–Émery criterion (cf. Theorem .3) and the Holley–Stroock criterion (cf. Theorem .2) that the marginal of the whole system satisfies the LSI with constant

which is independent on and . Therefore, an iterated application of the hierarchic criterion of the LSI (cf. Proposition 2.1) yields Theorem 1.4.

{remark}

[(Inhomogeneous single-site potentials)] It is a natural question whether this approach can be applied to the case of inhomogeneous single-site potentials. In this case, the single-site potentials are allowed to depend on the sites; that is, the Hamiltonian has the form where each is a perturbed strictly-convex potential. In principle, we believe that our approach can be adapted to this situation even if not in a straightforward way. The reason is that only one step of the proof of Theorem 1.4 has to be adapted: It is the convexification of the single-site potentials by iterated renormalization (see Theorem 2.4).

Let us make a comment on the proof of Theorem 2.4, which is stated in Section 3. Starting point for the proof is the observation that the -times renormalized single-site potential corresponds to the coarse-grained Hamiltonian related to coarse-graining with block size ; cf. [11].

Lemma 2.5

For let the coarse-grained Hamiltonian be defined by

(10)

Let . Then there is a constant depending only on such that

Because the last statement is verified by a straightforward application of the area and coarea formula, we omit the proof. In Lemma 2.5 one could easily determine the exact value of the constant . However, the exact value is not important because we are only interested in the convexity of . In [11], the convexification of was deduced from a local Cramér theorem; cf. [11], Proposition 31. For the proof of Theorem 2.4 we follow the same strategy generalizing the argument to perturbed strictly convex single-site potentials .

Now, we make some comments on the proof of Proposition 2.1 and Lemma 2.2, which are stated in Section 2.2. One of the limiting factors in the proof of Theorem 1.3 is the application of a classical covariance estimate; cf. [11], Lemma 22. In our framework this estimate can be formulated as:

Lemma 2.6

Assume that the single-site potential is perturbed strictly convex in the sense of (6). Let be a probability measure on given by

Then for any function and

In [11], the last estimate was applied to the function . Note that the function is only bounded in the case of a perturbed quadratic single-site potential . The main new ingredient for the proof of the hierarchic criterion for the LSI (cf. Proposition 2.1) and the invariance principle (cf. Lemma 2.2) is an asymmetric Brascamp–Lieb inequality, which does not exhibit this restriction.

Lemma 2.7

Assume that the single-site potential is perturbed strictly convex in the sense of (6). Let be a probability measure on given by

Then for any function and

where .

We call the last inequality asymmetric because, compared to the original Brascamp–Lieb inequality [4], the space is replaced by , and the factor is not evenly distributed. It is an interesting question if an analog statement also holds for higher dimensions. The proof of Lemma 2.7 is based on a kernel representation of the covariance. All steps are elementary.{pf*}Proof of Lemma 2.7 Let us consider a Gibbs measure associated to the Hamiltonian . More precisely, is given by

We start by deriving the following integral representation of the covariance of :

(11)

where the nonnegative kernel is given by

and so that . Indeed, we start by noting that

(12)

where we do not distinguish between the measure and its Lebesgue density in our notation. Using , we can use integration by parts to rewrite each factor in terms of the derivative

where assumes the value if and zero otherwise. Inserting this and the corresponding identity for into (12), we obtain

(13)

with kernel as desired, given by

We now establish the following identity for the above kernel:

(14)

Indeed, we have by integrations by part

Let us now consider the Gibbs measures and , given by

By the integral representation (11) of the covariance we have the estimate

By a straight-forward calculation, we can estimate

Together with a similar estimate for this yields the kernel estimate

Applying this to the covariance estimate from above yields

Using the identity (14) for , we may easily conclude

\upqed

For the entertainment of the reader, let us argue how the identity (14) also yields the traditional Brascamp–Lieb inequality in the case . Indeed, by the symmetry of the kernel , identity (14) yields, for all and ,

(15)

The integral representation of the covariance (11) yields

Then a combination of Hölder’s inequality and the identity (15) for the kernel yields the Brascamp–Lieb inequality,

(16)

2.2 Proof of auxiliary results

In this section we outline the proof of Proposition 2.1 and Lemma 2.2. We start with Proposition 2.1, which is the hierarchic criterion for the LSI. Unfortunately, we cannot directly apply the two-scale criterion of [11], Theorem 3. The reason is that the number

(17)

which measures the interaction between the microscopic and macroscopic scales, can be infinite for a perturbed strictly convex single-site potential . However, we follow the proof of [11], Theorem 3, with only one major difference: Instead of applying the classical covariance estimate (cf. Lemma 2.6), we apply the asymmetric Brascamp–Lieb inequality; cf. Lemma 2.7. Let us assume for the rest of this section that the single-site potential is perturbed strictly convex in the sense of (6).

For convenience, we set and . We choose on and the standard Euclidean structure given by

The coarse-graining operator given by (7) satisfies the identity

where is the adjoint operator of . Note that our differs from the of [11], because the Euclidean structure on Y differs from the Euclidean structure used in [11] by a factor. The last identity yields that is the orthogonal projection of to . Hence, one can decompose into the orthogonal sum of microscopic fluctuations and macroscopic variables according to

and

We apply this decomposition to the gradient of a smooth function on . The gradient is decomposed into a macroscopic gradient and a fluctuation gradient satisfying

(18)

Note that is the tangent space of the fiber . Hence the gradient of on is given by . The first main ingredient of the proof of Proposition 2.1 is the following statement.

Lemma 2.8

The conditional measure given by (8) satisfies the LSI with constant uniformly in the system size , the macroscopic profile and the mean spin . More precisely, for any nonnegative function 

{pf*}

Proof of Lemma 2.8 Observe that the conditional measure has a product structure: We decompose into a product of Euclidean spaces. Namely for

we have

It follows from the coarea formula (cf. [8], Section 3.4.2) that

Hence is the product measure

(19)

where we make use of the notation introduced in (2). Because the single-site potential is perturbed strictly convex in the sense of (6), a combination of the Bakry–Émery criterion (cf. Theorem .3) and the Holley–Stroock criterion (cf. Theorem .2) yield that the measure satisfies the LSI with constant uniformly in . Then the tensorization principle (cf. Theorem .1) implies the desired statement.

For convenience, let us introduce the following notation: Let be an arbitrary function. Then its conditional expectation is defined by

The second main ingredient of the proof of Proposition 2.1 is the following proposition, which is the analog statement of [11], Proposition 20.

Proposition 2.9

Assume that the marginal given by (8) satisfies the LSI uniformly in the system size and the mean spin . Then for any nonnegative function ,

uniformly in the macroscopic profile and the system size .

Before we verify Proposition 2.9, let us show how it can be used in the proof of Proposition 2.1. {pf*}Proof of Proposition 2.1 Using Lemma 2.8 and Proposition 2.9 from above, the argument is exactly the same as in the proof of [11], Theorem 3:

Let denote the function . The additive property of the entropy implies

An application of Lemma 2.8 yields the estimate

By assumption the marginal satisfies the LSI with constant . Together with Proposition 2.9 this yields the estimate

A combination of the last three formulas and the observations (8) and (18) yield