Strict convexity of the free energy of the canonical ensemble under decay of correlations
Abstract.
We consider a onedimensional lattice system of unbounded, realvalued spins. We allow arbitrary strong, attractive, nearestneighbor interaction. We show that the free energy of the canonical ensemble converges uniformly in to the free energy of the grand canonical ensemble. The error estimates are quantitative. A direct consequence is that the free energy of the canonical ensemble is uniformly strictly convex for large systems. Another consequence is a quantitative local Cramér theorem which yields the strict convexity of the coarsegrained Hamiltonian. With small adaptations, the argument could be generalized to systems with finiterange interaction on a graph, as long as the degree of the graph is uniformly bounded and the associated grand canonical ensemble has uniform decay of correlations.
Key words and phrases:
Canonical ensemble, equivalence of ensembles, local Cramér theorem, strong interaction, phase transition2010 Mathematics Subject Classification:
Primary: 82B05, Secondary: 60F05, 82B26.1. Introduction
The broader scope of this article is the study of phase transitions. A phase transition occurs if a microscopic change in a parameter leads to a fundamental change in one or more properties of the underlying physical system. The most wellknown phase transition is when water becomes ice. Many physical and nonphysical systems and mathematical models have phase transitions. For example, liquidtogas phase transitions are known as vaporization. Solidtoliquid phase transitions are known as melting. Solidtogas phase transitions are known as sublimation. More examples are the phase transition in the 2d Ising model (see for example [Sel16]), the ErdösRenyi phase transition in random graphs (see for example [ErRe60], [ErRe61] or [KrSu13]) or phase transitions in social networks (see for example [Fro07]).
We are interested in studying a onedimensional lattice systems of unbounded realvalued spins. The system consists of a finite number of sites on the lattice . For convenience, we assume that the set is given by . At each site there is a spin . In the Ising model the spins can take on the value or . In this article, we consider realvalued spins . A configuration of the lattice system is given by a vector . The energy of a configuration is given by the Hamiltonian of the system. For the detailed definition of the Hamiltonian we refer to Section 2. We consider arbitrary strong, attractive, nearestneighbor interaction.
We consider two ensembles of the lattice system. The first ensemble is the grandcanonical ensemble which is given by the Gibbs measure
(1) 
Here, is a generic normalization constant making the measure a probability measure. The constant is interpreted as an external field. The second ensemble is the canonical ensemble. It emerges from the grandcanonical ensemble by conditioning on the mean spin
(2) 
The canonical ensemble is given by the probability measure
(3) 
where denotes the dimensional Hausdorff measure.
The grandcanonical ensemble has a phase transition on the twodimensional lattice (see for example [Pei36]). However, on the onedimensional lattice the grandcanonical ensemble does not have a phase transition if the interaction decays fast enough (see for example [Isi25, Dob68, Dob74, Rue68, MeNi14]). More precisely, in this work a system has no phase transition if the infinitevolume Gibbs measure of the system is unique. It is a natural question if the canonical ensemble also does not have a phase transition on the onedimensional lattice. This is a nontrivial question since there are known examples where the grand canonical ensemble has no phase transition but the canonical ensemble has (see for example [ScSh96, BiChKo02, BiChKo03]).
If the spins are valued there is no phase transition for the canonical ensemble on a one dimensional lattice with nearestneighbor interaction. The authors could not find a proof of that statement in the literature but it follows from a result by Cancrini, Martinelli and Roberto [CMR02]. There, a logarithmic Sobolev inequality is deduced for the canonical ensemble on lattices of arbitrary dimension, provided the grand canonical ensemble satisfies a mixing condition. The mixing condition used in [CMR02] is that the grand canonical ensemble has an exponential decay of correlation that is uniform in the external field . This hypothesis is satisfied if the underlying lattice is onedimensional. In our article we will use a similar mixing condition.
Up to the authors knowledge, this question is still open if the spins are realvalued and unbounded. We conjecture that this is true i.e. the infinitevolume Gibbs measure of the canonical ensemble should be unique. A first step toward verifying this conjecture is to study the equivalence of the grandcanonical and canonical ensemble.
In equivalent ensembles, properties usually transfer from one ensemble to the other. The equivalence of ensembles in onedimensional lattice system was deduced by Dobrushin [DoTi77] for discrete (or bounded) spin values or by Georgii [Ge95] for quadratic Hamiltonians. However, our case where the spin values are unbounded and the Hamiltonian is not quadratic is still open.
There are many different notions of equivalence of ensembles. We only consider the most simple type, namely the equivalence of thermodynamic quantities (see for example [Ad06]). This means that as the system size goes to infinity the free energy of the grand canonical ensemble converges to the free energy of the canonical ensemble (for more details see Section 2 below).
In the main result of this article, i.e. in Theorem 2.3 below, we show that the grand canonical and canonical ensemble are equivalent. In fact, we show that free energies converge uniformly in as the system size goes to infinity. The rate of convergence in Theorem 2.3 is explicit. We therefore extend and refine the results of Dobrushin [DoTi77] and Georgii [Ge95].
Our argument is quite general and should apply to more general situations. The argument does not use that the lattice is onedimensional. Instead, it only uses that the grand canonical ensemble on a onedimensional lattice has an uniform exponential decay of correlations (see for example [MeNi14] and [Zeg96]). Under the assumption of an uniform exponential decay of correlation, one should be able to use similar calculations to deduce the local Cramér theorem for spin systems on arbitrary graphs, as long as the degree is uniformly bounded and the interaction has finite range. However, we only consider the onedimensional lattice with nearestneighbor interaction because less notational burden is better for explaining ideas and presenting the calculations.
A consequence of Theorem 2.3 is that the free energy of the canonical ensemble is uniformly strictly convex and quadratic for large enough systems (see Corollary 2.4). Strict convexity of the free energy rules out phase coexistence which corresponds flat parts in the free energy. The most prominent example of phase coexistence is that under ordinary pressure water and ice can coexist at 0 degree Celsius. We want to point out that our result already applies to large but finite systems. In the infinitevolume limit, ordinary equivalence of ensembles (and not equivalence in ) would suffice to conclude that the free energy of the canonical ensemble is strictly convex.
Closely related to the free energy of the canonical ensemble is the notion of the coarsegrained Hamiltonian (cf. is (28) and [GrOtViWe09]). As in [GrOtViWe09], we derive from Theorem 2.3 a local Cramér theorem (see Theorem 2.6). The local Cramér theorem shows that the coarsegrained Hamiltonian converges in to the Legendre transform of the free energy of the grand canonical ensemble. It is a direct consequence of the local Cramér theorem that the coarsegrained Hamiltonian is also uniformly strictly convex for large enough system size (cf. Corollary 2.7).
The coarsegrained Hamiltonian plays an important role when studying the Kawasaki dynamics. The Kawasaki dynamics is natural drift diffusion process on our lattice system that conserves the mean spin of the system. The canonical ensemble is the stationary and ergodic distribution of the Kawasaki dynamics. The strict convexity of is a central ingredient for deducing a uniform logarithmic Sobolev inequality (LSI) for the canonical ensemble via the twoscale approach [GrOtViWe09].
The LSI characterizes the speed of convergence of the Kawasaki dynamics to the canonical ensemble. With the equivalence of dynamic and static phase transitions (see [MeNi14] or [Yos03] a uniform LSI would also yield the absence of a phase transition and verify our conjecture (i.e. that the infinitevolume Gibbs measure is unique). Additionally, a uniform LSI is one if the main ingredients when deducing hydrodynamic limit of the Kawasaki dynamic via the two scale approach (see next paragraph). The uniform LSI for the canonical ensemble with no interaction is a wellknown result (see for example [Cha03, LaPaYa02, GrOtViWe09]). For weak interaction the uniform LSI was deduced in [Me11]. The question if the canonical ensemble satisfies a uniform LSI for strong nearestneighbor interaction is still open. For valued spins the answer is yes (see [CMR02]). The authors believe that this should also be the case for unbounded realvalued spins.
The strict convexity of the coarsegrained Hamiltonian also plays a crucial role when deducing the hydrodynamic limit of the Kawasaki dynamic. The hydrodynamic limit is a law of large numbers for processes. It states that under the correct scaling the Kawasaki dynamics (which is a stochastic process) converges to the solution of a nonlinear heat equation (which is deterministic). It is conjectured by H.T. Yau that the hydrodynamic limit also holds for strong finiterange interactions on a onedimensional lattice. So far, this conjecture is still wide open. The strict convexity of the coarsegrained Hamiltonian, which is deduced in this article, is an important cornerstone to tackle this problem with the help of the twoscale approach (see [GrOtViWe09]).
Let us now comment on how the equivalence of ensembles is deduced. The motivation for our approach comes from the proof of the local Cramér theorem in [GrOtViWe09] and [Me11]. By using Cramér’s trick of an exponential shift it suffices to show bounds on the density of a sum of random variables (see also Proposition 3.6 below). Those desired bounds were derived [GrOtViWe09] and [Me11] via a local central limit theorem (clt) for independent random variables. Our situation is a lot more subtle: Instead of deducing a local clt for independent random variables we would have to deduce a local clt for dependent random variables. At this point one could hope to use existing methods to deduce the local clt. Let us mention for example the approach of Dobrushin [Dob74], the approach of Bender [Ben73] or the approach of Wang and Woodroofe [WaWo90]. Unfortunately this does not help. All methods –at least the ones that are known to the authors– use the following principle (see also [DeSaMe16]):
(4) 
The first ingredient, namely the integral clt for the dependent random variables is relatively easy to deduce. There are a lot of methods available. Let us mention for example Stein’s method (see for example [ChGoSh11]), methods that are based on mixing, or methods that are based on Donsker’s theorem (see for example [Dur2010]). Deducing the second ingredient is tricky, not to mention that Dobrushin [Dob74] carried out that step only for discrete or bounded random variables.
All in all, this approach has two fundamental problems. The first one is that we need not only to control the density itself but also the first and second derivative. As a consequence, one would need very detailed information about the regularity of the density. We also believe that showing this regularity is as hard as directly deducing the local central limit theorem. Let us turn to the second problem. In order to deduce Theorem 2.3 the local central limit theorem must be quantitative. Using the principle from above yields suboptimal rates of convergence. For deducing Theorem 2.3 one has to iteratively apply the principle three times; and in each iteration the convergence rate gets worse. One would have to hope that in the end the convergence rate is still good enough for deducing Theorem 2.3.
Instead of using the principle from above, we generalize a wellknown method for proving the local clt for independent random variables to dependent ones. We generalize the method that is based on characteristic functions and Fourier inversion (see [Fel71] and [GrOtViWe09]). Calculations get quite evolved and lengthy. We do not deduce a local clt for dependent random variables in this work. Instead, we only deduce bounds that are needed to deduce Theorem 2.3 (cf. Proposition 3.6 below). However, one could use our calculations as a guideline for deducing a quantitative, local clt for dependent random variables. When doing so, one would have to substitute some of our arguments that use the specific structure of our lattice model. We use the following special structure:

Exponential decay of correlations (see Lemma 3.5).

The interaction has finite range . More precisely, we use that two spins and become independent if and one conditions on the spin values between them (see Section 2).

The Hamiltonian is quadratic. More precisely, we use the following consequence. For all the conditional variances is bounded from above and below uniformly in the values (see Section 2).
As mentioned before, the local Cramér theorem (see Theorem 2.6) is deduced by generalizing the argument of [GrOtViWe09] for independent random variables to dependent random variables. This adds a lot more complexity to the task. We overcome the technical challenges of considering dependent random variables by using two strategies. The first strategy is to induce artificial independence by conditioning on even or odd random variables. The second strategy is to handle dependencies as a perturbation. We morally treat large blocks as single sites of a coarsegrained system. Because there is a big distance between the blocks, the blocks are only weakly dependent. Then, the error term can be controlled by using the decay of correlations. For more details we refer to the comments after Proposition 3.6 and at the beginning of Section 4.
Let us shortly discuss possible generalizations of our main result. We expect that one can generalize our method with only slight modifications to the following situation:

instead of nearestneighbor interaction to finite range interaction.

instead of exponential decay of correlations to sufficiently fast algebraic decay.

instead of a 1d lattice to any lattice or graph with bounded degree, as long as the grand canonical ensemble has sufficient decay of correlations, uniformly in the system size and the external field .

instead of attractive interaction to repulsive and mixed interactions, as long as the estimate
(5) is satisfied. For attractive interaction this estimate is deduced in Lemma 3.2.
More challenging, it would be very interesting to study the local Cramér theorem for the following changes:

Instead of finiterange interaction one could consider infiniterange interaction. In the case of the grand canonical ensemble on a onedimensional lattice, there is no phase transition if the interaction decays algebraically faster than . The decay condition is sharp (see for example [MeNi14] and references therein). It would be very interesting to know if the local Cramér theorem (see Theorem 2.6) also holds for this decay or a stronger algebraic decay is needed.

In our model we need a quadratic singlesite potential. Inspired from [FaMe14], it is natural to ask if the local Cramér theorem also holds for superquadratic or polynomially increasing singlesite potentials.

Inspired by [Dob74] or [Ge95] it would be interesting to study more general interaction than pairwisequadratic interaction.
We conclude the introduction by giving a short overview over the remaining article. In Section 2 we introduce the precise setting and formulate the main results. In Section 3 we deduce the main results of this article up to Proposition 3.6. The main computations are done in Section 4 where we give the proof of Proposition 3.6.
Conventions and Notation

The symbol denotes the term that is given by the line .

With uniform we mean that a statement holds uniform in the system size , the mean spin and the external field .

We denote with a generic uniform constant. This means that the actual value of might change from line to line or even within a line.

denotes that there is a uniform constant such that .

means that and .

denotes the dimensional Hausdorff measure.

is a generic normalization constant. It denotes the partition function of a measure.
2. Setting and main results
We start with explaining the details of our model. We consider the sublattice . The Hamiltonian of the system is defined as
(6) 
We make the following assumptions:

The singlesite potential can be written as
(7) where the function satisfies
(8) 
The numbers are arbitrary. They model the interaction of the system with an external field or the boundary.

The number is arbitrary. It models the strength of the interaction. The interaction is attractive if . The interaction is repulsive if .
Now, let us turn to the first main result of this article, namely the equivalence of ensembles (see Theorem 2.3 from below). The grand canonical ensemble (gce) is a probability measure on given by the Lebesgue density
(9) 
The free energy of the gce is given by (cf. (1) and [GrOtViWe09])
(10) 
We observe that is uniformly strictly convex. More precisely, it holds:
Lemma 2.1.
Let be a realvalued random variable distributed according to the gce . Assume that
(11) 
Then the free energy of the gce is uniformly strictly convex in the sense that there exists a constant such that for all
(12) 
The proof of Lemma 2.1 is given in Section 3. The core ingredient of the argument is a uniform Poincaré inequality. The additional assumption (11) is not very restrictive. For example, it is automatically satisfied if the interaction is attractive (see Lemma 3.2 below).
Let us turn to the canonical ensemble (ce) . It emerges from the gce by conditioning (i.e. fixing) on the mean spin
(13) 
The ce is given by the probability measure
(14) 
where denotes the dimensional Hausdorff measure. The free energy of the ce is given by
(15) 
Equivalence of ensembles only holds if the external field of the gce and the mean spin of the ce are related in the following way.
Assumption 2.2.
Now, let us formulate our first main result, namely the equivalence of the free energies in .
Theorem 2.3 (Equivalance of ensembles).
Let be a realvalued random variables distributed according to
(17) 
Assume that
(18) 
Then it holds that
(19) 
where the convergence is uniform in the mean spin and the external field . More precisely, given a constant , there is an integer such that for all
(20)  
(21)  
(22) 
We would like to emphasize that Theorem 2.3 contains explicit rates of convergence. We give the proof of Theorem 2.3 in Section 3.
A direct consequence of Lemma 2.1 and Theorem 2.3 is that the free energy is uniformly strictly convex for large enough systems.
Corollary 2.4.
There is a uniform constant and an integer such that for all and all
(23) 
Let us turn to the second main result of this article, the local Cramér theorem. For that purpose let us introduce which denotes the Legendre transform of the free energy (also denoted by ) i.e.
(24) 
It follows from elementary observations that is uniformly strictly convex.
Lemma 2.5.
For any
(25) 
Additionally, under the same assumptions as in Theorem 2.3, it holds that is uniformly strictly convex in the sense that there is a uniform constant such that for all
(26) 
The coarsegrained Hamiltonian is defined as
(27) 
Hence, we can rewrite the free energy of the ce as
(28) 
It follows that the difference of the free energies and can be expressed as
(29)  
(30) 
From Theorem 2.3 we deduce the following local Cramér theorem.
Theorem 2.6 ( local Cramér theorem).
Let be a realvalued random variables distributed according to
(31) 
Assume that
(32) 
Then it holds that
(33) 
where the convergence is uniform in the mean spin and the external field . More precisely, given a constant , there is an integer such that for all
(34)  
(35)  
(36) 
Theorem 2.6 is an extension of the
local Cramér theorems that were deduced in Proposition 31 in [GrOtViWe09], Theorem 4 in [Me11] and [MeOt13]. The proof of Theorem 2.6 is stated in Section 3. The main ingredient is Theorem 2.3.
An important consequence of Lemma 2.5 and of Theorem 2.6 is that for large enough systems the coarsegrained Hamiltonian is uniformly strictly convex.
Corollary 2.7.
Under the assumptions of Theorem 2.6 there is an positive integer such that for all the coarsegrained Hamiltonian is uniformly strictly convex. More precisely, there is a uniform constant such that for all
(37) 
3. Proof of the main results
In this section we prove the main results of this article.
Assumption 3.1.
From now on we assume that is a realvalued random vector distributed according to
(38) 
We begin with simple auxiliary lemma.
Lemma 3.2.
The last lemma shows that the variance of the the mean spin of the gce is well behaved.
Proof of Lemma 3.2. For the proof of the upper bounds, one can apply a result of [MeNi14] which provided the logarithmic Sobolev inequality (LSI) for . It is well known that logarithmic Sobolev inequality implies spectral gap inequality (SG). This implies
(40) 
where is a constant in spectral gap inequality independent of .
Proof of the lower bound relies on [Me11, Lemma 9]. Let be a random variable distributed according to the distribution
(41) 
In [Me11, Lemma 9], it was shown that there is a constant such that for all ,
(42) 
Observe that the conditional expectation has the Lebesgue density
(43) 
Let be the measure defined by the following property
(44) 
Then it follows that
(45)  
(46)  
(47) 
Moreover, Menz and Nittka (cf. [MeNi14]) proved that for we have
(48) 
Then it follows using (47) and (48) that
(49) 
∎
Proof of Lemma 2.1. It is a direct consequence of Lemma 3.2. Indeed, we have
(50)  
(51) 
Taking the derivative with respect to again, we obtain
(52)  
(53) 
Therefore, we conclude from Lemma 3.2 that there is a constant with
(54) 
∎
Proof of Lemma 2.5. Let us denote . Since is the Legendre transform of the strict convex function , there exists a unique such that
(55) 
Moreover, for each , satisfies
(56) 
which is equivalent to
(57)  
(58)  
(59) 
Then it follows that
(60)  
(61)  
(62)  
(63) 
In Lemma 2.1 we proved
(64) 
Thus Lemma 3.2 implies there exists a constant with
(65) 
∎
Let us now turn to the proof of Theorem 2.3. We need some more auxiliary results. The first one is Cramér’s trick of exponential shift of measures.
Lemma 3.3.
It holds that
(66)  
(67) 
Here, denotes the distribution of
(68) 
Proof of Lemma 3.3. The lemma follows from a direct computation:
(69)  
(70)  
(71)  
(72)  
(73)  
(74)  
(75) 
Taking the exponential function and using (30) yields the lemma as desired. ∎
Next, we need the following direct consequence of Lemma 3.2.
Lemma 3.4.
Proof of Lemma 3.4. Using the arithmeticgeometric mean inequality we get
(77)  
(78)  
(79) 
The lemma now follows from the observation that because the gce satisfies a uniform Poincaré inequality it holds that for all and
(80) 
where the constant only depends on . ∎
The next auxiliary lemma states that on a one dimensional lattice with nearestneighbor interaction, the gce has uniform exponential decay of correlations.
Lemma 3.5.
For all functions we have
(81)  
(82) 
where and denotes the support of the function and respectively.
Proof of Lemma 3.5. See [MeNi14]. ∎
Proposition 3.6.
For each and , there exists a uniform constant and an integer such that for all and all
(83)  
(84)  
(85) 
The statement of Proposition 3.6 should be compared to Proposition 31 in [GrOtViWe09] or Proposition 3.1 in [MeOt13]. The main difference is that in our situation the random variables are dependent. This also makes the proof of Proposition 3.6 a lot harder.
The estimates of Proposition 3.6 are motivated from deducing a quantitative local central limit theorem for the properly normalized sum of the random variables . For example, if the random variables are iid, the estimate (83) is a weaker version of the quantitative local clt estimate
(86) 
The last inequality states that the density of the normalized sum at point converges to the density of the normal distribution. As we mentioned in the introduction, we believe that one could strengthen the estimates of Proposition 3.6 to get a local central limit theorem for dependent random variables. However, we choose to derive weaker bounds instead because they are sufficiently strong for deducing our main results (see Theorem 2.3 and Theorem 2.6). Deducing those weaker estimates is already quite subtle and challenging.
We deduce Proposition 3.6 in Section 4. There, we also comment on how to overcome the problem of considering dependent random variables and not independent ones. Now, we are prepared for the proof of Theorem 2.3.
Proof of Theorem 2.3. Let us begin with the estimate (20). From (66), we have
(87) 
Then a combination of (83) and (87) yields, as desired,
(88) 
Let us turn to the estimate (21). Taking the derivative with respect to in (87) yields
(89) 
Let us choose . Then a combination of (83), (84) and (89) implies
(90) 
Let us turn to the estimate (22). Differentiating (89) again, we get
(91) 
Then after choosing , a combination of (83), (84) and (85) yields
(92) 
∎
Let us proceed to the proof of Theorem 2.6.
Proof of Theorem 2.6. Recall the difference of the free energies (30) of and
(93) 
Then the first desired estimate (34) follows from a combination of (30) and (20) in Theorem 2.3. Let us turn to the estimate (35). A direct computation yields
(94)  
(95)  
(96) 
Then (21), (64) and Lemma 3.2 implies, as desired,
(97) 