Strong Spatial Mixing and Approximating Partition Functions of Two-State Spin Systems without Hard Constrains

# Strong Spatial Mixing and Approximating Partition Functions of Two-State Spin Systems without Hard Constrains

Jinshan Zhang Department of
Mathematical Sciences
Tsinghua University
Beijing, China 100084

Email: zjs02@mails.tsinghua.edu.cn

Abstract: We prove Gibbs distribution of two-state spin systems(also known as binary Markov random fields) without hard constrains on a tree exhibits strong spatial mixing(also known as strong correlation decay), under the assumption that, for arbitrary ‘external field’, the absolute value of ‘inverse temperature’ is small, or the ‘external field’ is uniformly large or small. The first condition on ‘inverse temperature’ is tight if the distribution is restricted to ferromagnetic or antiferromagnetic Ising models.

Thanks to Weitz’s self-avoiding tree, we extends the result for sparse on average graphs, which generalizes part of the recent work of Mossel and Sly[15], who proved the strong spatial mixing property for ferromagnetic Ising model. Our proof yields a different approach, carefully exploiting the monotonicity of local recursion. To our best knowledge, the second condition of ‘external field’ for strong spatial mixing in this paper is first considered and stated in term of ‘maximum average degree’ and ‘interaction energy’. As an application, we present an FPTAS for partition functions of two-state spin models without hard constrains under the above assumptions in a general family of graphs including interesting bounded degree graphs.

Keywords: Strong Spatial Mixing; Self-Avoiding Trees; Two-State Spin Systems; Ising Models; FPTAS; Partition Function

## I Introduction

Counting problem has played an important role in theoretic computer science since Valiant[19] introduced P-Complete conception and proved many enumeration problems are computationally intractable. The most successful and powerful existing method for counting problem is due to Markov Chain method, which has been successfully used to provide a fully polynomial randomized approximation schemes (FPRAS)(which approximates the real value within a factor of in polynomial time of the input and with the probability 3/4) for convex bodies[3] and the number of perfect matchings on bipartite graphs[9]. Since many counting problems such as the number of matchings, independent sets, circuits[14] etc. can be viewed as special cases of computing partition functions associated with Gibbs measures in statistical physics. Hence studying the computation of partition function is a natural extension of counting problems.

Self-reducing [10] or conditional probability method is a well known method to compute partition functions if the marginal probability of a vertex can be efficiently approximated. Gibbs sampling also known as Glauber dynamics is a popular used method to approximate marginal probability. This is a Markov Chain approach locally updating the chain according to conditional Gibbs measure. Hence studying the convergence rate(also known as mixing time) of Glauber dynamics becomes a major research direction. Recently the problem whether the Glauber dynamics converges ‘fast’(in a polynomial time of the input and logarithm of reciprocal of sampling error) deeply related to whether a phase transition takes place in statistical model has been extensively studied, see [16] for hard core model(also known as independent set model) and [15][6] for ferromagnetic Ising model. Another approach to approximate marginal probability comes from the property of the structure of Gibbs measures on various graphs. This method utilizes local recursion and leads to deterministic approximation schemes rather than random approximation schemes of Markov Chain method. Our paper focuses on this recursive approach.

The recursive approach for counting problems is introduced by Weitz[21] and Bandyopadhyay, Gamarnik [1] for counting the number of independent sets and colorings. The key of this method is to establish the property also known as on certain defined rooted trees, which means the marginal probability of the root is asymptotically independent of the configuration on the leaves far below. Usually the exponential decay with the distance implies a deterministic polynomial time approximating algorithm for marginal probability of the root. In [21], Weitz establishes the equivalence between the marginal probability of a vertex in a general graph and that of the root of a tree named - associated with for two-state spin systems and shows the correlations on any graph decay at least as fast as its corresponding self-avoiding tree. He also proves the strong correlation decay for hard-core model on bounded degree trees. Later Gamarnik et.al.[5] and Bayati et.al.[2] bypass the construction of a self-avoiding tree, by instead creating a certain and establishing the strong correlation decay on the corresponding computation tree for list coloring and matchings problems. An interesting relation between self-avoiding tree and computation tree is that they share the same recursive formula for hard-core model. Considering the motivation of construction of the self-avoiding tree, Jung and Shah[8] and Nair and Tetali [17] generalize Weitz’s work for certain Markov random field models, and Lu et.al.[12] for TP decoding problem. Mossel and Sly[15] show ferromagnetic Ising model exhibits strong correlation decay on ‘sparse on average’ graph under the tight assumption that the ‘inverse temperature’ in term of ‘maximum average degree’ is small.

In this paper, based on self-avoiding tree, we establish the strong spatial mixing for general two-state spin systems also know binary Markov random field without hard constrains on a graph that are sparse on average under certain assumptions. Our first condition is on the ‘inverse temperature’. We show that there exits a value in term of ‘maximum average degree’ , if the absolute value of the ‘inverse temperature’ is smaller than , for arbitrary ‘external field’, the Gibbs measure exhibits strong spatial mixing on a sparse on average graph. Since for (anti)ferromagnetic Ising model, strong spatial mixing on a finite regular tree implies uniqueness of Gibbs measures of infinite regular tree[4][20]. in our setting is the critical point for uniqueness of Gibbs measures of infinite regular tree with degree of each vertex , implying our condition is also necessary on trees. The condition is the same as that of Mossel and Sly[15] when ferromagnetic Ising model is the only focus. This makes part of their work in our framework. Our proof yields a different approach, and also avoids the argument between weak spatial mixing and strong spatial mixing employed in [21]. In fact our proof is based an inequality similar to the one in [13] and carefully exploits monotonicity of the recursive formula. The recursive formula on trees is well known. Recently Pemantle and Peres [18] use it to present the exact capacity criteria that govern behavior at critical point of ferromagnetic Ising model on trees under various boundary conditions. Our second condition is for ‘external field’. We prove for any ‘inverse temperature’ on a graph which are sparse on average Gibbs distribution exhibits strong spatial mixing when the ‘external field’ is uniformly larger than or smaller than , where is ‘maximum average degree’ and , , are parameters of the system. To our best knowledge, this condition on ‘external field’ is first considered for strong spatial mixing. The technique employed in the proof is Lipchitz approach which has been used in [1][2][5]. The novelty here is that we propose a ‘path’ characterization of this method, allowing us to give the ‘external field’ condition in term of ‘maximum average degree’ rather than maximum degree. Some notations of the ‘sparse on average’ graphs have appeared in [15]. These are graphs where the sum degrees along each self-avoiding path(a path with distinct vertices) with length log is log.

As an application of our results, we present a fully polynomial time approximation schemes(FPTAS)(which approximates the real value within a factor of in polynomial time of the input and ) for partition functions of two-state spin systems without hard constrains under our assumptions on the graph , where, for each vertex of , the number of total vertices of its associated self-avoiding tree with hight log is . This includes bounded degree graph and especially lattice more concerned in statistical physics. Jerrum and Sinclair [7] provided an FPRAS for partition function of ferromangetic Ising model for any graph with any positive ‘inverse temperature’ and identical external field for all the vertices. Their results do not include the case where different vertices have different external field, and are not applied to antiferromagnetic Ising model where the ‘inverse temperature’ is negative either.

The remainder of the paper has the following structure. In Section II, we present some preliminary definitions and main results. We go on to prove the theorems in Section III. Section IV is devoted to propose an FPTAS for the partition functions under our conditions. Further work and conclusion are given in Section IV.

## Ii Preliminaries and Main Results

### Ii-a Two-State Spin Systems

In the two-state spin systems on a finite graph with vertices and edge set , a configuration consists of an assignment of values, or “spins”, to each vertex(or“sites”) of . Each vertex is associated with a random variable with range . We often refer to the spin values as and . The probability of finding the system in configuration is given by the joint distribution of dimensional random vector (also known as the Gibbs distribution with the nearest neighbor interaction)

 PG(X=σ)=1Z(G)exp(∑(i,j)∈Eβij(σi,σj)+∑i∈Vhi(σi)).

Here is called partition function of the system and a normalized factor such that , and and are defined as a function from and to respectively. We use notation . We say the system has hard constraints if there exit an edge or a vertex , and an assignment , or such that or (e.g. hard-core model is one of the systems with hard constrains where , and ). In this paper we focus to the systems without hard constrains. We call the function ‘interaction energy’ and ‘applied field’ . If and for all the edge and vertex , where and are constant numbers varying with edges or vertices, the system is called Ising model. Further, if is uniformly (negative)positive for all , the system is called (anti)ferromagnetic Ising model. and are called and respectively. To match the notation of Ising model, set and for all edges and vertices , in this paper we call and are ‘inverse temperature’ and ‘external field’ of general two-state spin systems without hard constrains (denoted by TSSHC for abbreviation)respectively. For any , denotes the set . With a little abuse of notations, also denotes the configuration that is fixed , . Let denote the partition function under the condition , e.g. represent the partition function under the condition the vertex is fixed .

### Ii-B Definitions and Notations

Definition 2.1 (Self-Avoiding Tree) Consider a graph and a vertex in . Given any order of all the vertices in . There is associated partial order on of the order on defined as iff , share a common vertex and . The self-avoiding tree (for simplicity denoted by ) corresponding to the vertex is the tree of self-avoiding walks originating at except that the vertices closing a cycle are also included in the tree and are fixed to be either or . Specifically, the vertex of the closing a cycle is fixed if the edge ending the cycle is larger than the edge starting the cycle and otherwise. Given any configuration of , , the self-avoiding tree is constructed the same as the above procedure except that the vertex which is a copy of the vertex in is fixed to the same spin as and the subtree below it is not constructed(See Figure 1). Hence, for any configuration of , , we also use to denote the configuration of obtained by imposing the condition corresponding to as above. For any and of , the ‘interaction energy’ function and ‘applied field’ function on all their copies of the induced system on by are the same as and respectively.

We now provide the remarkable property of the self-avoiding tree, one of two main results of [21], which is one of the essential techniques of our proofs.

Proposition 2.1 For two-state spin systems on , for any configuration , and any vertex , then

 PG(Xv=+|σΛ)=PTsaw(v)(Xv=+|σΛ).

In order to generalize our result to more general families of graphs , which are sparse on average, we need some definitions and notation of these graphs.

Definition 2.2 Let denote the cardinality of the set . The length of a path is the number of edges it contains. The distance of two vertices in a graph is the length of shortest path connecting these two vertices. A path is called a self-avoiding path if for all , . In a graph , let denote the distance between and , ,. The distance between a vertex and a subset is defined by min. The set of vertices within distance of is denoted by . Similarly, the set of vertices with distance of is denoted by . We call a vertex at the height of a rooted tree if the distance between it and the root is . Let denote the degree of in . The is defined by , where the maximum is taken over all self-avoiding paths starting at with length at most . The is defined by . The of is defined by . Roughly speaking, in this paper, a family of graphs is sparse on average if there exits a constant number and such that for any .

Some properties of the above definitions are useful in our proof, we present them. Most of proofs are simply obtained by induction and can be found in [15].

Proposition 2.2 Let , denote positive natural numbers, then

 m(G,v,jl)≤jmaxu∈G{m(G,u,l)−δu}+δv.

Proposition 2.3 Let be natural numbers, then

 |S(Tsaw(v),v,l+1)|≤δv(δ(G,v,l)−1)l.

Proposition 2.4 Let , be natural numbers, then

 |V(Tsaw(v),v,jl)|≤(maxu∈V|V(Tsaw(u),u,l)|)j.

Definition 2.3 ((Exponential) Strong Spatial Mixing) Let be a graph with vertices. The Gibbs distribution of two-state spin systems on exhibits strong spatial mixing iff for any vertex , subset , any two configurations and on , denote and ,

 |PG(Xv=+|σΛ)−PG(Xv=+|ηΛ)|≤f(t),

where goes to zero if goes to infinity and is called decay function.
For the purpose of our settings, we present a weak form of exponential strong spatial mixing. We say the distribution exhibits exponential strong spatial mixing if there exits positive numbers , , independent of such that when log, .

Remark: In the above definition of (exponential) strong spatial mixing, and can be replaced by and respectively if is large than a constant number, due to the inequality log when , and we call it the logarithmic form exponential strong spatial mixing. )

Definition 2.4 (FPTAS) An approximation algorithm is called a fully polynomial time approximation scheme(FPTAS) iff for any , it takes a polynomial time of input and to output a value satisfying

 1−ϵ≤¯MM≤1+ϵ,

where is the real value.

Remark: In the above definition and can be replaced by and .

### Ii-C Main Results

For simplicity , We use the following notations. Consider a two-state spin systems with hard constrains(TSSHC) on a graph with vertices and edge set . Let , , , , , , , where , are ‘inverse temperature’ and ‘external field’ respectively, and , , , .

Theorem 2.1 Let be a graph with vertices , edges set and TSSHC on it. If there exit two positive numbers and such that , and when

 (d−1)tanhJ<1

or equivalently , then the Gibbs distribution of TSSHC exhibits logarithmic form exponential strong spatial mixing for arbitrary ‘external field’, specifically, for any , any two configurations and on , denote and ,

 |log(PG(Xi=+|σΛ))−log(PG(Xi=+|ηΛ))|≤f(t),

where .

Remark: If the graph is bounded with the maximum degree , then can be replaced by while for any , and is the ‘inverse temperature’ in (anti)ferromagnetic Ising model, then theorem 2.1 still holds and is the critical point for uniqueness of Gibbs measures on a infinite tree with maximum degree [13]. Note the decay function is slight different from the definition since may be , however, in this case we can choose large enough independent of such that when is large, where is a positive number independent of , then replace by as required. In fact in the application of the algorithm, this is not important.

Theorem 2.2 Let be a graph with vertices , edges set and TSSHC on it. If there exit two positive numbers and such that , and , and when

 Bmin>B(d,αmax,γ)      or     Bmax<−B(d,−αmin,γ)

where , the Gibbs distribution of TSSHC exhibits exponential strong spatial mixing, specifically, for any , any two configurations and on , denote and ,

 |PG(Xi=+|σΛ)−PG(Xi=+|ηΛ)|≤f(t),

where or respectively.

Remark: It’s easy to check , hence in theorem 2.2, if , then . As a corollary of Theorem 2.2, from its proof in section III, we know if the graph is bounded degree with maximum degree is , the condition for ‘external field’ can be relaxed to or for any , which does not require that ‘external field’ is uniformly large or uniformly small.

Theorem 2.3 Let be a graph with vertices , edges set and TSSHC on it. If there exit two positive numbers and such that for any

 V(Tsaw(i),i,alogn)≤(d−1)alogn,

where , then when or , or , there exits an FPTAS for partition function of TSSHC on .

## Iii Proofs

We now proceed to prove the theorems. One of the technical lemmas for the theorem 2.1 is an inequality similar to [13]. We present it now.

Lemma 3.1 Let , , , , , be positive numbers and and , then

 max(g(x)g(y),g(y)g(x))≤(max(xy,yx))t.

Proof: Case 1. . Since is an increasing function, w.l.o.g. suppose and let , where , then

Hence,

 max(g(x)g(y),g(y)g(x))=g(x)g(y)≤(xy)t=(max(xy,yx))t,

where .
Case 2. . Similar to the first case, is a decreasing function, let , then is an increasing function, w.l.o.g. suppose , repeat the process of Case 1 for , then

Hence,

 max(g(x)g(y),g(y)g(x))=g(y)g(x)=h(x)h(y)≤(xy)t=(max(xy,yx))t,

where .

Lemma 3.2 Let be a tree rooted at with vertices , edge set and TSSHC on it. Suppose some vertices are fixed. Let and be two subtrees of including vertex and respectively by removing an edge where . The fixed vertices remain fixed on and . Then the probability of on equals the probability of on the subtree except changing the ‘external field’ to certain value .

Proof: Let denote the configuration spaces, and the edge set and vertices on . Setting

 h′k(σk)=hk(σk)+log(∑σ∈ΩTleβkl(σk,σl)+∑(i,j)∈Elβij(σi,σj)+∑i∈Vlhi(σi))

completes the proof.

With Lemma 3.1 and Lemma 3.2, we now proceed to prove (exponential) strong spatial mixing property on trees.

Theorem 3.1 Let be a tree rooted at with vertices , edge set and TSSHC on it. Let , and be any two configurations on . Let , and . Then

 max(PT(X0=+|ζΛ)PT(X0=+|ηΛ),PT(X0=+|ηΛ)PT(X0=+|ζΛ))≤exp(4Js(tanhJ)t−1)

Proof: For any , let denote the subtree with as its root and be the TSSHC induced on by . Noting is . To prove the theorem, it’s convenient to deal with the ratio rather than itself. Denote where is the condition by imposing the configuration on , and note a simple relation if , then iff , further . Hence replace and by and , we need only to show

 max(RζΛ0RηΛ0,RηΛ0RζΛ0)≤exp(4Js(tanhJ)t−1). (1)

Theorem 3.1 follows by .
We go on to prove (1) by induction on . Before we doing this, some trivial cases need to be clarified. We are interested in the case and is unfixed. Let denote the unique self-avoiding path from to on . If is a leave on and , where . Define . Note since . Let such that . By lemma 3.2, we can remove the subtree bellow and change external field at to without changing the probability of . More importantly, this process removes at least one leave at the hight , and does not remove any vertex at the hight . Thus, w.l.o.g. suppose is a tree rooted at where any leave on it at the height . Let be the neighbors connected to . A trivial calculation then gives that

 RζΛ0=Z(T0,X0=+,ζΛ)Z(T0,X0=−,ζΛ)=eh0(+)∑σ∈Ω0eq∑i=1(β00i(+,σ0i)+∑(k,l)∈Tiβkl(σk,σl)+∑k∈Tihk(σk))eh0(−)∑σ∈Ω0eq∑i=1(β00i(−,σ0i)+∑(k,l)∈Tiβkl(σk,σl)+∑k∈Tihk(σk))=e2B0q∏i=1∑σ∈ΩTieβ00i(+,σ0i)+∑(k,l)∈Tiβkl(σk,σl)+∑k∈Tihk(σk)∑σ∈ΩTieβ00i(−,σ0i)+∑(k,l)∈Tiβkl(σk,σl)+∑k∈Tihk(σk)
 =e2B0q∏i=1aiZ(T0i,Xi=+,ζΛi)+biZ(T0i,Xi=−,ζΛi)ciZ(T0i,Xi=+,ζΛi)+diZ(T0i,Xi=−,ζΛi)=e2B0q∏i=1aiRζΛ0i+biciRζΛ0i+di. (2)

where , , , , . Now we check the base case where , by the monotonicity of and ,

 max(RζΛ0RηΛ0,RηΛ0RζΛ0)≤q∏i=1max(aidibici,biciaidi)≤e4qJ

Hence , (1) holds. Assume by induction that (1) holds for , and we will show it holds for . Let , , still using above recursive procedure, then

 max(RζΛ0RηΛ0,RηΛ0RζΛ0)≤q∏i=1max(aiRζΛ0i+biciRζΛ0i+diaiRηΛ0i+biciRηΛ0i+di,aiRηΛ0i+biciRηΛ0i+diaiRζΛ0i+biciRζΛ0i+di)≤q∏i=1max(RζΛ0iRηΛ0i,RηΛ0iRζΛ0i)|√aidi−√bici√aidi+√bici|≤q∏i=1max(RζΛ0iRηΛ0i,RηΛ0iRζΛ0i)tanhJ

where the second inequality comes from the Lemma 3.1. According to the hypothesis of induction