Improved Mixing Condition on the Grid for Counting and Sampling Independent Sets
The hard-core model has received much attention in the past couple of decades as a lattice gas model with hard constraints in statistical physics, a multicast model of calls in communication networks, and as a weighted independent set problem in combinatorics, probability and theoretical computer science.
In this model, each independent set in a graph is weighted proportionally to , for a positive real parameter . For large , computing the partition function (namely, the normalizing constant which makes the weighting a probability distribution on a finite graph) on graphs of maximum degree , is a well known computationally challenging problem. More concretely, let denote the critical value for the so-called uniqueness threshold of the hard-core model on the infinite -regular tree; recent breakthrough results of Dror Weitz (2006) and Allan Sly (2010) have identified as a threshold where the hardness of estimating the above partition function undergoes a computational transition.
We focus on the well-studied particular case of the square lattice , and provide a new lower bound for the uniqueness threshold, in particular taking it well above . Our technique refines and builds on the tree of self-avoiding walks approach of Weitz, resulting in a new technical sufficient criterion (of wider applicability) for establishing strong spatial mixing (and hence uniqueness) for the hard-core model. Our new criterion achieves better bounds on strong spatial mixing when the graph has extra structure, improving upon what can be achieved by just using the maximum degree. Applying our technique to we prove that strong spatial mixing holds for all , improving upon the work of Weitz that held for . Our results imply a fully-polynomial deterministic approximation algorithm for estimating the partition function, as well as rapid mixing of the associated Glauber dynamics to sample from the hard-core distribution.
In this paper we study phase transitions for sampling weighted independent sets (weighted by an activity ) of the 2-dimensional integer lattice . In statistical physics terminology, we study the hard-core lattice gas model ([6, 13]), which is a simple model of a gas whose particles have non-negligible size (thus preventing them from occupying neighboring sites), with activity corresponding to the so-called fugacity of the gas. More formally, for a finite graph , let denote the set of independent sets of . Given an independent set , its weight is defined as and is said to be occupied under if . The associated Gibbs (or Boltzmann) distribution is defined on as , where is commonly referred to as the partition function.
Recall that Valiant  showed that exactly computing the number of independent sets is #P-complete, even when restricted to 3-regular graphs (see Greenhill ). Hence, we focus our attention on approximation algorithms for estimating the number, or more generally, the partition function. It is well known  that the problem of approximating the partition function and that of sampling from a distribution that is close to the Gibbs distribution , are polynomial-time reducible to each other (see also ).
The fundamental notion of a phase transition for a statistical mechanics model on an infinite graph addresses the critical point at which the model starts to exhibit a certain long-range dependence, as a system parameter is varied. In particular, the so-called critical inverse temperature for the Ising or the Potts model, and the critical activity for the hard-core lattice gas model, are prime examples where the system undergoes a transition from uniqueness to multiplicity of the infinite-volume Gibbs measures.
Phase transition in the hard-core model is also intimately related to the computational complexity of estimating the partition function . Recently, a remarkable connection was established between the computational complexity of approximating the partition function for graphs of maximum degree and the phase transition for the infinite regular tree of degree . On the positive side, Weitz  showed a deterministic fully-polynomial time approximation algorithm (FPAS) for approximating the partition function for any graph with maximum degree , when and is constant. On the other side, Sly  recently showed that for every , it is NP-hard (unless NP=RP) to approximate the partition function for graphs of maximum degree , when , for some function . More recently, Galanis et al.  improved the range of in Sly’s inapproximability result, extending it to all for the cases and .
1.1 Prior history and current work
Our work builds upon Weitz’s work to get improved results for specific graphs of interest. We focus our attention on what is arguably the simplest, not yet well-understood, case of interest namely the square grid, or the 2-dimensional integer lattice . Empirical evidence suggests that the critical point [13, 3, 26], but rigorous results are significantly far from this conjectured point. The possibility of there being multiple such is not ruled out, although no one believes that this is the case.
From below, van den Berg and Steif  used a disagreement percolation argument to prove where is the critical probability for site percolation on . Applying the best known lower bound on for by van den Berg and Ermakov  implies . Prior to that work, an alternative approach aimed at establishing the Dobrushin-Shlosman criterion , yielded, via computer-assisted proofs, by Radulescu and Styer , and by Radulescu .
These results were improved upon by Weitz  who showed that , where is the infinite, complete, regular tree of degree . For the upper bound, a classical Peierls’ type argument implies . (A related result of Randall  showing slow mixing of the Glauber dynamics for gives hope for a better upper bound on .) The regular tree is one of the only examples (that we know of) where the critical point is known exactly, and in this case, Kelly  showed that .
In this work we present a new general approach which, for the case of the hard-core model on , improves the lower bound to . There are various algorithmic implications for finite subgraphs of the when . Our results imply that Weitz’s deterministic FPAS is also valid on subgraphs of for the same range of . Thanks to the existing literature on general spin systems ([22, 23, 8, 11]), our results also imply that the Glauber dynamics has mixing time for any finite subregion of when , where . Recall that the Glauber dynamics is a simple Markov chain that updates the configuration at a randomly chosen vertex in each step, see  for an introduction to the Glauber dynamics. The stationary distribution of this chain is the Gibbs distribution. Hence, it is of interest as an algorithmic technique to randomly sample from the Gibbs distribution, and also as a model of how physical systems reach equilibrium. The mixing time is the number of steps (from the worst initial configuration) until the distribution is guaranteed to be within variation distance of the stationary distribution.
As in Weitz’s work, our approach can be used for other 2-spin systems, such as the Ising model. This is discussed in Section 6. As will be evident from the following high-level idea of our approach, it can be applied to other graphs of interest. Our work also provides an arguably simpler way to derive the main technical result of Weitz showing that any graph with maximum degree has strong spatial mixing (SSM) when .
To underline the difficulty in estimating bounds on , we remark that the existence of a (unique) critical activity remains conjectural and an open problem for , for . In contrast, for the Ising model, the critical inverse temperature has been known since 1944 ; interestingly, the corresponding critical point for the -state Potts model (for ) has only recently been established (by Beffara and Duminil-Copin ) to be , settling a long-standing open problem. The lack of monotonicity in in the hard-core model poses a serious challenge in establishing such a sharp result for this model. In fact, Brightwell et al.  showed that in general such a monotonicity need not hold, by providing an example with a non-regular tree.
2 Technical Preliminaries and Proof Approach
Before presenting our approach, it is useful to review briefly the uniqueness/non-uniqueness phase transition, and introduce associated notions of decay of spatial correlation, known as weak and strong spatial mixing properties. Much of the below discussion is simplified for the case of the hard-core model on , wherein one utilizes certain induced monotonicity (given by the bipartite property) in the model and the amenability of the graph.
2.1 Uniqueness, Weak and Strong Spatial Mixing
Let denote the finite graph corresponding to a box of side-length centered around the origin in . Thus, , where with edges between pairs of vertices at distance (or Manhattan distance) equal to one. Since this is a bipartite graph, we may fix one such partition – for example, it is standard to consider the set of vertices at an even distance from the origin as the even set. The boundary of are those vertices where for or . The hard-core model on bipartite graphs is a monotone system (e.g., see ), which for the current discussion implies that we only have to consider two assignments to the boundary: all even vertices or all odd vertices on the boundary are occupied. Let () denote the marginal probability that the origin is unoccupied given the even (odd, respectively) boundary. Then to establish uniqueness of the Gibbs measures, we need that:
We are interested in the critical point for the transition between uniqueness and non-uniqueness. A standard way to establish uniqueness is by proving one of the spatial mixing properties introduced next.
Let be a (finite) graph. For , a configuration on specifies a subset of as occupied and the remainder as unoccupied. Let denote the Gibbs distribution conditional on configuration to . For , let denote the marginal probability that is unoccupied in .
The first spatial mixing property is Weak Spatial Mixing (WSM). Here we consider a pair of boundary configurations on a subset and consider the “influence” on the marginal probability that a vertex is unoccupied. WSM says that the influence on decays exponentially in the distance of from .
Definition 1 (Weak Spatial Mixing).
For the hard-core model at activity , for finite graph , WSM holds with rate if for every , every , and every two configurations on ,
where is the graph distance (i.e., length of the shortest path) between and (the nearest point in) the subset .
The second property of interest is Strong Spatial Mixing (SSM). The intuition is that if a pair of boundary configurations on a subset agree at some vertices in then those vertices “encourage” to agree. Therefore, SSM says that the influence on decays exponentially in the distance of from the subset of vertices where the pair of configurations differ.
Definition 2 (Strong Spatial Mixing).
For the hard-core model at activity , for finite graph , SSM holds with rate if for every , every , every , and every two configurations on where ,
Note that since , SSM implies WSM for the same rate. Moreover, it is a standard fact that such an exponential decay in finite boxes (say), in , implies uniqueness of the corresponding infinite volume Gibbs measure on , see Georgii  for an introduction to the theory of infinite-volume Gibbs measures. We can specialize the above notions of WSM and SSM to a particular vertex , in which case we say that WSM or SSM holds at . If the graph is a rooted tree, we will always assume that the notions of WSM and SSM are considered at the root.
For the hard-core model on a graph , for a subset of vertices and a fixed configuration on , it is equivalent to consider the subgraph which we obtain for each that is fixed to be unoccupied we remove from , and for each that is fixed to be occupied we remove and its neighbors from . In this way we obtain the following observation which will be useful for proving SSM holds.
For a graph and , SSM holds in at vertex iff WSM holds for all subgraphs (of ) at vertex . To be precise, by subgraphs we mean graphs obtained by considering all subgraphs of and taking the component containing .
2.2 Self-Avoiding Walk Tree Representation
Since our work builds on that of Weitz’s, we first describe the self-avoiding walk (SAW) tree representation introduced in . Given , we first fix an arbitrary ordering on the neighbors of each vertex in . For each , the tree is constructed as follows. Consider the tree of self-avoiding walks originating from , additionally including the vertices closing a cycle as leaves of the tree. We then fix such leaves of to be occupied or unoccupied in the following manner. If a leaf vertex closes a cycle in , say , then if we fix this leaf to be unoccupied, otherwise if we fix the leaf to be occupied. Note, if the leaf is fixed to be unoccupied we simply remove that vertex from the tree. If the leaf is fixed to be occupied, we remove that leaf and all of its neighbors, i.e. we remove the parent of that leaf from the tree. The resulting tree is denoted as . See Figure 1 for an illustration of for a particular example.
Weitz  proves the following theorem for the hard-core model, which shows that the marginal distribution at the root in is identical to the marginal distribution for in . For a graph , a subset and configuration on , for , let in denote the configuration on in where for every occurrence of in is assigned according to .
Theorem 1 (SAW Tree Representation, Theorem 3.1 in ).
For any graph , , , and configuration on , for the following holds:
Note, the tree preserves the distance of vertices from in , which implies the following corollary.
If SSM holds with rate for for all , then SSM holds for with rate .
The reverse implication of Corollary 2 does not hold since there are configurations on in which are not necessarily realizable in . Observe that if has maximum degree , any SAW tree of is a subtree of the regular tree of degree .
2.3 Our Proof Approach
In summary, Weitz  first shows (via Theorem 1) that to prove SSM holds on a graph , it suffices to prove SSM holds on the trees , for all . Weitz then proves that the regular tree “dominates” every tree of maximum degree in the sense that, for all trees of maximum degree , SSM holds when . We refine this second part of Weitz’s approach. In particular, for graphs with extra structure, such as , we bound by a tree that is much closer to it than the regular tree . We then establish a criterion that achieves better bounds on SSM for trees when the trees have extra structure.
The tree will be constructed in a regular manner so that we can prove properties about it – the construction of is governed by a (progeny) matrix , whose rows correspond to types of vertices, with the entry specifying the number of children of type that a vertex of type begets. We will then show a sufficient condition using entries of which implies that SSM holds for and for any subgraph of , including . The construction of is reminiscent of the strategy employed in [1, 25] to upper bound the connectivity constant of several lattice graphs, including . The derivation of our sufficient condition has some inspiration from belief propagation algorithms.
As a byproduct of our proof that our new criterion implies SSM for , we get a new (and simpler) proof of the second part of Weitz’s approach, namely, that for all trees of maximum degree , SSM holds when .
3 Branching Matrices and Strong Spatial Mixing
As alluded to above, we will utilize more structural properties of self-avoiding walk trees. To this end, we consider families of trees which can be recursively generated by certain rules; we then show that such a general family is also analytically tractable.
3.1 Definition of Branching Matrices
We say that the matrix is a branching matrix if every entry is a non-negative integer. We say the maximum degree of is , the maximum row sum. Given a branching matrix , we define the following family of graphs. In essence, it includes a graph if the self-avoiding walk trees of can be generated by .
Definition 3 (Branching Family).
Given a branching matrix , includes trees which can be generated under the following restrictions:
Each vertex in tree has its type .
Each vertex of type has at most children of type .
In addition, we use the notation if for all .
For example, the family with includes the family of trees with maximum branching . On the other hand, with describes the family of graphs of maximum degree , by assigning the root of tree to be of type 1 and the other vertices of the tree to be of type 2. Note that if has maximum degree , then every also has maximum degree .
In this framework, Weitz’s result establishing SSM for all graphs of maximum degree when can be stated as establishing SSM with uniform rate for all with ; and we are interested in establishing its analogy for general . To this end, we will use the following notion of SSM for .
To establish SSM for , it suffices to prove that SSM holds with uniform rate for all trees in due to Corollary 2. In addition, note that SSM holds for if and only if it holds for since the root of a tree is the only possible vertex of type 1 in .
Finally, we define SSM for a branching matrix .
Given a branching matrix , we say SSM holds for if SSM holds with uniform rate for all .
To establish SSM for , it suffices to prove that SSM holds with uniform rate for all trees in due to Corollary 2.
3.2 Implications of SSM
We present a new approach for proving SSM for a branching matrix . There are multiple consequences of SSM for as summarized in the following theorem. We first state some definitions needed for stating the theorem.
Following Goldberg et al.  we use the following variant of amenability for infinite graphs. Here we consider an infinite graph . For and a non-negative integer , let denote the set of vertices within distance from , where distance is the length of the shortest path. For a set of vertices , the (outer) boundary and neighborhood amenability are defined, respectively, as:
The infinite graph is said to be neighborhood-amenable if .
Now we can state the following theorem detailing the implications of SSM of interest to us.
For a branching matrix , if SSM holds for then the following hold:
For every , SSM holds on .
For every infinite graph , there is a unique infinite-volume Gibbs measure on .
If has maximum degree , if and , then for every (finite) , Weitz’s algorithm  gives an FPAS for approximating the partition function .
For every infinite which is neighborhood-amenable, for every finite subgraph of , the Glauber dynamics has mixing time. Moreover, if for constant , then for every finite subgraph of , the Glauber dynamics has mixing time.
Part 1 is by the definition of SSM for . The uniqueness result follows from the fact that the infinite-volume extremal Gibbs measures on the infinite graph can be obtained by taking limits of finite measures, see Georgii  for an introduction to infinite-volume Gibbs measures, and see Martinelli  for Part 2. Part 3 immediately follows from the work of Weitz . Finally, for Part 4, there is a long line of work showing that for the integer lattice in fixed dimensions, for the Ising model SSM on implies mixing time of the Glauber dynamics on finite subregions of , e.g., see Cesi  and Martinelli  (and the references therein) for recent results on this problem. These results for the Ising model are typically stated for a general class of models, but that class does not include models with hard constraints, such as the hard-core model studied here. Dyer et al.  showed a simpler proof for the hard-core model that utilizes the monotonicity of the model. We use this result of  in Theorem 8 to get mixing time for subregions of . Goldberg et al. [15, Theorem 8] showed that for -colorings, if SSM holds for an infinite graph that is neighborhood-amenable, the Glauber dynamics has mixing time for all finite subgraphs of . Their proof holds for the hard-core model which implies Part 4. ∎
4 Establishing SSM for Branching Matrices
In this section we present a sufficient condition implying SSM for the family of trees generated by a branching matrix. As a consequence of the approach presented in this section we get a simpler proof of Weitz’s result  implying SSM for all graphs with maximum degree when . We then apply the condition presented in this section to in Section 5.
To show the decay of influence of a boundary condition , a common strategy is to prove some form of contraction for the ‘one-step’ iteration given in (1) below. More generally, we will prove such a contraction for an appropriate set of ‘statistics’ of the unoccupied marginal probability.
A statistic of the univariate parameter is a monotone (i.e., strictly increasing or decreasing) function . For a branching matrix we consider a set of statistics , one for each type. For the simpler case when and hence , we have a single statistic . Our aim is proving contraction for an appropriate set of statistics of the probability that the root of a tree is unoccupied.
We first focus on the case of a single type. Consider a tree with root . For , let denote the children of , and let the number of children. Let denote the subtree rooted at . We will analyze the unoccupied probability for a vertex , but will always be the root of its subtree. Hence, to simplify the notation, for a boundary condition on , let .
A straightforward recursive calculation with the partition function leads to the following relation:
Note, the unoccupied probability always lies in the interval , i.e., for all , all , .
For , let be the ‘message’ at vertex . The messages satisfy the following recurrence:
Our aim is to prove uniform contraction of the messages on all trees . To this end, we will consider a more general set of messages. Namely, we consider messages where for every , and . This set of tuples contains all of the tuples obtainable on a tree.
For , let , and let
Ideally, we would like to establish the following contraction: there exists a such that for all ,
where and . We will instead show that the following weaker condition suffices. Namely, that the desired contraction holds for all for some . This is equivalent to the following condition.
Let . For the branching matrix , we say that Condition () is satisfied if for all , by setting for , the following holds:
Let us now consider a natural generalization of the above notion for a branching matrix with multiple types. Let be a branching matrix. For , let denote the maximum number of children of a vertex of type . Once again, consider a tree with root . For , let denote its type. As before, are the children of , is the number of children of , and for a boundary condition on , is the unoccupied probability for in the tree under .
The recursive calculation in (1) for in terms of , still holds. For the case of multiple types, for , let be the message at vertex . The messages satisfy the following recurrence:
For each type , we consider contraction of messages derived from all . We need to identify the type of each these quantities in order to determine the appropriate statistic to apply. The assignment of types needs to be consistent with the branching matrix . Hence, let be the following assignment. Let and for , let . For , for , let .
For type , for , set , and let
We generalize Condition () to branching matrices with multiple types by allowing a weighting of the types by parameters .
Let . For a branching matrix , we say that Condition () is satisfied if there exist , such that for all , for all , by setting for , the following holds:
The following lemma establishes a sufficient condition so that SSM holds for .
For a branching matrix , if for every , is continuously differentiable on the interval and , and if Condition () is satisfied for or Condition () is satisfied for then SSM holds for , and hence the conclusions of Theorem 3 follow.
For a tree with root , let and denote the marginal probabilities that the root of is unoccupied conditional on the vertices at level (i.e., distance from the root) being occupied and unoccupied, respectively.
The main result for proving Lemma 4 is that there exist and such that for every tree and every integer ,
We first explain why (1) implies Lemma 4 and then we prove (1). Consider a tree with root , and a boundary condition on . Set as the distance of to the root of . The hard-core model on bipartite graphs has a monotonicity of boundary conditions (c.f., ) which implies that for odd , , and for even , . Hence, for any pair of boundary conditions and on ,
We now turn our attention to proving (1). Fix a branching matrix and consider a tree with root . Given , let denote the marginal probability that the root of is unoccupied given all of the vertices at level (in ) are assigned marginal probability of being unoccupied (conditional on its parent being unoccupied). Intuitively, can be thought as the marginal probability conditioned on a ‘fractional’ boundary configuration at level . As in (1), satisfies the following recurrence for :
Observe that for all , all , all , , and hence we can use Condition () to analyze .
Using the fact that and are continuously differentiable for , we have that for ,
Note that and should be independent of and , but may depend on and . The constant will be the following:
and the constant will be the constant implicit in Condition ().
We will show (3) by induction on . First we verify the base case . In this case,
|by the chain rule|
|by the definition of .|
This completes the analysis of the base case.
Now we proceed toward establishing the necessary induction step using the inductive hypothesis. We have that
|by Hölder’s inequality.||(4)|
From (), there exists a universal constant such that
Therefore, it follows that
|by (4) and the definition of|
|by the inductive hypothesis.|
4.1 Reproving Weitz’s Result of SSM for Trees
In this section, we aim at finding a good choice of statistics. First we find such a statistic for the case , i.e., the case of a single type, which enables us to reprove Weitz’s result  that when SSM holds for every tree of maximum degree .
Using Lemma 4 (and the simpler condition () for the case of a single type) we obtain a simpler proof of Weitz’s result  that for every tree with maximum degree (hence, for every graph of maximum degree ) and for all , SSM holds on (and on ).
Let where . Then, Condition () holds for and . Consequently, SSM and the conclusions of Theorem 3 hold for and .
First, a straightforward calculation implies that
where and .
Hence, we have
|by the arithmetic-geometric mean inequality||(5)|
We now use the following technical lemma.
where is a positive integer and is the unique solution to .