Rapid mixing of Swendsen-Wang and single-bond dynamics in two dimensions
We prove that the spectral gap of the Swendsen-Wang dynamics for
the random-cluster model on arbitrary graphs with edges is
bounded above by times the spectral gap of the
single-bond (or heat-bath) dynamics.
This and the corresponding lower bound (from [U12]) imply
that rapid mixing of these two dynamics is equivalent.
Using the known lower bound on the spectral gap of the Swendsen-Wang dynamics for the two dimensional square lattice of side length at high temperatures and a result for the single-bond dynamics on dual graphs, we obtain rapid mixing of both dynamics on at all non-critical temperatures. In particular this implies, as far as we know, the first proof of rapid mixing of a classical Markov chain for the Ising model on at all temperatures.
Markov chains for the random-cluster model and the closely related -state Potts model are the topic of many research articles from various areas of mathematics and statistical physics. The probably most studied model is the Ising model (or 2-state Potts model) on the two-dimensional square lattice. While there is almost complete knowledge about the mixing properties of single-spin dynamics, such as the heat-bath dynamics, there are only a few results on cluster algorithms, such as Swendsen-Wang, or dynamics on the corresponding random-cluster model. The single-spin dynamics on , i.e. the two dimensional square lattice of side length , is known to mix rapidly above the critical temperature [MO94] and below the critical temperature the mixing time is exponential in the side length , see [CGMS96]. See also [Mar99] for an excellent survey of the (at that time) known results. Only recently it was proven by Lubetzky and Sly [LS10] that the single-spin dynamics is also rapidly mixing at the critical temperature. One approach to overcome the torpid (or slow) mixing at low temperatures was to consider cluster algorithms that change the spin of a large portion of vertices at once. The most successful approach (so far) is the Swendsen-Wang dynamics (SW) that is based on the close relation between Potts and random-cluster models, see [SW87] and [ES88]. But although it is conjectured that this dynamics is rapidly mixing at high and at low temperatures, again most of the results concern high temperatures. Known results for general graphs include rapid mixing on trees and complete graphs at all temperatures, see e.g. [CF98], [CDFR00] and [LNP07], and graphs with small maximum degree at high temperatures , see [CF98] and [Hub03]. Additionally, it is proven that, for bounded degree graphs, rapid mixing of single-spin dynamics implies rapid mixing of SW [U11]. At low temperatures there are only a few results on the mixing time of SW. Beside the results for trees and complete graphs, we are only aware of two articles concerning the low temperature case, [Mar92] and [Hub03]. While Huber [Hub03] states rapid mixing for temperatures below some constant that depends on the size of the graph, Martinelli [Mar92] gave a result for hypercubic subsets of at sufficiently low temperatures that do not depend on the side length. Additionally to the rapid mixing results for SW there are some results on torpid mixing. These include torpid mixing for the -state Potts model at the critical temperature on the complete graph for all [GJ97] and on hypercubic subsets of for sufficiently large [BCT10].
In this article we study the mixing properties of the Swendsen-Wang and the heat-bath dynamics for the random-cluster model (In fact, SW can be seen as a Markov chain for random-cluster and Potts models.) and we prove that the spectral gap of SW is bounded above by some polynomial in the size of the graph times the spectral gap of the heat-bath dynamics. In particular, this implies rapid mixing of SW for the Potts model on the two-dimensional square lattice at all temperatures below the critical one.
To state our results in detail, we first have to define the models and the algorithms. Let be a graph with finite vertex set and edge set . The random-cluster model (also known as the FK-model) with parameters and , see Fortuin and Kasteleyn [FK72], is defined on the graph by its state space and the RC measure
where is the number of connected components in the graph , counting isolated vertices as a component, and is the normalization constant that makes a probability measure. Note that this model is well-defined also for non-integer values of , but we do not need this generalization here. See [Gri06] for further details and related topics.
A closely related model is the -state Potts model on at inverse temperature , that is defined as the set of possible configurations , where is the set of colors (or spins), together with the probability measure
for , where is the same normalization constant as for the RC model (see [Gri06, Th. 1.10]). For this model is called Ising model.
The connection of these models is given by a coupling of the RC and the Potts measure in the case . Let and . Then the joint measure of is defined by
The Swendsen-Wang dynamics (SW) uses this coupling implicitly in the following way. Suppose the SW at time is in the state . We choose with respect to the measure , i.e. every connected component of is colored independently and uniformly at random with a color from . Then take and delete each edge independently with probability to obtain , which can be seen as sampling from . Denote by the transition matrix of this Markov chain. Of course, we can make these two steps in reverse order to obtain a Markov chain for the -state Potts model with transition matrix .
The heat-bath dynamics (HB) for the random-cluster model is a local Markov chain that, given the current state , sets with probability and otherwise chooses a edge uniformly at random and changes the state at most at the edge with respect to the conditional measure given all the other edges, which is sampling of from the conditional measure . The transition matrix of this chain is denoted .
The spectral gap of a Markov chain with transition matrix is defined by
Let (resp. ) be the transition matrix of the Swendsen-Wang (resp. heat-bath) dynamics for the random-cluster model on a graph with edges. Then
Using the corresponding lower bound, which was proven in [U12] we obtain that SW is rapidly mixing if and only if HB is rapidly mixing, since the spectral gaps can differ only by a polynomial in the number of edges of the graph. Furthermore we prove that the heat-bath dynamics for the RC model on a planar graph with parameters and has the same spectral gap than the heat-bath dynamics for the dual model, which is the random-cluster model on the dual graph (see Section 5.1 for definitions) with parameters and , where satisfies . We denote the dynamics for the dual model by (resp. ). This was probably known before, but we could not found a reference. It follows
Let be the transition matrix of the Swendsen-Wang dynamics for the random-cluster model on a planar graph with edges and let be the SW dynamics for the dual model. Then there exists a constant , such that
If we consider the two-dimensional square lattice of side length , i.e. the graph with and , we can deduce the following from the results of [U11].
Let be the transition matrix of the Swendsen-Wang dynamics for the random-cluster model on with parameters and . Let . Then there exist constants such that
for and ,
An immediate consequence is the following corollary.
Let be the transition matrix of the Swendsen-Wang dynamics for the -state Potts model on at inverse temperature . Let . Then there exist constants such that
for and ,
This seems to be the first prove of rapid mixing of a classical Markov chain for the Ising model at all temperatures. In fact, in [U11] a somehow artificial Markov chain, that makes a additional step at the dual graph, is proven to be rapid.
We also obtain for the heat-bath dynamics
Let be the transition matrix of the heat-bath dynamics for the random-cluster model on with parameters and . Let . Then there exist constants such that
for and ,
The results of [U11], and hence the proofs of Theorems 3 and 5, rely ultimately on the rapid mixing results for the heat-bath dynamics for the Potts (resp. Ising) model that were proven over the last decades. To state only some of them, see e.g. [LS10], [MO94], [MOS94]l and [Ale98] together with the proof of exponential decay of connectivities up to the critical temperature from [BD10]. These articles give an almost complete picture over what is known so far about mixing of single-spin dynamics in .
The plan of this article is as follows. In Section 2, we introduce the necessary notation related to the spectral gap of Markov chains. Section 3 contains a more detailed description of the algorithms and the definition of the “building blocks” that are necessary to represent the dynamics on the FKES model. In Section 4 we will prove Theorem 1, and in Section 5 we introduce the notion of dual graphs and prove the remaining results from above.
2 Spectral gap and mixing time
As stated in the introduction, we want to estimate the efficiency of Markov chains. For an introduction to Markov chains and techniques to bound the convergence rate to the stationary distribution, see e.g. [LPW09]. In this article we consider the spectral gap as measure of the efficiency. Let be the transition matrix of a Markov chain with state space that is ergodic, i.e. irreducible and aperiodic, and has unique stationary measure . Additionally let the Markov chain be reversible with respect to , i.e.
Then we know that the spectral gap of the Markov chain can be expressed in terms of norms of the (Markov) operator that maps from to , where inner product and norm are given by and , respectively. The operator is defined by
and represents the expected value of the function after one step of the Markov chain starting in . The operator norm of is
and we use interchangeably for functions and operators, because it will be clear from the context which norm is used. It is well known that for reversible , where , and that reversibility of is equivalent to self-adjointness of the corresponding Markov operator, i.e. , where is the (adjoint) operator that satisfies for all . The transition matrix that corresponds to the adjoint operator satisfies
If we are considering a family of state spaces with a corresponding family of Markov chains , we say that the chain is rapidly mixing for the given family if for all and some .
In several (or probably most of the) articles on mixing properties of Markov chains the authors prefer to use the mixing time as measure of efficiency, which is defined by
The mixing time and spectral gap of a Markov chain (on finite state spaces) are closely related by the following inequality, see e.g. [LPW09, Theorem 12.3 & 12.4].
Let be the transition matrix of a reversible, ergodic Markov chain with state space and stationary distribution . Then
In particular, we obtain the following for the random-cluster model.
Let be the transition matrix of a reversible, ergodic Markov chain for the random-cluster model on with parameters and . Then
Therefore, all results of this article can also be written in terms of the mixing time, loosing the same factor as in Corollary 7.
3 Joint representation of the algorithms
In order to make our description of the considered Markov chains complete, we state in this section formulas for their transition matrices. Additionally, we introduce another local Markov chain that will be necessary for the further analysis and introduce a representation of the dynamics on (joint) FKES model.
The Swendsen-Wang dynamics (on the RC model), as stated in the introduction, is based on the given connection of the random cluster and Potts models and has the transition matrix
Recall that we denote by the Swendsen-Wang dynamics for the Potts model and note that both dynamics have the same spectral gap, see [U11].
The second algorithm we want to analyze is the (lazy) heat-bath dynamics. Let be given and denote by (resp. ) connected (resp. not connected) in the subgraph . Additionally we use throughout this article instead of (respectively for ) and denote the endpoints of by and , i.e. . For , , the transition probabilities of the HB dynamics are given by
where denotes the symmetric difference and is chosen such that is stochastic. Hence, satisfies
for . The heat-bath dynamics has the advantage that the corresponding HB dynamics for the dual model has the same spectral gap, see Section 5.1. Unfortunately, this Markov chains do not admit a representation on the joint model like the SW dynamics. Therefore we introduce the following (non-lazy) local dynamics with transition probabilities
We call this Markov chain the single-bond dynamics (SB). This chain is inspired by the Swendsen-Wang dynamics since for a graph that consists of two vertices connected by a single edge. Note that , and are reversible with respect to .
Before we state the representation of and on the FKES model, we show that the spectral gaps of and are closely related. For this let and note that is the transition matrix of the lazy single-bond dynamics.
For and for the random-cluster model with parameters and we have
Using standard comparison ideas, e.g. from [DSC93, Section 2.A],
we obtain that for two transition matrices and ,
for all implies
, where for lazy Markov chains the inequality
for all is sufficient. Additionally we have in general
Therefore it is enough to prove
for all ,
which is easy to check.
We want to represent the Swendsen-Wang and the single-bond dynamics on the FKES model, which consists of the product state space and the FKES measure . This was done first in [U12] and we follow the steps from this article. First we introduce the stochastic matrix that defines the mapping (by matrix multiplication) from the RC to the FKES model
Note that defines an operator (like in (1)) that maps from to and its adjoint operator can be given by the (stochastic) matrix
The following matrix represents the updates of the RC “coordinate” in the FKES model. For and let
Let , and be the matrices from above. Then
and are self-adjoint in .
and for all .
Now we can state the desired Markov chains with the matrices from above.
Let , and be the matrices from above. Then
4 Proof of Theorem 1
In this section we will prove Theorem 1. This is done in two subsections. In the first one we prove some general norm estimates for operators on (resp. between) Hilbert spaces. In the second subsection we will apply these estimates to the setting from above to obtain the result.
4.1 Technical lemmas
In this section we provide some technical lemmas that will be necessary for the analysis. We state them in a general form, because we guess that they could be useful also in other settings. First let us introduce the notation. Throughout this section consider two Hilbert spaces and with the corresponding inner products and . The norms in and are defined as usual as the square root of the inner product of a function with itself. Additionally, we denote by (resp. ) the operator norms of operators mapping from to (resp. to ). We consider two bounded, linear operators, and . The operator maps from to and has the adjoint , i.e. with for all and . The operator is self-adjoint and acts on . Obviously, is then self-adjoint on .
In the setting from above let be also positive, i.e. , then
In the special case this lemma was used in [U12] to prove a lower bound on the spectral gap of SW. We will recall this result later.
By the assumptions, has a unique positive square root , i.e. , which is again self-adjoint, see e.g. [Kre78, Th. 9.4-2]. We obtain
In particular, if this proves monotonicity in .
In the setting from above let additionally , then
for all .
The case is obvious. Now suppose the statement is correct for , then
which proves the statement for . ∎
The next corollary combines the statements of the last two lemmas to give a result similar to Lemma 12 for arbitrary exponents.
Additionally to the general assumptions of this section let be positive, and . Then
for all .
Let such that . Since by assumption, we obtain
In this section we apply the estimates from the last one. Recall that we consider the dynamics on a graph with edges, i.e. . Fix an arbitrary ordering of the edges . We set the Hilbert spaces from the last section to and and define the operators
for . By Lemma 9(ii) we obtain for that if and only if . Furthermore, for every with for all . We prove the following theorem.
Let and be a bounded, linear operator with . Then
Define the index sets and . Let and denote by , for , the multinomial coefficient. Obviously (by the multinomial theorem),
Note that we use for the second equality that the ’s are commuting by Lemma 9(ii). Since we know that for every (note that for ) and for every , we obtain
Using for and , and it follows
Setting yields the result. ∎
Now we are able to prove the comparison result for SW and SB dynamics. For this let for all and , which defines an operator (by (1)) that maps from to . The adjoint operator is then given by and thus, for all . For the proof we set
It follows that , since , but and thus . This implies . Additionally, and .
Let (resp. ) be the transition matrix of the Swendsen-Wang (resp. single-bond) dynamics for the random-cluster model on a graph with edges. Then
Let . Then
where the last inequality comes from for .
Setting , we obtian
. This proves the statement.
In this section we introduce the notion of (planar) dual graphs and prove that the heat-bath dynamics on a planar graph has the same spectral gap than the heat-bath dynamics for the dual model on , which is the dual graph of . This immediately implies Corollary 2 and hence, that rapid mixing of the Swendsen-Wang dynamics for the random-cluster model and its dual model is equivalent. Finally, we use the known lower bounds on the spectral gap of SW on the two-dimensional square lattice at high temperatures to prove Theorem 3 and Theorem 5.
5.1 Dual graphs
Let be a planar graph, i.e. a graph that can be embedded into a sphere such that two edges of intersect only at a common endvertex. We fix such an embedding for . Then we define the dual graph of as follows. Place a dual vertex in each face, i.e. in each region of whose boundary consists of edges in the embedding of , and connect 2 vertices by the dual edge if and only if the corresponding faces of share the boundary edge (see e.g. [Gri10, Section 8.5]). Note that the dual graph certainly depends on the used embedding. It is clear, that the number of vertices can differ in the dual graph, but we have the same number of edges.
Additionally we define a dual RC configuration in to a RC state in by
where is the edge in that intersects in our (fixed) embedding. (By construction, this edge is unique.)
It is easy to obtain (see [Gri06, p. 134]) that the random cluster models on the (finite) graphs and are related by the equality