Inference and Sampling of K_{33}-free Ising Models

Inference and Sampling of -free Ising Models

Valerii Likhosherstov, Yury Maximov and Michael Chertkov
Skolkovo Institute of Science and Technology, Moscow, Russia
Theoretical Division and Center for Nonlinear Studies,
Los Alamos National Laboratory, Los Alamos, NM, USA
Graduate Program in Applied Mathematics,
University of Arizona, Tucson, AZ, USA

We call an Ising model tractable when it is possible to compute its partition function value (statistical inference) in polynomial time. The tractability also implies an ability to sample configurations of this model in polynomial time. The notion of tractability extends the basic case of planar zero-field Ising models. Our starting point is to describe algorithms for the basic case, computing partition function and sampling efficiently. Then, we extend our tractable inference and sampling algorithms to models whose triconnected components are either planar or graphs of size. In particular, it results in a polynomial-time inference and sampling algorithms for (minor)-free topologies of zero-field Ising models—a generalization of planar graphs with a potentially unbounded genus. 111The paper to appear at the Proceedings of the 36-th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019. Implementation of the algorithms is available at



1 Introduction

Computing the partition function of the Ising model is generally intractable, even an approximate solution in the special anti-ferromagnetic case of arbitrary topology would have colossal consequences in the complexity theory [jerrum-sinclair]. Therefore, a question of interest—rather than addressing the general case—is to look after tractable families of Ising models. In the following, we briefly review tractability related to planar graphs and graphs embedded in surfaces of small genus.

Related work. Onsager [onsager] gave a closed-form solution for the partition function in the case of a homogeneous interaction Ising model over an infinite two-dimensional square grid without a magnetic field. This result has opened an exciting era of phase transition discoveries, which is arguably one of the most significant contributions in theoretical and mathematical physics of the 20th century. Then, Kac and Ward [kac-ward] showed in the case of a finite square lattice that the problem of the partition function computation is reducible to a determinant. Kasteleyn [kasteleyn] generalized the results to the case of an arbitrary inhomogeneous interaction Ising model over an arbitrary planar graph. Kasteleyn’s construction was based on mapping of the Ising model to a perfect matching (PM) model with specially defined weights over a modified graph. Kasteleyn’s construction was also based on the so-called Pfaffian orientation, which allows counting of PMs by finding a single Pfaffian (or determinant) of a matrix. Fisher [fisher] simplified Kasteleyn’s construction such that the modified graph remained planar. Transition to PM is fruitful because it extends planar zero-field Ising model inference to models embedded on a torus [kasteleyn] and, in fact, on any surface of small (orientable) genus , but with a price of the additional, multiplicative, and exponential in genus, , factor in the algorithm’s run time [gallucio].

A parallel way of reducing the planar zero-field Ising model to a PM problem consists of constructing a so-called expanded dual graph [bieche, barahona, schraudolph-kamenetsky]. This approach is more natural and interpretable because there is a one-to-one correspondence between spin configurations and PMs on the expanded dual graph. An extra advantage of this approach is that the reduction allows one to develop an exact efficient sampling. Based on linear algebra and planar separator theory [lipton-tarjan], Wilson introduced an algorithm [wilson] that allows one to sample PMs over planar graphs in time. The algorithms were implemented in [thomas-middleton1, thomas-middleton2] for the Ising model sampling, however, the implementation was limited to only the special case of a square lattice. In [thomas-middleton1] a simple extension of the Wilson’s algorithm to the case of bounded genus graphs was also suggested, again with the factor in complexity. Notice that imposing zero field condition is critical, as otherwise, the Ising model over a planar graph is NP-hard [barahona]. On the other hand, even in the case of zero magnetic field Ising models over general graphs are difficult [barahona].

Contribution. In this manuscript, we discuss tractability related to the Ising model with zero magnetic fields over graphs more general than planar. Our construction is related to graphs characterized in terms of their excluded minor property. Planar graphs are characterized by excluded minor and minor (Wagner’s theorem [diestel], Chapter 4.4). Therefore, instead of attempting to generalize from planar to graphs embedded into surfaces of higher genus, it is natural to consider generalizations associated with a family of graphs excluding minor or minor.

In this manuscript, we show that -free zero-field Ising models are tractable in terms of inference and sampling and give a tight asymptotic bound, , for both operations. For that purpose, we use graph decomposition into triconnected components—the result of recursive splitting by pairs of vertices, disconnecting the graph. Indeed, the -free graphs are simple to work with because their triconnected components are either planar or graphs [hall]. Therefore, the essence of our construction is to decompose the inference task in Ising over a -free graph into a sequential dynamic programming evaluation over planar or graphs in the spirit of [straub]. Notice that the triconnected classification of the tractable zero-field Ising models is complementary to the aforementioned small genus classification. We illustrate the difference between the two classifications with an explicit example of a tractable problem over a graph with genus growing linearly with graph size.

Structure. The manuscript is organized as follows. Sections 2 and 3, respectively, establish notations and pose problems of inference and sampling. Section 4 presents transition from the zero-field Ising model to an equivalent tractable perfect matching (PM) model. This provides a description of a inference and sampling method in planar models, which is new (to the best of our knowledge), and it sets the stage for what follows. Section 5 discusses a scheme for polynomial inference and sampling in zero-field models over graphs with triconnected components that are either planar or of size. Section 6 applies this scheme to -free zero-field Ising models, resulting in tight asymptotic bounds, which appear to be equivalent to those in the planar case. Section 7 describes benchmarks justifying correctness and efficiency of our algorithm. Technical proofs of statements given throughout the manuscript can be found in the supplementary material.

2 Definitions and Notations

Let be a finite set of vertices, a multiset consisting of , be edges, then we call a graph. We call normal, if is a set (i.e., there are no multiple edges in ).

A tree is a connected graph without cycles. For , let denote a graph . Let be a graph. Then is a subgraph of , if . Vertex is an articulation point of , if is disconnected. is biconnected if there are no articulation points in . Biconnected component is a maximal subgraph of without an articulation point.

The graph is planar if it can be drawn on a plane without edge intersections. The corresponding drawing is referred to as planar embedding of . When no ambiguity arises, we do not distinguish planar graph from its embedding.

A set is called a perfect matching (PM) of , if edges of are disjoint and their union equals . denotes the set of all PMs of . denotes a complete (normal) graph on vertices, and denotes a utility graph. Triple bond is a graph of two vertices and three edges between them. Multiple bond is a graph of two vertices and at least three edges between them.

3 Problem Setup

Let be a normal graph, . For each , define a random binary variable (a spin) , . Subscript will be used as shorthand for , for brevity, thus . For each , define a pairwise interaction . We associate assignment to vector with probability as follows:



The probability distribution (1) defines the so-called zero-field (or pairwise) Ising model, and is called the partition function (PF) of the zero-field Ising (ZFI) model. Notice that .

Given a ZFI model, our goal is to find (inference) and draw samples from the model efficiently.

4 Reducing Planar ZFI Model to PM Model

In this section, we consider a special case of planar graph and introduce a transition from the ZFI model to the perfect matching (PM) model on a different planar graph.

We assume that the planar embedding of is given (and if not, it can be found in time [boyer]). We follow [schraudolph-kamenetsky] in constructions discussed in this section.

4.1 Expanded Dual Graph

First, triangulate by adding new edges to such that . (The triangulation does not change probabilities of the spin assignments.) Graph is generated (use the same notation as for the original graph for convenience) and is biconnected with every face, including lying on the boundary, forming a triangle. Complexity of the triangulation procedure is , see [schraudolph-kamenetsky] for an example.

Second, construct a new graph, , where each vertex of is a face of , and there is an edge in if and only if and share an edge in . By construction, is planar, and it is embedded in the same plane as , so that each new edge intersects the respective old edge. Call a dual graph of . Since is triangulated, each has degree 3 in .

Third, obtain a planar graph and its embedding from by substituting each by a triangle so that each vertex of the triangle is incident to one edge, going outside the triangle (see Figure 1 for an illustration). Call an expanded dual graph of .

Newly introduced triangles of , substituting ’s vertices, are called Fisher cities [fisher]. We refer to edges outside triangles as intercity edges and denote their set as . The set of Fisher city edges is denoted as . Notice that intersects exactly one and vice versa, which defines a bijection between and ; denote it by . Observe also that , where is the size of . Moreover, is a PM of , and thus . Since is planar, one also finds that . Constructing takes efforts of complexity.

Figure 1: (a) A fragment of ’s embedding after triangulation (black), expanded dual graph (red). (b) Possible configurations and corresponding (wavy lines) on a single face of . Rotation symmetric and reverse sign configurations are omitted.

4.2 Perfect Matching (PM) Model

For , let be a set . Each Fisher city is incident to an odd number of edges in . Thus, can be uniquely completed to a PM by edges from . Denote the resulting PM by (see Figure 1 for an illustration). Let .

Lemma 1.

is a bijection between and .

Define weights on according to

Lemma 2.

For holds




is the PF of the PM distribution (PM model) defined by (2).

Second transition of (3) reduces the computation to solve for . Furthermore, only two equiprobable spin configurations and (one of which is in ) correspond to , and they can be recovered from in steps, thus resulting in the statement that one samples from (1) if sampling from (2) is known.

The PM model can be defined for an arbitrary graph with positive weights , as a probability distribution over : .

Our subsequent derivations are based on the following:

Theorem 1.

Given the PM model defined on planar graph of size with positive edge weights , one can find its partition function and sample from it in time.

Algorithms, constructively proving the theorem, are directly inferred from [wilson, thomas-middleton1], with minor changes/generalizations. Hence, we outline them in the supplementary material.


Inference and sampling of the PM model on (and, hence, the ZFI model on ) take time.

5 Dynamic Programming within Triconnected Components

Starting with this section, we present new results. We describe a general algorithm that allows us to perform inference and sampling from the ZFI model in the case where the triconnected components of the underlying graph are either planar or of size.

5.1 Decomposition into Biconnected Components

Consider a ZFI model (1) over a normal graph , . If is disconnected, then distribution (1) is decomposed into a product of terms associated with independent ZFI models over the connected components of . Hence, we assume below, without loss of generality, that is connected.

Let be biconnected components of . They form a tree if an edge is drawn between and whenever and share an articulation point. A simple reduction (see supplementary material) shows that inference and sampling on are reduced to a series of inference and sampling on ZFI models induced by subgraphs .

Lemma 3.

Let be partition functions of ZFI models induced by . Then,


Sampling from is reduced to a series of sampling on and post-processing.

Observe also that all the articulation points and the biconnected components of can be found in steps [hopcroft1]. Therefore later on, we assume without loss of generality that is biconnected.

5.2 Biconnected Graph as a Tree of Triconnected Components

In this subsection we follow [hopcroft2, gutwenger], see also [08Mader] to define the tree of triconnected components. Following discussions of the previous subsection, one considers here a biconnected .

Let . Divide into equivalence classes so that are in the same class if they lie on a common simple path that has as endpoints. are referred to as separation classes. If , then is a separation pair of , unless (a) and one of the classes is a single edge or (b) and each class is a single edge. Graph is called triconnected if it has no separation pairs.

Figure 2: (I) An example biconnected graph . (II) A separation pair of and separation classes associated with . (III) Result of split operation with . Hereafter, dashed lines indicate virtual edges and dotted lines connect equivalent virtual edges in split graphs. (IV) Split components of (non-unique). (V) Triconnected components of . (VI) Triconnected component tree of ; spacial alignment of V is preserved. “G," “B," and “C" are examples of the “triconnected graph," “multiple bond," and “cycle," respectively.

Let be a separation pair in with equivalence classes . Let be such that , . Then, graphs are called split graphs of with respect to , and is a virtual edge, which is a new edge between and , identifying the split operation. Due to the addition of , and are not normal in general.

Split into and . Continue splitting , and so on, recursively, until no further split operation is possible. The resulting graphs are split components of . They can either be (triangles), triple bonds, or triconnected normal graphs.

Let be a virtual edge. There are exactly two split components containing : and . Replacing and with is called merging and . Do all possible mergings of the cycle graphs (starting from triangles), and then do all possible mergings of multiple bonds starting from triple bonds. Components of the resulting set are referred to as the triconnected components of . We emphasize again that some graphs (i.e., cycles and bonds) in the set of triconnected components are not necessarily triconnected.

Lemma 4.

[hopcroft2] Triconnected components are unique for . Total number of edges within the triconnected components is at most .

Consider a graph , where vertices (further referred to as nodes for disambiguation) are triconnected components, and there is an edge between and in , when and share a (copied) virtual edge.

Lemma 5.

[hopcroft2] is a tree.


Figure 2 illustrates triconnected decomposition of a binconnected graph and intermediate steps towards it.

All triconnected components, and thus , can be found in steps [hopcroft2, gutwenger, vo]. Merging of two triconnected components is equivalent to contracting an edge in (VI on Figure 2). After all possible mergings, is recovered.

5.3 Inference via Dynamic Programming

Assume that there is a (small) number bounding the size of each nonplanar triconnected component. In the following, we present a polynomial time algorithm that computes for a given (fixed) .

First, one finds triconnected components of and in steps. Choose a root node in . For any node in , let the next node (on a unique path from to ) be a parent of , and be a child of . Nodes, which do not have any children, are called leaves. For node , let a subtree denote a subgraph constructed from , its children, grandchildren, and so on.

Our algorithm processes each node once. The node is only processed when all its children have been already processed, so a leaf is processed first and the root is processed last. Let be a currently processed node. Let be a graph obtained by merging all nodes in . If is a root, then . Since the root is processed last, it outputs the desired PF, Z. Figure 3 provides a visualization of a node processing routine which is to be explained.

If is not a root, let be a virtual edge shared between and its parent. The only virtual edge in is , and without is a subgraph of . Hence, pairwise interactions are defined for . The result of node ’s processing is a quantity.

where . Notice that , , and hence .

Processing nodes one by one we notice that the following cases are possible:

  1. is a leaf. Therefore, there is nothing to merge, and . If is nonplanar, find by brute force enumeration, completed in steps. If is a multiple bond, is found in steps.

    Assume now that node is (or corresponds to) a planar, normal graph. Define and consider a ZFI model with the probability defined over graph with as pairwise interactions. Let be the PF of the ZFI model. In the remaining part of this case we will only work with this induced ZFI model, so that one can assume that nodes in are ordered, , such that . Then, one utilizes the notations and and derives


    Next, one triangulates by adding enough edges with zero pairwise-interactions, similar to how it is done in Subsection 4.1. Assume that is triangulated, and observe that the right-hand side of Eq. (5) is not affected. Construct , which is an expanded dual graph of with , and defined as in Subsection 4.1. Then, define mapping , weights , and the PF as in 4.2. Denote .

    According to the definition of ,


    Denote . We continue the chain of relations/equalities (6) observing that

    Then one arrives at

    where is a PF of the PM model over . Compute and in steps, as described in Section 4. Since is planar of size , can also be computed in steps, as Theorem 1 states. The following relations finalize computation of in steps:

  2. is not a leaf, not a root. Let be ’s children, and be a virtual edge shared between and , . At this point, we already computed all . Each is a separation pair in that splits it into and the rest of , containing all , . Denote all virtual edges in as , and then the following relation holds:


    If is (or corresponds to) a multiple bond, (7) is computed trivially in steps. Hence, one assumes next that is a normal graph.

    Each is positive, and it essentially only depends on the product , that is, there exist such that . Using this relation, one rewrites (7) as


    Denote , for each . Then rewrite (8) as


    We compute (9) by brute force in steps, if is nonplanar. If is normal planar, we once again consider a ZFI model with the probability , defined over , where the pairwise weights are , and is the respective PF. Then applying machinery from Case 1, one derives

    in steps.

  3. is a root. Once again, let be children of , be a virtual edge shared between and , and , be the set of virtual edges in (which shares only with its children). Using considerations similar to those described while deriving Eq. (7), one arrives at

    Finally, one computes similarly to how the values were derived in Case 2. It takes steps if is a multiple bond. Otherwise, one constructs a ZFI model and finds the PF over the respective graphs in either steps, if the graph is nonplanar, or in steps, if is normal planar.

Figure 3: Inference. Illustration of a node processing. Arrow indicates a direction to the root. (I) Exemplary node (subgraph in the center with one solid side edge, one solid diagonal edge, and solid dashed edges, marked according to the rules explained in the captions to Fig. 2), its (two) children and a parent. (II) Topology of the ZFI model defined on . (III) Triangulated ZFI model. (IV) Expanded dual graph of ZFI model (red). Computing PF of ’s PMs is a part of the inference processing of the node . (V) graph for (red). Computing PF of PMs is a part of the inference processing of node , unless is a root.
Figure 4: Sampling. Illustration of a node processing. General notations (arrows, children, parents, dashed and dotted lines) are consistent with the captions of Figs. 2,3. Assume that spin values at ’s parent are already drawn (and consequently, spin values at are drawn, too). The examples in the top line are for the case of equal spin values at , and the examples in the bottom line are for unequal spin values at . (I) Start with the triangulated ZFI model defined during inference (see Fig. 3). (II) Find either (top, red) or (bottom, red) depending on spin values at . (III) Sample PM on or . (IV) Set spin values according to PM. (V) Propagate the spin values drawn along the virtual edges towards the child nodes.
Figure 5: KL-distance of the model probability distribution compared with the empirical probability distribution. are the model’s size and the number of samples, respectively.
Figure 6: Execution time of inference (red dots) and sampling (blue dots) depending on , shown on a logarithmic scale. Black line corresponds to .

5.4 Sampling via Dynamic Programming

The sampling algorithm, detailed below, follows naturally from the inference routine. Compute triconnected components of in steps. If all the triconnected components of are multiple bonds, should be a multiple bond itself, but is normal. Therefore, there exists a component that is not a multiple bond; choose it as a root of .

Use the inference routine (described in the previous Section) to compute . Now, do a backward pass through the tree, processing the root first, and then processing the node only when its parent has already been processed (Figure 4 visualizes the sampling algorithm).

Suppose is a root and it is processed by now. Since is not a multiple bond, it results in an Ising model, . Draw a spin configuration from this model. It will take steps if is nonplanar or steps if is planar.

Suppose is not a root. If is a multiple bond, spin values were already assigned to its vertices (contained within the node/graph ). Otherwise, there exists a ZFI model already constructed at the inference stage. Following the notation of Subsection 5.3, one has to sample from , since spins and are shared with the parent model and have already been drawn as and , respectively. If , all valid are such that , and the task is reduced to sampling PMs on . Otherwise, all valid are such that . Denote and notice that

Therefore, the task is reduced to sampling PM over .

6 -free Topology

6.1 ZFI Model over -free Graphs

Consider the ZFI model (1) over a normal connected graph . Let be some graph. Then, is a minor of , if it is isomorphic to ’s subgraph, in which some edges are contracted. (See [diestel], Chapter 1.7, for a formal definition.)

is -free, if is not a minor of , that is, it cannot be derived from ’s subgraph by contraction of some edges.

Let a biconnected be decomposed into the tree of triconnected components. Then, the following lemma holds:

Lemma 6.

[hall] Graph is -free if and only if its nonplanar triconnected components are exactly .

Therefore, if is -free, it satisfies all the conditions needed for efficient inference and sampling, described in Section 5. According to the lemma, the graph in Fig. 2 is -free. The next statement expresses the main contribution of this manuscript.

Theorem 2.

If is -free, inference or sampling of (1) takes steps.

We point out that the family of models for which the algorithm from Section 5 applies is broader than just -free models. However, we focus on -free graphs because they have a fortunate characterization in terms of a missing minor.

6.2 Discussion: Genus of -free Graphs

A remarkable feature of -free models is related to considerations addressing the graph’s genus. Genus of a graph is a minimal genus (number of handles) of the orientable surface that the graph can be embedded into. Kasteleyn [kasteleyn] has conjected that the complexity of evaluating the PF of a ZFI model embedded in a graph of genus is exponential in . The result was proven and detailed in [regge, gallucio, cimasoni1, cimasoni2]. One naturally asks what are genera of graphs over which the ZFI models are tractable. The following statement relates biconnectivity and graph topology (genus):

Theorem 3.

[battle] A graph’s genus is a sum of its biconnected component genera.

If a graph is not biconnected, its genus can be arbitrarily large, while inference and sampling may still be tractable in relation to the decomposition technique discussed in Subsection 5.1. Therefore, it becomes principally interesting to construct tractable biconnected models with large genus.

Lemma 7.

A biconnected -free graph of size can be of genus as big as .

From this we conclude that -free graphs can’t be tackled via the bounded-genus approach of [regge, gallucio, cimasoni1, cimasoni2]. This justifies the novelty of our contribution.

7 Implementation and Tests

To test the correctness of inference, we generate random -free models of a given size and then compare the value of PF computed in a brute force way (tractable for sufficiently small graphs) and by our algorithm. We simulate samples of sizes from ( samples per size) and verify that respective expressions coincide.

When testing sampling implementation, we take for granted that the produced samples do not correlate given that the sampling procedure (Section 5.4) accepts the Ising model as input and uses independent random number generation inside. The construction does not have any memory, therefore, it generates statistically independent samples. To test that the empirical distribution is approaching a theoretical one (in the limit of the infinite number of samples), we draw different numbers, , of samples from a model of size . Then we find Kullback-Leibler divergence between the probability distribution of the model (here we use our inference algorithm to compute the normalization, ) and the empirical probability, obtained from samples. Fig. 5 shows that KL-divergence converges to zero as the sample size increases. Zero KL-divergence corresponds to equal distributions.

Finally, we simulate inference and sampling for random models of different size and observe that the computational time (efforts) scales as (Fig. 6)222Implementation of the algorithms is available at

8 Conclusion

In this manuscript, we compiled results that were scattered over the literature on sampling and inference in the Ising model over planar graphs. To the best of our knowledge, we are the first to present a complete and mathematically accurate description of the tight asymptotic bounds.

We generalized the planar results to a new class of zero-field Ising models over graphs not containing as a minor. In this case, which is strictly more general than the planar case, we have shown that the complexity bounds for sampling and inference are the same as in the planar case. Along with the formal proof, we provided evidence of our algorithm’s correctness and complexity through simulations.


This work was supported by the U.S. Department of Energy through the Los Alamos National Laboratory as part of LDRD and the DOE Grid Modernization Laboratory Consortium (GMLC). Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001).


Appendix A Technical Proofs

Lemma 1 proof. Let . Call saturated, if it intersects an edge from . Each Fisher city is incident to an odd number of edges in . Thus, each face of has an even number of unsaturated edges. This property is preserved, when two faces/cycles are merged into one by evaluating respective symmetric difference. Therefore, one gets that any cycle in has an even number of unsaturated edges.

For each define , where is the number of unsaturated edges on the path connecting and . The definition is consistent due to aforementioned cycle property. Now for each , if and only if is saturated. To conclude, we constructed such that . Such is unique, because parity of unsaturated edges on a path between and uniquely determines relationship between and , and is always . ∎

Lemma 2 proof. Let , . The statement is justified by the following chain of transitions:

1:Input: A tree of .
2:Draw from ZFI models on .
3:ProcessComponent(1, -1).
4:Combine into .
5:Output: .
7:Procedure ProcessComponent
8:Input: index , parent index .
9: = articulation point of and .
10:if unequal spins of and at  then
11:      :=
12:for  in ’s neighbors do
13:     if  then
14:         ProcessComponent(, ).      
15:end Procedure
Algorithm 1 Sampling from

Lemma 3 proof. The Algorithm 1 reduces sampling on to a series of samplings on .

Given the algorithm and inference formula in Lemma 3, the statement is obvious for . Let . Let be an articulation point shared by and . Denote , . Without loss of generality assume that has index in and . Let . Then one derives:

where is the PF of the ZFI model induced by . As far as sampling is concerned, denote by a probability distribution induced by the -th ZFI model. Then, since :

Assume that a method for sampling from is available. Then, draw by sampling from . To sample conditional on from , draw from . If , then , otherwise . This is consistent with Algorithm 1.

For graphs of the statement of lemma follows naturally by induction.

Theorem 2 proof. Since is normal and minor-free, it holds that [thomason]. Find all biconnected components and for each construct a triconnected component tree in .

As described above, the time (number of steps) of inference or sampling is a sum of inference or sampling times of each triconnected component of . Let the set of all ’s triconnected components (that is, a union over all biconnected components) to consist of planar triconnected components of size with edges respectively, multiple bonds of edges and graphs. Then the complexity of inference or sampling is