A complexity dichotomy for partition functions with mixed signs
Partition functions, also known as homomorphism functions, form a rich family of graph invariants that contain combinatorial invariants such as the number of -colourings or the number of independent sets of a graph and also the partition functions of certain “spin glass” models of statistical physics such as the Ising model.
Building on earlier work by Dyer and Greenhill  and Bulatov and Grohe , we completely classify the computational complexity of partition functions. Our main result is a dichotomy theorem stating that every partition function is either computable in polynomial time or #P-complete. Partition functions are described by symmetric matrices with real entries, and we prove that it is decidable in polynomial time in terms of the matrix whether a given partition function is in polynomial time or #P-complete.
While in general it is very complicated to give an explicit algebraic or combinatorial description of the tractable cases, for partition functions described by a Hadamard matrices — these turn out to be central in our proofs — we obtain a simple algebraic tractability criterion, which says that the tractable cases are those “representable” by a quadratic polynomial over the field .
We study the complexity of a family of graph invariants known as partition functions or homomorphism functions (see, for example, [14, 21, 22]). Many natural graph invariants can be expressed as homomorphism functions, among them the number of -colourings, the number of independent sets, and the number of nowhere-zero -flows of a graph. The functions also appear as the partition functions of certain “spin-glass” models of statistical physics such as the Ising model or the -state Potts model.
Let be a symmetric real matrix with entries . The partition function associates with every graph the real number
We refer to the row and column indices of the
matrix, which are elements of , as spins.
use the term configuration to refer to a
mapping assigning a spin to each vertex of the graph. To avoid
difficulties with models of real number computation, throughtout this paper
we restrict our attention to algebraic numbers. Let denote the set of
algebraic real numbers.
Our main result is a dichotomy theorem stating that for every symmetric matrix the partition function is either computable in polynomial time or #P-hard. This extends earlier results by Dyer and Greenhill , who proved the dichotomy for 0-1-matrices, and Bulatov and Grohe , who proved it for nonnegative matrices. Therefore, in this paper we are mainly interested in matrices with negative entries.
In the following, let be a graph with vertices. Consider the matrices
It is not hard to see that is the number of independent sets of a graph and is the number of 3-colourings of . More generally, if is the adjacency matrix of a graph then is the number of homomorphisms from to . Here we allow to have loops and parallel edges; the entry in the adjacency matrix is the number of edges from vertex to vertex .
Let us turn to matrices with negative entries. Consider
Then is the number of induced subgraphs of with an even number of edges. Hence up to a simple transformation, counts induced subgraphs with an even number of edges. To see this, observe that for every configuration the term is if the subgraph of induced by has an even number of edges and otherwise. Note that is the simplest nontrivial Hadamard matrix. Hadamard matrices will play a central role in this paper. Another simple example is the matrix
It is a nice exercise to verify that for connected the number is if is Eulerian and otherwise.
A less obvious example of a counting function that can be expressed in terms of a partition function is the number of nowhere-zero -flows of a graph. It can be shown that the number of nowhere-zero -flows of a graph with vertices is , where is the matrix with s on the diagonal and s everywhere else. This is a special case of a more general connection between partition functions for matrices with diagonal entries and off diagonal entries and certain values of the Tutte polynomial. This well-known connection can be derived by establishing certain contraction-deletion identities for the partition functions. For example, it follows from [24, Equations (3.5.4)] and [23, Equation (2.26) and (2.9)]
Like the complexity of graph polynomials [2, 16, 18, 20] and constraint satisfaction problems [1, 3, 4, 5, 12, 15, 17], which are both closely related to our partition functions, the complexity of partition functions has already received quite a bit of a attention. Dyer and Greenhill  studied the complexity of counting homomorphisms from a given graph to a fixed graph without parallel edges. (Homomorphisms from to are also known as -colourings of .) They proved that the problem is in polynomial time if every connected component of is either a complete graph with a loop at every vertex or a complete bipartite graph, and the problem is #P-hard otherwise. Note that, in particular, this gives a complete classification of the complexity of computing for symmetric 0-1-matrices . Bulatov and Grohe  extended this to symmetric nonnegative matrices. To state the result, it is convenient to introduce the notion of a block of a matrix . To define the blocks of , it is best to view as the adjacency matrix of a graph with weighted edges; then each non-bipartite connected component of this graph corresponds to one block and each bipartite connected component corresponds to two blocks. A formal definition will be given below. Bulatov and Grohe  proved that computing the function is in polynomial time if the row rank of every block of is and -hard otherwise. The problem for matrices with negative entries was left open. In particular, Bulatov and Grohe asked for the complexity of the partition function for the matrix introduced in (1.1). Note that is a matrix with one block of row rank . As we shall see, is computable in polynomial time. Hence the complexity classification of Bulatov and Grohe does not extend to matrices with negative entries. Nevertheless, we obtain a dichotomy, and this is our main result.
Results and outline of the proofs
Our main theorem is the following.
Theorem 1.1 (Dichotomy Theorem)
Let be a symmetric matrix. Then the function either can be computed in polynomial time or is #P-hard.
Furthermore, there is a polynomial time algorithm that, given the matrix , decides whether is in polynomial time or #P-hard.
Let us call a matrix tractable if can be computed in polynomial time and hard if computing is #P-hard. Then the Dichotomy Theorem states that every symmetric matrix with entries in is either tractable or hard. The classification of matrices into tractable and hard ones can be made explicit, but is very complicated and does not give any real insights. Very roughly, a matrix is tractable if each of its blocks can be written as a tensor product of a positive matrix of row rank 1 and a tractable Hadamard matrix. Unfortunately, the real classification is not that simple, but for now let us focus on tractable Hadamard matrices. Recall that a Hadamard matrix is a square matrix with entries from such that is a diagonal matrix. Let be a symmetric Hadamard matrix with . Let be a bijective mapping, which we call an index mapping. We say that a multivariate polynomial over symmetrically represents with respect to if, for all , it holds that
For example, the -polynomial symmetrically represents the matrix with respect to the index mapping . The -polynomial symmetrically represents the matrix
with respect to the index mapping . The qualifier “symmetrically” in “symmetrically represents” indicates that the same index mapping is applied to both and . We will need to consider asymmetric representations later. Note that we can only represent a matrix by an -polynomial in this way if is a power of . In this case, for every index mapping there is a unique -polynomial symmetrically representing with respect to . We say that has a quadratic representation if there is an index mapping and an -polynomial of degree at most 2 that symmetrically represents with respect to . Our dichotomy theorem for Hadamard matrices is as follows.
Theorem 1.2 (Complexity Classification for Hadamard Matrices)
A symmetric Hadamard matrix is tractable if it has a quadratic representation and hard otherwise.
Hence, in particular, the matrices and are tractable. The tractability part of Theorem 1.2 is an easy consequence of the fact that counting the number of solutions of a quadratic equation over (or any other finite field) is in polynomial time (see [13, 19]). The difficulty in proving the hardness part is that the degree of a polynomial representing a Hadamard matrix is not invariant under the choice of the index mapping . However, for normalised Hadamard matrices, that is, Hadamard matrices whose first row and column consists entirely of s, we can show that either they are hard or they can be written as an iterated tensor product of the two simple Hadamard matrices and . This gives us a canonical index mapping and hence a canonical representation by a quadratic -polynomial. Unfortunately, we could not find a direct reduction from arbitrary to normalised Hadamard matrices. To get a reduction, we first need to work with a generalisation of partition functions. If we view the matrix defining a partition function as an edge-weighted graph, then this is the natural generalisation to graphs with edge and vertex weights. Let be a symmetric matrix and a diagonal matrix, which may be viewed as assigning the weight to each vertex . We define the partition function by
for every graph . As a matter of fact, we need a further generalisation that takes into account that vertices of even and odd degree behave differently when it comes to negative edge weights. For a symmetric matrix and two diagonal matrices we let
for every graph . We call the parity-distinguishing partition function (pdpf) defined by . We show that the problem of computing is always either polynomial-time solvable or #P-hard, and we call a triple tractable or hard accordingly. Obviously, if are identity matrices, then we have .
Returning to the outline of the proof of Theorem 1.2, we can show that, for every Hadamard matrix , either is hard or there is a normalised Hadamard matrix and diagonal matrices such that computing is polynomial time equivalent to computing . Actually, it turns out that we may assume to be an identity matrix and to be a diagonal matrix with entries only. For the normalised matrix we have a canonical index mapping, and we can use this to represent the matrices and over . Then we obtain a tractability criterion that essentially says that is tractable if the representation of is quadratic and that of is linear (remember that is an identity matrix, which we do not have to worry about).
For the proof of the Dichotomy Theorem 1.1, we actually need an extension of Theorem 1.2 that states a dichotomy for parity-distinguishing partition functions , where is a “bipartisation” of a Hadamard matrix (this notion will be defined later). The proof sketched above can be generalised to give this extension. Then to prove the Dichotomy Theorem, we first reduce the problem of computing to the problem of computing for the connected components of . The next step is to eliminate duplicate rows and columns in the matrix, which can be done at the price of introducing vertex weights. Using the classification theorem for nonnegative matrices and some gadgetry, from there we get the desired reduction to parity-distinguishing partition functions for bipartisations of Hadamard matrices.
Let us finally mention that our proof shows that the Dichotomy Theorem not only holds for simple partition functions , but also for vertex-weighted and parity-distinguishing partition functions.
Let be an -matrix. The entries of are denoted by . The th row of is denoted by , and the th column by . By we denote the matrix obtained from by taking the absolute value of each entry in .
Let be the identity matrix and let be the matrix that is all zero except that for .
The Hadamard product of two matrices and , written , is the component-wise product in which . denotes the Hadamard product of and the matrix in which every entry is .
We write to denote the inner product (or dot product) of two vectors in .
Recall that the tensor product (or Kronecker product) of an matrix and an matrix is an matrix . For , , and , we have . It is sometimes useful to think of the product in terms of “blocks” or “tiles” of size .
For index sets , we let be the -submatrix with entries for , . The matrix is indecomposable if there are no index sets such that , and for all . Note that, in particular, an indecomposable matrix has at least one nonzero entry. The blocks of a matrix are the maximal indecomposable submatrices. For every symmetric matrix we can define a graph with vertex set and edge set . We call the matrix bipartite if the graph is bipartite. We call connected if the graph is connected. The connected components of are the maximal submatrices such that , the subgraph of induced by , is a connected component. If the connected component is not bipartite then is a block of . If the connected component is bipartite and contains an edge then has the form , where is a block of . Furthermore, all blocks of arise from connected components in this way.
For two Counting Problems and , we write if there is a polynomial time Turing reduction from to . If and holds, we write . For a symmetric matrix and diagonal matrices of the same size, (, ) denotes the problem of computing (, , respectively) for an input graph (which need not be a simple graph - it may have loops and/or multi-edges).
2 Hadamard matrices
The main focus of this section is to prove Theorem 2.2 below which is a strengthened version of Theorem 1.2. Suppose that is an Hadamard matrix and that and are subsets of . It will be useful to work with the bipartisation of , and which we define as follows. Let and let be the matrix defined by the following equations for : , , , and . The matrix can be broken into four “tiles” as follows.
Let . Note that the matrix can be decomposed naturally in terms of the tiles and .
We identify a set of conditions on , and that determine whether or not the problem can be computed in polynomial time. We will see how this implies Theorem 1.2.
The Group Condition. For an matrix and a row index , let
The group condition for is:
For all , both and .
The group condition gets its name from the fact that the condition implies that is an Abelian group (see Lemma 7.1). As all elements of this group have order 2, the group condition gives us some information about the order of such matrices, as the following lemma (which we prove later in Section 7) shows:
Let be an Hadamard matrix. If satisfies (GC) then for some integer .
The Representability Conditions. We describe Hadamard matrices satisfying (GC) by -polynomials. By Lemma 2.1 these matrices have order . We extend our notion of “symmetric representation”: Let and be index mappings (i.e. bijective mappings) and and . A polynomial over represents with respect to and if for all it holds that
So a symmetric representation is just a representation with . We say that the set is linear with respect to if there is a linear subvectorspace a such that . Note that, if is linear, then for some . We may therefore define a coordinatisation of (with respect to ) as a linear map such that , that is is just the image of the concatenated mapping . We define the notion of linearity of with respect to and the coordinatisation of with respect to similarly. For a permutation we use the shorthand .
The following conditions stipulate the representability (R) of by -polynomials, the linearity (L) of the sets and , and the appropriate degree restrictions on the associated polynomials (D).
There are index mappings and and a permutation such that (w.r.t. and ) the matrix is represented by a polynomial of the form
Moreover, if is non-empty, then . Similarly, if is non-empty, then . Finally, if is symmetric and , then and .
and are linear with respect to and respectively.
Either is empty or there is a coordinatisation of w.r.t such that the polynomial has degree at most . Similarly, either is empty or there is a coordinatisation of w.r.t such that the polynomial has degree at most . Finally, if is symmetric and is nonempty then .
Actually, it turns out that condition (D) is invariant under the choice of the coordinatisations . However, the conditions are not invariant under the choice of the representation , and this is a major source of technical problems.
Before we can apply the conditions (R), (L) and (D) we deal with one technical issue. Let be an Hadamard matrix and let be subsets of indices. Let be the bipartisation of , and . We say that is positive for and if there is an entry such that (1) or , (2) or , and (3) If is symmetric and then . Otherwise, note that is positive for and . Since , the problems and have equivalent complexity, so we lose no generality by restricting attention to the positive case, which is helpful for a technical reason.
We can now state the theorem which is proved in this section.
Let be an Hadamard matrix and let be subsets of indices. Let be the bipartisation of , and and let . If is positive for and then is polynomial-time computable if, and only if, and satisfy the group condition (GC) and conditions (R), (L), and (D). Otherwise is -hard. If is not positive for and then is polynomial-time computable if, and only if, and satisfy the group condition (GC) and conditions (R), (L), and (D). Otherwise is -hard. There is a polynomial-time algorithm that takes input , and and decides whether is polynomial-time computable or -hard.
The theorem is proved using a sequence of lemmas. Proof sketches of these lemmas will be given in this section and full proofs will be given later in Section 7.
Lemma 2.3 (Group Condition Lemma)
Let be an Hadamard matrix and let be subsets of indices. Let be the bipartisation of , and and let . If does not satisfy (GC) then is -hard. There is a polynomial-time algorithm that takes determines whether satisfies (GC).
Proof sketch. For any integer and a symmetric non-negative matrix , which depends upon , the proof uses gadgetry to transform an input to into an input to . The fact that does not satisfy (GC) is used to show that, as long as is sufficiently large with respect to , then has a block of rank greater than one. By a result of Bulatov and Grohe, is #P-hard, so is -hard.
Lemma 2.4 (Polynomial Representation Lemma)
Let be an Hadamard matrix and subsets of indices. Suppose that satisfies (GC) and that is positive for and . Then the Representability Condition (R) is satisfied. There is a polynomial-time algorithm that computes the representation.
Proof sketch. The representation is constructed inductively. First, permutations are used to transform into a normalised matrix , that is, a Hadamard matrix whose first row and column consist entirely of s, which still satisfies (GC). We then show that there is a permutation of which can be expressed as the tensor product of a simple Hadamard matrix (either or and a smaller normalised symmetric Hadamard matrix . By induction, we construct a representation for and use this to construct a representation for the normalised matrix of the form for a permutation . We use this to construct a representation for .
Lemma 2.5 (Linearity Lemma)
Let be an Hadamard matrix and subsets of indices. Let be the bipartisation of , and and let . Suppose that (GC) and (R) are satisfied. Then the problem is -hard unless the Linearity condition (L) holds. There is a polynomial-time algorithm that determines whether (L) holds.
Proof sketch. For a symmetric non-negative matrix , which depends upon , the proof uses gadgetry to transform an input to to an input of . By (R), there are bijective index mappings and and a permutation such that (w.r.t. and ) the matrix is represented by a polynomial of the appropriate form. Let be the inverse of and be the inverse of . Let and . We show that either is #P-hard or (L) is satisfied. In particular, the assumption that . is not #P-hard means that its blocks all have rank 1 by the result of Bulatov and Grohe. We use this fact to show that is a linear subspace of and that is a linear subspace of . To show that is a linear space of , we use to construct an appropriate linear subspace and compare Fourier coefficients to see that it is in fact itself.
Lemma 2.6 (Degree Lemma)
Let be an Hadamard matrix and subsets of indices. Let be the bipartisation of , and and let . Suppose that (GC),(R) and (L) are satisfied. Then is -hard unless the Degree Condition (D) holds. There is a polynomial-time algorithm that determines whether (D) holds.
Proof sketch. For any (even) integer and a symmetric non-negative matrix , which depends upon , the proof uses gadgetry to transform an input to into an input to . Using the representation of , a coordinatisation with respect to , and a coordinatisation with respect to , some of the entries of the matrix may be expressed as sums, over elements in , for some , of appropriate powers of . We study properties of polynomials , discovering that the number of roots of a certain polynomial , which is derived from , depends upon the degree of . From this we can show that if (D) does not hold then there is an even such that is #P-hard.
Proof (Proof of Theorem 2.2).
By the equivalence of the problems and we can assume that is positive for and . The hardness part follows directly from the Lemmas above. We shall give the proof for the tractability part. Given , and satisfying (GC), (R), (L) and (D), we shall show how to compute for an input graph in polynomial time.
Note first that unless is bipartite. If has connected components , then
Therefore, it suffices to give the proof for connected bipartite graphs. Let be such a graph with vertex bipartition . Let be the set of odd-degree vertices in and let and be the corresponding subsets of and . Let and . We have
As is bipartite and connected this sum splits into for values
We will show how to compute . The computation of the value is similar.
Fix configurations and and let be the index mappings and the -polynomial representing as given in condition (R). Let be the inverse of and let be the inverse of . Let and . Then and induce a configuration defined by
which implies, for all that iff . Let and be coordinatisations of and w.r.t. and satisfying (L) and (D). We can simplify
Define, for , sets
Then . Therefore, it remains to show how to compute the values . Define, for each , a tuple and let be the -polynomial
Here the second equality follows from the definition of the polynomial given in condition (R) and the fact that the terms and in the definition of appear exactly and many times in . Therefore, these terms cancel for all even degree vertices.
Let denote the set of variables in and for mappings we use the expression as a shorthand and define the -sum . We find that can be expressed by
By equation (2.4) we are interested only in those assignments of the variables of which satisfy and for all and . With and for some appropriate , we introduce variable vectors and for all and . If or then we can express the term in in terms of these new variables. In particular, let