Contraction of matchgate tensor networks on non-planar graphs

Contraction of matchgate tensor networks on non-planar graphs

Sergey Bravyi111 IBM T.J. Watson Research Center, Yorktown Heights, NY 10598
Abstract

A tensor network is a product of tensors associated with vertices of some graph such that every edge of represents a summation (contraction) over a matching pair of indexes. It was shown recently by Valiant, Cai, and Choudhary that tensor networks can be efficiently contracted on planar graphs if components of every tensor obey a system of quadratic equations known as matchgate identities. Such tensors are referred to as matchgate tensors. The present paper provides an alternative approach to contraction of matchgate tensor networks that easily extends to non-planar graphs. Specifically, it is shown that a matchgate tensor network on a graph of genus with vertices can be contracted in time where is the minimum number of edges one has to remove from in order to make it planar. Our approach makes use of anticommuting (Grassmann) variables and Gaussian integrals.

1 Introduction and summary of results

Contraction of tensor networks is a computational problem having a variety of applications ranging from simulation of classical and quantum spin systems [1, 2, 3, 4, 5] to computing capacity of data storage devices [6]. Given the tremendous amount of applications it is important to identify special classes of tensor networks that can be contracted efficiently. For example, Markov and Shi found a linear time algorithm for contraction of tensor networks on trees and graphs with a bounded treewidth [1]. An important class of graphs that do not fall into this category are planar graphs. Although contraction of an arbitrary tensor network on a planar graph is a hard problem, it has been known for a long time that the generating function of perfect matchings known as the matching sum can be computed efficiently on planar graphs for arbitrary (complex) weights using the Fisher-Kasteleyn-Temperley (FKT) method, see [7, 8, 9]. It is based on the observation that the matching sum can be related to Pfaffian of a weighted adjacency matrix (known as the Tutte matrix). The FKT method also yields an efficient algorithm for computing the partition function of spin models reducible to the matching sum, most notably, the Ising model on a planar graph [10]. Recently the FKT method has been generalized to the matching sum of non-planar graphs with a bounded genus [11, 12, 13].

Computing the matching sum can be regarded as a special case of a tensor network contraction. It is therefore desirable to characterize precisely the class of tensor networks that can be contracted efficiently using the FKT method. This problem has been solved by Valiant [14, 15] and in the subsequent works by Cai and Choudhary [16, 17, 18]. Unfortunately, it turned out that the matching sum of planar graphs essentially provides the most general tensor network in this class, see [16, 18]. Following [16] we shall call such networks matchgate tensor networks, or simply matchgate networks. A surprising discovery made in [17] is that matchgate tensors can be characterized by a simple system of quadratic equations known as matchgate identities which does not make references to any graph theoretical concepts. Specifically, given a tensor of rank with complex-valued components labeled by -bit strings one calls a matchgate tensor, or simply a matchgate, if

(1)

Here denotes a string in which the -th bit is and all other bits are . The symbol stands for a bit-wise XOR of binary strings. For example, a simple algebra shows that a tensor of rank is a matchgate iff it is either even or odd222A tensor is called even (odd) if for all strings with odd (even) Hamming weight.. Furthermore, an even tensor of rank is a matchgate iff

(2)

A matchgate network is a tensor network in which every tensor is a matchgate.

The purpose of the present paper is two-fold. Firstly, we develop a formalism that allows one to perform partial contractions of matchgate networks, for example, contraction of a single edge combining its endpoints into a single vertex. More generally, the formalism allows one to contract any connected planar subgraph of the network into a single vertex by ”integrating out” all internal edges of . The number of parameters describing the contracted tensor assigned to is independent of the size of . It depends only on the number of ”external” edges connecting to the rest of the network. This is the main distinction of our formalism compared to the original matchgate formalism of Valiant [14]. The ability to implement partial contractions may be useful for designing efficient parallel contraction algorithms. More importantly, we show that it yields a faster contraction algorithm for matchgate networks on non-planar graphs.

Our formalism makes use of anticommuting (Grassmann) variables such that a tensor of rank is represented by a generating function of Grassmann variables. A matchgate tensor is shown to have a Gaussian generating function that depends on parameters. The matchgate identities Eq. (1) can be described by a first-order differential equation making manifest their underlying symmetry. Contraction of tensors is equivalent to convolution of their generating functions. Contraction of matchgate tensors can be performed efficiently using the standard Gaussian integration technique. We use the formalism to prove that a tensor satisfies matchgate identities if and only if it can be represented by the matching sum on some planar graph. It reproduces the result obtained earlier by Cai and Choudhary [17, 18]. Our approach also reveals that the notion of a matchgate tensor is equivalent to the one of a Gaussian operator introduced in [19] in the context of quantum computation.

Secondly, we describe an improved algorithm for contraction of matchgate networks on non-planar graphs. Let be a standard oriented closed surface of genus , i.e., a sphere with handles.

Definition 1.

Given a graph embedded into a surface we shall say that is contractible if there exists a region with topology of a disk containing all vertices and all edges of . A subset of edges is called a planar cut of if a graph is contractible.

A contraction value of a tensor network is a complex number obtained by contracting all tensors of . Our main result is as follows.

Theorem 1.

Let be a matchgate tensor network on a graph with vertices embedded into a surface of genus . Assume we are given a planar cut of with edges. Then the contraction value can be computed in time . If has a bounded vertex degree, one can compute in time .

If a network has a small planar cut, , the theorem provides a speedup for computing the matching sum and the partition function of the Ising model compared to the FKT method. For example, computing the matching sum of a graph as above by the FKT method would require time since the matching sum is expressed as a linear combination of Pfaffians where each Pfaffian involves a matrix of size , see [11, 12, 13], and since Pfaffian of an matrix can be computed in time , see Remark 2 below. In contrast to the FKT method, our algorithm is divided into two stages. At the first stage that requires time one performs a partial contraction of the planar subgraph determined by the given planar cut , see Def. 1. The contraction reduces the number of edges in a network down to without changing the genus333If the initial network represents a matchings sum, the first stage of the algorithm would require only time .. The first stage of the algorithm yields a new network with a single vertex and self-loops such that . At the second stage one contracts the network by expressing the contraction value as a linear combination of Pfaffians similar to the FKT method. However each Pfaffian involves a matrix of size only .

Remark 1: The statement of the theorem assumes that all tensors are specified by their generating functions. Thus a matchgate tensor of rank can be specified by parameters, see Section 3 for details. The ordering of indexes in any tensor must be consistent with the orientation of a surface. See Section 2.1 for a formal definition of tensor networks.

Remark 2: Recall that Pfaffian of an antisymmetric matrix is defined as

where is the symmetric group and is the parity of a permutation . One can efficiently compute Pfaffian up to a sign using an identity . However, in order to compute a linear combination of several Pfaffians one needs to know the sign exactly. One can directly compute using the combinatorial algorithm by Mahajan et al [20] in time . Alternatively, one can use Gaussian elimination to find an invertible matrix such that is block-diagonal with all blocks of size . It requires time . Then can be computed using an identity . This method yields algorithm although it is less computationally stable compared to the combinatorial algorithm of [20].

2 Some definitions and notations

2.1 Tensor networks

Throughout this paper a tensor of rank is a -dimensional complex array in which the indexes take values and . Given a binary string of indexes we shall denote the corresponding component as .

A tensor network is a product of tensors whose indexes are pairwise contracted. More specifically, each tensor is represented by a vertex of some graph , where is a set of vertices and is a set of edges. The graph may have self-loops and multiple edges. For every edge one defines a variable taking values and . A bit string that assigns a particular value to every variable is called an index string. A set of all possible index strings will be denoted . In order to define a tensor network on one has to order edges incident to every vertex. We shall assume that is specified by its incidence list, i.e., for every vertex one specifies an ordered list of edges incident to which will be denoted . Thus where for all . Here is the degree of . If a vertex has one or several self-loops, we assume that every self-loop appears in the list twice (because it will represent contraction of two indexes). For example, a vertex with one self-loop and no other incident edges has degree . A tensor network on is a collection of tensors labeled by vertices of such that a tensor has rank . A contraction value of a network is defined as

(3)

Thus the contraction value can be computed by taking a tensor product of all tensors and then contracting those pairs of indexes that correspond to the same edge of the graph. By definition, is a complex number (tensor of rank ).

It will be implicitly assumed throughout this paper that a tensor network is defined on a graph embedded into a closed oriented surface . We require that the order of edges incident to any vertex must agree with the order in which the edges appear if one circumnavigates counterclockwise. Thus the order on any set is completely specified by the choice of the first edge . If the surface has genus we shall say that has genus (it may or may not be the minimal genus for which the embedding of into is possible).

2.2 Anticommuting variables

In this section we introduce notations pertaining to the Grassmann algebra and anticommuting variables (see the textbook [21] for more details). Consider a set of formal variables subject to multiplication rules

(4)

The Grassmann algebra is the algebra of complex polynomials in variables factorized over the ideal generated by Eq. (4). Equivalently, is the exterior algebra of the vector space , where each variable is regarded as a basis vector of . More generally, the variables may be labeled by elements of an arbitrary finite set (in our case the variables will be associated with edges or vertices of a graph). A linear basis of is spanned by monomials in variables . Namely, for any subset define a normally ordered monomial

(5)

where the indexes increase from the left to the right. If the variables are labeled by elements of some set , one can define the normally ordered monomials , by choosing some order on . Let us agree that . Then an arbitrary element can be written as

(6)

We shall use notations and interchangeably meaning that can be regarded as a function of anticommuting variables . Accordingly, elements of the Grassmann algebra will be referred to as functions. In particular, is regarded as a constant function. A function is called even (odd) if it is a linear combination of monomials with even (odd) degree. Even functions span the central subalgebra of .

We shall often consider several species of Grassmann variables, for example, and . It is always understood that different variables anticommute. For example, a function must be regarded as an element of the Grassmann algebra , that is, a linear combination of monomials in and .

A partial derivative over a variable is a linear map defined by requirement and the Leibniz rule

More explicitly, given any function , represent it as , where do not depend on . Then . It follows that , , for and .

A linear change of variables with invertible matrix induces an automorphism of the algebra such that . The corresponding transformation of partial derivatives is

(7)

2.3 Gaussian integrals

Let be a set of Grassmann variables. An integral over a variable denoted by is a linear map from to , where means that the variable is omitted. To define an integral , represent the function as , where . Then . Thus one can compute the integral by first computing the derivative and then excluding the variable from the list of variables of .

Given an ordered set of Grassmann variables we shall use a shorthand notation

Thus can be regarded as a linear functional on , or as a linear map from to , and so on. The action of on the normally ordered monomials is as follows

(8)

Similarly, if one regards as a linear map from to then

Although this definition assumes that both variables , have a normal ordering, the integral depends only on the ordering of .

One can easily check that integrals over different variables anticommute, for . More generally, if and then

(9)

Under a linear change of variables the integral transforms as

(10)

In the rest of the section we consider two species of Grassmann variables and . Given an antisymmetric matrix and any matrix , define quadratic forms

Gaussian integrals over Grassmann variables are defined as follows.

(11)

Thus is just a complex number while is an element of . Below we present the standard formulas for the Gaussian integrals. Firstly,

(12)

Secondly, if is an invertible matrix then

(13)

Assume now that has rank for some even444Note that antisymmetric matrices always have even rank. integer . Choose any invertible matrix such that has zero columns . (This is equivalent to finding a basis of such that the last basis vectors belong to the zero subspace of .) Then

for some invertible matrix . Introduce also matrices , of size and respectively such that

Performing a change of variables in Eq. (11) and introducing variables and such that one gets

Here we have taken into account Eqs. (9,10). Applying Eq. (13) to the first integral one gets

(14)

One can easily check that if the rank of is smaller than the number of variables in , that is, . Since has only columns we conclude that

Therefore in the non-trivial case the matrices and specifying have size and for some . It means that can be specified by bits. One can compute in time . Indeed, one can use Gaussian elimination to find , compute and in time . The matrix can be computed in time . Computing the matrices requires time .

The formula Eq. (14) will be our main tool for contraction of matchgate tensor networks.

3 Matchgate tensors

3.1 Basic properties of matchgate tensors

Although the definition of a matchgate tensor in terms of the matchgate identities Eq. (1) is very simple, it is neither very insightful nor very useful. Two equivalent but more operational definitions will be given in Sections 3.3, 3.4. Here we list some basic properties of matchgate tensors that can be derived directly from Eq. (1). In particular, following the approach of [17], we prove that a matchgate tensor of rank can be specified by a mean vector and a covariance matrix of size .

Proposition 1.

Let be a matchgate tensor of rank . For any a tensor with components is a matchgate tensor.

Proof.

Indeed, make a change of variables , in the matchgate identities ∎

Let be a non-zero matchgate tensor of rank . Choose any string such that and define a new tensor with components

such that is a matchgate and . Introduce an antisymmetric matrix such that

Proposition 2.

For any

where is a matrix obtained from by removing all rows and columns such that .

Proof.

Let us prove the proposition by induction in the weight of . Choosing and in the matchgate identities Eq. (1) one gets for all . Similarly, choosing and with one gets . Thus the proposition is true for . Assume it is true for all strings of weight . For any string of weight and any such that apply the matchgate identities Eq. (1) with and . After simple algebra one gets

Noting that has weight and applying the induction hypothesis one gets

for even and for odd . Thus for all odd strings of weight . Furthermore, let non-zero bits of be located at positions . Note that the sign of coincides with the parity of a permutation that orders elements in a set . Therefore, by definition of Pfaffian one gets . ∎

Thus one can regard the vector and the matrix above as analogues of a mean vector and a covariance matrix for Gaussian states of fermionic modes, see for instance [19]. Although Proposition 2 provides a concise description of a matchgate tensor, it is not very convenient for contracting matchgate networks because the mean vector and the covariance matrix are not uniquely defined.

Corollary 1.

Any matchgate tensor is either even or odd.

Proof.

Indeed, the proposition above implies that if a matchgate tensor has even (odd) mean vector it is an even (odd) tensor. ∎

3.2 Describing a tensor by a generating function

Let be an ordered set of Grassmann variables. For any tensor of rank define a generating function according to

Here is the normally ordered monomial corresponding to the subset of indexes . Let us introduce a linear differential operator acting on the tensor product of two Grassmann algebras such that

(15)
Lemma 1.

A tensor of rank is a matchgate iff

(16)
Proof.

For any strings one has the following identity:

Expanding both factors in Eq. (16) in the monomials , , using the above identity, and performing a change of variable and for every one gets a linear combination of monomials with the coefficients given by the right hand side of Eq. (1). Therefore Eq. (16) is equivalent to Eq. (1). ∎

Lemma 1 provides an alternative definition of a matchgate tensor which is much more useful than the original definition Eq. (1). For example, it is shown below that the operator has a lot of symmetries which can be translated into a group of transformations preserving the subset of matchagate tensors.

Lemma 2.

The operator is invariant under linear reversible changes of variables .

Proof.

Indeed, let be the partial derivative over . Using Eq. (7) one gets

Lemmas 1,2 imply that linear reversible change of variables , where map matchgates to matchgates.

Corollary 2.

Let be a matchgate tensor of rank . Then a tensor defined by any of the following transformations is also matchgate.
(Cyclic shift): ,
(Reflection): ,
(Phase shift): , where .

Proof.

Let if is an even tensor and if is an odd tensor, see Corollary 1. The transformations listed above are generated by the following linear changes of variables:

Phase shift :
Cyclic shift :
Reflection :

Indeed, let be the normally ordered monomial where . Let for the cyclic shift and for the reflection. Then the linear changes of variables stated above map to for the phase shift, to for the cyclic shift, and to for the reflection. Therefore, in all three cases is a matchgate tensor. ∎

3.3 Matchgate tensors have Gaussian generating function

A memory size required to store a tensor of rank typically grows exponentially with . However the following theorem shows that for matchgate tensors the situation is much better.

Theorem 2.

A tensor of rank is a matchgate iff there exist an integer , complex matrices , of size and respectively, and a complex number such that has generating function

(17)

where is a set of Grassmann variables. Furthermore, one can always choose the matrices and such that and .

Thus the triple provides a concise description of a matchgate tensor that requires a memory size only . In addition, it will be shown that contraction of matchgate tensors can be efficiently implemented using the representation Eq. (17) and the Gaussian integral formulas of Section 2.3. We shall refer to the generating function Eq. (17) as a canonical generating function for a matchgate tensor .

Corollary 3.

For any matrices and the Gaussian integral defined in Eq. (11) is a matchgate.

Proof.

Indeed, use Eq. (14) and Theorem 2. ∎

In the rest of the section we shall prove Theorem 2.

Proof of Theorem 2..

Let us first verify that the tensor defined in Eq. (17) is a matchgate, i.e., , see Lemma 1. Without loss of generality is an antisymmetric matrix and . Write as

Noting that is an even function and for any even string one concludes that

(18)

Therefore it suffices to prove that and . The first identity follows from and which implies

To prove the second identity consider the singular value decomposition , where and are unitary operators, while is a matrix with all non-zero elements located on the main diagonal, . Introducing new variables and one gets

Here we have used identity , see Eq. (10). Since is invariant under linear reversible changes of variables, see Lemma 2, and since for any monomial one gets . We proved that , that is, is a matchgate tensor.

Let us now show that any matchgate tensor of rank can be written as in Eq. (17). Define a linear subspace such that

Let . Make a change of variables where is any invertible matrix such that the last rows of span . Then for all . It follows that can be represented as

(19)

for some function that depends only on variables . Equivalently,

where the partial derivatives are taken with respect to the variables . Since is invariant under reversible linear changes of variables, see Lemma 2, and since , we get

(20)

By definition of the subspace the functions are linearly independent. Therefore there exist linear functionals , , such that . Applying to the first factor in Eq. (20) we get

(21)

for all . Let the lowest degree of monomials in . Let us show that , that is, contains with a non-zero coefficient. Indeed, let be a function obtained from by retaining only monomials of degree . Since any monomial in the r.h.s. of Eq. (21) has degree at least , we conclude that for all . It means that for some complex number and thus .

Applying the partial derivative to Eq. (21) we get , where the substitution means that the term proportional to the identity is taken. Since the partial derivatives over different variables anticommute, is an antisymmetric matrix.

Using Gaussian elimination any antisymmetric matrix can be brought into a block-diagonal form with blocks on the diagonal by a transformation , where is an invertible matrix (in fact, one can always choose unitary , see [23]). Since our change of variables allows arbitrary transformations in the subspace of we can assume that is already bock-diagonal,

where only non-zero blocks are represented, so that .

Applying Eq. (21) for we get

(22)

Note that can be written as

(23)

where the sums over and run over all odd and even monomials in respectively. Substituting Eq. (23) into Eq. (22) one gets and , that is

where depends only on variables . Repeating this argument inductively, we arrive to the representation

Here we extended the matrix such that its last columns and rows are zero. Combining it with Eq. (19) one gets

where is a vector of Grassmann variables and is a matrix with , entries such that

Recalling that , we conclude that has a representation Eq. (17) with and . As a byproduct we also proved that the matrices , in Eq. (17) can always be chosen such that since and all non-zero entries of are in the last rows. ∎

3.4 Graph theoretic definition of matchgate tensors

Let be an arbitrary weighted graph with a set of vertices , set of edges and a weight function that assigns a complex weight to every edge .

Definition 2.

Let be a graph and be a subset of vertices. A subset of edges is called an -imperfect matching iff every vertex from has no incident edges from while every vertex from has exactly one incident edge from . A set of all -imperfect matchings in a graph will be denoted .

Note that a perfect matching corresponds to an -imperfect matching. Occasionally we shall denote a set of perfect matching by . For any subset of vertices define a matching sum

(24)

(A matching sum can be identified with a planar matchgate of [15].) In this section we outline an isomorphism between matchgate tensors and matching sums of planar graphs discovered earlier in [18]. For the sake of completeness we provide a proof of this result below. Although the main idea of the proof is the same as in [18] some technical details are different. In particular, we use much simpler crossing gadget.

Specifically, we shall consider planar weighted graphs embedded into a disk such that some subset of external vertices belongs to the boundary of disk while all other internal vertices belong to the interior of . Let be an ordered list of external vertices corresponding to circumnavigating anticlockwise the boundary of the disk. Then any binary string can be identified with a subset that includes all external vertices such that . Now we are ready to state the main result of this section.

Theorem 3.

For any matchgate tensor of rank there exists a planar weighted graph with vertices, edges and a subset of vertices such that

(25)

Furthermore, suppose is specified by its generating function,