Virtual Parallel Computing and a Search Algorithm using Matrix Product States
Abstract
We propose a form of parallel computing on classical computers that is based on matrix product states. The virtual parallelization is accomplished by representing bits with matrices and by evolving these matrices from an initial product state that encodes multiple inputs. Matrix evolution follows from the sequential application of gates, as in a logical circuit. The action by classical probabilistic onebit and deterministic twobit gates such as NAND are implemented in terms of matrix operations and, as opposed to quantum computing, it is possible to copy bits. We present a way to explore this method of computation to solve search problems and count the number of solutions. We argue that if the classical computational cost of testing solutions (witnesses) requires less than local twobit gates acting on bits, the search problem can be fully solved in subexponential time. Therefore, for this restricted type of search problem, the virtual parallelization scheme is faster than Grover’s quantum algorithm.
Interference and the ability to follow many history paths simultaneously make quantum systems attractive for implementing computations (1). Efficient algorithms exploring these properties have been proposed to solve practical problems such as number factoring (2) and unsorted database search (3). However, we still do not have a sufficiently large and resilient quantum computer to take advantage of these algorithms. It is, thus, very desirable to try to find better and more efficient ways to compute with classical systems. In this regard, recent advances in our understanding of quantum manybody systems provide some guidance. It is well understood now that the time evolution of a large class of onedimensional interacting systems can be efficiently simulated by expressing their wave functions in a matrix product state form and by using a timeevolving block decimation (TEBD) (4). A key aspect of this success is data compression: even though manybody interactions tend to increase the rank of the matrices over time, it is possible to use truncation along the evolution to keep the matrices relatively small, such that the resulting wave function approximates quite accurately the exact one without an exponential computation cost (5). In quantum systems, it is well understood that local interactions do not quickly entangle onedimensional manybody state, justifying the matrix truncation (6); (7).
In this Letter, we describe a method of classical computation that utilizes matrix product states (MPS) to implement search and other similar tasks. Compression, when possible, provides additional speedup. Formally, instead of working with wave functions and quantum amplitudes, we describe the state of the computer in terms of a stochastic probability distribution written as traces of matrix product states associated to bit configurations. The idea of expressing classical probability distributions in the form of MPS is not new (8), but the focus so far has been on using it to study nonequilibrium phenomena of physical systems (see for instance Ref. (9); (10)). As we show below, a MPS formulation of classical probability distributions can also be employed to create a virtual parallel machine where all possible outcomes of an algorithm are obtained for all inputs of an bit register. Information about these outcomes is encoded and compressed in the matrices forming the MPS. By itself this “parallelism” is not obviously useful; it is, however, if a certain problem can use the probability of a single outcome at a time. This is the case of a search problem that seeks, for a given , the value of such that for an algorithmically computable function . Then, the focus is not on all values of the output, but on only one given . We shall show below that in this case matrix computing can be useful. In particular, from the probability of , the method directly provides the number of input values satisfying the functional constraint .
In our matrix computing, insertion and removal of bits are allowed and onebit and twobit gates can be implemented much like in a conventional computer. Our onebit gates are probabilistic while our twobit gates are deterministic. Twobit gates rely on a singular value decomposition (SVD) to maintain the MPS form of the probability distribution. All these operations preserve the positivity and the overall normalization of the probability even though we work with nonpositive matrices.
Matrix computing formulation – Consider a set of binary variables describing a set of bits, with denoting a particular configuration of this system. In analogy to quantum mechanics, we define the vector
(1) 
where
(2) 
Here, each is a real matrix of dimensions . The trace can be dropped if we consider the first and last matrices to be row and column vectors, i.e., . The state vector is normalized in the following sense: define , then since .
Starting from an initial probability distribution , the vector evolves as a sequence of onebit and nearestneighbor twobit gates is applied to the bit matrices. These bit operations form a logical circuit, which is tailored according to a particular computational problem, for instance, the algorithmic computation of a function . Below, we describe how bit operations are implemented.
Onebit gates: We will use probabilistic onebit gates, which take states to states with probabilities and :
The probabilities can be encoded into a transfer function that takes a logic input into a logic output . Explicitly: , , , . A onebit gate acting on bit yields a new matrix
(3) 
The transfer function satisfies the sum rule , which ensures that the normalization is maintained as the system evolves. Examples of onebit gates are: (a) Deterministic NOT, with and , (b) RAND, with and , which randomizes the bit, (c) RST, with and , which resets the bit to 0.
Twobit gates: We will consider only deterministic twobit gates. Given two logical functions and , we construct the transfer function , taking bits with states and to bits with states and , respectively:
(4) 
Similarly to onebit gates, the normalization after twobit gates is preserved by the sum rule . The evolved matrices must satisfy
(5) 
and we use the SVD to decompose the result of the gate operation on the righthand side of Eq. (5) as a product of two matrices, as in the lefthand side of the equation, for all the four cases .
Let us demonstrate this construction with a concrete example. Consider the following logical operation on bits and : and . The first bit is unaffected, while the second one evolves into the NAND operation between the two bits. In this case, , with all other elements set to zero. We use the transfer function to determine the four blocks (for ) of a matrix of dimension :
(6) 
To factor the matrix back into a product, we employ an SVD,
(7) 
In this process, the common dimension may change and likely increase. This is an issue of fundamental important, which we shall return when we discuss a search algorithm.
Bit insertions and removals: For computational tasks such addition and multiplication, it is important to be able to insert and remove bits. These operations are straightforward for MPS. Insertion of a new bit (say, initially set to 0) in between bits and amounts to replacing with , where and are null and identity matrices, respectively, and the total sum over bit configurations in the vector [see Eq. (1)] has now to include the binary variable . Removal of a bit is done by absorbing its matrix into the one of an adjacent bit, namely, by tracing it out; for instance, we use to remove bit .
How can matrix computing be used to solve certain computational problems? – Here we shall present computational algorithms that explore the virtual parallelism encoded in matrix product states. To be concrete, consider the following search problem as an example:

Given a function that can be computed algorithmically with gates and a certain value for , we would like to search for an input that yields as output .
The reason why matrix computation is useful for this search problem can be argued as follows. Matrix product states can express the probability values of all possible bit outputs if one starts with a product state encoding all possible bit inputs , namely, for all . Of course, if we were interested in all the probabilities, we would have to compute an exponentially large () number of traces of products of matrices. But this is not what is needed to perform the search above: we are interested in just one output for this problem. We, thus, proceed in the following steps.

Starting with all bits , , randomized with equal probabilities for being or , we compute the final output matrices , , resulting from the action of the circuit that evaluates .

We compute the probability for the given we are interested in. If , then there is at least one value of such that .

We then fix one of the input bits, say , to be , instead of randomizing it. We recompute the output matrices , , and the new probability . Again we test if the probability is above the threshold, in this case. If the probability fell below the threshold, we must reset to . (Notice that since there may be more than one for a given , that stays above threshold does not mean that switching to is necessarily forbidden, but we shall stick instead to in this case to avoid unnecessary iterations.)

We repeat step 3 fixing now input bit , then repeat it again fixing input bit , and so on until we finally fix input bit . At the end of steps, having fixed all the bits of the input, we have arrived at one value for such that .
Let us discuss the computational cost of such algorithm. To simplify the discussion, let us present it in terms of the largest matrix dimension in the computations, which we shall relate to the number of gates involved in the computation of the function . All SVD steps involve matrices with rank smaller or equal to ; therefore, the cost associate to gate operations is no more than . One has also to compute the trace of the matrix products for a fixed to yield the probability , and this takes time no more than . We then have to repeat the procedure fixing bitbybit the , . Therefore, in the worst case it takes a time to find .
The largest computational cost comes from the SVD and trace steps, which depend on the rank of the matrices. The crucial issue is how scales with either the number of bits or the number of gates for a given algorithm to compute . We shall prove below the following result: the maximum dimension of any matrix in a computation using nearestneighbor gates in a system with bits is bounded by . The consequence of this result on the computational time is as follows. As we argued above, the search algorithm takes a time . For a function that can be computed with gates, the time to search for an that gives a fixed has two different behaviors depending on whether or . If , , and thus the search takes, in the worst possible case, a time using matrix computing algorithms. If instead , saturates to and in the worst possible case the computation (without discarding singular values) takes exponential time. In other words, there is a transition between subexponential and exponential behavior at . It, thus, follows that for any function that can be computed with gates, the full search problem can be solved faster using matrix computing than using Grover’s quantum algorithm, which scales as .
Proof of the bound on the largest bond dimension – Upon application of a twobit gate on bits and , the dimension will increase as follows. Starting with matrices and matrices , one assembles a matrix [see the example of the NAND gate in Eq. (6)]. The SVD step will lead to matrices and matrices , where the new bond dimension . It is useful to work on a logarithmic scale and define . Thus, we can write .
Let us next prove that at any step in the algorithmic evolution the “entanglement heights” satisfy the condition , which we shall refer to as the height difference constraint (hdc). The proof is done by induction. At the initial state of the calculation, one starts with the product state of all possible equally weighted inputs , which correspond to matrices or, equivalently, all , so that , thus satisfying the condition. Now suppose that the condition is satisfied at step ; we can show that it is then also satisfied at step , when a twobit gate is applied between two adjacent bits and . None of the heights other than are changed, therefore, the hdc condition remains satisfied for all and , and it just remains to be shown that it is satisfied for and . Consider the case where (the other case is analogous). In this case , satisfying the condition . Now , and using that and , as well as that , we have that . It, thus, follows that the hdc condition is satisfied at all steps in the calculation. An example of a configuration of entanglement heights satisfying the hdc is show in Fig. 1.
If all we do to evolve the state is to apply twobit gates, we have shown that . It is easy to see that after a bit insertion the condition is still satisfied, because the change in height is zero on the two sides of the inserted bit (corresponding to a square matrix), with all other relative height differences unchanged. The removal (tracing out) of bits is slightly more subtle. Right after the removal, there are large jumps across the region where the bits were removed, but these can be brought up to satisfy the hdc by applying a series of twobit identity gates [ and ] sweeping from lefttoright followed by another from righttoleft. These sweeps remove the height “faults” (and actually tend to decrease the overall height). Therefore we arrive at the result that the hdc condition is satisfied after all operations, twobit gates, bit insertions, and bit deletions (after the identity sweeps).
Let us now show that the maximum height resulting from the application of twobit gates is bounded by . The application of a single twobit gate on bits and changes the height . Because the relative heights of neighboring bonds cannot differ by more than 1 unit due to the hdc, the maximum amount that the height can increase with respect to is by 2 (which occurs when ). Therefore one can write that . Now, suppose that the maximum height is at some bond labelled by (located to the right of bit ); because the heights to the left of the 1st bit and to the right of the th bit are both equal to 0 at all times, and because of the hdc condition, there are constraints on how quickly the heights can grow from 0 to at and then decrease down to 0 again. The climb and descent that minimize the area can be trivially seen to be a triangle where increases linearly from to , and then decreases linearly until . The area of this triangle is , and any other height profile that reaches the same maximum height has equal or larger area. Therefore, , and thus we arrive at the conclusion that , i.e., the bound on the maximum entanglement height for a given number of gates. Furthermore, because of the hdc and the fact that , the entanglement height for a fixed is bounded by , and the overall maximum is reached at the center of the chain, and (which coincide when is even).
Putting all the conditions together, we arrive at , or equivalently, the bound which we used to obtain the absolute maximum running time of the search algorithm.
Conclusions – We have shown that it is possible to achieve virtual parallelization in singleprocessor classical computers using onebit and twobit local gates acting on matrix product states over bits. Based on this method, we propose a search algorithm that runs in subexponential time when the cost to check a witness requires less than twobit gates. This critical bound in the circuit size was obtained assuming a worstcase scenario for the matrix dimension growth as a function of the number of twobit gates. However, for particular circuits, the actual rank of the matrices may grow slower than this estimate, in which case some speedup is possible. In addition, during gate operations and matrix decompositions, if the singular values decay sufficiently fast, it may be possible to reduce matrix rank growth through truncation, similarly to the standard procedure used in quantum methods such as the TEBD (4) and its classical version for stochastic evolution, the cTEBD (9); (10). This question is open to future investigation.
The method is not limited to onedimensional bit arrays and could, in principle, be extended to higher dimension tensor products. Finally, we point out that the method also naturally counts the number of satisfying assignments of a given Boolean formula, which is a problem of much importance in computer science.
This work was supported in part by the NSF grants CCF1116590 and CCF1117241. The authors thank P. Wocjan for useful discussions.
References
 D. Deutsch, Proc. R. Soc. Lond. A 400, 97 (1985).
 P. W. Shor, SIAM J. Sci. Stat. Comput. 26, 1484 (1997).
 L. K. Grover, Phys. Rev. Lett. 79, 325 (1997).
 G. Vidal, Phys. Rev. Lett. 91, 147902 (2003); G. Vidal, ibid 93, 040502 (2004).
 F. Verstraete, V. Murg, and J. I. Cirac, Adv. Phys. 57, 143 (2008).
 J. I. Cirac and F. Verstraete, J. Phys. A 42, 504004 (2009).
 A. Hamma, S. Santra, and P. Zanardi, arXiv:1109.4391.
 B. Derrida, M. R. Evans, H. Hakim, and V. Pasquier, J. Phys. A 26, 1493 (1993).
 K. Temme and F. Verstraete, Phys. Rev. Lett. 104, 210502 (2010).
 T. H. Johnson, S. R. Clark, and D. Jaksch, Phys. Rev. E 82, 036702 (2010).