From local to global ground states in Ising spin glasses

From local to global ground states in Ising spin glasses

Ilia Zintchenko Theoretische Physik, ETH Zurich, 8093 Zurich, Switzerland    Matthew B. Hastings Station Q, Microsoft Research, Santa Barbara, CA 93106-6105, USA Quantum Architectures and Computation Group, Microsoft Research, Redmond, WA 98052, USA    Matthias Troyer Theoretische Physik, ETH Zurich, 8093 Zurich, Switzerland

We consider whether it is possible to find ground states of frustrated spin systems by solving them locally. Using spin glass physics and Imry-Ma arguments in addition to numerical benchmarks we quantify the power of such local solution methods and show that for the average low-dimensional spin glass problem outside the spin-glass phase the exact ground state can be found in polynomial time. In the second part we present a heuristic, general-purpose hierarchical approach which for spin glasses on chimera graphs and lattices in two and three dimensions outperforms, to our knowledge, any other solver currently around, with significantly better scaling performance than simulated annealing.

05.10.-a, 75.10.Nr, 75.50.Lk

I Introduction

The combination of disorder and frustration in spin glasses Binder and Young (1986) creates a complex energy landscape with many local minima that makes finding their ground states a formidable challenge. In particular finding the assignments of spins which minimises the total energy of an Ising spin glass with Hamiltonian


where and , is non-deterministic polynomial (NP) hard Barahona (1982) and no polynomial time algorithm is known for the hardest instances. NP-hardness also means that any problem in the complexity class NP can be mapped to an Ising spin glass with only polynomial overhead. This includes the travelling salesman problem, satisfiability of logical formulas, and many other hard optimization problems. Explicit mapping for a number of these problems have recently been given in Ref. Lucas, 2014. Efficient solvers for Ising spin glass problems hence can have an impact far beyond spin glass physics.

This broad spectrum of applications has also motivated the development of the devices by the Canadian company D-Wave Systems Harris et al. (2010); Johnson et al. (2010); Berkley et al. (2010); Johnson et al. (2011). These devices have been designed to employ quantum annealing Kadowaki and Nishimori (1998) for Ising spin glass problems using superconducting flux qubits. However, it has not yet been shown that they can outperform classical devices Shin et al. (2014); Rønnow et al. (2014). Determining the complexity of solving the spin glass problems on the so-called “chimera graph”, which is implemented by the hardware of the D-Wave devices, and finding the best classical algorithms for them is important in the search for quantum speedup on these devices Rønnow et al. (2014).

Motivated by these comparisons and the importance of efficiently solving Ising spin glass problems, here we consider the complexity of solving such problems for random spin glass instances on finite-dimensional lattices, including the chimera graph. In Sec. II we discuss the effects of non-zero temperature and magnetic field on Ising spin glasses and argue that the absence of correlations outside the spin glass phase allows for polynomial time algorithms. Section III presents an exact solver based on this idea which solves the system quasi-locally by considering finite patches of the lattice. Finally, in Sec. IV we present a hierarchical heuristic approach, which recursively solves groups of spins by splitting each group into smaller sub-groups. For our benchmark problems on two and three dimensional periodic lattices and chimera graphs with random disorder this approach outperforms, to our knowledge, any other solver currently available and scales significantly better than simulated annealing. While we give a qualitative explanation of the advantage of the hierarchical solver, it remains an open theoretical question to give a quantitative argument for its improved scaling. The interested reader can skip directly to this section as it can be understood independently of the scaling analysis earlier in the paper.

Ii Boundary Condition Dependence in Frustrated Spin Systems

It is evident that if the fields in Eq. (1) are very large, the problem can be solved by simply aligning each spin relative to the field. The problem becomes more difficult at smaller , and the meaningful question is whether a phase transition intervenes at some non-zero value of the field strength, where the difficulty increases greatly. In this section, we argue that the relevant transition indeed is already known in the literature, where it is referred to as the de Almeida-Thouless line. We argue that above this transition (which happens for any non-zero random choice of in two dimensions), the problem can be solved by considering larger patches of spins, with the patch size diverging as the field strength goes to zero; the spins in the middle of these large patches become independent of those outside the patch and can be fixed using a local algorithm. We first review the relevant literature at and the scaling theory at small .

ii.1 Review

In the particular ensemble where fields vanish (), the behaviour of the model depends strongly upon both the dimensionality of the system and upon the choice of the ensemble for the couplings between spins. In this discussion, we will focus on the case of a continuous distribution, e.g. a Gaussian one, with vanishing mean.

We will also refer to results in the literature that study nearest-neighbour couplings on a square or cubic lattice, rather than the chimera graph. One important distinction between the two-dimensional square lattice and the chimera graph is that for the square lattice, as for any planar graph, if the magnetic fields vanish there are efficient polynomial time matching algorithms for finding exact ground states Bieche et al. (1980), while on non-planar graphs, such as the chimera, it is NP-hard. We discuss this further below.

In two dimensions it is accepted that there is no spin glass phase at temperature Hartmann and Young (2001). To quantify this, consider a pair of sites . Let denote the thermal average of an operator at temperature and let denote the disorder average over Hamiltonians . Since the couplings are chosen with zero mean, we have that exactly. However, generically the ground state is unique and hence at , and this average is expected to be positive at ; however, the average vanishes in the limit of large distances between .

The reason for the absence of a spin glass phase is that it costs very little energy to flip a domain of spins. Consider flipping a cluster of spins of linear size . In a ferromagnetic state, this costs energy proportional to . In a spin glass ground state, it is possible, however, that a cluster can be found which costs very low energy to flip. Using various methods of generating flipped patches (by for example boundary condition changes), it is found that the energy of the domain wall scales proportional to with Hartmann and Young (2001). Thus, it costs less energy to flip larger clusters, and no matter how small is, for there eventually will be some such that flipping clusters at that scale costs energy smaller than . Hence there will be many thermally excited domain walls. On the other hand, for three dimensions and higher, there is believed to be a phase transition temperature with a domain wall exponent for excitations above the ground state McMillan (1984).

Similarly, we can consider random models with nonzero fields Leuzzi et al. (2009); Baños et al. (2012); Baity-Jesi et al. (2014) and denote standard deviation of the field magnitude by . In this case, we consider the quantity

. If this quantity tends to a non-zero limit at large distance between , then we term this a spin glass phase. It has been shown that such a spin-glass phase can exist in a mean-field model at Young and Katzgraber (2004); the line in the plane separating the spin glass from the paramagnetic phase is termed the de Almeida-Thouless line. However, it is unclear whether such a spin glass phase at can persist in a local finite-dimensional model. Numerical work Katzgraber et al. (2009); Larson et al. (2013) suggests that it exists for dimension . However, it is accepted that the spin glass phase at does not persist in dimension and in the next subsection we will explain why this is expected given the exponent discussed above.

It should be emphasised that it is not necessarily difficult to find ground states in a spin glass phase, as exemplified by the matching algorithm for the planar case in at . Conversely, even if a random ensemble is not in the spin glass phase, particular instances may be difficult, as exemplified by the fact that in at the model is not in a spin glass phase, but finding the ground state of arbitrary instances is still NP-hard.

ii.2 Weak Field Scaling in

We now consider the effect of a weak magnetic field in . Our general goal is to show that in this case, we expect that the value of a given spin in the ground state can often be fixed using a purely local calculation. The argument is a version of the Imry-Ma argument applied to disordered systems Imry and Ma (1975) and in the specific application to spin glasses is an example of the droplet picture Fisher and Huse (1986). We conjecture that a similar argument (with different exponents) will work if there is no de Almeida-Thouless line (i.e., whenever there is no spin glass phase at non-zero magnetic field).

Consider a spin at the center of a patch of size inside a larger system of linear size . Suppose that we have found some configuration of spins which is a ground state. At , it is impossible to know whether or without knowing the value of the boundary spins because there is a symmetry. However, at , it may be possible to determine the value of the spin independent of the value of the boundary spins. That is, there may be some choice (either or ) that minimises the energy inside the patch for all choices of boundary spins. In this case, we know that in the global ground state the spin will take the given value.

Figure 1: If central spin, is forced to be opposite to its optimal orientation while keeping the spins on the boundary of the patch fixed, a cluster of spins around it will also flip. Central spin marked in yellow, flipped cluster marked in red and the boundary in black

To analyse the ability to fix the spin independently of boundary conditions, we again begin with the case to develop a scaling argument that will apply at small . Consider a given configuration of boundary spins, which we write as , where we write this as a vector to emphasise that there are many boundary sites. At , we can minimise the energy inside the patch for this choice of boundary spins, uniquely fixing all spins inside the patch. Suppose that this minimisation gives . Now consider the case in which we force , defining a new configuration of spins inside the patch which minimises the energy subject to the given boundary conditions and given that . Forcing to take the opposite value will flip also a cluster of spins around the central spin, creating a domain wall around that cluster of spins, as shown in Fig. 1. The energy of this domain wall will be proportional to which therefore decreases with increasing .

Figure 2: Number of spins in the cluster averaged over different boundary configurations and Hamiltonians vs. linear dimension of patch . The data is fitted with to a power law (dashed line). Inset show log-log scale

The number of spins in the cluster scales also as a power of , with the power slightly less than Kawashima (2000) ; in that reference, the exponent was found for one specific method of constructing droplets. Our numerical studies, shown in Fig. 2, indicate that the number scales as , with a fractal dimension ; while this dimension might revert to for larger system sizes, we use the fractal dimension extracted at these system sizes to facilitate comparison with our complexity analysis below.

This cluster then defines a larger effective spin. The cost to flip this effective spin relative to the rest of the patch is proportional to . We now consider the case that , and analyse the effect of the non-zero on this effective spin. Given that the magnetic fields acting on the spins in the cluster are chosen randomly, we expect that the cluster will experience an effective magnetic field


Balancing these energy scales, we find that

Figure 3: Defect energy averaged over different boundary configurations and Hamiltonians vs. linear dimension of patch . The data is fitted with to a power law (dashed line). Inset show log-log scale

In Fig. 3, we show our estimate for obtained from the defect energy gained when the central spin is forced in opposite direction to its optimal with a fixed boundary configuration around a patch. While this differs slightly from the numbers quoted above, we remark that many different ways of forcing domain walls in have been considered in the literature, such as flipping a central spin as here or changing global boundary conditions and these may give rise to different values, especially for finite sizes; see Refs. Kawashima and Aoki, 2000; Kawashima, 2000; Hartmann and Moore, 2004 for various possibilities. Thus, we get that


For larger than this number, the coupling of the cluster to the effective field exceeds its coupling to the rest of the patch, so that the value of the central spin can be fixed independently of boundary conditions. Note that this analysis focuses on one possible way to fix in which the central spin can become independent of boundary conditions; others may be possible.

ii.3 Boundary Condition Dependence

The above scaling analysis gives an estimate of the length scale at which we can fix the central spin in a patch. The total number of spins which can be fixed in the system depends on the local fields and patch size and can be estimated from the probability of fixing a single spin. To quantify this probability we define


where is the central spin and where denotes the average over boundary conditions. We term this quantity as it measures the response of the central spin to change in boundary conditions. If this quantity is equal to , then the spin can be fixed independently of boundary conditions as it assumes the same value for all choice of boundary.

Figure 4: Correlation of central spin with the boundary for different fields and patch sizes. . The fit of is to , where and (dashed line)

For this averaged quantity, we find a scaling collapse as shown in Fig. 4. The scaling collapse onto a single curve is implement by defining the scaling variable . This implies a scaling

. This should be compared with the estimate in Eq. (4); the agreement of exponents is reasonable, and if we use instead of our measured the agreement becomes more accurate. We find that the scaling collapse can be approximately fit by the form


To obtain statistical information about whether we can fix a spin independently of boundary, it suffices to determine the behaviour of in the tail, see Fig. 4, where we fit with the constants and . We cannot be completely confident about the tail behaviour of at large from these simulations, but let us use this estimate to try to determine the complexity of a simple solver which tries to solve each spin by taking a sufficiently large patch that . The complexity of the solver will depend upon the scaling of , but we will estimate that it takes a polynomial time (in ) for any non-zero . We will find in the next section that we can improve on this, by using the fact that once a single spin is fixed it simplifies the fixing of other spins.

Since there are only possible boundary conditions, the minimum non-zero value of is of order . Considering the possible choices of central spin, only spins correlate with the boundary if . Equivalently, this holds if ; since , the scaling of the left-hand side of this equation is dominated by the second term. Hence, the equation will hold when


Thus, we expect that for larger than this, it will be possible to fix all spins.

Since each patch can be solved exactly with complexity using a dynamic programming method Bertele and Brioschi (1972), at a fixed the whole system can be solved with complexity


Since and , the exponential term is sub-linear and the total complexity is therefore polynomial; however, it diverges as . The exact estimate may depend sensitively upon the tail of the curve which we cannot determine with full confidence.

It should, however, be emphasised that the data in Fig. 4 arises only from an average over a finite number (in this case, ) of boundary conditions. This finite number was chosen to enable rapid sampling of the curve. To exactly solve a specific sample, we need to consider all possible boundary conditions, as discussed in the next section.

Iii Finding the exact global ground state

Following the argument above, correlations in a typical finite-dimensional lattice decay exponentially if and the ground state for such a system can therefore be found in polynomial time as the optimal orientation of single spins can be determined with high probability by considering only finite regions of the system. Furthermore, even for zero fields we present strong numerical evidence that the typical two-dimensional case can be solved in polynomial time with a more general approach which we describe below.

iii.1 Single spin reduction

Let us consider a spin in the center of a patch in our system. If for all boundary configurations of the patch the optimal orientation of the central spin is the same, then it is independent of the boundary and can thus be fixed to that value. Based on this idea, the simplest way to find the ground state is by determining the optimal orientation of each spin independently by building a patch around it and checking if the optimal orientation of the central spin is independent of the boundary. If this is not the case, we increase the patch size and check again until the spin becomes independent of the boundary. When all spins are fixed the system is solved.

This approach can be further improved by solving the system similar to a crossword puzzle rather than considering each spin independently. If a spin gets fixed, this will reduce the number of possible configurations for patches containing that spin, which in return may allow more spins to get fixed without increasing the patch sizes.

Fixing single spins is a simple algorithm which can be very efficient for systems with large fields. In the limit of very large fields the complexity approaches as each spin becomes independent of its neighbours. However, for small fields the computational effort increases as the correlation length diverges when the field approaches zero requiring patches comparable to the total system size. A more general approach, discussed next, remains effective in that limit.

iii.2 Patch reduction

Figure 5: Illustration of a patch. Central spins are marked in red, boundary spins in green and the rest of the spins in the system are blue.

Instead of only attempting to fix the central spin, correlations between spins inside a patch can be captured by considering all possible configurations of a patch that minimise the energy for a given choice of boundary conditions (see Fig. 5). These configurations are then constrained by requiring consistency between overlapping patches. We find numerically that this approach is significantly more efficient than the single spin algorithm.

The algorithm starts with a small patch size (e.g. a single spin in the center) and sequentially builds patches around each spin. For each boundary configuration of a given patch we store the configuration of the boundary together with the corresponding optimal configuration of the center spins. If the local ground state of a patch turns out to be degenerate for a given boundary condition, we arbitrarily pick any of these configurations if our aim is to obtain just one of the potentially degenerate global ground states. Note that if instead we are interested in finding all ground states, then for each boundary configuration all degenerate interior configurations need to be stored.

The number of potential ground state configurations within a patch (boundary and interior) is then further reduced by removing those configurations which are inconsistent with the constraints imposed by overlapping patches.

After a pass through all spins we increase the patch size and repeat the above steps with larger patches until only a single configuration remains or all remaining configurations have the same energy. As the patch size increases, the set of configurations which satisfy all constraints is strongly reduced and typically scales much better than the exponential worst case.

iii.3 Improved patch reduction

One way to significantly reduce the cost of storing configurations is by removing some spins from the system. If for a pair of neighbouring spins and , their product is constant in all configurations, they can be replaced by a single spin. If only one ground state is targeted, this procedure will finally eliminate all, but one spin. More generally, any arbitrary spin can be removed by replacing it with multi-spin interactions such that for each configuration of the neighbouring spins the local energy is conserved given that the spin to be removed aligns optimally with respect to its neighbours.

iii.4 Empirical scaling

Figure 6: Median wall clock time (in seconds) for different system sizes and various fields
Figure 7: Scaling exponent for the runtime shown in Fig. 6, obtained from fitting the data to a power law, for different system sizes and various fields

As shown in Figs. 6 and 7 the median time to solution appears to scale polynomially in the number of spins for all values of the field , including zero field111The median was calculated from random instances for each system size and field strength. In our implementation we use a library Lind-Nielsen () for binary decision diagrams to store these configurations which we found to be significantly more efficient than using simple boolean tables.. While faster specialised exact solvers are available De Simone et al. (1995), this algorithm is not necessarily intended as a general purpose optimiser, but rather to demonstrate polynomial scaling in the number of spins at all values of for typical low-dimensional spin glass instances.

Iv Hierarchical search

iv.1 Motivation

In this Section we present a general purpose heuristic hierarchical algorithm for finding the ground state of Ising spin glasses based on recursively optimising groups of variables. Before describing the algorithm we motivate why solving groups of variables is significantly more efficient than solving the whole system at once.

The arguably simplest heuristic algorithm for finding the ground state is by generating random spin configurations and recording the energy, in other words random guessing. The probability to find the global ground state of spins this way is trivially per guess, assuming for simplicity a non-degenerate ground state in the discussion here and below. A more sophisticated way to guess the solution is to generate random configurations of only spins and for each configuration find the lowest energy of the remaining spins by some other algorithm, e.g. by enumerating all possible combinations or any other more optimized algorithm. This improves the probability of guessing the correct solution, but as the cost of finding the optimal orientation of the remaining variables may be as much as , we might not have gained much. This idea can, however, be extended to solving multiple groups. Let’s consider two groups with and spins respectively, chosen such that spins in one group do not couple to any of the spins in the other group. For each random guess of the remaining spins, the complexity of finding the optimal configuration of both of them with respect to the rest of the system is , thus reducing the total complexity by an exponential amount from to . In our algorithm, described below, we find a significant reduction in complexity even if spins in subgroups are coupled and overlap with each other.

iv.2 Optimization of groups

The above argument provides a basis for a simple algorithm to find the global ground state by iteratively optimising groups of spins. We start with a random global state, sequentially pick groups with spins each and optimise their configurations by calling some – as yet unspecified – solver as follows

procedure Solve()
     initialise random spin configuration
     for  do
         pick a random spin
         build group of size around spin
          Solve group()
         Update configuration()
     end for
end procedure

Here, Solve group() is a solver that solves the group (taking into account the interaction with spins outside to produce an effective field) and returns an optimized configuration for the spins in that group. The procedure Update configuration() updates the spins inside to configuration ; if the solver Solve group is a heuristic solver, then Update configuration() only makes this change if the energy is lowered. Alternatively one may also consider an algorithm which replaces the group configuration probabilistically with a Metropolis-type criterion or similar.

If we pick trivial groups of size , consisting of a single spin, the group solver just returns the spin direction which minimises its energy with respect to its neighbours. For larger groups – as will usually be the case – we can use any arbitrary exact or heuristic solver, including potentially special purpose classical or quantum hardware. We note in passing that in the case of , if the new configuration is accepted probabilistically depending on its energy this algorithm reduces to simulated annealing.

iv.3 Hierarchical recursive algorithm

If solving a given system in groups is more efficient than solving the whole system at once, performance can be increased even further by solving each group by subdividing it recursively into sub-groups, thus giving a hierarchical version of the algorithm. That is, in the pseudo code written above, we could use the function ), restricted to the spins in a group, as the solver ). The recursion terminates at some (small) group size, which is solved by another algorithm.

Note that the hierarchical scheme randomises the configuration of each group before solving it by optimising subgroups, thus implementing random local restarts without affecting the global spin configuration. This randomisation also implies that it makes no sense to solve a particular group more than once in a row, but rather a new group should be chosen after one group has been optimized. It should be emphasised that random restarting is just one possible way to initialise the state of a group and the one we used here. Other ways are possible and could be more efficient.

The total complexity of the hierarchical algorithm is dominated by the number of calls to the solver for the bottom level group rather than by the group size at each level. This is because for a given group of size , the effort to calculate the local energy and randomise spins is at most for dense graphs, which is typically negligible relative to the effort of finding a lower energy configuration of that group.

iv.4 Selecting groups

Figure 8: Optimal group, marked in red, of spins in complete graph of spins with Gaussian disorder.

Up to now, we ignored the hard problem of how to best pick groups. Here we provide a simple strategy that turned out to work well. Intuitively, in a well chosen group spins are strongly coupled to each other and more weakly coupled to the rest of the system, see Fig. 8. We thus build a group by starting from one spin and greedily adding spins until the group has reached the desired size. We add the spin that maximises , if this maximum is positive and a random neighbour of one of the spins in otherwise.

Other ways of building a group may be more effective. For example, single spins could be added probabilistically, or instead of single spins we could consider sets of spins which can be added to the group. Such improvements will be discussed in follow-up work.

iv.5 Results

To test the performance of our algorithm we compare it to simulated annealing, which is currently one of the most versatile and efficient solver for finding ground states of spin glasses. As mentioned above, simulated annealing is a special case of our algorithm. For our benchmarks we perform hierarchical search with two levels, using simulated annealing to solve groups of size with the optimised configuration being accepted if its energy is lower or the same as the current configuration.

As a measure of complexity we use the median total number of spin updates required to find the ground state with a target probability . Since a heuristic algorithm will find the ground state with some probability we may have to repeat the optimization multiple times if . Assuming independent repetitions, the required number of repetitions is . For each set of parameters the probability was estimated by performing repetitions from random initial states.

32 78 9 1 8
72 80 37 5 24
128 100 60 7 64
200 349 41 4 192
288 408 68 6 400
392 500 105 13 1024
512 642 129 14 2048
Table 1: Optimal parameters for chimera graphs with random bimodal disorder. is the system size, is the number of groups, is the group size, is the number of simulated annealing sweeps per group, is the number of sweeps for plain simulated annealing.
32 28 8 1 4
72 237 9 1 4
128 388 9 1 4
200 197 26 3 1281
288 454 29 3 4800
392 470 27 3 24576
512 718 28 3 131072
Table 2: Optimal parameters for chimera graphs with cluster bimodal disorder. The parameters have the same meaning as in Tab. 1.
16 72 8 1 4
64 314 9 1 48
144 273 33 21 891
256 573 47 43 30189
Table 3: Optimal parameters for two dimensional lattices with Gaussian disorder. The parameters have the same meaning as in Tab. 1.
27 39 10 1 5
64 118 22 4 45
125 232 30 5 512
216 271 58 23 2700
343 562 86 42 13056
512 614 113 101 61440
Table 4: Optimal parameters for three dimensional lattices with Gaussian disorder. The parameters have the same meaning as in Tab. 1.
256 235 56 10 64
400 224 119 23 227
576 384 103 17 768
784 686 208 31 3506
1024 656 151 29 7680
Table 5: Optimal parameters for two dimensional lattices with bimodal disorder. The parameters have the same meaning as in Tab. 1.

For both algorithms and each class and size of problems we optimise the simulation parameters to minimise the median effort in terms of single spin updates. For simulated annealing the total effort for a single repetition is , there is the number of sweeps and is the system size. We used a linear schedule in where the initial and final values of inverse temperature, and respectively, as well as the number of sweeps are chosen to minimise the total effort. We list the parameters used in Tables 15.

For the hierarchical approach a single repetitions requires a total effort , where is the number of groups, is the number of simulated annealing sweeps per group and is the group size. The same annealing schedule is used for each group. The values of , and are chosen to minimise the total effort and are listed in Tables 15

Figure 9: Chimera graph with spins composed of an grid of unit cell. Each unit cells is a complete bipartite graph with spins.

As benchmark problems we used typical spin glass problems on two and three-dimensional lattices: two-dimensional square lattices, three dimensional simple cubic lattices222We used the spin glass server SGS () to computed the exact ground states for three dimensional lattices, and so-called two-dimensional chimera graphs. The unit cell of the chimera graph Denil and Freitas (), shown in Fig. 9, is a complete bipartite graph with eight vertices and is coupled to the neighbouring unit cells with four edges each. Hence, each vertex has either five or six edges corresponding to four edges to spins within the unit cell and one or two edges to neighbouring unit cells depending on if it is on the edges of the graph or in the interior respectively.

One choice of benchmark problems are spin glasses with bimodal disorder i.e., couplings and another choice will be Gaussian disorder with couplings drawn from a normal distribution with zero mean and unit variance. In all benchmarks we choose zero local fields .

A special benchmark problem is chimera graphs with cluster structure, which has recently been proposed as a class of problems to explore an advantage of quantum annealing over simulated annealing Google (). In these problems the spins within each unit cell are coupled ferromagnetically with . Of the four edges connecting neighbouring pairs of unit cells one randomly chosen edge is assigned a random coupling and the rest is set to zero.

Figure 10: Speedup (ratio of the median number of spin updates) of hierarchical search relative to simulated annealing for chimera graphs with random bimodal disorder. The inset shows the total number of spin updates for simulated annealing (SA), parallel tempering (PT) and hierarchical search (HS). For SA, PT and each group of HS, , . Optimal parameters for SA and HS are listed in Tab. 1.
Figure 11: Chimera graphs with cluster bimodal disorder. Speedup and the inset are defined the same as in Fig. 10. For both plain simulated annealing and for each group, , . Optimal parameters for both algorithms are listed in Tab. 2.
Figure 12: Two dimensional square lattices with Gaussian disorder. Speedup and the inset are defined the same as in Fig. 10. As the energy gap between the ground state and first excited state decreases linearly with system size, the final temperature is also reduced with the number of spins. , for plain simulated annealing (SA) and parallel tempering (PT) and for each group of the hierarchical algorithm (HS). Optimal parameters for SA and HS are listed in Tab. 3.
Figure 13: Two dimensional square lattices with bimodal disorder. Speedup and the inset are defined the same as in Fig. 10. For both plain simulated annealing and for each group, , . Optimal parameters for both algorithms are listed in Tab. 5.
Figure 14: Three dimensional cubic lattices with Gaussian disorder. Speedup and the inset are defined the same as in Fig. 10. , for plain simulated annealing and for each group of the hierarchical algorithm. Optimal parameters for both algorithms are listed in Tab. 4.
Figure 15: Speedup (ratio of the median number of spin updates) of hierarchical search (HS) with parallel tempering as a group solver to plain parallel tempering (PT) for two dimensional square lattices with Gaussian disorder. , for plain parallel tempering and for each group of the hierarchical algorithm.

In all benchmarks we find that hierarchical search performs significantly better than simulated annealing. The gain is evidently more significant for problems that are harder for simulated annealing, such as cluster chimera graphs and systems with Gaussian disorder, see Fig. 11, 12 and 14 respectively. Random bimodal disorder is significantly easier for simulated annealing and hence the speedup on those problems is smaller, although still substantial, see Fig. 10 and 13 respectively.

As a further comparison was also made with parallel tempering Swendsen and Wang (1986), another state the art method for finding ground states of Ising spin glasses. For each class and problem size, the total number of replicas and sweeps per replica was optimised minimising the median total number of spin updates. For a single repetition the total effort is , where is the number of replicas, is the number of sweeps per replica and is the system size. On chimera graphs its performance is very similar to simulated annealing, see Fig. 10. On two dimensional lattices with Gaussian disorder it performs slightly better, see Fig. 12. However, analogous to simulated annealing its performance can be significantly improved by optimising groups of spins rather than the whole system at once, see Fig. 15. Note that although in all cases the advantage of hierarchical search over plain simulated annealing and parallel tempering grows with problem size, a spin update is effectively more costly due to the additional overhead of randomising the spins and computing the energy of a group. However, the difference is typically insignificant. For example, the wall clock time per spin update is only about 7% higher than plain simulated annealing for 3D lattices with Gaussian disorder ran with optimal parameters.

V Conclusion

It has long been established that the complexity of finding grounds states of spin glasses is strongly dependent on the ensemble of couplings and is in the worst case NP-hard. However, whilst the the most trivial cases, like the ferromagnetic Ising model, are relatively evident, the hardest problems are far more elusive Katzgraber et al. (2014).

One way to look for hard problems is by sampling randomly distributed couplings. Although this approach certainly includes such problems, in this work we presented strong numerical evidence that the average complexity of low-dimensional spin glasses with randomly distributed couplings is actually polynomial in the number of spins and looking for hard problems in such a large ensemble might be next to futile. Another way to generate hard cases is to map 3-SAT problems at the critical clause to variable ratio Mitchell et al. (1992), where previous studies have shown evidence of a universal peak in complexity, to the Ising model. Further studies are to be done in this direction.

Our most significant result reported here is a hierarchical approach as a way to potentially improve the performance of a given algorithm for finding ground states of Ising spin glasses. With simulated annealing as a reference solver, on all our benchmark instances we find that optimising groups of spins is significantly more efficient than solving the whole system at once.

It should be noted that approaches other than simulated annealing can be used at the bottom level of the hierarchical solver. Suppose for some class of problems, another algorithm (or special purpose classical or quantum device) outperforms simulated annealing. In that case, we can use that algorithm or device at the lowest level. Let denote the time annealing takes to optimise the bottom level. If it is now replaced by a device which takes time, including communication overhead, , we expect the potential speedup to the whole algorithm to be . As the complexity of finding the ground state scales exponentially with the number of spins, this can be significant even for small groups.

Although we limited our investigation to spin glasses, similar ideas can be applied directly to other problems such as machine learning, protein folding, travelling salesman etc. by constraining groups of variables independently relative to the rest of the system. We will address such applications in a follow-up work.

Vi Acknowledgements

This work was supported by Microsoft Research, the Swiss National Competence Center in Research NCCR QSIT and the ERC Advanced Grant SIMCOFE. We thank Bruichladdich Dist. for inspiration and M. Freedman, L. Gamper, I. Hen, J. Imriska, S. Isakov, H.G. Katzgraber, A. Kosenkov, J. Osorio, I. Pizorn, D. Poulin, T. Rønnow, A. Soluyanov, and D. Steiger for fruitful discussions. Simulations were performed on the Monch and Brutus clusters of ETH Zurich.


Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description