A deceptive step towards quantum speedup detection
Abstract
There have been multiple attempts to design synthetic benchmark problems with the goal of detecting quantum speedup in current quantum annealing machines. To date, classical heuristics have consistently outperformed quantumannealing based approaches. Here we introduce a class of problems based on frustrated cluster loops — deceptive cluster loops — for which all currently known stateoftheart classical heuristics are outperformed by the DW2000Qquantum annealing machine. While there is a sizable constant speedup over all known classical heuristics, a noticeable improvement in the scaling remains elusive. These results represent the first steps towards a detection of potential quantum speedup, albeit without a scaling improvement and for synthetic benchmark problems.
pacs:
75.50.Lk, 75.40.Mg, 05.50.+q, 03.67.LxI Introduction
Quantum annealing (QA) Kadowaki and Nishimori (1998); Farhi et al. (2001); Finnila et al. (1994); Martoňák et al. (2002); Santoro et al. (2002); Das and Chakrabarti (2008) has been proposed as a potentially efficient heuristic to optimize hard constraint satisfaction problems. In principle, the approach can overcome tall energy barriers commonly found in this class of optimization problems by exploiting quantum effects, thereby potentially outperforming commonlyused heuristics that use thermal kicks to overcome the barriers. However, despite a significant effort by the scientific community towards an optimization technique that, in principle, relies on quantum effects, it is still unclear whether quantum speedup is actually achievable using analog transversefield quantum annealing approaches.
There have been multiple attempts to define quantum speedup Rønnow et al. (2014a); Mandrà et al. (2016), as well as quantify any “quantumness” and problemsolving efficacy of current commerciallyavailable quantum annealers Johnson et al. (2011); Dickson et al. (2013); Boixo et al. (2014); Katzgraber et al. (2014); Rønnow et al. (2014a); Katzgraber et al. (2015); Heim et al. (2015); Hen et al. (2015); Albash et al. (2015); MartinMayor and Hen (2015); Marshall et al. (2016); Denchev et al. (2016); King et al. (2017); Albash and Lidar (2017). However, to date, any convincing detection of an improved scaling of quantum annealing with a transverse field over stateoftheart classical optimization algorithms remains elusive. The increase in performance of quantum annealing machines in the last few years has resulted in an “arms race” with classical optimization algorithms implemented on CMOS hardware. The goal post to detect quantum speedup continuously keeps moving and has resulted in a renaissance in classical algorithm design to optimize hard constraintsatisfaction problems.
A key ingredient in the detection of quantum speedup is the selection of the optimization problems to be used as benchmark. Ideally, one would want a realworld industrial application where the time to solution of the quantum device scales better than any known algorithm with the size of the input. However, such application problems are not suitable for presentday quantum annealers, either because they require more variables than currently available or because precision requirements cannot be met by current technologies. Random spinglass problems have been shown to be too easy to detect any scaling improvements Katzgraber et al. (2014, 2015). As such, efforts have shifted to carefullydesigned synthetic problems. While some studies focus on postselection techniques Katzgraber et al. (2015), others focus on the use of planted solutions Hen et al. (2015); King et al. (2017), or the use of gadgets Albash and Lidar (2017). Unfortunately, however, in all planted problems Hen et al. (2015); King et al. (2017) used to date, as well as problems that use gadgets Denchev et al. (2016), the underlying logical structure is easily decoded and the underlying problem trivially solved, sometimes even with exact polynomial methods Mandrà et al. (2016). Therefore, in the quest for quantum speedup, an important step is to design problems where no variable reduction or algorithmic trick can be exploited to reduce the complexity of the problem. Ideally, the benchmark problem should be hard for a small number of variables and “break” all known optimization heuristics.
In this work we introduce a class of benchmark problems designed for DW2000Qquantum annealers whose logical structure is not directly recognizable and whose typical computational complexity can be tuned via a control parameter that tunes the relative strength of inter vs intracell couplers in the DW2000QChimera Bunyk et al. (2014) topology. Note that this approach can be easily generalized to other topologies. We demonstrate that for a particular setting of the control parameter where the ground state of the virtual problem cannot be decoded, the DWave Systems Inc. DW2000Qquantum annealer outperforms all known classical optimization algorithms by approximately two to three orders of magnitude. More precisely, we compare against the two best heuristics to solve Isinglike problems on the DW2000QChimera topology, the Hamzede FreitasSelby (HFS) Hamze and de Freitas (2004); Selby (2014) and parallel tempering Monte Carlo with isoenergetic cluster moves (PT+ICM) Zhu et al. (2015) heuristics. Although we were not able to identify the optimal annealing time given the hard limit of as minimum annealing time in the DW2000Qdevice, the scaling is comparable and the speedup persists for increasing system sizes. Therefore, we present the first steps towards the detection of potential quantum speedup, however, for now, without a noticeable scaling improvement. Problems with tunable complexity as the ones shown here, combined with a careful statistical analysis, bulletproof definitions of quantum speedup, the inclusion of power consumption in the analysis, as well as the use of the currently bestavailable heuristics are key in the assessment of the performance of quantumenhanced optimization techniques.
Ii Technical Details
The DW2000Qquantum annealer is designed to optimize classical problem Hamiltonians of the quadratic form
(1) 
where is known as Chimera graph Bunyk et al. (2014) constructed of a twodimensional lattice of fullyconnected cells. The couplers and biases are programmable parameters that define the optimization problem to be studied. Although the DW2000QChimera architecture graph has been kept fixed since the first commercial generation of the machine, the number of qubits doubled almost every two years. At the moment, the latest DW2000Qchip counts working fluxqubits and working couplers. To minimize the cost function , the DW2000Qquantum chip anneals quantum fluctuations driven by a transverse field of the form
(2) 
More precisely, the annealing protocol starts with the system initialized to a quantum paramagnetic state. Then, the amplitude of is slowly reduced while the amplitude of the problem Hamiltonian is gradually increased. If the annealing is slow enough, the adiabatic theorem Morita and Nishimori (2008) ensures that the quantum system remains in its instantaneous lowest energy state for the entire annealing protocol. Therefore, (closeto) optimal configurations for can be retrieved by measuring the state of the qubits along the basis at the end of the anneal.
Given its intrinsic analog nature, combined with the heuristic properties of quantum annealing, the DW2000Qdevice is only able to find the optimum of a cost function up to a probability . Indeed, fast annealing in proximity of level crossings Kadowaki and Nishimori (1998); Farhi et al. (2001); Santoro and Tosatti (2006), as well as quantum dephasing effects Amin et al. (2009); Dickson et al. (2013); Albash and Lidar (2015), thermal excitations Wang et al. (2016); Nishimura et al. (2016); Marshall et al. (2017) and programming errors Mandrà et al. (2015); Katzgraber et al. (2015), can lead to higher energy states of at the end of the anneal. A commonlyaccepted metric is the timetosolution (TTS). The TTS is defined as the time needed for a heuristic, either classical or quantum, to find the lowest energy state with success probability, that is:
(3) 
where is either the running time (for a classical heuristic) or the annealing time (for the DW2000Qquantum chip) and is the number of repetitions needed to reach the desired success probability Rønnow et al. (2014b). In this work we analyze the TTS as a function of the number of input variables in the problem.
Iii Synthetic Benchmark Problems
In this Section we outline and discuss a new synthetic benchmark we call “deceptive cluster loop” (DCL) problems based on traditional frustrated cluster loop problems. However, DCL problems have a tunable parameter that for particular values hides the underlying logical structure of the planted problem, thus “deceptive.”
iii.1 Traditional frustrated cluster loop problems
Based on the fact that it is typically hard for agnostic optimization algorithms to find the lowest energy state of very long frustrated chains, the frustrated cluster loop à la Hen (HFCL) is a random model that has been proven to be hard for many classical heuristics Hen et al. (2015). The idea is simple: Given , an arbitrary connectivity graph for the problem Hamiltonian , and two parameters and , HFCL instances are constructed as follows:

Generate loops on the graph, where is the number of nodes in . Loops are constructed by placing random walkers on random nodes (tails are eliminated once random walkers cross their own path).

For each loop , assign to all the corresponding couplings but one randomly chosen one, for which the value is assigned instead.

The final Hamiltonian is then constructed by adding up all the loop couplings, i.e.,
(4) The instance is discarded if there is a coupling such that
(5)
The parameters and correspond to the density of “constraints” and to the “ruggedness” of the HFCL problem, respectively.
Although the HFCL problems can be, in principle, directly generated for the Chimera graph Hen et al. (2015), in a recent paper King et al. (2017), King et al. have chosen a different approach (called here KFCL) that can be divided into two steps:

All couplings inside a unit cell of the Chimera structure are set to be ferromagnetic, i.e., , . Because the unit cells are fullyconnected, all the physical qubits within a single cell are forced to behave as a single virtual qubit. This process generates a twodimensional lattice with open boundary conditions of these virtual variables. Here, is the number of cells on the Chimera graph with physical variables (qubits) and virtual variables.

The embedded instances are then generated on the virtual lattice with a given and .
These KFCL problems chi () have considerably fewer (virtual) variables than other benchmarks, but have proven to be computationally difficult for many heuristics, in particular the HFSand PT+ICMsolvers King et al. (2017). We emphasize, however, that the virtual problem is planar and can therefore be solved in polynomial time using minimumweightperfectmatching techniques Kolmogorov (2009); Mandrà and Katzgraber (2017). As such, any speedup claims based on these problems have to be taken with a grain of salt.
iii.2 Deceptive cluster loop benchmark problems
Inspired by the KFCL benchmark problems, we have developed a new class of problems we call deceptive cluster loops. Although the ground state of the problem cannot be planted and therefore has to be computed with other efficient heuristics, we show that while the DW2000Qdevice maintains its performance for this class of problems, all other known heuristics struggle with solving these instances. In addition, the virtual problem cannot be easily decoded, i.e., the problems cannot be solved in polynomial time or with other clever approximations that exploit the logical structure Mandrà et al. (2016).
The structure of the DCL problems can be summarized as follows: Starting from an embedded KFCL instance, all the intercell couplers in a cell are multiplied by a factor , whereas all intracell couplers have magnitude .
One of the main feature of the proposed DCL problems that distinguishes them from other FCLlike models King et al. (2017); Albash and Lidar (2017) is the presence of two distinct limits for small and large . For small , i.e., in the limit of weak intercell couplings, each unit cell results to be strongly connected and therefore, behaves like a single virtual variable. In particular, when , the DCL problems are equivalent to KFCL problems. The corresponding Ising model has a twodimensional planar square lattice as the underlying graph and therefore can be solved in polynomial time Mandrà and Katzgraber (2017). On the other hand, in the limit of large , i.e., in the limit of strong intercell couplings, either horizontal or vertical chains that go across different unit cells become strongly coupled. By observing that there always exists a gauge transformation for Chimera graphs such that all the intercell couplings can be fixed to be ferromagnetic, it is straightforward to see that the corresponding virtual model for is the virtual fullyconnected bipartite model Venturelli et al. (2015). For intermediate values of , the DCL problems become a nontrivial combination of the two limits and therefore, optimal states cannot be mapped onto either virtual models. The effect becomes most pronounced when for the intercell couplers is comparable to the connectivity of the intracell variables, i.e., – for the current DWave Chimera architecture, where the local intracell environment felt by a variable in the cell competes with the strength of the intercell couplers.
From a physical point of view, the DCL problems have another important property which makes them interesting in their own right: By continuously changing the scaling parameter , it is possible to modify the critical spinglass temperature from () Katzgraber et al. (2014), to () Venturelli et al. (2015). Therefore, it would be interesting to understand the nature of the spinglass phase for intermediate where the system is neither planar or fullyconnected Mandrà and Katzgraber (2018).
Iv Results
In this Section, we compare the DW2000Qquantum chip against two of the fastest classical heuristics for Chimera Hamiltonians, namely the Hamzede FreitasSelby (HFS) heuristic and the parallel tempering isoenergetic cluster method (PT+ICM). Both HFSand PT+ICMhave been modified to correctly compute TTS as described in Eq. (3). Moreover, PT+ICMhas been further optimized to exploit the knowledge of the virtual ground states in both limits of small () and large () scaling (referred to as PT+ICM+L). In particular, is computed by running PT+ICMfrom either an initial random state or from one of the two virtual ground states and then taking the minimum value. For each linear size , we generated DCL instances with parameters and (instances at different have been obtained by properly rescaling the intercell couplings). In all plots, points represent the median of the distribution while the error bar correspond to the – percentiles. If not otherwise indicated, DW2000Qannealing time has been fixed to the minimum allowed, namely . Simulation parameters for the classical heuristics are listed in the Appendix alb ().
Figure (1) summarizes our results where DW2000Qis compared to both HFSand PT+ICM. Interestingly, excluding the region of small where PT+ICM+Lis designed to be the fastest, DW2000Qalways performs better than the two classical heuristics for the considered values of , being approximately times faster for . To better appreciate the different computational scalings among the classical and quantum heuristics we analyzed, Fig. (2), top panels, reports the scaling exponent of an exponential fit of the form:
(6) 
In the plots, boxes represent the confidence interval – for computed using only the percentile of TTS while whisker bars represent the confidence interval – for computed using the – percentile of the TTS. As one can see, while HFSis statistically indistinguishable from the DW2000Q data, PT+ICMperforms slightly better for large . However, the better performance for PT+ICMfor large can be explained by noticing that PT+ICMhas been optimized for each while both DW2000Qand HFSuse the same setup regardless of the value of . Figure (2, bottom panels, shows that DW2000Qis consistently faster than both HFSand PT+ICMby, on average, a factor of . Unfortunately, we cannot “certify” the DW2000Qcomputational scaling because we are not able to find the optimal annealing time for the allowed minimum annealing time in the device. Still, as shown in Fig. (3), we have strong indication that the computational scaling we have found is reliable because of its stability for a large variation of annealing times one ().
To further analyze the effects of the rescaling factor in Fig. (4) we show the performance of DW2000Qcompared to PT+ICM, PT+ICM+L, and HFSat fixed linear size of the system . As expected, PT+ICM+Lperforms the best for both small and large , while its performance quickly degenerates for , i.e., in the region where the true ground state is a nontrivial overlap of the virtual ground states at either small or large [see Fig. (6) for the number of “broken” virtual variables for either the virtual planar model or the virtual fullyconnected bipartite model]. In contrast, the DW2000Qperformance gradually decreases by increasing the scaling factor (precision issues Venturelli et al. (2015); Katzgraber et al. (2015) may be one of the dominant factor of the loss of performance).
As final remark, it is important to stress that the advantage is not just on the typical instance. Indeed, as it is shown in Fig. (5), DW2000Qperforms best even in an instance by instance comparison, when contrasted to HFSand PT+ICM.
V Inclusion of Power Consumption
While most of the benchmark studies have largely focused on pure computational speed, the inclusion of power consumption has been largely neglected in the literature pow (a). With evergrowing data centers, power consumption has become an important issue and “greener” computational solutions are highly sought after.
A largesized data center like the one hosted at NASA Ames top (2016); nas () has a typical energy consumption of approximately MW, with a ratio between power usage and cooling. With more than cores for the current NASA highperformance computing cluster, the typical energy consumption is approximately W/core. In contrast, the energy consumption of the DW2000Qquantum processing unit is approximately pW. Keeping the quantum processing unit cooled to mK requires approximately – kW. In our analysis, we found that the DW2000Qdevice was approximately times faster than the used PT+ICMand HFSheuristics. Therefore, to compete against DW2000Q, – compute cores are needed running in parallel with a total energy consumption between and kW. Therefore, power consumption is, overall, comparable. However, there is a remarkable difference: The data center uses of the total consumed energy to run the computers, while the DW2000Qdevice requires only of the power to run the quantum processing unit. Therefore, while an improvement of the power usage effectiveness (PUE) Brady et al. (2013); gre () for the classical data center would eventually reduce the total cooling power of , far more efficient cooling alternatives are needed to reduce the quantum PUE (qPUE). It is unclear how the qPUE can be reduced due to the cryogenic requirements for quantum processing units. However, dry dilution refrigerators with more efficient pumping systems might improve this metric pow (b).
Vi Conclusions
In conclusion, we present the first class of tunable benchmark problems – Deceptive Cluster Loops (DCL) – for which the DWave quantum chip (DW2000Q) shows an advantage over the currently best classical heuristics, namely the parallel tempering isoenergetic cluster method (PT+ICM) and the Hamzede FreitasSelby (HFS) algorithm. The benchmarks are characterized by a control parameter , the scaling factor of the intercell couplings, that allows to continuously transform the model from a virtual planar model () to a virtual fullyconnected bipartite problem (). While classical heuristics are faster in the small and large limit where the logical structure can be exploited, DW2000Qis the fastest in the crossover region , where the DCL problems are neither virtual planar nor virtual fullyconnected bipartite. Indeed, while the computational scaling is comparable among classical and quantum heuristics, the DW2000Q device is approximately two orders of magnitude faster than the currently best known heuristics (PT+ICMand HFS) with a comparable scaling. This result represents the first of its kind since the inception of the DWave quantum chip.
Vii Acknowledgments
H. G. K. acknowledges support from the National Science Foundation (Grant No. DMR1151387) and thanks M. Thom for multiple discussions on power consumption of the DW2000Qdevices. He also thanks N. Artner for support. S. M. acknowledges E. G. Rieffel for the careful reading of the manuscript and useful discussion, and the NASA Ames Research Center for support and computational resources. This research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via Interagency Umbrella Agreement IA11198. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.
Appendix A Number of Broken Virtual Variables in the DCL Model
Our numerical simulations for suggest that the DCL model reduces to the virtual planar model for , while it is a virtual fullyconnected bipartite for [see Fig. (4)]. Figure (6) shows the number of “broken” virtual variables for either the virtual planar model or the virtual fullyconnected bipartite model. The interesting regime is obtained when , i.e., when the DCL model is neither a virtual planar model nor a virtual fullyconnected bipartite model. In this regime, the ground state of the DCL model cannot be found by solving a corresponding virtual problems and therefore, logical structures cannot be exploited as in Refs. King et al. (2017); Mandrà and Katzgraber (2017).
Appendix B Simulation parameters
In this Section, we briefly report the main parameters we used
for our experiments and numerical simulations.
DCL Random Instances — We randomly generate instances for
each system size (). The instances are generated by
following the prescription in Ref. King et al. (2017) (with ,
) and then properly rescaling the intercell couplings by
a factor . For consistency, the same instances have been used
for all values of . Unlike in Ref. King et al. (2017), we used all
available qubits, i.e., some of the unit cells are not complete.
DW2000QParameters — For all experiments, we use the minimum
allowed annealing time of and gauges
runs, i.e., total readouts. The initialization time and the
readout time have not been included in the calculation of the TTS.
PT+ICMParameters — The lowest and highest temperature for
parallel tempering have been chosen to be and , respectively, to
maximize the performance of PT+ICM. Optimal sweeps for each instance
and have been determined by computing the cumulative
distribution of the probability to find the ground state ( runs
for each instance and ). The overall optimal number of sweeps is
then obtained by bootstrapping the optimal number of sweeps for each
instance. For all simulations, the number of sweeps has been
optimized to minimize the TTS of the percentile. The
initialization time and the readout time have been not included in the
calculation of the TTS.
PT+ICM+LParameters — The parameters used are the same as for
PT+ICM. The computational time to find the ground state of either the
virtual planar model of the virtual fullyconnected bipartite model has
been set to zero (in reality, the computational time to find the ground
state of the fullyconnected bipartite model is nonnegligible).
HFSParameters — The option S13, namely “Exhaust maximal
tree width 1 subgraphs” with partial random state initialization, has
been used. Optimal sweeps for each instance and have been
determined by computing the cumulative distribution of the probability
to find the ground state ( runs for each instance and ).
The overall optimal number of sweeps is then obtained by bootstrapping
the optimal number of sweeps for each instance. For all the simulations,
the number of sweeps has been optimized to minimize the TTS of the
percentile. The initialization time and the readout time have
been not included in the calculation of the TTS.
Exponential Fits — For all the linear regressions, only the last system sizes () have been used. fits for the percentile have been obtained using a linear least squares model. fits for the confidence have been computed by randomly extract values in the confidence interval, one for each size, and then bootstrapping the linear regression data. Figure (7) shows the fits for different values of .
References
 Kadowaki and Nishimori (1998) T. Kadowaki and H. Nishimori, Quantum annealing in the transverse Ising model, Phys. Rev. E 58, 5355 (1998).
 Farhi et al. (2001) E. Farhi, J. Goldstone, S. Gutmann, J. Lapan, A. Lundgren, and D. Preda, A quantum adiabatic evolution algorithm applied to random instances of an NPcomplete problem, Science 292, 472 (2001).
 Finnila et al. (1994) A. B. Finnila, M. A. Gomez, C. Sebenik, C. Stenson, and J. D. Doll, Quantum annealing: A new method for minimizing multidimensional functions, Chem. Phys. Lett. 219, 343 (1994).
 Martoňák et al. (2002) R. Martoňák, G. E. Santoro, and E. Tosatti, Quantum annealing by the pathintegral Monte Carlo method: The twodimensional random Ising model, Phys. Rev. B 66, 094203 (2002).
 Santoro et al. (2002) G. Santoro, E. Martoňák, R. Tosatti, and R. Car, Theory of quantum annealing of an Ising spin glass, Science 295, 2427 (2002).
 Das and Chakrabarti (2008) A. Das and B. K. Chakrabarti, Quantum Annealing and Analog Quantum Computation, Rev. Mod. Phys. 80, 1061 (2008).
 Rønnow et al. (2014a) T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov, D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer, Defining and detecting quantum speedup, Science 345, 420 (2014a).
 Mandrà et al. (2016) S. Mandrà, Z. Zhu, W. Wang, A. PerdomoOrtiz, and H. G. Katzgraber, Strengths and weaknesses of weakstrong cluster problems: A detailed overview of stateoftheart classical heuristics versus quantum approaches, Phys. Rev. A 94, 022337 (2016).
 Johnson et al. (2011) M. W. Johnson, M. H. S. Amin, S. Gildert, T. Lanting, F. Hamze, N. Dickson, R. Harris, A. J. Berkley, J. Johansson, P. Bunyk, et al., Quantum annealing with manufactured spins, Nature 473, 194 (2011).
 Dickson et al. (2013) N. G. Dickson, M. W. Johnson, M. H. Amin, R. Harris, F. Altomare, A. J. Berkley, P. Bunyk, J. Cai, E. M. Chapple, P. Chavez, et al., Thermally assisted quantum annealing of a 16qubit problem, Nat. Commun. 4, 1903 (2013).
 Boixo et al. (2014) S. Boixo, T. F. Rønnow, S. V. Isakov, Z. Wang, D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer, Evidence for quantum annealing with more than one hundred qubits, Nat. Phys. 10, 218 (2014).
 Katzgraber et al. (2014) H. G. Katzgraber, F. Hamze, and R. S. Andrist, Glassy Chimeras Could Be Blind to Quantum Speedup: Designing Better Benchmarks for Quantum Annealing Machines, Phys. Rev. X 4, 021008 (2014).
 Katzgraber et al. (2015) H. G. Katzgraber, F. Hamze, Z. Zhu, A. J. Ochoa, and H. MunozBauza, Seeking Quantum Speedup Through Spin Glasses: The Good, the Bad, and the Ugly, Phys. Rev. X 5, 031026 (2015).
 Heim et al. (2015) B. Heim, T. F. Rønnow, S. V. Isakov, and M. Troyer, Quantum versus classical annealing of Ising spin glasses, Science 348, 215 (2015).
 Hen et al. (2015) I. Hen, J. Job, T. Albash, T. F. Rønnow, M. Troyer, and D. A. Lidar, Probing for quantum speedup in spinglass problems with planted solutions, Phys. Rev. A 92, 042325 (2015).
 Albash et al. (2015) T. Albash, T. F. Rønnow, M. Troyer, and D. A. Lidar, Reexamining classical and quantum models for the DWave One processor, Eur. Phys. J. Spec. Top. 224, 111 (2015).
 MartinMayor and Hen (2015) V. MartinMayor and I. Hen, Unraveling Quantum Annealers using Classical Hardness (2015), (arXiv:1502.02494).
 Marshall et al. (2016) J. Marshall, V. MartinMayor, and I. Hen, Practical engineering of hard spinglass instances, Phys. Rev. A 94, 012320 (2016).
 Denchev et al. (2016) V. S. Denchev, S. Boixo, S. V. Isakov, N. Ding, R. Babbush, V. Smelyanskiy, J. Martinis, and H. Neven, What is the Computational Value of Finite Range Tunneling?, Phys. Rev. X 6, 031015 (2016).
 King et al. (2017) J. King, S. Yarkoni, J. Raymond, I. Ozfidan, A. D. King, M. M. Nevisi, J. P. Hilton, and C. C. McGeoch, Quantum Annealing amid Local Ruggedness and Global Frustration (2017), (arXiv:quantphys/1701.04579).
 Albash and Lidar (2017) T. Albash and D. A. Lidar, Evidence for a Limited Quantum Speedup on a Quantum Annealer (2017), (arXiv:quantphys/1705.07452).
 Bunyk et al. (2014) P. Bunyk, E. Hoskinson, M. W. Johnson, E. Tolkacheva, F. Altomare, A. J. Berkley, R. Harris, J. P. Hilton, T. Lanting, and J. Whittaker, Architectural Considerations in the Design of a Superconducting Quantum Annealing Processor, IEEE Trans. Appl. Supercond. 24, 1 (2014).
 Hamze and de Freitas (2004) F. Hamze and N. de Freitas, in Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (AUAI Press, Arlington, Virginia, United States, 2004), UAI ’04, p. 243, ISBN 0974903906.
 Selby (2014) A. Selby, Efficient subgraphbased sampling of Isingtype models with frustration (2014), (arXiv:condmat/1409.3934).
 Zhu et al. (2015) Z. Zhu, A. J. Ochoa, and H. G. Katzgraber, Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension, Phys. Rev. Lett. 115, 077201 (2015).
 Morita and Nishimori (2008) S. Morita and H. Nishimori, Mathematical Foundation of Quantum Annealing, J. Math. Phys. 49, 125210 (2008).
 Santoro and Tosatti (2006) G. E. Santoro and E. Tosatti, TOPICAL REVIEW: Optimization using quantum mechanics: quantum annealing through adiabatic evolution, J. Phys. A 39, R393 (2006).
 Amin et al. (2009) M. H. S. Amin, D. V. Averin, and J. A. Nesteroff, Decoherence in adiabatic quantum computation, Phys. Rev. A 79, 022107 (2009).
 Albash and Lidar (2015) T. Albash and D. A. Lidar, Decoherence in adiabatic quantum computation, Phys. Rev. A 91, 062320 (2015).
 Wang et al. (2016) W. Wang, J. Machta, and H. G. Katzgraber, Bond chaos in spin glasses revealed through thermal boundary conditions (2016), (arXiv:1603.00543).
 Nishimura et al. (2016) K. Nishimura, H. Nishimori, A. J. Ochoa, and H. G. Katzgraber, Retrieving the ground state of spin glasses using thermal noise: Performance of quantum annealing at finite temperatures, Phys. Rev. E 94, 032105 (2016).
 Marshall et al. (2017) J. Marshall, E. G. Rieffel, and I. Hen, Thermalization, freezeout and noise: deciphering experimental quantum annealers (2017), (arxiv:1703.03902).
 Mandrà et al. (2015) S. Mandrà, G. G. Guerreschi, and A. AspuruGuzik, Adiabatic quantum optimization in the presence of discrete noise: Reducing the problem dimensionality, Phys. Rev. A 92, 062320 (2015).
 Rønnow et al. (2014b) T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov, D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer, Defining and detecting quantum speedup (2014b), (arXiv:quantphys/1401.2910).
 (35) It has been tempting to refer to the approach developed in Ref. King et al. (2017) “frustrated cluster loops Hen à la King,” but that would be a mouthful every time it appears in the text.
 Kolmogorov (2009) V. Kolmogorov, Blossom V: A new implementation of a minimum cost perfect matching algorithm, Math. Prog. Comp. 1, 43 (2009).
 Mandrà and Katzgraber (2017) S. Mandrà and H. G. Katzgraber, The pitfalls of planar spinglass benchmarks: Raising the bar for quantum annealers (again), Quantum Sci. Technol. 2, 038501 (2017).
 Venturelli et al. (2015) D. Venturelli, S. Mandrà, S. Knysh, B. O’Gorman, R. Biswas, and V. Smelyanskiy, Quantum Optimization of Fully Connected Spin Glasses, Phys. Rev. X 5, 031040 (2015).
 Mandrà and Katzgraber (2018) S. Mandrà and H. G. Katzgraber, In preparation (2018).
 (40) In our study we do not include simulated quantum annealing (SQA) simulations because these have been reported elsewhere Albash and Lidar (2017).
 (41) Experiments for have been obtained using the DW2000Q quantum chip recently installed at NASA Ames.
 pow (a) To our knowledge, the only discussion of power consumption between quantum and classical hardware can be found in a whitepaper published on DWave Systems Inc.’s website [https://goo.gl/v72k2f].
 top (2016) (2016), top500 List – June 2016, URL https://www.top500.org/list/2016/06/.
 (44) Private communication.
 Brady et al. (2013) G. A. Brady, N. Kapur, J. L. Summers, and H. M. Thompson, A case study and critical assessment in calculating power usage effectiveness for a data centre, Energy conversion and management 76, 155 (2013).
 (46) The Green Grid, URL http://www.thegreengrid.org.
 pow (b) We note that a comprehensive analysis of the power consumption requires the inclusion of embedding, programming, and readout times, as well as considerations on dissipated energy, for both classical and quantum machines.