A Review of the Hamming weight problem and Reichardt’s bound for PHWO problems

# Tunneling and speedup in quantum optimization for permutation-symmetric problems

## Abstract

Tunneling is often claimed to be the key mechanism underlying possible speedups in quantum optimization via quantum annealing (QA), especially for problems featuring a cost function with tall and thin barriers. We present and analyze several counterexamples from the class of perturbed Hamming-weight optimization problems with qubit permutation symmetry. We first show that, for these problems, the adiabatic dynamics that make tunneling possible should be understood not in terms of the cost function but rather the semi-classical potential arising from the spin-coherent path integral formalism. We then provide an example where the shape of the barrier in the final cost function is short and wide, which might suggest no quantum advantage for QA, yet where tunneling renders QA superior to simulated annealing in the adiabatic regime. However, the adiabatic dynamics turn out not be optimal. Instead, an evolution involving a sequence of diabatic transitions through many avoided level-crossings, involving no tunneling, is optimal and outperforms adiabatic QA. We show that this phenomenon of speedup by diabatic transitions is not unique to this example, and we provide an example where it provides an exponential speedup over adiabatic QA. In yet another twist, we show that a classical algorithm, spin vector dynamics, is at least as efficient as diabatic QA. Finally, in a different example with a convex cost function, the diabatic transitions result in a speedup relative to both adiabatic QA with tunneling and classical spin vector dynamics.

## I Introduction

The possibility of a quantum speedup for finding the solution of classical optimization problems is tantalizing, as a quantum advantage for this class of problems would provide a wealth of new applications for quantum computing. The goal of many optimization problems can be formulated as finding an -bit string that minimizes a given cost function , which can be interpreted as the energy of a classical Ising spin system whose ground state is . Finding the ground state of such systems can be hard if, e.g., the system is strongly frustrated, resulting in a complex energy landscape that cannot be efficiently explored with any known algorithm due to the presence of many local minima Nishimori (2001). This can occur, e.g., in classical simulated annealing (SA) Kirkpatrick et al. (1983), when the system’s state is trapped in a local minimum.

Thermal hopping and quantum tunneling provide two starkly different mechanisms for solving optimization problems, and finding optimization problems that favor the latter continues to be an open theoretical question S. Suzuki and A. Das (guest eds.) (2015); Heim et al. (2015). It is often stated that quantum annealing (QA) Ray et al. (1989); Finnila et al. (1994); Kadowaki and Nishimori (1998); Farhi et al. (2001); Das and Chakrabarti (2008) uses tunneling instead of thermal excitations to escape from local minima, which can be advantageous in systems with tall but thin barriers that are easier to tunnel through than to thermally climb over Heim et al. (2015); Das and Chakrabarti (2008); Suzuki et al. (2013). It is with this potential tunneling-induced advantage over classical annealing that QA and the quantum adiabatic algorithm Farhi et al. (2000) were proposed. Our goal in this work is to address the question of the role played by tunneling in providing a quantum speedup, and to elucidate it by studying a number of illustrative examples. We shall demonstrate that the role of tunneling is significantly more subtle than what might be expected on the basis of the “tall and thin barrier” picture.

In order to make progress on this question, the potential with respect to which tunneling occurs must be clearly specified. Tunneling is defined with respect to a semi-classical potential which delineates classically allowed and forbidden regions. In QA, one typically initializes the system in the known ground state of a simple Hamiltonian and evolves the system towards a Hamiltonian representing the final cost function. We shall argue that when one takes a natural semi-classical limit, the semi-classical potential does not become the final cost-function. Instead one obtains a potential appearing in the action of the spin-coherent path-integral representation of the quantum dynamics. This potential, which here we call the spin-coherent potential, has been used profitably before Farhi et al. (2002a, b); Schaller and Schützhold (2010); Boixo et al. (2014). We provide comprehensive evidence that multi-spin tunneling can be understood with respect to this spin-coherent potential.

We analyze the spin-coherent potential for several examples from a well-known class of problems known as perturbed Hamming weight oracle (PHWO) problems. These are problems for which instances can be generated where QA either has an advantage over classical random search algorithms with local updates, such as SA Farhi et al. (2002a); Reichardt (2004), or has no advantage van Dam et al. (2001); Reichardt (2004). Moreover, because PHWO problems exhibit qubit permutation symmetry, their quantum evolutions are easily classically simulatable, and furthermore, their spin-coherent potential is one-dimensional. Tunneling becomes clear and explicit for these problems when using the spin-coherent potential.

We focus on a particular PHWO problem that has a plateau in the final cost function (henceforth,“the Fixed Plateau”). This problem offers a counter-example to two commonly held views: (1) QA has an advantage, due to tunneling, over SA only on problems where the barrier in the final cost function is tall and thin; (2) tunneling is necessary for a quantum speedup in QA. We refute the first statement by showing that for the Fixed Plateau, which is a short and wide cost function, QA significantly outperforms SA by using tunneling. Indeed, we find numerically that adiabatic QA (AQA) needs a time of to find the ground state, where is the number of spins or qubits. Moreover, using the spin-coherent potential, we observe the presence of tunneling during the quantum anneal. On the other hand, we prove that single-spin update SA takes a time of . Thus, we have essentially an arbitrary polynomial tunneling speedup of QA over SA on a cost-function that is not tall and thin. We remark that the result about SA’s performance is also a rigorous proof of a result due to Reichardt Reichardt (2004) that classical local search algorithms will fail on a certain class of PHWO problems and is of independent interest.

We refute the second statement by showing that, for the Fixed Plateau, it is actually optimal to run QA diabatically (henceforth, DQA for diabatic quantum annealing). The system leaves the ground state, only to return through a sequence of diabatic transitions associated with avoided-level crossings. In this regime, the runtime for QA is . Moreover, in this regime, we do not observe any of the standard signatures of tunneling. We show that this feature — that the optimal evolution time for QA is far from being adiabatic — is present in a few other PHWO problems and that this optimal evolution involves no multi-qubit tunneling.

Given that the optimal evolution involves no tunneling, we are inspired to investigate a classical algorithm, spin vector dynamics (SVD), which can be interpreted as a semi-classical limit of the quantum evolution with a product-state approximation. We observe that SVD evolves in an almost identical manner to DQA, and is able to recover the speedup seen by DQA. Thus, in these problems, we show that what may be suspected to be a highly quantum-coherent process—diabatic transitions—can be mimicked by a quantum-inspired classical algorithm.

The structure of this paper is as follows. In Sec. II, we list the PHWO problems we study. In Sec. III, we use these problems to present evidence that tunneling can be understood with respect to the spin-coherent potential. In Sec. IV, we focus on the Fixed Plateau PHWO problem, and exhaustively analyze the performance of various algorithms for this problem. In particular we numerically characterize AQA (Sec. IV.1), provide a rigorous proof of SA’s performance (Sec. IV.2), and numerically analyze DQA (Sec. IV.3), SVD (Sec. IV.4), and a quantum Monte Carlo algorithm (Sec. IV.5). We conclude in Sec. V by discussing the implications of our work and possible directions for future work. Additional background information and technical details can be found in the Appendix.

## Ii Perturbed Hamming weight optimization problems and the examples studied

The cost function of a PHWO problem is defined as,

 f(x)={|x|+p(|x|)l<|x|

where denotes the Hamming weight of the bit string . For SA, this is the cost-function. For QA, this will be the final Hamiltonian. More precisely, we define QA as the closed-system quantum evolution governed by the time-dependent Hamiltonian,

 H(s)=12(1−s)∑i(\openone−σxi)+s∑xf(x)|x⟩⟨x| , (2)

where we have chosen the standard transverse field “driver” Hamiltonian that assumes no prior knowledge of the form of , and a linear interpolating schedule, with being the dimensionless time parameter. The initial state is the ground state of .

Below, we list several of PHWO examples that we study in greater detail. We refer to the case with as the Plain Hamming Weight problem.

1. Fixed Plateau:

 f(x)={u−1,l<|x|

Clearly, this forms a plateau in Hamming weight space. We take . Since the location of the plateau does not change with , we refer to it as “fixed.” An instance of this cost function with and is illustrated in Fig. 1. By numerical diagonalization we find that QA has a constant gap for this cost-function.

2. Reichardt:

 f(x)={|x|+h(n),l(n)<|x|

with . For this case, Reichardt Reichardt (2004) proved a constant lower bound on the minimum spectral gap during the quantum anneal. In Appendix A we provide a pedagogical review of this proof and fill in some details not explicitly provided in the original proof.

3. Moving Plateau:

 f(x)={u−1,l(n)<|x|

with , and . This is termed “moving” since the location of the plateau changes with . Note that this is a special case from the Reichardt class.

4. Grover:

 f(x)={n,|x|≥1,0,|x|=0. (6)

This is a minor modification of the standard Grover problem: the marked state is the all-zeros string with energy , and the energy of all the other states is . Scaling the energy by keeps the maximum energy of all the PHWO problems we consider comparable.

5. Spike:

 f(x)={n,|x|=n/4,|x|,otherwise . (7)

This was studied by Farhi et al. in Farhi et al. (2002a), where it was argued that the quantum minimum gap scales as and that SA will take exponential time to find the ground state. However, we show below (Fig. 8) that SVD is more efficient than QA for this problem.

6. Precipice:

 f(x)={−1,|x|=n,|x|,otherwise . (8)

This was studied by van Dam et al. in van Dam et al. (2001), where it was proved that the quantum minimum gap for this problem scales as .

7. -Rectangle:

 f(x)={|x|+nα,n4−12cnα<|x|

We call this an -Rectangle because the width of the perturbation () is times the height. This was studied in Brady and Dam (2015), where evidence for the following conjecture for the scaling of the quantum minimum gap was presented,

 gmin=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩constant,α<14,1/poly(n),14<α<13,1/exp(n),α>13. (10)

Note that is a member of the Reichardt class, and thus the constant lower-bound on the minimum gap is a theorem, and not a conjecture. We restrict ourselves to the case of .

We remark that all the problems listed above are representative members of a large family of problems: if the input bit-string to any of the above problems is transformed by an mask, then all of our analysis below will hold. For QA, the mask can be represented as a unitary transformation: , with being the mask string. As this unitary commutes with the QA Hamiltonian at all times, none of our subsequent analysis is affected. Similar arguments go through for SA and all the other algorithms that we consider.

We note that PHWO problems are strictly toy problems since these problems are typically represented by highly non-local Hamiltonians (see Appendix B) and thus are not physically implementable, in the same sense that the adiabatic Grover search problem is unphysical Roland and Cerf (2002); Rezakhani et al. (2010). Nevertheless, these problems provide us with important insights into the mechanisms behind a quantum speed-up, or lack thereof.

## Iii The semi-classical potential and tunneling

In order to study tunneling, we need a potential arising from a semi-classical limit, which defines classically allowed and forbidden regions. One approach to writing a semi-classical potential for quantum Hamiltonians is to use the spin-coherent path-integral formalism Klauder (1979). This semi-classical potential has been used profitably in various QA studies, e.g., Refs. Farhi et al. (2002a); Schaller and Schützhold (2010); Farhi et al. (2002b); Boixo et al. (2014), and we extend its applications here. For the quantum evolution, since the initial state [the ground state of ] is symmetric under permutations of qubits and the unitary dynamics preserves this symmetry (it is a symmetry of for all ), we can consistently restrict ourselves to spin-1/2 symmetric coherent states :

 Missing or unrecognized delimiter for \right (11)

The spin-coherent potential is then given by:

 VSC(θ,ϕ,s)=⟨θ,ϕ|H(s)|θ,ϕ⟩ . (12)

We show that for all the examples defined above except the Reichardt class (we address this below), this potential captures important features of the quantum Hamiltonian [Eq. (2)] and reveals the presence of tunneling. Specifically:

1. The spin-coherent potential displays a degenerate double-well almost exactly at the point of the minimum gap. In Fig. 2 we plot, for the Fixed Plateau the potential near the minimum gap. The potential transitions from having a single minimum on the right to a single minimum on the left. In between, it becomes degenerate and displays a degenerate double well. Since the instantaneous ground state corresponds to the position of the global minimum, which exhibits a discontinuity, the degeneracy point is where tunneling should be most helpful. In Fig. 3, we show that the location of the minimum gap of the quantum evolution is very close to the location of the degenerate double-well in the spin-coherent potential.

2. The ground state predicted by the spin-coherent potential is a good approximation to the quantum ground state except near the degeneracy point. As expected from a potential that arises in a semi-classical limit, the ground state predicted by the spin-coherent potential (i.e., the spin-coherent state corresponding to the instantaneous global minimum in ) agrees well with the quantum ground state, except where tunneling is important. In particular, delocalization when the spin-coherent potential is a degenerate double-well (or is close to being one) should imply that approximating the ground state with a wavefunction localized in one of the wells fails. Indeed, we find this to be the case. We illustrate this for the Fixed Plateau in Fig. 2; similar results hold for the other examples we have considered.

3. There is a sharp change in the ground state of the adiabatic quantum evolution at the degeneracy point. Tunneling should be accompanied by a sharp change in the properties of the ground state at the degeneracy point as the state state shifts from being localized in one well to the other. We quantify this change by calculating the expectation value of the Hamming weight operator, defined as . We expect a discontinuity in the spin-coherent ground state expectation value , because the spin-coherent ground state changes discontinuously at the degeneracy point. We find that there is a nearly identical change in the quantum ground state expectation value , for all of the examples listed above. This is illustrated explicitly for the Fixed Plateau in Fig. 2. In Fig. 3, we show that there is close and increasing agreement (as a function of ) between the position of the sudden drop in and the position of the degeneracy point, for all of the problems considered.

4. The scaling of the barrier height in the spin-coherent potential is positively correlated with the scaling of the minimum gap of the quantum Hamiltonian. In Fig. 4, we see that as the barrier height increases, the inverse of the quantum minimum gap also increases.

Note that the Reichardt class is absent from the discussion above. The reason is that for these problems, the barrier in the spin-coherent potential is very small, which makes its numerical detection difficult. Fortunately, we can make some analytical claims about this class of problems. By adapting Reichardt’s proof (reviewed in Appendix A) that these problems have a constant minimum gap, we are able to prove that the barrier height in the spin-coherent potential for this class vanishes as . Therefore, for these easy-for-AQA problems, there is a vanishing barrier in the spin-coherent potential. More precisely, we can show, for any perturbed Hamming weight problem,

 VpertSC−VunpertSC =s∑l

where the unperturbed case refers to in Eq. (4). Recall that = for the Reichardt class. Thus asymptotically, the spin-coherent potential for this class approaches the spin-coherent potential of the unperturbed Hamming weight problem. It is easy to check that the latter has a single minimum throughout the evolution, and hence no barriers.

Taken together, these observations indicate that the spin-coherent potential (not the cost function alone) is the appropriate potential with respect to which tunneling is to be understood for these problems.

## Iv Fixed Plateau: Performance of algorithms

Having motivated the spin-coherent potential for understanding tunneling, we now exhaustively analyze the Fixed Plateau. We choose this problem because it forces us to confront some intuitions about the performance of certain algorithms. Considering the final cost function, the Fixed Plateau has neither local minima nor a barrier going from large to small : it just has a long, flat section before the ground state at . This might suggest that it is easy for an algorithm such as SA, and is not a candidate for a quantum speedup. Moreover, given the absence of a barrier, one might suspect that the quantum evolution would not even involve multi-qubit tunneling.

We dispel both of these intuitions and summarize our findings first. In the previous section, we already provided evidence that tunneling is unambiguously present for this problem. The spin-coherent potential involves energy barriers, despite their absence in the final cost function, and the adiabatic quantum evolution is forced to tunnel in order to follow the ground state. By a simulation of the Schrödinger equation, we find that AQA needs a time of in order to reach a given success probability (see Sec. IV.1). Therefore, the adiabatic algorithm, via tunneling, is able to solve this problem efficiently.

Turning to SA, an algorithm which performs a local stochastic search on the final cost function, we prove that simulated annealing with single spin-updates will take time to find the ground state (see Sec. IV.2). This result is due to the fact that a random walker on the plateau has no preferred direction and becomes trapped there. More precisely, the probability of a leftward transition while on the plateau is proportional to the probability of flipping one of a constant number of bits (given by the Hamming weight) out of , which scales as if . And since the walker needs to make as many consecutive leftward transitions as the width of the plateau in order to fall off the plateau, the time taken for this to happen scales as . Consequently, we obtain a polynomial speedup of AQA over SA that can be made as large as desired. Therefore, using the Fixed Plateau, we are able to demonstrate that a quantum speedup over SA is possible via tunneling in the adiabatic regime.

However, is the adiabatic evolution optimal? In order to find the optimal evolution time, we employ the optimal time to solution (TTS), a metric that is commonly used in benchmarking studies Rønnow et al. (2014) (also see Appendix C). It is defined as the minimum total time such that the ground state is observed at least once with desired probability :

 TTSopt=mintf>0{tfln(1−pd)ln(1−pGS(tf))} , (14)

where is the duration (in QA) or the number of single spin updates (in SA) of a single run of the algorithm, and is the probability of finding the ground state in a single such run. The use of TTS allows for the possibility that multiple short runs of the evolution, each lasting an optimal annealing time , result in a better scaling than a single long (adiabatic) run with an unoptimized . The quantum evolution that gives the optimal annealing time relative to this cost function is actually DQA, with an asymptotic scaling of . Importantly, this diabatic evolution does not contain any of the signatures of tunneling discussed in the previous section. Therefore, for the Fixed Plateau, tunneling does not give rise to the optimal quantum performance.

Motivated by the fact that the optimal quantum evolution involves no multi-qubit tunneling, we consider spin-vector dynamics Smolin and Smith (2014) (see, also Refs. Albash et al. (2015); Owerre and Paranjape (2015)), a model that evolves according to the spin-coherent potential in Eq. (12). SVD can be derived as the saddle-point approximation to the path integral formulation of QA in the spin-coherent basis Owerre and Paranjape (2015). The SVD equations are equivalent to the Ehrenfest equations for the magnetization under the assumption that the density matrix is a product state, i.e., , where denotes the state of the th qubit. This algorithm is useful since it is derived under the assumption of continuity of the angles , so tunneling, which here would amount to a discrete jump in the angles, is absent.

We also consider a quantum Monte Carlo based algorithm, often called simulated quantum annealing (SQA) Martoňák et al. (2002); Santoro et al. (2002). We show that SQA has a scaling that is better than SA’s. Indeed, this is consistent with the fact that SQA thermalizes not just relative to the final cost function, but also during the evolution.

We provide further details of our implementations of each of these algorithms in Appendix D. We now turn to each of the algorithms individually and detail their performance for the Fixed Plateau problem.

In order to study the scaling of adiabatic dynamics, we consider the minimum time required to reach the ground state with some probability , where we choose to ensure that we are exploring a regime close to adiabaticity for QA. We call this benchmark metric the “threshold criterion,” and set . As seen in Fig. 5, we observe a scaling for AQA that is approximately . As is to be expected given that the tunneling for the Fixed Plateau problem is controlled by the width of the plateau, which is constant (does not scale with ), we find that scales in the same way for the Fixed Plateau and the Plain Hamming Weight problems (see Appendix A). This suggests that the dominant contribution to the scaling at large is not the time associated with tunneling but rather the time associated with the Plain Hamming Weight problem.

As also seen in Fig. 5, we find that the textbook adiabatic criterion A. Messiah (1999) given by

 tf≳maxs∈[0,1]|⟨ε0(s)|∂sH(s)|ε1(s)⟩|Gap(s)2 , (15)

serves as an excellent proxy for the scaling of AQA 1. The scaling of AQA is matched by the scaling of the numerator of the adiabatic condition, which is explained by the fact that we find a constant minimum gap for the case . This numerator turns out to be well approximated in our case by the matrix element of between the ground and first excited states, leading to in the adiabatic limit. Note that calculating this matrix element can easily be done for arbitrarily large systems, and is hence much easier to check directly than the scaling of AQA.

### iv.2 Simulated annealing using random spin selection

We consider a version of SA with random spin-selection as the rule that generates candidates for Metropolis updates. Our main motivation is to understand the behavior of a local, stochastic search algorithm which has access only to the final cost function. We note that our analysis below is general for any Plateau problem, and is not limited to the Fixed Plateau or the Moving Plateau.

If we pick a bit-string at random, then for large we will start with very high probability at a bit-string with Hamming weight close to . The plateau may be to the left or to the right of ; if the plateau is to the right, then the random walker is unlikely to encounter it and can quickly descend to the ground state. Thus, the more interesting case is when the random walker arrives at the plateau from the right. We proceed to analyze these two cases separately.

#### Walker starts to the right of the plateau

In this case, how much time would it take, typically, for the walker to fall off the left edge? It is intuitively clear that traversing the plateau will be the dominant contribution to the time taken to reach the ground state, as after that the random walker can easily walk down the potential. We show below (for the walker that starts to the left of the plateau) that this time can be at most if .

To evaluate the time to fall off the plateau, note that the perturbation is applied on strings of Hamming weight , so the width of the plateau is . Consider a random walk on a line of nodes labelled . Node represents the set of bit strings with Hamming weight , with . We may assume that the random walker starts at node . Only nearest-neighbor moves are allowed and the walk terminates if the walker reaches node .

Our analysis will provide a lower bound on the actual time to fall off the left edge, because in the actual PHWO problem one can also go back up the slope on the right, and in addition we disallow transitions from strings of Hamming weight to . This is justified because the Metropolis rule exponentially (in ) suppresses these transitions.

The transition probabilities for this problem can be written as a row-stochastic matrix . Here is a tridiagonal matrix with zeroes on the diagonal, except at and . First consider . If the walker is at node , then the transition to node (which has Hamming weight ) occurs with probability (the chance that the bit picked had the value ). Similarly, for , the Hamming weight will decrease to with probability (the chance that the bit picked had the value ). Combining this with the fact that a walker at node stays put, we can write:

 bi ≡pi→i=⎧⎪⎨⎪⎩1 if i=00 if 1≤i≤(w−1)1−l+wn if i=w, (16a) ci ≡pi−1→i={0 if i=11−l+i−1n if i=2,…,w, (16b) ai ≡pi→i−1=l+in if i=1,2,…,w. (16c)

Let be the position of the random walker at time-step . The random variable measuring the number of steps the random walker starting from node would need to take to reach node for the first time is

 τr,s≡min{t>0:X(t)=s,X(t−1)≠s|X(0)=r} . (17)

The quantity we are after is , the expectation value of the random variable , i.e., the mean time taken by the random walker to fall off the plateau. Since only nearest neighbor moves are allowed we have

 Eτw,0=w∑r=1Eτr,r−1 . (18)

Stefanov Stefanov (1995) (see also Ref. Krafft and Schaefer (1993)) has shown that

 Eτr,r−1=1ar(1+w∑s=r+1s∏t=r+1ctat), (19)

where . Evaluating the sum term by term, we obtain:

 Eτw,w−1 =nl+w, (20a) ⋮ Eτw−k,w−k−1 =nl+w−k[1+n−(l+w−k)l+w−(k−1)+… +n−(l+w−k)l+w−(k−1)×⋯ ×n−(l+w−2)l+w−1×n−(l+w−1)l+w]. (20b)

Now consider the following cases:

1. Fixed Plateau, : Here, using the fact that , we conclude that . Since the leading order term is , the time to fall off the plateau is This result about SA’s performance is confirmed numerically in Fig. 5.

2. In order for Reichardt’s bound (see Appendix A) to give a constant lower-bound to the quantum problem, we need . Since at most we can have , we can conclude . Therefore, the time to fall-off becomes .

• Moving Plateau: If and , we can see that , which is a constant time scaling.

• Moving Plateau with changing width: If and , where , then , which is super-polynomial.

• Most general plateau in the Reichardt class: More generally, if , with and , where , then we get the scaling

#### Walker starts to the left of the plateau

Note that this case is equivalent to the unperturbed Hamming weight problem, which is a straightforward gradient descent problem. We may therefore consider a simple fixed temperature version of SA (i.e., the standard Metropolis algorithm). We will show that the performance of SA on this problem provides an upper bound of on the time for a random walker to arrive at the plateau, and on the time for a random-walker to reach the ground state after descending from the plateau. Moreover, our analysis provides a lower bound of on the efficiency of such algorithms.

For this problem, the transition probabilities are:

 ci ≡pi−1→i=n−i+1ne−β , (21a) ai ≡pi→i−1=in , (21b)

with denoting strings of Hamming weight , and is the inverse temperature. Using the Stefanov formula (19), we can write (after much simplification):

 Eτn−k,n−k−1=nn−k(nk)−1k∑l=0e−lβ(nk−l) . (22)

We will bound

 Eτn,0=n−1∑k=0nn−k(nk)−1k∑l=0e−lβ(nk−l) , (23)

the expected time to reach the all-zeros string starting from the all-ones string. This is the worst-case scenario as we are assuming that we are starting from the string farthest from the all-zeros string. Note again that if we start from a random spin configuration, then with overwhelming probability we will pick a string with Hamming weight close to . Thus, most probably, will be the time to hit the ground state.

We first show that will lead to an exponential time to hit the ground state, irrespective of the walker’s starting string. Toward that end,

 Eτ1,0 =Eτn−(n−1),n−n (24a) =n−1∑l=0e−lβ(nn−1−l) (24b) =eβ[(e−β+1)n−1], (24c)

which is clearly exponential in if .

Next, let , i.e., we decrease the temperature logarithmically in system size. In this case,

 Eτ1,0=n[(1+1n)n−1]≤n(e−1)=O(n) . (25)

Now it is intuitively clear that for all , which implies that . Thus, if , then at worst.

To obtain a lower-bound on the performance of the algorithm, we take . Thus, for each in Eq. (23), only the term will survive. Hence,

 limβ→∞Eτn,0 =n−1∑k=0nn−k=nn∑i=11i≈n(logn+γ) , (26)

for large , with being the Euler-Mascheroni constant. The scaling here is . This is the best possible performance for single-spin update SA with random spin-selection on the plain Hamming weight problem. Therefore, if , the scaling will be between and . Of course, this cost needs to be added to the time taken for the walker starting to the right of the plateau.

Two clarifications are in order regarding the comparison between our theoretical bound on SA’s performance and the associated numerical simulations we have presented. First, while Fig. 5 displays the time to cross a threshold probability, our theoretical bound of is on the expected time for the random walker to hit the ground state [Eq. (18)]. However, we found that both metrics show identical scaling. Second, while the SA data in Fig. 5 was generated using sequential spin updates, the theoretical bound assumes random spin updates (see Appendix D.1 for more details on the update schemes). However, we found that the asymptotic scaling for both cases is nearly identical in the long-time regime, and thus have plotted only the former.

### iv.3 Optimal QA via Diabatic Transitions

Having established that for the Fixed Plateau AQA enjoys a quantum speedup over local search algorithms such as SA via tunneling, we are motivated to ask: Is tunneling necessary to achieve a quantum speedup on these problems? In order to answer this question, we demonstrate using the optimal TTS criterion defined in Eq. (14) that the optimal annealing time for QA is far from adiabatic. Instead, as shown in Fig. 6, the optimal TTS for QA is such that the system leaves the instantaneous ground state for most of the evolution and only returns to the ground state towards the end. The cascade down to the ground state is mediated by a sequence of avoided energy level-crossings as seen in Fig. 7. We consider this a diabatic form of QA (DQA) and call this mechanism through which DQA achieves a speedup a diabatic cascade.

As increases for fixed , repopulation of the ground state improves for fixed , hence causing TTS to decrease with , as seen Fig. 6, until it saturates to a constant at the lowest possible value, corresponding to a single run at . At this point the problem is solved in constant time , compared to the scaling of the adiabatic regime. Moreover, as shown in Fig. 6, there are no sharp changes in , suggesting that the non-adiabatic dynamics do not entail multi-qubit tunneling events, unlike the adiabatic case. Thus, this establishes that we may have speedups in QA that do not involve multi-qubit tunneling.

One may worry that for this diabatic evolution to be successful, the optimal annealing time may need to be very finely tuned. We address this concern in Appendix E, where we show that if is the precision desired in , we need only have a precision of in setting , which means that the diabatic speedup is robust.

Figure 8 shows that the speedup of DQA and SVD over AQA exists for three other PHWO problems: the Moving Plateau, the Spike, and the 0.5-Rectangle problems. Importantly, DQA and SVD have an exponential speedup over AQA for the 0.5-Rectangle problem. We do not observe a diabatic speedup for the Precipice or Grover problems.

### iv.4 Spin Vector Dynamics

Given the absence of tunneling in the time-optimal quantum evolution, we are motivated to consider the behavior of Spin-Vector Dynamics (SVD), which arise in a semi-classical limit (see Appendix D.3 for an overview of this algorithm). As we show in Fig. 6, the scaling of SVD’s optimal TTS also saturates to a constant time, i.e., . Moreover, it reaches this value earlier (as a function of problem size ) than DQA, thus outperforming DQA for small problem sizes, while for large enough both achieve scaling. As seen in the inset, SVD’s advantage persists as a function of at constant .

The dynamics of DQA are well approximated by SVD until close to the end of the evolution, as shown in Fig. 6: the trace-norm distance between the instantaneous states of DQA and SVD is almost zero until , after which the states start to diverge. This suggests that SVD is able to replicate the DQA dynamics up to this point, and only deviates because it is more successful at repopulating the ground state than DQA.

In Fig. 8, we show that SVD’s speedup over AQA is replicated for the Spike, Moving Plateau, and 0.5-Rectangle problems as well. Remarkably, while the 0.5-Rectangle problem has an exponentially small gap [see Eq. (10) and Fig. 4], SVD and DQA both achieve scaling, and hence the diabatic cascades provides an exponential speedup relative to AQA.

It is important to note that SVD is ineffective if one desires to simulate the adiabatic evolution. In the absence of unitary dynamics (which allow for tunneling) or thermal activation (to thermally hop over the barrier), SVD gets trapped behind the barrier that forms in the semi-classical potential separating the two degenerate minima [see Fig. 2] and is unable to reach the new global minimum. In this sense, SVD does not enjoy the guarantee provided by the quantum adiabatic theorem for the unitary evolution Jansen et al. (2007); Amin (2009); Lidar et al. (2009), that for sufficiently long dictated by the adiabatic condition, the ground state can be reached with any desired probability.

Likewise, it is important to keep in mind the distinction between a classical algorithm being able to match, or sometimes outperform, a quantum algorithm (as SVD does here), and the classical algorithm approximating the evolution or instantiating the physics of the quantum algorithm (as SVD fails to do here). Indeed, in both the diabatic and adiabatic regimes, SVD provides a poor approximation to the instantaneous quantum state. For example, in the diabatic regime, it is clear from Fig. 6 that the trace-norm distance between the instantaneous SVD state and the instantaneous quantum state starts to increase significantly for . In the same spirit, consider the instantaneous semi-classical ground state, i.e., the spin-coherent state evaluated at the minimum of the spin-coherent potential, which may be suspected to provide a good approximation to the instantaneous quantum ground state, but does not as shown in Fig. 2. Thus the unentangled semi-classical ground state also fails to provide a good approximation to the quantum ground state.

### iv.5 Simulated Quantum Annealing

Simulated Quantum Annealing (SQA) is a quantum Monte Carlo algorithm performed along the annealing schedule (see Appendix D.4 for further details). It is often used as a benchmark against which QA is compared (though see Ref. Heim et al. (2015) for caveats). SQA scales better than SA for the Fixed Plateau problem using the threshold criterion (see Fig. 5). In order to understand why SQA enjoys an advantage over SA using this benchmark metric, it is useful to study the behavior of the state of SQA along the annealing schedule. We show the behavior of for SQA in Fig. 9, where we observe that SQA at the optimal number of sweeps (the case of sweeps shown in Fig. 9) does not follow the instantaneous ground state. Instead it reaches the threshold success probability by thermally relaxing to the ground state after the minimum gap point (and tunneling event) of the quantum Hamiltonian. Therefore, SQA’s advantage over SA stems from the fact that it thermalizes in a different energy landscape than SA.

We also contrast the behavior of SQA and AQA using the threshold criterion. While SQA is able to follow the instantaneous ground state for a sufficiently large number of sweeps and thus mimic the tunneling of AQA (see Fig. 9), this is not the optimal way for it to reach the threshold criterion. For a fixed threshold success probability, the process of thermal relaxation after the minimum gap point uses fewer sweeps (and hence is more efficient) than following the instantaneous ground state closely throughout the anneal 2. This is in contrast to AQA, where tunneling is the only means for it to reach a high success probability and nevertheless is more efficient than SQA, as seen in Fig. 5.

We note that SQA’s threshold criterion advantage over SA does not carry over to the optimal TTS criterion. In fact, we find that using the optimal TTS criterion, SQA scales as , while SA scales as , as seen in Fig. 6. The reason for the latter scaling is that the optimal number of sweeps for SA is , simply because there is a small but non-zero probability that in the first sweep all the s are flipped to s.

## V Discussion

It is often assumed that the shape of the final cost-function determines how hard it is for QA to solve the problem (in fact, this was partly the motivation for the Spike problem in Ref. Farhi et al. (2002a)), and that potentials with tall and thin barriers should be advantageous for AQA, since this is where tunneling dominates over thermal hopping (e.g., (Heim et al., 2015, p.215), (Das and Chakrabarti, 2008, p.1062), (Suzuki et al., 2013, p.226)). It is then assumed that problems where the final potential has this feature are those for which there should be a quantum speedup. We have given several counterexamples to such claims, and shown that tunneling is not necessary to achieve the optimal TTS. Instead, the optimal trajectory may use diabatic transitions to first scatter completely out of the ground state and return via a sequence of avoided level crossings. That diabatic transitions can help speed up quantum algorithms has also been noted and advantageously exploited in Refs. Somma et al. (2012); Crosson et al. (2014); Hen (2014); Steiger et al. (2015). Moreover, we have shown that the instantaneous semi-classical potential provides important insight into the role of tunneling, while the final cost function can be rather misleading in this regard.

While both adiabatic and diabatic QA outperform SA for the Fixed Plateau problem, the faster quantum diabatic algorithm is not better than the classical SVD algorithm for this problem. The PHWO problems due to Reichardt Reichardt (2004), which includes problems very similar to the Fixed Plateau, have widely been considered an example where tunneling provides a quantum advantage; we have shown that this holds if one limits the comparison to SA, but that there is in fact no quantum speedup in the problem when one compares the quantum diabatic evolution (which outperforms adiabatic quantum annealing) to SVD.

These results of the diabatic optimal evolution extend beyond the plateau problems: even the Spike problem studied in Ref. Farhi et al. (2002a)—which is in some sense the antithesis of the plateau problem since it features a sharp spike at a single Hamming weight—also exhibits the diabatic-beats-adiabatic phenomenon, indicating that tunneling is not required to efficiently solve the problem. Thus diabatic evolution, especially via diabatic cascades, is an important and relatively unexplored mechanism in quantum optimization that is different from tunneling. The fact that we observe a speedup relative to AQA for several problems, especially an exponential speedup for the 0.5-Rectangle, motivates the search for algorithms exploiting this mechanism and may yield fruitful results. However, we also already know that diabatic cascades are not generic. E.g., we have checked that this mechanism is absent in the Grover and Precipice problems, even though the Grover problem is equivalent to a ‘giant’ plateau problem.

In summary, our work provides a counterargument to the widely made claims that tunneling should be understood with respect to the final cost function, that speedups due to tunneling require tall and thin barriers; and that tunneling is needed for a quantum speedup in optimization problems. Which features of Hamiltonians of optimization problems favor diabatic or adiabatic algorithms remains an open question, as is the understanding of tunneling for non-permutation-symmetric problems.

We finish on a positive note for QA. We have given several examples where SVD outperforms QA, e.g., the Spike problem Farhi et al. (2002a). However, we make no claim that SVD will always have an advantage over QA. A simple and instructive example comes from the class of cost functions that are convex in Hamming weight space, which have a constant minimum gap Jarret and Jordan (2014):

 f(x)={2,|x|=0|x|,otherwise . (27)

We have observed similar diabatic transitions for this problem as for the Fixed Plateau (not shown), but find that DQA outperforms SVD, as shown in Fig. 10. This results because the optimal TTS for QA occurs at a slightly higher optimal annealing time, i.e., there is an advantage to evolving somewhat more slowly, though still far from adiabatically. Thus, this provides an example of a “limited” quantum speedup Rønnow et al. (2014).

###### Acknowledgements.
Special thanks to Ben Reichardt for insightful conversations and for suggesting the plateau problem, and to Bill Kaminsky for inspiring talks Kaminsky (2014a, b). We also thank Itay Hen, Joshua Job, Iman Marvian, Milad Marvian, and Rolando Somma for useful comments. The computing resources were provided by the USC Center for High Performance Computing and Communications and by the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This work was supported under ARO grant number W911NF-12-1-0523 and ARO MURI Grant No. W911NF-11-1-0268.

## Appendix A Review of the Hamming weight problem and Reichardt’s bound for PHWO problems

Here we closely follow Ref. Reichardt (2004).

### a.1 The Hamming weight problem

We review the analysis within QA of the minimization of the Hamming weight function , which counts the number of ’s in the bit string . This problem is of course trivial, and the analysis given here is done in preparation for the perturbed problem.

 HD=12n∑i=1(\openonei−σxi)=n∑i=1|−⟩i⟨−| , (28)

which has as the ground state.

The final Hamiltonian for the cost function is

 HP=12n∑i=1(\openonei−σzi)=n∑i=1|1⟩i⟨1| , (29)

which has as the ground state.

We interpolate linearly between and :

 H(s) =(1−s)HD+sHP;s∈[0,1] (30) =n∑i=112(1−s−(1−s)−(1−s)1−s)i+(000s)i, (31) =12n∑i=1[\openone−(1−s)σxi−sσzi]≡Hi(s) . (32)

We note that in Eq. (32) is similar to a variant of the Landau-Zener (LZ) Hamiltonian with finite coupling duration Vitanov and Garraway (1996a, b), for which the Schrödinger equation has an analytical solution, except that there it is assumed that the term is constant and only the terms has a (linear) time dependence over a finite interval. The analytical solution of the problem obtained in Ref. Vitanov and Garraway (1996a) is rather complicated, and for our purposes a simpler approach suffices.

Since there are no interactions between the qubits, the adiabatic problem can be solved exactly by diagonalizing the Hamiltonian acting on each qubit separately. For each term, we have the energy eigenvalues ,

 E±(s)=12(1±Δ(s));Δ(s)≡√1−2s+2s2, (33)

and associated eigenvectors,

 |v±(s)⟩=1√2Δ(Δ∓s)[∓(Δ∓s)|0⟩+(1−s)|1⟩] . (34)

The ground state of is

 |ψGS(s)⟩=|v−(s)⟩⊗n . (35)

The gap is given by,

 Gap[H(s)] =H(s)|v+(s)⟩⊗|v−(s)⟩⊗(n−1) −H(s)|v−(s)⟩⊗n (36a) =E++(n−1)E−−nE− (36b) =E+−E− (36c) =Δ(s) . (36d)

The gap is minimized at with minimum value . The minimum gap is independent of and hence does not scale with problem size. Therefore we can predict an adiabatic run time to be given by,

 tf=O(∥∂sH∥Δ2)=O(n) , (37)

where the -dependence is solely due to (see Appendix-D.2). However, this is actually a loose upper bound. We next provide separate numerical and analytical arguments to demonstrate that the actual scaling for AQA is .

#### Numerical argument

Suppose the adiabatic algorithm runs long enough so as to attain a desired success probability, . Let this time be . Using the fact that the quantum evolution of the plain Hamming Weight problem is the evolution of non-interacting qubits, we can express the global ground-state probability in terms of the ground-state probabilities of single qubits. So, if the single qubit ground-state probability for this run-time is , then we must have .

We find numerically (see Fig. 11) that has an envelope that is excellently approximated by:

 pGS(tf)=1−1t2f+O(t−3f) , (38)

for sufficiently large . We therefore can write:

 lnp0=nlnpGS(tf)≈nln⎛⎝1−1t2f⎞⎠ , (39)

and upon expanding the , we extract a tighter scaling for our adiabatic time:

 tf=O(n1/2) . (40)

#### Analytical argument

Here, we invoke a result due to Boixo and Somma Boixo and Somma (2010). This result states,

###### Theorem 1 (Boixo and Somma (2010)).

To adiabatically prepare a final eigenstate using a Hamiltonian evolution requires time that scales at least as . Here is the eigenpath length,

 L≡∫10∥|∂sψ(s)⟩∥ds, (41)

where is the eigenpath traversed to reach the final eigenstate.

We analytically compute for the ground-state path in the plain Hamming weight problem, and show that it scales as . Since we know that in this case , we conclude the adiabatic algorithm will require at least time.

Recall that the instantaneous ground state is [Eq. (35)] , where , with [Eq. (34)]

 q(s)=(1−s)22Δ(Δ+s) . (42)

Differentiating:

 dds|ψGS(s)⟩=n∑i=1⎛⎝⨂j≠i|vj−(s)⟩⊗dds|vi−(s)⟩⎞⎠ , (43)

so that

 ∥|∂sψGS(s))⟩∥2≡⟨∂sψGS(s))|∂sψGS(s))⟩ (44) =n∥dds|vi−(s)⟩∥2+n(n−1)|⟨vi−(s)|dds|vi−(s)⟩|2. (45)

The term does not have any scaling with , and the second term vanishes because it is equal to , where we use the fact that is real-valued and normalized. Thus, taking the square root on both sides and integrating from to , we obtain the scaling of .

If we desire to fix the constant in front of , a straightforward calculation will show that

 ∫10∥dds|vi−(s)⟩∥ds=π/4 . (46)

### a.2 Reichardt’s bound for PHWO problems

Here we review Reichardt’s derivation of the gap lower-bound for general PHWO problems, but provide additional details not found in the original proof Reichardt (2004).

We use the same initial Hamiltonian [Eq. (28)] and linear interpolation schedule as before, , and choose the final Hamiltonian to be

 ~HP=∑x∈{0,1}n~f(x)|x⟩⟨x| , (47)

where

 ~f(x)={|x|+p(x)l<|x|

where is the perturbation. Note that here we have not assumed that the perturbation, , respects qubit permutation symmetry.

We wish to bound the minimum gap of . Unlike the Hamming weight problem , this problem is no longer non-interacting. Define

 hk≡max|x|=kp(x);h≡maxkhk=maxxp(x). (49)
###### Lemma 1 (Reichardt (2004)).

Let and let and be the ground state energies of and , respectively. Then .

###### Proof.

First note that

 ~H(s)−H(s)=s∑x:l<|x|

Below, we suppress the dependence of all the terms for notational simplicity. We know that . Using this,

 ⟨~E0|~H|~E0⟩ ≤⟨ψ|~H|ψ⟩∀|ψ⟩∈H. (51a) ⟹~E0−E0 ≤⟨v⊗n−|~H|v⊗n−⟩−E0 (51b) ≤⟨v⊗n−|~H−H|v⊗n−⟩ (51c) =s∑x:l<|x|

where is the number of strings with Hamming weight , we used the fact that if we measure in the computational basis, the probability of getting outcome is , and is given in Eq. (42).

Consider the partial binomial sum (dropping the ’s),

 ∑k:l

Using the fact that the binomial is well-approximated by the Gaussian in the large limit (note that this approximation requires that and not be too close to zero), we can write:

 ∑k:l

where , and . Note that and depend on , and also on via . The parameters and are specified by the problem Hamiltonian, and are therefore allowed to depend on as long as is satisfied for all .

Let us define:

 B(s,n,l(n),u(n))≡∫(u(n)−μ(n,s))/σ(n,s)(l(n)−μ(n,s))/σ(n,s)dt e−t2/2√2π. (54)

We seek an upper bound on this function. We observe that decreases monotonically from to as goes from to . Thus, the mean of the Gaussian decreases from to . Depending on the values of , and , we thus have three possibilities: (i) , (ii) , and (iii) . Note that (ii) and (iii) are cases where the integral runs over the tails of the Gaussian and so the integral is exponentially small. We focus on (i), as this induces the maximum values of the integral. In this case the lower limit of the integral Eq. (