Quantum accelerated approach to the thermal state of classical spin systems with applications to pattern-retrieval in the Hopfield neural network

# Quantum accelerated approach to the thermal state of classical spin systems with applications to pattern-retrieval in the Hopfield neural network

Eliana Fiorelli School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK Centre for the Mathematics and Theoretical Physics of Quantum Non-equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, UK    Pietro Rotondo School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK Centre for the Mathematics and Theoretical Physics of Quantum Non-equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, UK    Matteo Marcuzzi School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK Centre for the Mathematics and Theoretical Physics of Quantum Non-equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, UK    Juan P. Garrahan School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK Centre for the Mathematics and Theoretical Physics of Quantum Non-equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, UK    Igor Lesanovsky School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK Centre for the Mathematics and Theoretical Physics of Quantum Non-equilibrium Systems, University of Nottingham, Nottingham NG7 2RD, UK
July 15, 2019
###### Abstract

We explore the question as to whether quantum effects can yield a speedup of the non-equilibrium evolution of spin systems towards a classical thermal state. In our approach we exploit the fact that the thermal state of a spin system can be mapped onto a node-free quantum state whose coefficients are given by thermal weights. This perspective permits the construction of a dissipative – yet quantum – dynamics which encodes in its stationary state the thermal state of the original problem. We show for the case of an all-to-all connected Ising spin model that an appropriate transformation of this dissipative dynamics allows to interpolate between a regime in which the order parameter obeys the classical equations of motion under Glauber dynamics, to a quantum regime with an accelerated approach to stationarity. We show that this effect enables in principle a speedup of pattern retrieval in a Hopfield neural network.

\externaldocument

[B-]OQN_v12\externalcitedocument[B-]OQN_v12 Introduction — A fundamental question that is currently triggering much attention in the quantum information, quantum many-body and computer science communities is whether quantum effects may lead to advantages in solving computational problems Kadowaki and Nishimori (1998); Morita and Nishimori (2008); Aharonov et al. (2008); Boixo et al. (2014); Baldassi and Zecchina (2018); Biamonte et al. (2017). Several quantum algorithms have been proposed which can outperform their best classical counterparts, such as in the paradigmatic examples of integer factorization Shor (1999) and database search problems Grover (1997). Fluctuations due to quantum effects can moreover be employed to improve the performance of classical algorithms by opening “tunnelling” paths through high potential barriers that could otherwise trap a classical system in configurations potentially very different from the sought solution. This is the case, for instance, of quantum annealing Das and Chakrabarti (2008) which seeks to find the state of minimum energy within the energy landscape of e.g. a highly-connected spin system with random couplings Farhi et al. (2001); Santoro et al. (2002); Nishimori and Takada (2017); Denchev et al. (2016). More recently, a further paradigm emerged seeking to exploit the intrinsic open nature of quantum systems for quantum computing Breuer and Petruccione (2002). Its underlying idea is to encode the result of a computation in the stationary state of a suitably engineered Diehl et al. (2008); Müller et al. (2012) quantum dissipative evolution of a many-body (spin) system Verstraete et al. (2009).

In this work we are interested in the question whether quantum effects in a purely dissipative dynamics can be advantageous for an accelerated approach to the thermal state of an interacting spin system. Analogously to the above-mentioned annealing or quantum computation protocols, this equilibrium state may encode the solution of a computational problem or the result of an optimization protocol. Our construction is based on a dissipative – yet quantum – generalization of a classical equilibrium Markov process. The corresponding dynamics has a pure stationary state which yields expectation values for classical observables that are identical to those of a thermal ensemble Alicki (1976). We show that the stationary state is invariant under a set of unitary transformations which, however, affect the dynamics in a non-trivial way Olmos et al. (2014); Marcuzzi et al. (2014). This freedom permits a (quantum) speedup of the approach towards stationarity in comparison with the classical dynamics. We illustrate this in the case of a fully-connected Ising model and show that similar results hold for a Hopfield neural network (HNN) Hopfield (1982); Rotondo et al. (2018), which hints at the possibility of using quantum effects for the accelerated retrieval of patterns.

Quantum generalization of classical stochastic processes — We focus for simplicity on equilibrium spin models. At the classical level, these are described by a set of configurations with the number of spins and some Ising spin variables. Each model is defined in terms of an energy (or “cost”) function . In hard optimization problems Mezard and Montanari (2009), these functions are typically defined so that their global minima correspond to the sought solutions in configuration space. Under any single-spin-flipping (or, more generally, ergodic) dynamics which satisfies detailed balance, the probability of being in a given configuration at time will approach, in the limit , the Gibbs distribution , where is the inverse temperature and is the partition function.

To define a quantum generalization, we first promote the classical variables to quantum spins and encode the corresponding configurations in quantum states such that . In the following, we shall refer to this as the “classical basis” and to observables diagonal in this basis as “classical observables”. The state of the system is generically described by a density matrix . We restrict for simplicity to Markovian and purely dissipative dynamics, so that in the Lindblad formalism Lindblad (1976); Breuer and Petruccione (2007) evolves according to

 ˙ρ=Lρ=∑j(LjρL†j−12{L†jLj,ρ}). (1)

Here denotes the generator of the time evolution and the operators are jump operators. In order to fix the form of the jump operators we define the pure state and require it to be a dark state of the dynamics, i.e., . This ensures that is a stationary state of Eq. (1), and that any expectation value of a classical observable on it corresponds to the (classical) thermal average . Hence, any classical property can be equivalently retrieved from this system, while quantum fluctuations can affect the typical timescales of the dynamics. We note that imposing is reminiscent of the frustration-free property typically associated to special (Rokhsar-Kivelson) systems Castelnovo et al. (2005), as it ensures that the “global” stationary state property is “locally” satisfied by each jump operator.

The dark-state property does not uniquely fix the jump operators. For instance, if from a given set of one were to construct a second one with a set of unitary transformations (), the dynamics would still have as a stationary state (since ). Note that it is the “frustration-free” property of the dark state that allows one to choose different unitary operators for different . As sketched in Fig. 1, this freedom allows to construct and explore different dynamics, all sharing the same stationary state(s). Generically, the typical timescales of the dynamics will change for different , yielding in some cases a faster, in others a slower approach to stationarity.

In the following we construct explicitly a set of jump operators. To this end we define as the list of Ising variables excluding the -th one. Furthermore, we denote by half the energy cost for flipping the -th spin down, leaving the configuration of the remaining spins fixed. From we define an operator obtained by taking the functional form of and replacing every () with the corresponding Pauli matrix . For example, for one would get . The jump operators (one per site) are then constructed according to

 Li=α−ini−α+iσ+i,withα±i=e±β2ΔEi[2cosh(βΔEi)]12. (2)

Here are the Pauli raising and lowering operators, , and the dependence of on all spins but the -th one is implicit. This particular form of the jump operators is convenient, because it will allow us, in the examples discussed further below, to directly relate the purely-dissipative quantum dynamics to the corresponding classical Glauber dynamics with energy function .

To conclude the construction of the jump operators we use the freedom given by the unitaries . For the sake of simplicity we choose local unitaries parameterized by the two angles and : .

Fully-connected quadratic models — To study the dependence of the relaxation timescales under the angles , we focus here on fully-connected models , with being a symmetric real matrix. This will allow us to study the dynamics in terms of semi-classical collective variables. The equations of motion for the local operators generated by the dynamics (1) read (see SM ())

 ˙σxi = −Axi−β2∑k≠iJkisech(βΔEk)σykσyi+sech(βΔEi)2, ˙σyi = −Ayi+β2∑k≠iJkisech(βΔEk)σykσxi, ˙σzi = −Azi+12tanh(βΔEi). (3)

Here (), which depends on the angles through the functions , , . Note that for [i.e., ] the last equation is defined entirely in terms of combinations of matrices. Further below, we shall use this choice (together with ) as a classical reference case, as it will yield, for the -component of spin operator, the same dynamics one would derive from a purely-classical Glauber dynamics.

Fully-connected Ising model — As a first example we consider the fully-connected Ising model, i.e., we choose . We construct a set of semiclassical collective variables that allow to reduce the problem to a system of coupled ordinary differential equations: (). Since their commutator vanishes in the thermodynamic limit , we can effectively replace them with their expectation values , leading to

 ˙mx = −Ax+12sech(βmz)−β2sech(βmz)(my)2, ˙my = −Ay+β2sech(βmz)mymx, ˙mz = −Az+12tanh(βmz). (4)

Here (). The choice decouples the third equation from the others, implying that the dynamics of proceeds independently from the one of . Furthermore, the equation for is equivalent to the mean-field evolution equation of the order parameter of a classical Ising model under a continuous-time Glauber dynamics. We thereby consider this angle our classical reference point (the specific choice of will not affect the evolution of ).

The stationary state structure can be extracted by setting the l.h.s. of Eqs. (4) to zero: for (high temperature) the equations admit a single (“paramagnetic”) solution , . For (low temperature) the paramagnetic solution becomes unstable and two stable “ferromagnetic” () solutions appear obeying .

In Fig. 2(a) we show the “quantum” (red lines) and “classical” (blue line) evolution of the order parameter in the ferromagnetic phase for a specific choice of and and fixed initial conditions. The observed acceleration away from the classical limit arises in the early stages of the dynamics, whereas the long-time, asymptotic behavior is exponential with the same rate for both curves [panel (b)]. The onset of this exponential decay is thus shifted in time, leading to a speedup.

To quantify this accelerated approach to stationarity, we define the relaxation time as the time it takes the order parameter to approach its stationary value within the threshold (see SM () for more details). The choice of this threshold is arbitrary; however, for most values of and the asymptotic approach to stationarity is exponential with the same rate for both quantum and classical dynamics, as in Fig. 2(b). Hence, the specific value of is not relevant to determine the presence of a speedup. Some small regions where the quantum dynamics is asymptotically slower than the classical one exist, but they are included within regions where the early-time dynamics is slowed down as well. In Fig. 2(c) we show — which is the relaxation time normalized by the classical relaxation time (at and ) — in the -plane for a given choice of very small initial conditions. The plot shows large regions of accelerated relaxation separated by two narrow strips, close to the “classical” regime, where the quantum dynamics experiences instead a slowdown. The shape of these strips shows only a weak dependence on the initial conditions.

This robustness is a consequence of the fact that, while the stationary states are independent of the angles and , their respective basins of attraction change (see sketch in Fig. 1). As can be gleaned from Eqs. (4), identifies an invariant subspace under the classical dynamics (i.e., ). Hence, by choosing the initial conditions the initial dynamics in the -direction will necessarily be slow (due to the almost vanishing derivative). By changing the angles (and thereby the shape of the attraction basins), the same initial condition will generically not be close to such an invariant manifold and the quantum dynamics will start faster, leading to the observed accelerated early-time dynamics. Indeed, the instances in which it is slower [bright regions in panel (c)] correspond to regions where the dynamics swaps from reaching one ferromagnetic stationary state to its opposite [panel (d)]. Hence, for most choices of and the quantum dynamics will be faster than the classical one independently of the initial conditions as long as .

Application to the Hopfield neural network — The HNN is a fundamental model for an associative memory capable of storing a set of spin configurations (). These are referred to as patterns, where each component takes the values (). Hereafter, we shall adopt the vector notation for the -dimensional pattern space, e.g., . For large , the pattern components are assumed to have a random structure symmetrically distributed between , so that and . More specifically, the are described as a set of independent, identically-distributed random variables with distribution . For the HNN, the coupling matrix reads . The corresponding energy function is minimized by choosing configurations for any fixed . Hence, a stochastic Glauber dynamics may have the patterns as stationary states, as long as the number of patterns obeys (Amit et al., 1985a, b). More generally, at finite temperature two phases emerge: for the system is in a paramagnetic phase where the typical configurations have no extensive overlap with any of the patterns []. Instead, identifies a “retrieval” phase where the system picks one of the patterns (say, the -th) and acquires a non-vanishing overlap with it [].

In the corresponding quantum model, this overlap is generalized to with representing the quantum expectation value at time . However, in contrast to the fully-connected Ising model, the equations of motion do not straight-forwardly close in the collective variables (overlaps) and further approximations are required: since the stationary state should yield the same expectation values as a classical HNN, it is natural to think that, at least not too far from the stationary points, the dynamics will be mostly determined by the properties of the overlaps . We thus perform the approximation (consistency checks are discussed in SM ()). Moreover, we exploit the self-averaging property, i.e. that for large we can perform the substitution , where denotes the average over the disorder. This reduces the equations of motion to SM ()

 ˙mx,yμ = −¯Ax,yμ(β), ˙mzμ = −¯Azμ(β)+12¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ξμtanhβ→ξ⋅→mz, (5)

where (). These equations close in the variables and can be solved numerically, the averages being calculated according to their definition .

As in the fully-connected Ising model, the choice decouples the equations for the -component and makes them equivalent to the classical (mean-field) dynamics. Unlike in the previous case, however, the order parameters [the overlaps ] are not necessarily known (one would have to know the patterns to derive them from the spin configurations) and therefore an averaging over initial conditions is required.

We focus on initial conditions representing “corrupted memories”, i.e., we pick the initial (random) overlaps uniformly over a small interval with the exception of a single one (without loss of generality, we fix ) , which is instead centered around a non-vanishing value . This mimics a case where a classical memory encodes a portion of a given pattern and the task of the neural network is to reconstruct the remainder.

Notably, merely changing and does not lead to any speedup as in the Ising case. To find a regime where quantum acceleration occurs, a possible protocol is the following: for each random choice of the initial conditions a spin rotation is performed which maps the overlaps’ -components in -components and vice versa [, , ]. Afterwards the quantum dynamics is switched on. To quantify the speedup the timescale is measured analogous to the Ising case and compared with the classical value . The ratios corresponding to different initial conditions are subsequently averaged. The results are shown in Fig. 3, where this ratio is displayed in the (,)-plane for three choices of . We can identify choices of the angles for which an accelerated pattern retrieval is indeed achieved. In particular, the gain is more significant for less corrupted initial memories, i.e. larger .

Conclusions — We investigated how quantum effects can accelerate the approach towards stationarity of a classical stochastic system. Our results suggest that indeed complementary methods to the established quantum annealing techniques Kadowaki and Nishimori (1998); Morita and Nishimori (2008); Aharonov et al. (2008); Boixo et al. (2014); Baldassi and Zecchina (2018); Denchev et al. (2016) may be exploited to enhance pattern retrieval in neural networks. In the future it will be interesting to extend this idea to more complex settings, for example spin systems with disorder that realize specific instances of NP-hard problems, such as the Sherrington-Kirkpatrick model Mézard et al. (1990) or the spin glass phase of the HNN (Amit et al., 1985b). Furthermore, it would be interesting to study experimental implementations of the proposed quantum dynamics based on open multimodal cavity arrays, which have been proposed as emulators for quantum neural networks and glassy systems Gopalakrishnan et al. (2011); Strack and Sachdev (2011); Rotondo et al. (2015a, b); Vaidya et al. (2018).

Acknowledgments — We acknowledge discussions with M. Müller. The research leading to these results has received funding from the European Research Council under the European Unions Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement No. 335266 (ESCQUMA). P.R. acknowledges funding by the European Union through the H2020 - MCIF No. . I.L. gratefully acknowledges funding through the Royal Society Wolfson Research Merit Award.

## Ii Derivation of Eqs. (LABEL:B-eq.spin) of the main text

We firstly consider the equations of motion generated by the Markovian and purely dissipative dynamics defined in Eq.(LABEL:B-eq.Lindblad) with the jumps operators (LABEL:B-eq.jumpop), only secondly applying the unitary transformation obtaining the Eqs. (LABEL:B-eq.spin). Here, any operator evolves according to the adjoint Lindblad equation, .

The equations of motion for the operators , read

 ˙σγi=∑kL†kσγiLk−12{L†kLk,σγi}. (6)

Let us first specialize (6) for the operator. As it is , the terms arising from the sum over vanish, and Eq.(6) reads

 ˙σzi=(α−ini−α+iσ−i)σzi(α−ini−α+iσ+i)−12{(α−ini−α+iσ−i)(α−ini−α+iσ+i),σzi}==(α−i)2ni+(α+i)2σ−iσ+i−α−iα+iσ+i−α−iα+iσ−i−12(α−i)2ni+12(α+i)2σ−iσ+i+−12α−iα+iσ+i+12α−iα+iσ−i−12(α−i)2ni+12(α+i)2σ−iσ+i+12α−iα+iσ+i−12α−iα+iσ−i=−α−iα+iσxi+(α+i)2(1−σzi)==−12sech(βΔEi)σxi+12[1+tanh(βΔEi)](1−σzi), (7)

where it has been used that , , . We can now consider the equation of motion of the operator , which reads

 ˙σ+i=∑k{(α−knk−α+kσ−i)σ+i(α−knk−α+kσ+i)−12[(α−knk−α+kσ−i)(α−knk−α+kσ+i)σ+i+σ+i(α−knk−α+kσ−i)(α−knk−α+kσ+i)]}. (8)

The term arising from the sum over is

 (α−i)2σ+ini+(α+i)2(1−σzi)2σ+i−α−iα+iσ+iσ+i−α−iα+iσ−iσ+ini−12(α−i)2σ+i−−12(α+i)2(1−σzi)2σ+i+12α−iα+iσ+iσ+i+12α−iα+i(1−σzi)2−12(α−i)2σ+ini−12(α+i)2(1+σzi)2σ+i+12α−iα+iσ+iσ+i+12α−iα+i(1+σzi)2==−(α−i)2+(α+i)22σ+i+α−iα+i2=−σ+i2+α−iα+i2. (9)

Before evaluating the part of Eq.(8) coming from the sum over , as it is for , it is useful to consider the following expression

 σ+iα+k=σ+ieβ2∑jJkjσzj√2cosh(β∑jJkjσzj)=eβ2∑j≠iJkjσzje−β2Jki√2cosh(β∑j≠iJkjσzj−βJki)σ+i==eβ2∑jJkjσzje−β2Jkie−β2Jkiσzi√2cosh(β∑j≠iJkjσzj−βJki)σ+i==eβ2∑jJkjσzj√2cosh(β∑jJkjσzj)=α+k(cosh(β∑jJkjσzj)cosh(β∑j≠iJkjσzj−βJki))12≡fkie−βJkiσ+i==α+kfkie−βJkiσ+i, (10)

which, after having defined , can be written as

 σ+iα+k=α+kiσ+i. (11)

Similarly, it is

 σ+i(α+k)2=(α+ki)2σ+i,σ+iα−k=α−kiσ+i,σ+i(α−k)2=(α−ki)2σ+i, (12)

where . Let us now go ahead in evaluating the last part of Eq.(8), which reads

 (13)

so that it is

 ˙σ+i=−σ+i2+α−iα+i2+∑k≠i12[−(α−k−α−ki)2nk−(α+k−α+ki)2σ−kσ+k++(α+k−α+ki)(α−kσ+k−α−kiσ−k)−(α−k−α−ki)(α+kiσ+k−α+kσ−k)]σ+i. (14)

In order to simplify last equation, we focus on fully-connected models with , with the corresponding operator as defined in the main text. Requiring this energy to stay finite in the thermodynamic limit, we have . In the following, we perform a power series expansion with respect the parameter () neglecting terms of the order . In particular, we consider the following expression

 (α+k−α+ki)=e+β2∑jJkjσzj√2cosh(βΔEk)−e+β2∑jJkjσzje−βJki√2cosh(β∑j≠iJkjσzj−βJki)==e+β2∑jJkjσzj√2cosh(βΔEk)⎡⎢⎣1−e−βJki(cosh(βΔEk)cosh(β∑j≠iJkjσzj−βJki))12⎤⎥⎦==e+β2∑jJkjσzj√2cosh(βΔEk)⎡⎢⎣1−e−βJki(cosh(βΔEk)cosh(β∑jJkjσzj−βJki(1+σzi)))12⎤⎥⎦≃=e+β2∑jJkjσzj√2cosh(βΔEk)⎡⎢⎣1−e−βJki(11−tanh(β∑jJkjσzj)βJki(1+σzi))12⎤⎥⎦≃e+β2∑jJkjσzj√2cosh(βΔEk)⎡⎣1−(1−βJki)(1+12tanh(β∑jkJkjσzj))βJki(1+σzi)⎤⎦≃e+β2∑jJkjσzj√2cosh(βΔEk)⎡⎣βJki−12tanh(β∑jkJkjσzj)βJki(1+σzi)⎤⎦≃O(βJki). (15)

Keeping terms up to the first order in , then Eq.(14) reads

 ˙σ+i=−σ+i2+α−iα+i2+∑k≠i(βJki)α−kα+k(iσyk)σ+i=−σ+i2+14sech(βΔEi)+iβ2∑k≠isech(βΔEk)σykσ+i, (16)

and the dynamical equations for are

 ˙σxi=−σxi2+12sech(βΔEi)−β2∑k≠iJkisech(βΔEk)σykσyi,˙σyi=−σyi2+β2∑k≠iJkisech(βΔEk)σykσxi, (17)

### ii.1 Unitary transformation of jump operators

As we stress in the main text, the stationary state of the dynamics given by Eq.(LABEL:B-eq.Lindblad) does not change under any local set of unitary transformation , (), that modifies the set as follows

 L′i≡uiLi=α−iuini−α+iuiσ+i, (18)

We choose to generalize the jump operators with the following equal spin rotation on all sites

 ui=(cosθeiϕsinθ−e−iϕsinθcosθ). (19)

In order to obtain the equations of motion for the operator , it is useful to evaluate the matrix representation of the operator , which is

 L†iLi=(α−i0−α+i0)(α−i−α+i00)=((α−i)2−α−iα+i−α−iα+i(α+i)2). (20)

Le us start by considering the dynamical equations for , which read

 ˙σzi=L′†iσziL′i−12{L′†iL′i,σzi