Scalable Emulation of Stoquastic Hamiltonians with Room Temperature p-bits

Scalable Emulation of Stoquastic Hamiltonians with Room Temperature p-bits

Kerem Y. Camsari, Shuvro Chowdhury and Supriyo Datta School of Electrical and Computer Engineering, Purdue University, IN, 47907
July 2, 2019
Abstract

The growing field of quantum computing is based on the concept of a q-bit which is a delicate superposition of 0 and 1, requiring cryogenic temperatures for its physical realization along with challenging coherent coupling techniques for entangling them. By contrast, a probabilistic bit or a p-bit is a robust classical entity that fluctuates between 0 and 1, and can be implemented at room temperature using present-day technology. Just as three-terminal transistors provide a building block for large functional circuits, a three terminal realization of the p-bit can provide a building block for p-circuits that has been shown to perform a variety of useful functions. In this paper, we introduce a new application of p-circuits, namely to perform Path Integral Monte Carlo (PIMC) simulations of quantum systems based on the well-known Suzuki-Trotter decomposition that maps a -dimensional quantum many body Hamiltonian to a +1-dimensional classical Hamiltonian. This approach allowing a quantum system to be emulated by a number of classical replicas is commonly used in software or high-level hardware simulations. What we show here is that it should be possible to build low-level hardware accelerators that use replicated p-bits to emulate a given q-bit network. Using full device-level SPICE simulations we demonstrate that the correct quantum correlations can be obtained using a classical p-circuit built with existing technology and operating at room temperature. Another advantage of p-bit networks is that they can be interconnected using conventional electronic devices such as GPUs or FPGAs. This could allow complex connectivity beyond nearest neighbor coupling as well as the implementation of arbitrary -body interactions with .

pacs:

I Introduction

FIG. 1: q-bit to p-bit mapping: (a) A -dimensional q-bit array described by the Transverse Ising Hamiltonian can be mapped to a +1-dimensional p-bit array with -replicas that are coupled in the vertical direction by the Suzuki-Trotter decomposition (The case for is illustrated). In this scheme, the replicas are always connected with periodic boundary conditions such that . The many-body quantum and the corresponding classical Hamiltonian are shown where the operators of the quantum system are replaced with binary p-bits in the classical system with . Corresponding coupling terms are and .

The basic building block of conventional digital electronics is the CMOS (Complementary Metal Oxide Semiconductor) transistor that is used to represent deterministic bits, that are either 0 or 1. Quantum computing, on the other hand, is based on q-bits that are coherent, delicate superpositions of 0 and 1. It is possible to define an entity intermediate between bits and q-bits that are classical but probabilistic, which we call “p-bits” Camsari et al. (2017a). It has been argued that just as three-terminal transistors provide a building block for large functional circuits, a three terminal realization of the p-bit can provide a building block for p-circuits Behin-Aein et al. (2016) reminiscent of the probabilistic computer described by Feynman in the same paper that helped launch the field of quantum computing Feynman (1982).

Such p-circuits can perform useful functions broadly relevant in the context of quantum computing and machine learning Camsari et al. (2018). For example, p-circuits can be used to perform classical annealing in hardware Sutton et al. (2017), perform integer factorization by operating multipliers in an invertible mode Camsari et al. (2017a); Pervaiz et al. (2017a), just like quantum annealers that have been used for similar applications Martoňák et al. (2004); Peng et al. (2008). In the machine learning context, p-bits can function as hardware accelerators for binary stochastic neurons Ackley et al. (1985) that can be used to become efficient inference engines Faria et al. (2018); Zand et al. (2018), or they can be used in an efficient calculation of correlations to accelerate learning algorithms, an application area also discussed in the context of quantum computing Bian et al. (2010); Adachi and Henderson (2015); Liu et al. (2018); Amin et al. (2018).

Scope

In this paper, we introduce a new application of p-circuits, namely to perform Path Integral Monte Carlo (PIMC) simulations of quantum systems based on the well-known Suzuki-Trotter decomposition that maps a -dimensional quantum many body Hamiltonian to a +1-classical Hamiltonian. This allows a quantum system to be emulated by a number of classical replicas that are interacting with each other Suzuki (1976) (FIG. 1) and this approach is commonly used in software or high-level hardware simulations Santoro et al. (2002); Heim et al. (2015); Denchev et al. (2016); Baldassi and Zecchina (2018); Okuyama et al. (2017). By contrast, we show that p-bits realized as low-level hardware building blocks to build p-computers can significantly speed up the simulation.

We show that for a class of quantum Hamiltonians generally referred to as stoquastic Hamiltonians Albash and Lidar (2018a) which avoid the sign problem Troyer and Wiese (2005) and are amenable to efficient PIMC simulation, it should be possible to build hardware accelerators using replicated p-bits to emulate the thermodynamics of q-bit networks. The number of p-bits required to emulate a given q-bit network is typically a factor of 25-100 larger, but this is offset by the relative ease of implementation. Three-terminal p-bits are straightforwardly implemented at room temperature with Magnetoresistive Random Access Memory (MRAM) technology which is currently in production with nearly Gb’s of memory elements. Non-magnetic and completely digital implementations of p-bits are also possible Pervaiz et al. (2017a, b) though they would require much larger energy and area Zand et al. (2019).

A p-bit based quantum emulator could overcome fundamental difficulties associated with the low temperature operation of quantum annealers Albash et al. (2017). It has recently been suggested Albash and Lidar (2018b) that among quantum annealing options, simulated quantum annealing (SQA) exhibits the best scaling properties, performing even better than experimental quantum annealers in some cases. As such, accelerating the software-based SQA with very low-level hardware is desirable.

Another advantage of p-bit networks is that unlike q-bit networks they can be interconnected using conventional electronic devices such as GPUs or FPGAs. This could allow all-to-all connectivity beyond nearest neighbor coupling without requiring any special encoding Lechner et al. (2015); De las Cuevas and Cubitt (2016). Moreover, it should allow the implementation of arbitrary -body interactions that are usually avoided by introducing ancillary bits to map them into 2-body interactions Biamonte (2008); Jiang et al. (2018).

Organization of the paper

We start in Section II, with a description of the mapping from the q-bit network to the p-bit network, along with the behavioral equations describing the dynamics of the latter. These behavioral equations for p-circuits are similar to those used for stochastic neural networks and are often implemented in software for machine learning applications. However, a hardware implementation can provide a significant speed-up especially because it can allow parallel asynchronous operation under the right conditions.

Next in Section III we consider one popular example of a stoquastic Hamiltonian, namely the Transverse Ising Hamiltonian Kadowaki and Nishimori (1998); Pfeuty (1970), commonly employed by quantum annealers Johnson et al. (2011). We compare the exact quantum results for the averages and correlations with the results obtained from the p-bit network demonstrating the impressive accuracy that can be achieved with a limited number of replicas. In Section IV we show another example, namely the ferromagnetic Heisenberg Model Suzuki (1976); Barma and Shastry (1978), once again comparing the exact quantum results with probabilistic simulations of the p-bit network.

Finally in Section V we present SPICE simulations of actual hardware implementations that can be built with existing Embedded Magnetoresistive RAM (eMRAM) technology that has been under development by a number of foundries Lin et al. (2009); Song et al. (2016); Shum et al. (2017). Unlike standard eMRAM where a non-volatile MTJ is carefully engineered with a large energy barrier (-) so that the magnetization state is retained for a long time Bhatti et al. (2017), the free layer of the MTJ for the p-bit is designed as a thermally unstable magnet ( whose magnetization rapidly fluctuates in time in the presence of thermal noise Camsari et al. (2017b). Using full device-level SPICE simulations corresponding to the p-bit and a resistive interconnection matrix, we demonstrate that the correct quantum correlations can be obtained using this classical p-circuit which can be built with existing technology at room temperature. With appropriate magnet designs Hassan et al. (2019) individual p-bits can flip in a nanosecond or less so that with a million of them operating in parallel, we should have petaflips per second which is quite impressive compared to existing digital implementations Fang et al. (2014).

Ii q-bit to p-bit

Since the seminal work of Suzuki Suzuki (1976), it is well-known that a -dimensional quantum many-body Hamiltonian can be mapped to a +1-dimensional classical Hamiltonian applying the so-called Suzuki-Trotter decomposition Trotter (1959); Suzuki (1976), which is used as a basis for PIMC methods to simulate quantum annealing using classical computers Santoro et al. (2002). This decomposition results in the quantum system being mapped to a classical system with replicas that are coupled to each other. In this paper we consider two examples as described in the next two Sections, but the principles apply to stoquastic Hamiltonians in general.

Consider for example the Transverse Ising Hamiltonian in 1D written as Pfeuty (1970):

(1)

The Suzuki-Trotter mapping produces the following classical 2D Hamiltonian Santoro et al. (2002):

(2)

where , being the number of replicas, and the vertical coupling term is and . Note how the quantum mechanical operators in Eq. 1 have become classical spins in Eq. 2. The mapping of Eq. 2 becomes exact in the limit of infinite replicas () however, for finite replicas the error scales as Heim et al. (2015) and can be made arbitarily small by choosing an appropriate number of replicas.

Behavioral model for p-bits

The classical system expressed by Eq. 2 can be represented by p-circuits that are built out of p-bits. There are two central equations that are used to describe p-bit networks Camsari et al. (2017a):

(3a)
where is dimensionless time that is incremented one at a time, is a random number uniformly distributed between 1 and +1 and at each time step is uncorrelated with the chosen at the previous step. is the dimensionless current to each p-bit, where is the inverse temperature. in general, is calculated according to,
which in the present case, becomes:
(3b)

where is the interconnection matrix and is the bias term. We refer to Eq. 3 as Probabilistic Spin Logic (PSL) equations and note that these equations are essentially the same as those discussed in the context of stochastic neural networks such as Boltzmann Machines, developed by Hinton and colleagues Ackley et al. (1985).

It is important to note that while Eq. 3b is a linear synapse that typically arises from quadratic Hamiltonians with 2-body interactions, specially designed digital CMOS circuits can be used to implement more complicated interactions arising from cost functions such as generalized Hopfield models with -body interactions Gardner (1987); Seki and Nishimori (2015). Such a flexibility of implementing complicated interactions could be a key advantage for hardware p-circuits.

PSL dynamics

PSL equations can be updated to approximate the steady state joint probability density for any matrix, symmetric or asymmetric. For symmetric matrices, the joint probability density is simply expressed by the classical Boltzmann Law, , where is the energy for a given configuration , . There are two important conditions regarding the updating of Eq. 3. First, Eq. 3b needs to be calculated much faster than Eq. 3a for proper convergence Pervaiz et al. (2017a), a requirement particularly relevant for hardware implementations. Second, Eq. 3a needs to be updated sequentially, as in Gibbs sampling Geman and Geman (1984). The requirement of sequential updating prohibits parallelization in software implementations, except in special cases such as restricted Boltzmann machines where the lack of intralayer connections between “visible” and “hidden” layers allows each layer to be updated in parallel Hinton (2012). For asynchronous hardware implementations, however, a clockless operation seems to satisfy the requirement of sequential updating naturally Pervaiz et al. (2017a); Sutton et al. (2017).

Iii Transverse Ising Hamiltonian

For the 1D Transverse Ising Hamiltonian (Eq. 1), we assume periodic boundary conditions such that . is the local transverse magnetic field and is a local -directed magnetic field. Eq. 1 can be constructed by first writing each term, , as a matrix followed by ordinary matrix multiplication for each product term. These terms are written in terms of Pauli spin matrices () at the lattice point as where is the 2 identity matrix and is the Pauli spin matrix at the term in the product.

FIG. 2: Exact quantum solution of a 1D Transverse Ising Hamiltonian vs Probabilistic Spin Logic (PSL): (a) A 1D ferromagnetic linear chain () with spins (nearest neighbor with periodic boundary conditions) described by the quantum Transverse Ising Hamiltonian (Eq. 1) is solved exactly, as a function of the transverse magnetic field () at an inverse temperature of . A symmetry breaking magnetic field in the direction is used, , so that at , all spins are pointing in the + direction. The green dashed line is obtained by evaluating Eq. 4 as a function of . The red dots represent 100 different PSL runs obtained with different RNGs, each running for time steps. The blue triangles represent the average of PSL simulations and closely match the exact solution, establishing the accuracy of the quantum to classical mapping, with replicas. (b) A probability histogram of correlations of the form are obtained from PSL and Quantum Boltzmann Law at that corresponds to the last point of the -axis in (a). Only a portion of the states are shown for clarity, states in between show essentially identical agreement.

Quantum Boltzmann Law

In principle, Eq. 1 can be exactly solved for any quantity of interest as a function of temperature and all other parameters and , from the principles of quantum statistical mechanics Kadanoff and Baym (1962):

(4)

where is the “inverse temperature” (as defined in Eq. 3a) and we have chosen to use a unit system in which . is the quantity we wish to calculate with a corresponding operator . In practice, directly solving Eq. 4 becomes intractable due to the exponential dependence of the Hamiltonian () to the size of the problem (). Due to its similarity to the classical Boltzmann Law Feynman et al. (2011), we refer to Eq. 4 as the “Quantum Boltzmann Law” throughout this paper and solve it for small 1D systems. To obtain numerically stable results at low temperatures (high ), we first diagonalize the Hamiltonian and subtract the ground state energy from the diagonals, without changing any observable quantities.

Averages and correlations

In FIG. 2a we calculate the average -magnetization of a 1D ferromagnetic () chain with spins, as a function of a transverse magnetic field. The average -magnetization, , is obtained by the operator where provides the net -spin, , at site . To break the symmetry of at low temperatures we introduce a -directed magnetic field. As the transverse magnetic field increases, gradually decreases, while (not shown) increases, as spins become aligned with the transverse magnetic field. Incidentally, the reverse process, starting from a large at a low temperature and slowly decreasing it to find the ground state of a complicated spin-glass, is commonly used in quantum annealing algorithms Heim et al. (2015).

FIG. 2b shows the probabilities of correlated states at a given temperature and transverse field expressed as decimal numbers. This is done by first converting the states to binary numbers such that denotes +1 and denotes 0 and then converting the full state into a decimal number, for example the all down state corresponds to 0, and the all up state corresponds to 255 and so on. There are such states, each with a given probability obtained from Eq. 4. These correlated states are calculated by first constructing an operator for the probability of finding a state at a given site, where is the identity matrix. Similarly, . Using these operators, any correlation of the form can be calculated from the corresponding composite operator:

(5)

There are 256 such operators and Eq. 4 can be used for each of them to obtain a probability for each state for any . FIG. 2b shows these probabilities at a chosen parameter combination and they are in agreement with results obtained from a simulation of p-bits, as we next explain in Section II. Note that this joint probability density contains all statistical information in the system, as averages and other correlations of interest can be calculated from it, for example one can obtain by weighting each state by the net -spin they contribute to the average.

PSL vs Quantum Boltzmann Law

With this picture, the mapped classical Hamiltonian with replicas described in Eq. 2 is used to obtain a consolidated matrix that is of size to be used in Eq. 3. FIG. 2 shows the equivalence of the PSL implementation of the Transverse Ising Hamiltonian to the exact quantum many-body description for a 1D-chain with spins. Note that the p-bit mapping can be applied to much larger spin systems, but an exact solution by Eq. 4 quickly becomes intractable. We investigate the average -spin of this ferromagnetic chain at a constant temperature () as a function of the transverse magnetic field, . A symmetry breaking field (to favor a +1 order) of is applied. As expected, the exact result shows how the average -spin becomes disordered. The PSL results for a replica system reproduce this behavior. The -spin average is obtained by taking an average over the length of the chain, as well as over each replica. The final average (for a given red dot) is recorded at the end of dimensionless time steps. Since a single stochastic point is recorded at the end of , for each point, we observe a variance in the final results, however averaging over 100 different simulations for the same system, we get a very close match to the exact solution.

In FIG. 2b, the full joint probability density for the classical system is obtained from a PSL simulation that is run for dimensionless time steps. The state of each replica with 8-spins is converted into a binary number at each time step, as in the exact solution, and then collected over all replicas. The striking agreement with PSL and the Quantum Boltzmann Law in FIG. 2 establishes the faithful mapping of the quantum system to the classical system, from the behavioral PSL equations Eq. 3.

Iv Ferromagnetic Heisenberg Model

FIG. 3: 1D Heisenberg Model: (a) A 1D Heisenberg model ( spins) with a transverse magnetic field is considered. (b) The chessboard lattice corresponding the 2D classical mapping of the model with periodic boundary conditions in the replica and lattice directions. (c) The interaction terms within the shaded and unshaded unit cells are shown. Within shaded cells, all neighbors are coupled vertically, diagonally and horizontally () in addition to a 4-body interaction energy term that involves the product of all four spins. Terms that arise from the diagonal part of the Hamiltonian, , result in additional horizontal interactions () in both the shaded and unshaded cells. (d) Probability histogram corresponding to a spin ferromagnetic Heisenberg Model. The histogram is obtained by solving the Quantum Boltzmann Law (Eq. 4) with the parameters shown in the inset and by solving behavioral PSL equations but with a modified Eq. 3b to account for a 3-body term arising from the 4-body interaction. For the PSL simulation 20 replicas are used with time steps. Samples taken from different replicas are considered independent and reduced to 16 probabilities to be compared with the original quantum system.

Before proceeding to a hardware implementation showing how replicated networks of p-bits can be built by existing nanodevices, we show another example of a stoquastic Hamiltonian that can be represented by p-bits. The Heisenberg Hamiltonian in 1D in the presence of a transverse magnetic field can be written as:

(6)

Following Suzuki (1976); Bravyi (2014); Barma and Shastry (1978), we apply the Suzuki-Trotter transformation to this system and obtain the chessboard lattice that is shown in Fig. 3b with shaded and unshaded unit cells. The interactions terms for this hardware neural network are shown in Fig. 3c. For the shaded unit cells all two-body interactions () exist in addition to a 4-body interaction () that involves the product of each spin. The two-body interactions can be implemented using a linear synapse of the form of Eq. 3b but the 4-body interaction requires a non-linear synapse that computes the input terms that are products of three neighboring spins. The interaction terms arise due to the diagonal parts of the quantum system, as in the case of the Transverse Ising Hamiltonian, and exists for both the shaded and unshaded unit cells shown in Fig. 3c. The detailed derivations of all interaction terms are shown in Appendix A.

In Fig. 3d, we show a simulation of the classical system using behavioral PSL equations and compare this to the exact solution as before. We choose a set of parameters, that corresponds to the ferromagnetic Heisenberg Model with a small transverse magnetic field in the -direction, such that all off-diagonal terms in the are positive, hence making this system stoquastic Bravyi (2014). In this small example with spins, we observe good agreement between the mapped system and the exact solution.

V p-bit to Stochastic MRAM

FIG. 4: p-bit to Stochastic eMRAM: (a) Each replica in the classical system is represented by a hardware neural network involving p-bits (neurons), interconnected by a resistive network (synapse). The outputs of the p-bits are weighted by the resistive network to become inputs to each other. Bias terms are added as fixed external voltage sources. (b) Detailed circuit schematics of a given p-bit and synapse following Ref. Camsari et al. (2017b): The outputs are collected by a fast operational amplifier (assumed ideal in circuit simulations). Fixed layer ferromagnet (FM) is a stable magnet with a large energy barrier, while the free layer is a circular low-barrier magnet (LBM) () whose magnetization fluctuates in the presence of thermal noise. (c) SPICE simulations for the input-output characteristics of the p-bit: The results for 1000 p-bits where the input voltage is swept from to +, in ns (Inset shows the autocorrelation of the p-bit at ). Each p-bit has a randomized resistance due to the random magnetization of the free layer, showing a range of outputs bounded by the parallel and anti-parallel resistance of the MTJ. Ensemble averaged output for 1000 samples at a given input voltage ( ns for each sample) shows a tanh() behavior. (d) The circuit model that self-consistently solves the stochastic LLG equation with the MTJ and transistor models. is the spin-current exerted on the free layer due to the current polarized by the fixed layer, is the instantaneous magnetization and is the thermal noise field.
FIG. 5: Full circuit simulation of a 4-spin chain with 10 replicas: 1D classical Ising chain of spins is simulated in SPICE with full device models for the stochastic MRAM-based p-bit and a resistive interconnection matrix and compared with Q-PSL and Quantum Boltzmann Equation. , and . The joint probability density is expressed in decimal numbers similar to the previous examples.

We now show how the behavioral p-bit model can be represented by a stochastic neural network in hardware (FIG. 4a). Each replica in the classical system consists of p-bits (neurons) that are interconnected to each other with a resistive network (synapse), a typical architecture often used in many hardware neural networks Yu et al. (2011); Hu et al. (2016) though for more complicated systems involving -body interactions (), standard electronic devices such as FPGA’s could also be used for this purpose, for example as in Ref. McMahon et al. (2016). The extra dimension added by the Suzuki-Trotter transformation would increase the synaptic complexity but for sparse quantum networks, this transformation would only slightly increase the fan-in of each p-bit in the classical network.

We assume that the weighted summation is carried out by ideal operational amplifiers. The replicas are also connected in the vertical direction (not shown in FIG. 4) with nearest neighbor coupling according to the coupling coefficient .

In the case of quantum annealing, the vertical resistors need to be reconfigurable, therefore they need to be designed differently compared to the fixed resistors that represent the transverse coupling . In our device level examples, we use fixed resistors in order to establish the equivalence between the classical and quantum systems and have not performed annealing.

Network parameters

The device equations for the synapse and the p-bit shown in FIG. 4 are given as Camsari et al. (2017b); Faria et al. (2018):

(7)
(8)

Eq. 7 and Eq. 8 are combined with the PSL equations, Eq. 3, to obtain the following equations that map the behavioral PSL equations to physical parameters:

(9)

where is a unit resistor that is used to electrically change the inverse temperature , and is a transistor dependent parameter ( mV) that defines the stochastic window of the p-bit (FIG. 4c). Depending on the sign of the interconnection, , the non-inverted output or the inverted output is used for the synaptic connections.

Device models

The 1T/1MTJ p-bit is modeled by combining a 14 nm-High Performance FinFET model from the open source Predictive Technology Models (PTM) pre () with a stochastic Landau-Lifshitz-Gilbert (sLLG) solver implemented in SPICE Torunbalci et al. (2018), following the design described in Camsari et al. (2017b) (FIG. 4d). The MTJ is modeled as a simple conductor whose conductance depends on the instantaneous magnetization , provided by the sLLG such that

(10)

where and are the parallel and antiparallel resistance of the MTJ and . We use an experimentally measured value for the tunneling magnetoresistance (TMR) after Ref. Lin et al. (2009). is set equal to the transistor resistance at to produce a symmetric sigmoid with no offsets, in this case . The free layer is assumed to be a circular low barrier nanomagnet Cowburn et al. (1999); Debashis et al. (2018) with a diameter of 22 nm and thickness of 2 nm and a saturation magnetization of emu/cc, with a damping coefficient , typical parameters for CoFeB Sankey et al. (2008).

The time dependent magnetization is obtained by solving sLLG equation in the monodomain approximation Sun (2000):

(11a)

is the electron gyromagnetic ratio, is electron charge and is the number of Bohr magnetons () in the volume of the magnet, . contains the external magnetic and internal anisotropy fields of the magnet as well as the noise field. In the case of a circular nanomagnet without an easy-axis anisotropy, the total internal magnetic anisotropy becomes , where - is the easy plane of the magnet. The thermal noise is added in three directions (, and ) with zero mean and in units [] Li and Zhang (2004).

Device operation

The p-bit shown in FIG. 4c is a series-resistance controlled device where the transistor resistance can be made much smaller or much larger compared to the fluctuating MTJ resistance. Therefore, the operation of the p-bit does not require manipulating the magnetization of the free layer unlike in standard spin-transfer-torque MRAM cells. However, the current flowing through the fixed layer of the MTJ produces a spin-polarized spin current that can unintentionally torque the magnet. We assume that this current is given as , where is the fixed layer orientation and is an interface polarization that can be related to TMR Datta et al. (2012). This spin-current is fed back to the sLLG solver and fully accounted for in the calculation of magnetization in our simulations, however for the circular LBM with a large demagnetization field used here, its effects are negligible Faria et al. (2017). Using these models, FIG. 4c shows transient SPICE simulations of a single p-bit output, for 1000 samples where is rapidly swept in 2 ns. The range of stochastic outputs is bounded by a distribution of resistances ranging from to . The ensemble average shows an approximate hyperbolic tangent behavior that allows the mapping shown in Eq. 7.

The inset of FIG. 4d shows the autocorrelation time of the circular in-plane magnet with a lifetime of  ps. The fluctuations for a circular magnet is expected to be faster compared to a magnet with perpendicular anisotropy due to the strong demagnetizing field that keeps the magnetization vector in the easy plane of the magnet Lopez-Diaz et al. (2002). The very short lifetime of such a circular low barrier magnet could allow very fast and efficient sampling times, as long as the interconnection network operates faster than these timescales. In present simulations, the resistive network operates instantaneously with an ideal operational amplifier therefore this requirement is met naturally, however in real implementations the synapse needs to be designed carefully.

The second requirement, the need for sequential updating of each p-bit is met naturally since the probability of simultaneous flips among p-bits is extremely unlikely, therefore hardware p-bits go through an effectively random update order that does not affect their final distribution.

Stochastic MRAM-based p-bit vs Quantum Boltzmann Law

In FIG. 5, using full SPICE simulations for a 40 p-bit network we compute the joint probability density of a spin ferromagnetic chain () using 10 replicas, with and . Unlike FIG. 2, no symmetry breaking field is applied and the network is asynchronously operated for ns, with a time step of 1 ps. All analog voltage values at the end of the SPICE simulation are thresholded (, ) and a time-average is obtained similar to the PSL averaging after converting the state of each p-bit to binary and then to decimal. The results from the full device models seem to be in good agreement with the exact solution obtained from Eq. 4 and the behavioral PSL equations that are included for reference. Note the suppression of states and that correspond to the energetically unfavorable antiferromagnetic configurations and , respectively. The agreement between the full SPICE models with the behavioral and exact solutions establishes the feasibility of the proposed quantum circuit emulator.

Vi Conclusion

We have presented a scalable, room-temperature quantum emulator using stochastic p-bits that can be built by a simple modification of the existing 1T/1MTJ cell of the eMRAM technology. The proposed emulator uses physical replicas for repeated Trotter slices used in software Quantum Monte Carlo methods. Having physical replicas for each slice could enable better scaling properties for quantum annealing compared to classical annealing as discussed in Santoro et al. (2002), since choosing the optimal number of replicas or probing each replica separately to find better energy minima is possible in a physically engineered design, unlike in real quantum systems Heim et al. (2015). The electrical control of annealing parameters, inverse temperature () and transverse field (), could allow a very large number of q-bits to be reliably emulated with room temperature p-bits. Using conventional electronic devices such as GPU’s or FPGA’s to implement the synapses, it should be possible to engineer complicated interactions that extend beyond nearest neighbors and/or involve -body interactions (). We note that even though the “sign problem” limits the universal use of our p-computer, a large number of practically relevant quantum systems could be efficiently emulated by it, considering a large number of optimization problems have been mapped on to the Transverse Ising Hamiltonian Lucas (2014). Our results provide a method of emulating quantum systems with probabilistic hardware in advance of a scalable universal quantum computer.

Acknowledgements.
KYC is grateful to Brian M. Sutton for helpful discussions. This work was supported in part by ASCENT, one of six centers in JUMP, a Semiconductor Research Corporation (SRC) program sponsored by DARPA.

Appendix A : Mapping Quantum Heisenberg Model to a Classical System

The Heisenberg Hamiltonian emulated in Section IV is,

(12)

Following Suzuki (1976); Barma and Shastry (1978), we divide this Hamiltonian into three non-commuting parts, i.e., , where

(13)
(14)
(15)

The -th approximant of the Suzuki-Trotter transformation for this Hamiltonian is then given by

and the classical system becomes,

(16)

with periodicity along ( dimension such that . Next, we evaluate and , with each being a by 1 vector and , where denotes spin of replica. With this choice of and and using some identities, these terms can be decoupled and obtained by evaluating the 44 density matrix:

(17)

where, , are 44 Pauli spin matrices. The corresponding are given by:

We then use the following relation:

(18)

where is a constant that we ignore and

This corresponds to the energy model for the Heisenberg Hamiltonian as shown in Fig. 3.

References

  • Camsari et al. (2017a) Kerem Yunus Camsari, Rafatul Faria, Brian M Sutton,  and Supriyo Datta, “Stochastic p-bits for invertible logic,” Physical Review X 7, 031014 (2017a).
  • Behin-Aein et al. (2016) Behtash Behin-Aein, Vinh Diep,  and Supriyo Datta, “A building block for hardware belief networks,” Scientific reports 6, 29893 (2016).
  • Feynman (1982) Richard P Feynman, “Simulating physics with computers,” International journal of theoretical physics 21, 467–488 (1982).
  • Camsari et al. (2018) Kerem Y Camsari, Brian M Sutton,  and Supriyo Datta, “p-bits for probabilistic spin logic,” arXiv preprint arXiv:1809.04028  (2018).
  • Sutton et al. (2017) Brian Sutton, Kerem Yunus Camsari, Behtash Behin-Aein,  and Supriyo Datta, “Intrinsic optimization using stochastic nanomagnets,” Scientific Reports 7, 44370 (2017).
  • Pervaiz et al. (2017a) Ahmed Zeeshan Pervaiz, Lakshmi Anirudh Ghantasala, Kerem Yunus Camsari,  and Supriyo Datta, “Hardware emulation of stochastic p-bits for invertible logic,” Scientific reports 7, 10994 (2017a).
  • Martoňák et al. (2004) Roman Martoňák, Giuseppe E Santoro,  and Erio Tosatti, “Quantum annealing of the traveling-salesman problem,” Physical Review E 70, 057701 (2004).
  • Peng et al. (2008) Xinhua Peng, Zeyang Liao, Nanyang Xu, Gan Qin, Xianyi Zhou, Dieter Suter,  and Jiangfeng Du, “Quantum adiabatic algorithm for factorization and its experimental implementation,” Physical review letters 101, 220405 (2008).
  • Ackley et al. (1985) David H. Ackley, Geoffrey E. Hinton,  and Terrence J. Sejnowski, “A Learning Algorithm for Boltzmann Machines*,” Cognitive Science 9, 147–169 (1985).
  • Faria et al. (2018) Rafatul Faria, Kerem Y Camsari,  and Supriyo Datta, “Implementing bayesian networks with embedded stochastic mram,” AIP Advances 8, 045101 (2018).
  • Zand et al. (2018) Ramtin Zand, Kerem Yunus Camsari, Steven D Pyle, Ibrahim Ahmed, Chris H Kim,  and Ronald F DeMara, “Low-energy deep belief networks using intrinsic sigmoidal spintronic-based probabilistic neurons,” in Proceedings of the 2018 on Great Lakes Symposium on VLSI (ACM, 2018) pp. 15–20.
  • Bian et al. (2010) Zhengbing Bian, Fabian Chudak, William G Macready,  and Geordie Rose, “The ising model: teaching an old problem new tricks,” D-wave systems 2 (2010).
  • Adachi and Henderson (2015) Steven H Adachi and Maxwell P Henderson, “Application of quantum annealing to training of deep neural networks,” arXiv preprint arXiv:1510.06356  (2015).
  • Liu et al. (2018) Jeremy Liu, Federico M Spedalieri, Ke-Thia Yao, Thomas E Potok, Catherine Schuman, Steven Young, Robert Patton, Garrett S Rose,  and Gangotree Chamka, ‘‘Adiabatic quantum computation applied to deep learning networks,” Entropy 20, 380 (2018).
  • Amin et al. (2018) Mohammad H Amin, Evgeny Andriyash, Jason Rolfe, Bohdan Kulchytskyy,  and Roger Melko, “Quantum boltzmann machine,” Physical Review X 8, 021050 (2018).
  • Suzuki (1976) Masuo Suzuki, “Relationship between d-dimensional quantal spin systems and (d+ 1)-dimensional ising systems: Equivalence, critical exponents and systematic approximants of the partition function and spin correlations,” Progress of theoretical physics 56, 1454–1469 (1976).
  • Santoro et al. (2002) Giuseppe E Santoro, Roman Martoňák, Erio Tosatti,  and Roberto Car, “Theory of quantum annealing of an ising spin glass,” Science 295, 2427–2430 (2002).
  • Heim et al. (2015) Bettina Heim, Troels F Rønnow, Sergei V Isakov,  and Matthias Troyer, “Quantum versus classical annealing of ising spin glasses,” Science 348, 215–217 (2015).
  • Denchev et al. (2016) Vasil S Denchev, Sergio Boixo, Sergei V Isakov, Nan Ding, Ryan Babbush, Vadim Smelyanskiy, John Martinis,  and Hartmut Neven, “What is the computational value of finite-range tunneling?” Physical Review X 6, 031015 (2016).
  • Baldassi and Zecchina (2018) Carlo Baldassi and Riccardo Zecchina, “Efficiency of quantum vs. classical annealing in nonconvex learning problems,” Proceedings of the National Academy of Sciences , 201711456 (2018).
  • Okuyama et al. (2017) T. Okuyama, M. Hayashi,  and M. Yamaoka, “An ising computer based on simulated quantum annealing by path integral monte carlo method,” in 2017 IEEE International Conference on Rebooting Computing (ICRC) (2017) pp. 1–6.
  • Albash and Lidar (2018a) Tameem Albash and Daniel A. Lidar, “Adiabatic quantum computation,” Rev. Mod. Phys. 90, 015002 (2018a).
  • Troyer and Wiese (2005) Matthias Troyer and Uwe-Jens Wiese, “Computational complexity and fundamental limitations to fermionic quantum monte carlo simulations,” Physical review letters 94, 170201 (2005).
  • Pervaiz et al. (2017b) Ahmed Zeeshan Pervaiz, Brian M. Sutton, Lakshmi Anirudh Ghantasala,  and Kerem Y. Camsari, “Weighted p-bits for FPGA implementation of probabilistic circuits,” arXiv:1712.04166 [cs]  (2017b), arXiv: 1712.04166.
  • Zand et al. (2019) Ramtin Zand, Kerem Y Camsari, Supriyo Datta,  and Ronald F DeMara, “Composable probabilistic inference networks using mram-based stochastic neurons,” ACM Journal on Emerging Technologies in Computing Systems (JETC) 15, 17 (2019).
  • Albash et al. (2017) Tameem Albash, Victor Martin-Mayor,  and Itay Hen, “Temperature scaling law for quantum annealing optimizers,” Physical review letters 119, 110502 (2017).
  • Albash and Lidar (2018b) Tameem Albash and Daniel A. Lidar, “Demonstration of a scaling advantage for a quantum annealer over simulated annealing,” Phys. Rev. X 8, 031016 (2018b).
  • Lechner et al. (2015) Wolfgang Lechner, Philipp Hauke,  and Peter Zoller, “A quantum annealing architecture with all-to-all connectivity from local interactions,” Science advances 1, e1500838 (2015).
  • De las Cuevas and Cubitt (2016) Gemma De las Cuevas and Toby S Cubitt, “Simple universal models capture all classical spin physics,” Science 351, 1180–1183 (2016).
  • Biamonte (2008) JD Biamonte, “Nonperturbative k-body to two-body commuting conversion hamiltonians and embedding problem instances into ising spins,” Physical Review A 77, 052331 (2008).
  • Jiang et al. (2018) Shuxian Jiang, Keith A Britt, Travis S Humble,  and Sabre Kais, ‘‘Quantum annealing for prime factorization,” arXiv preprint arXiv:1804.02733  (2018).
  • Kadowaki and Nishimori (1998) Tadashi Kadowaki and Hidetoshi Nishimori, “Quantum annealing in the transverse ising model,” Phys. Rev. E 58, 5355–5363 (1998).
  • Pfeuty (1970) Pierre Pfeuty, “The one-dimensional ising model with a transverse field,” ANNALS of Physics 57, 79–90 (1970).
  • Johnson et al. (2011) Mark W Johnson, Mohammad HS Amin, Suzanne Gildert, Trevor Lanting, Firas Hamze, Neil Dickson, R Harris, Andrew J Berkley, Jan Johansson, Paul Bunyk, et al., “Quantum annealing with manufactured spins,” Nature 473, 194 (2011).
  • Barma and Shastry (1978) Mustansir Barma and B Sriram Shastry, “Classical equivalents of one-dimensional quantum-mechanical systems,” Physical Review B 18, 3351 (1978).
  • Lin et al. (2009) CJ Lin, SH Kang, YJ Wang, K Lee, X Zhu, WC Chen, X Li, WN Hsu, YC Kao, MT Liu, et al., “45nm low power cmos logic compatible embedded stt mram utilizing a reverse-connection 1t/1mtj cell,” in Electron Devices Meeting (IEDM), 2009 IEEE International (IEEE, 2009) pp. 1–4.
  • Song et al. (2016) YJ Song, JH Lee, HC Shin, KH Lee, K Suh, JR Kang, SS Pyo, HT Jung, SH Hwang, GH Koh, et al., “Highly functional and reliable 8mb stt-mram embedded in 28nm logic,” in Electron Devices Meeting (IEDM), 2016 IEEE International (IEEE, 2016) pp. 27–2.
  • Shum et al. (2017) D. Shum, D. Houssameddine, S. T. Woo, Y. S. You, J. Wong, K. W. Wong, C. C. Wang, K. H. Lee, K. Yamane, V. B. Naik, C. S. Seet, T. Tahmasebi, C. Hai, H. W. Yang, N. Thiyagarajah, R. Chao, J. W. Ting, N. L. Chung, T. Ling, T. H. Chan, S. Y. Siah, R. Nair, S. Deshpande, R. Whig, K. Nagel, S. Aggarwal, M. DeHerrera, J. Janesky, M. Lin, H. J. Chia, M. Hossain, H. Lu, S. Ikegawa, F. B. Mancoff, G. Shimon, J. M. Slaughter, J. J. Sun, M. Tran, S. M. Alam,  and T. Andre, “Cmos-embedded stt-mram arrays in 2x nm nodes for gp-mcu applications,” in 2017 Symposium on VLSI Technology (2017) pp. T208–T209.
  • Bhatti et al. (2017) Sabpreet Bhatti, Rachid Sbiaa, Atsufumi Hirohata, Hideo Ohno, Shunsuke Fukami,  and SN Piramanayagam, “Spintronics based random access memory: A review,” Materials Today  (2017).
  • Camsari et al. (2017b) Kerem Yunus Camsari, Sayeef Salahuddin,  and Supriyo Datta, “Implementing p-bits with embedded mtj,” IEEE Electron Device Letters 38, 1767–1770 (2017b).
  • Hassan et al. (2019) Orchi Hassan, Rafatul Faria, Kerem Yunus Camsari, Jonathan Z Sun,  and Supriyo Datta, “Low barrier magnet design for efficient hardware binary stochastic neurons,” IEEE Magnetics Letters  (2019).
  • Fang et al. (2014) Ye Fang, Sheng Feng, Ka-Ming Tam, Zhifeng Yun, Juana Moreno, Jagannathan Ramanujam,  and Mark Jarrell, ‘‘Parallel tempering simulation of the three-dimensional edwards–anderson model with compact asynchronous multispin coding on gpu,” Computer Physics Communications 185, 2467–2478 (2014).
  • Trotter (1959) Hale F Trotter, “On the product of semi-groups of operators,” Proceedings of the American Mathematical Society 10, 545–551 (1959).
  • Gardner (1987) E Gardner, “Multiconnected neural network models,” Journal of Physics A: Mathematical and General 20, 3453 (1987).
  • Seki and Nishimori (2015) Yuya Seki and Hidetoshi Nishimori, ‘‘Quantum annealing with antiferromagnetic transverse interactions for the hopfield model,” Journal of Physics A: Mathematical and Theoretical 48, 335301 (2015).
  • Geman and Geman (1984) Stuart Geman and Donald Geman, “Stochastic relaxation, gibbs distributions, and the bayesian restoration of images,” IEEE Transactions on pattern analysis and machine intelligence , 721–741 (1984).
  • Hinton (2012) Geoffrey E Hinton, “A practical guide to training restricted boltzmann machines,” in Neural networks: Tricks of the trade (Springer, 2012) pp. 599–619.
  • Kadanoff and Baym (1962) Leo P Kadanoff and Gordon A Baym, “Quantum statistical mechanics: Green’s function methods in equilibrium and nonequilibirum problems,”  (1962).
  • Feynman et al. (2011) Richard P Feynman, Robert B Leighton,  and Matthew Sands, The Feynman lectures on physics, Vol. I: The new millennium edition: mainly mechanics, radiation, and heat, Vol. 1 (Basic books, 2011).
  • Bravyi (2014) Sergey Bravyi, “Monte carlo simulation of stoquastic hamiltonians,” arXiv preprint arXiv:1402.2295  (2014).
  • Yu et al. (2011) Shimeng Yu, Yi Wu, Rakesh Jeyasingh, Duygu Kuzum,  and H-S Philip Wong, “An electronic synapse device based on metal oxide resistive switching memory for neuromorphic computation,” IEEE Transactions on Electron Devices 58, 2729–2737 (2011).
  • Hu et al. (2016) Miao Hu, John Paul Strachan, Zhiyong Li, Emmanuelle M Grafals, Noraica Davila, Catherine Graves, Sity Lam, Ning Ge, Jianhua Joshua Yang,  and R Stanley Williams, “Dot-product engine for neuromorphic computing: programming 1t1m crossbar to accelerate matrix-vector multiplication,” in Proceedings of the 53rd annual design automation conference (ACM, 2016) p. 19.
  • McMahon et al. (2016) Peter L McMahon, Alireza Marandi, Yoshitaka Haribara, Ryan Hamerly, Carsten Langrock, Shuhei Tamate, Takahiro Inagaki, Hiroki Takesue, Shoko Utsunomiya, Kazuyuki Aihara, et al., “A fully programmable 100-spin coherent ising machine with all-to-all connections,” Science 354, 614–617 (2016).
  • (54) “Predictive Technology Model (PTM) (http://ptm.asu.edu/),” .
  • Torunbalci et al. (2018) M. M. Torunbalci, P. Upadhyaya, S. A. Bhave,  and K. Y. Camsari, “Modular compact modeling of mtj devices,” IEEE Transactions on Electron Devices , 1–7 (2018).
  • Cowburn et al. (1999) R. P. Cowburn, D. K. Koltsov, A. O. Adeyeye, M. E. Welland,  and D. M. Tricker, “Single-domain circular nanomagnets,” Physical Review Letters 83, 1042 (1999).
  • Debashis et al. (2018) Punyashloka Debashis, Rafatul Faria, Kerem Yunus Camsari,  and Zhihong Chen, “Designing stochastic nanomagnets for probabilistic spin logic,” IEEE Magnetics Letters  (2018).
  • Sankey et al. (2008) Jack C Sankey, Yong-Tao Cui, Jonathan Z Sun, John C Slonczewski, Robert A Buhrman,  and Daniel C Ralph, “Measurement of the spin-transfer-torque vector in magnetic tunnel junctions,” Nature Physics 4, 67 (2008).
  • Sun (2000) Jonathan Z Sun, “Spin-current interaction with a monodomain magnetic body: A model study,” Physical Review B 62, 570 (2000).
  • Li and Zhang (2004) Z Li and S Zhang, ‘‘Thermally assisted magnetization reversal in the presence of a spin-transfer torque,” Physical Review B 69, 134416 (2004).
  • Datta et al. (2012) Deepanjan Datta, Behtash Behin-Aein, Supriyo Datta,  and Sayeef Salahuddin, “Voltage asymmetry of spin-transfer torques,” IEEE Transactions on Nanotechnology 11, 261–272 (2012).
  • Faria et al. (2017) Rafatul Faria, Kerem Yunus Camsari,  and Supriyo Datta, ‘‘Low-barrier nanomagnets as p-bits for spin logic,” IEEE Magnetics Letters 8, 1–5 (2017).
  • Lopez-Diaz et al. (2002) L Lopez-Diaz, L Torres,  and E Moro, “Transition from ferromagnetism to superparamagnetism on the nanosecond time scale,” Physical Review B 65, 224406 (2002).
  • Lucas (2014) Andrew Lucas, “Ising formulations of many np problems,” Frontiers in Physics 2, 5 (2014).
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
366228
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description