Finding Quantum Critical Points with NeuralNetwork Quantum States
Abstract
Finding the precise location of quantum critical points is of particular importance to characterise quantum manybody systems at zero temperature. However, quantum manybody systems are notoriously hard to study because the dimension of their Hilbert space increases exponentially with their size. Recently, machine learning tools known as neuralnetwork quantum states have been shown to effectively and efficiently simulate quantum manybody systems. We present an approach to finding the quantum critical points of the quantum Ising model using neuralnetwork quantum states, analytically constructed innate restricted Boltzmann machines, transfer learning and unsupervised learning. We validate the approach and evaluate its efficiency and effectiveness in comparison with other traditional approaches.
1 Introduction
Matthias Vojta, in [79], highlights that “[…] the presence of […] quantum critical points holds the key to sofar unsolved puzzles in many condensed matter systems”. Quantum critical points mark the transition between different phases of quantum manybody systems [74] at zero temperature. Finding their precise location is of particular importance to characterise the physical properties of quantum manybody systems [62, 68]. However, these systems are notoriously hard to study because their associated quantum wave functions live in a huge Hilbert space which is the tensor product of the individual Hilbert spaces associated to each constituent of the system. As such, its dimension increases exponentially with the number of constituents. This entails computational complexity issues and calls for deterministic and stochastic approximation algorithms.
Recently, Carleo and Troyer [15] showed that a machine learning tool, which they called neuralnetwork quantum states, can effectively and efficiently simulate quantum manybody systems in different quantum phases and for different parameters of the system. Their approach can be seen as an unsupervised neural network implementation of a variational quantum Monte Carlo method. The authors used a restricted Boltzmann machine to calculate the ground state energy and the time evolution of quantum manybody systems such as the Ising and Heisenberg models. This work triggered a wave of interest in the design of neural network approaches to the study of quantum manybody systems [19, 20, 21, 41, 52, 53, 84, 92].
We present here an approach to finding the quantum critical points of the quantum Ising model using innate restricted Boltzmann machines, transfer learning and unsupervised learning for neuralnetwork quantum states.
We first propose to analytically construct restricted Boltzmann machine neuralnetwork quantum states for quantum states deeply in each phase of the system. We refer to such restricted Boltzmann machine neuralnetwork quantum states as innate as they have innate knowledge, i.e. builtin knowledge rather than knowledge acquired by training, of the system they represent.
We then devise a transfer learning protocol across parameters of the system to improve both the efficiency and the effectiveness of the approach. We finally combine the transfer learning protocol across system parameters with a transfer learning protocol to larger sizes [92] to find the quantum critical points in the limit of infinite size.
We evaluate the efficiency and effectiveness of the approach for one, two and threedimensional Ising models in comparison with other traditional approaches such as exact diagonalization method [83], a numerical approximation method called tensor network method [57, 66] and a stochastic method called quantum Monte Carlo method [7, 9, 32].
The rest of the paper is structured as follows. Section 2 summarises the related work. Section 3 presents the necessary notions of quantum manybody physics, the Ising model, restricted Boltzmann machine and restricted Boltzmann machine neuralnetwork quantum states. Section 4 presents the general approach and the algorithm for finding quantum critical points, the transfer learning protocols and the analytical construction of an initial restricted Boltzmann machine neuralnetwork quantum states. Section 5 reports the result of the comparative performance evaluation of our approach. We conclude and highlight possible future works in Section 6.
2 Related Work
2.1 Quantum manybody physics
Quantum manybody physics [74] is a branch of quantum physics that studies systems with large numbers of interacting particles, or bodies. Some wellknown quantum manybody physics models are Ising, Heisenberg and Hubbard models and their variants [2, 27, 30]. We focus on the quantum Ising model, which has been studied extensively in the literature [10, 70, 86] as it, albeit a simple model, displays most of the qualitative features present in complex models.
More specifically we focus on finding the ground state of a system in the Ising model and the quantum critical points where the nature and the properties of the ground state change qualitatively.
Several methods have been developed to find the ground state. The most straightforward method is to diagonalize the Hamiltonian matrix that represents the problem, its eigenvalues giving the possible energies and its eigenvectors representing the corresponding states [83].
Even though iterative methods [43, 47] have been devised to improve the efficiency of the diagonalization method, it still does not scale well as the size of the system increases. Instead, deterministic and stochastic approximation methods have been proposed and used. Tensor network methods [66] are deterministic approximation methods using variational techniques and combining the exact diagonalization with the iterative generation of an effective, low dimensional and local Hamiltonian [87]. Quantum Monte Carlo methods [32] are, instead, stochastic.
This work belongs to the general field of quantum machine learning that addresses machine learning problems with quantum computing as well as quantum physics problems with machine learning [6, 13, 25]. One application is the evaluation of the properties of a quantum system at very cold temperatures [15]. Some other applications in this domain include quantum state tomography which reconstructs quantum states from measurements [77], the estimation [39] and control [17] of the parameters of quantum systems, and the design of better experiments [4]. Machine learning algorithms, specifically neural network, have also been used to classify phases or to detect phase transitions in a supervised [11, 16, 61, 78] and unsupervised [37, 80, 85] manner.
2.2 Neuralnetwork quantum states
Recently, Carleo and Troyer proposed to use restricted Boltzmann machines to simulate quantum manybody systems and introduced neuralnetwork quantum states [15]. Their scheme falls into the family of variational quantum Monte Carlo methods. They tested their approach on the paradigmatic Ising and Heisenberg models and demonstrated that this new method is capable of finding the lowestenergy state and of reproducing the time evolution of these interacting systems. The neuralnetwork quantum states method has been further explored by studying quantum entanglement properties [26], its connection to other methods [18, 31] and its representation power [23, 29, 40, 50]. It has also been used to find excited states [19], to study different quantum models [20, 21, 52] and to aid the simulation of quantum computing [41].
Several works have tried to analytically or algorithmically construct a representation of a quantum manybody system with a restricted Boltzmann machine. Most of these works focus on the architecture and topology of the network. Restricted Boltzmann machines have been constructed for the MajumdarGhosh and AKLT models [31], for the CZX model [50] and for the Heisenberg and Hubbard models (in this case combined with the pairproduct method) [56]. In the field of quantum error correction, restricted Boltzmann machines have been proposed for the stabilizer code [40] and for the toric code [29]. The authors of [14] algorithmically and deterministically constructed a deep Boltzmann machine from the system parameters. The authors of [18] algorithmically constructed a mapping between restricted Boltzmann machines and tensor networks.
2.3 Restricted Boltzmann machine
The original architecture of neuralnetwork quantum states, which we adopt, leverages the unsupervised training and generative capabilities of restricted Boltzmann machines. Restricted Boltzmann machines are generative energybased probabilistic graphical models. They were initially invented under the name Harmonium in 1986 [67].
The training of restricted Boltzmann machines can be supervised or unsupervised. In the supervised case, they are usually used as feature extractors. However, they can also be used for classification and regression tasks as in [44, 49, 54, 55, 76]. In the unsupervised case, they have been used in a variety of domains such as face recognition [73], dimensionality reduction [35], unsupervised feature learning [22] and image denoising [71], topic modelling [36, 90], acoustic modelling [24, 38], collaborative filtering [63], anomaly detection [28], fault detection [48] and credit scoring [76].
One of their most interesting characteristics is that they can, not only be used as discriminative models, but also as generative models [8, 42, 45, 65, 69, 72, 89]. They have been applied to the generation of images [45, 42], videos [69], music [8], speeches [89] and human motions [72]. The authors of [65] use the combination of generative and discriminative models of restricted Boltzmann machines in the medical domain to classify and generate fMRI images.
2.4 Transfer learning
We devise two transfer learning protocols that improve effectiveness, efficiency and scalability of restricted Boltzmann machine neuralnetwork quantum states.
Gale Martin, in [51], was the first to evaluate the opportunity of directly copying neural network weights trained on a particular task to another neural network with a different task in order to improve efficiency. His approach was soon further improved by [59]. The notion was later formalised in [75] and in [5] under the name transfer learning.
Transfer learning has been applied to all kinds of unsupervised, supervised and reinforcement learning tasks as reported in several surveys [58, 82]. Transfer learning has been applied to restricted Boltzmann machines for numerous tasks such as reinforcement learning [3] and classification [81, 93]. The authors of [91] applied transfer learning to neural networks and they observed that it improves both efficiency and effectiveness.
3 NeuralNetwork Quantum States
3.1 Quantum manybody systems
A quantum manybody system [74] consists of a large number of interacting particles, or bodies, evolving in a discrete or continuous dimensional space. A particle is, in general, characterised by its external degrees of freedom, such as its position momentum, and its internal degrees of freedom, such as its magnetic moment, also referred to as spin. In the following, we will concentrate on the spin degree of freedom and consider identical particles pinned at the nodes of a dimensional lattice (). The size of the system is then given by the number of particles, the number of possible spin states per particle being .
A quantum manybody model defines how particles interact with each other or with external fields. Several prototypical models, such as the Ising and Heisenberg models and their variants [27, 30], describe the pairwise interactions of the spins of particles in addition to the interaction with external fields. The physical properties of each model depend on the respective magnitude of all these interactions which enter as parameters in the model.
Specifying the value of the spin for each particle gives a configuration of the system.The number of configurations is, therefore, exponential in the number of particles. We will specifically consider onehalf spin in the rest of the paper, meaning that each particle can only have internal states.
In quantum physics, the possible physical states of a given system are described by state vectors , called wave functions, living in the socalled state space. Formally, this state space is a complex separable Hilbert space and state vectors are simply linear combination of all the basis state vectors, denoted by , associated to each possible configuration :
(1) 
As easily seen from Equation 1, the dimension of the Hilbert space is given by the number of possible distinct configurations. For the interacting spin systems we consider, the dimension of the Hilbert space is then . Each complex coefficient in Equation (1) is called a probability amplitude. Defining the normalisation constant , gives the probability of the configuration in the state . The collection of all these probabilities defines the multinomial probability distribution of all possible configurations of the system.
For a given grid, number of particles and external fields, the dynamics of a system is fully described by its Hamiltonian. The Hamiltonian is a Hermitian matrix of size that describes how the system evolves. Furthermore, the eigenvalues of the Hamiltonian are the possible energies of the system and the corresponding eigenvectors are the only possible states in which the system can be individually found after a measurement of its energy has been performed.
The energy functional of a state with wave function is given in Equation 2 where is the local energy function of a given configuration , as defined in Equation 3, with the entry of the Hamiltonian matrix for the configurations and :
(2) 
(3) 
Formally, the energy functional is the expected value of the local energy. Do note that the local energy of any configuration gives the average energy value of the corresponding state . Based on the variational principle in quantum mechanics, the energy functional of a given state is always larger than or equal to the lowest possible energy of the system, i.e. to the lowest eigenvalue of the Hamiltonian. It reaches this minimal value when is precisely the corresponding eigenvector called the ground state of the system.
Being the most relevant state at low enough temperatures, the ground state has important physical implications as it can have emerging properties which could not be trivially predicted from the interactions of the particles.
A phase is a region in the space of the parameters of a model in which systems have similar physical properties. In the thermodynamic limit, each possible phase is characterised by socalled order parameters that achieve different values in each phase region. Finding the order parameters that characterise the phases of a system is an open research area which can benefit from neural networks too [13, 25].
A phase transition occurs when the system crosses the boundary between two phases and the order parameters change values. When this happens, the nature and the properties of the ground state change qualitatively. The transition happens when the parameters of a model are varied. In quantum systems, in the limit of infinite system size, the transition is typically described by an abrupt change in the observable physical properties or their derivatives. In particular, the term “quantum phase transition” is used for phase transitions in the ground state alone (i.e. for a system at zero temperature). The parameters of a model that correspond to this abrupt change define the quantum critical points. For finitesize systems, the transition is not abrupt but smooth. Mathematically, this means that, for a given size of the system, we need to find the inflection point of the order parameter as a function of the parameters of the system. Since it is not possible to empirically determine the parameters that yield the quantum critical point of an infinite system, it will be necessary to extrapolate its limit value from a series of values measured or simulated from systems of increasing sizes. In the remainder of the paper, when we mention a critical point, we refer to the quantum critical point.
3.2 Ising Model
The Ising model describes particles pinned on the sites of a lattice carrying a binary discrete spin. Each spin is in one of two states: up (represented by ) or down (represented by ). We have two states per particles and, therefore, the number of configurations equals to , for particles. A configuration is given by the value of the spin on each site: where .
Each particle interacts with its nearest neighbours and with an external magnetic field along the axis. We consider a homogeneous Ising model where the parameters are translationally invariant. The parameters of the model that characterise the interaction among the particles and the external field are denoted with and , respectively.
Equation 4 gives the Hamiltonian matrix of the Ising model where is a function that returns the nearest neighbouring sites and the are the Pauli matrices where and indicates the position of the spin it acts upon. Only the relative strength between and matters. For instance, a realisation of the Ising model with and has the same static properties as a realisation with and except that the energy is doubled in the latter. Therefore, we refer to as the parameter of the system in the Ising model.
(4) 
We are interested in the possible magnetic phases of the system. In the paramagnetic phase, the magnetic field dominates over the interaction . The ground state is oriented in the direction and the magnetisation in the direction (and all corresponding correlations) is zero. All configurations are equally probable in this state. In the ferromagnetic phase, where and dominates , the particles interact to align parallel to each other. The configurations where spins are parallel to each other (e.g. all spinups and all spindowns) are the most probable ones. In the antiferromagnetic phase, where , neighbouring particles interact to align antiparallel to each other. Due to the symmetry of the Ising model, the antiferromagnetic phase is equivalent to the ferromagnetic one, up to a redefinition of the directions of the spins. In particular, the transitions from paramagnetic to ferromagnetic and antiferromagnetic phases will happen at the same absolute value of . In the following we consider only positive values of .
We study four order parameters that are commonly used in the literature [60, 64] to find the critical points. Firstly, the squared magnetisation, denoted by and shown in Equation 5, which shows the presence of ferromagnetism, while it is zero in the paramagnetic and antiferromagnetic phases. becoming non zero marks the transition point between the paramagnetic and ferromagnetic phase. We refer to this order parameter as the ferromagnetic magnetisation .
(5) 
Secondly, the squared magnetisation, denoted by and shown in Equation 6, which shows the presence of antiferromagnetism, while it is zero in the paramagnetic and ferromagnetic phases. becoming non zero marks the transition point between the paramagnetic and antiferromagnetic phases. We refer to this order parameter as the antiferromagnetic magnetisation .
(6) 
Thirdly, the average ferromagnetic correlation between the particle at a given position to any particle , denoted by and shown in Equation 7 where is a range of distances between two particles that we consider. This order parameter shows the inclination of the particles to be aligned with each other. We refer to this order parameter as the ferromagnetic correlation .
(7) 
Finally, the average antiferromagnetic correlation between the particle at a given position to any particle , denoted by and shown in Equation 8 where is a range of distances between two particles that we consider. We refer to this order parameter as the antiferromagnetic correlation .
(8) 
The terms and in Equations 6 and 8, respectively, are inserted in order to add up the magnetisation of spins that are exactly antiferromagnetically correlated.
For onedimensional systems, the correlation order parameters are computed from the particle at the first position () to every other particle in the system (). For twodimensional systems, the correlation order parameters are computed from the first particle in the centre row (coordinate of ) to the neighbouring particles in the same row (). For threedimensional systems, the correlation order parameters are computed from the particle in the centre (coordinate of ) to the neighbouring particles in the same row ().
For the onedimensional Ising model, in the limit of infinite size, it is exactly known that critical points are located at [70]. The system is antiferromagnetic when , paramagnetic when and ferromagnetic when , which results from the symmetry between ferromagnetic and antiferromagnetic phases. For the twodimensional model, quantum Monte Carlo simulations [7] showed that the three same phases are observed, in the same order, but with critical points located at . For the threedimensional model, quantum Monte Carlo simulations [9] showed that the critical points are located at .
3.3 Restricted Boltzmann machine
A restricted Boltzmann machine is an energybased generative model [35, 46]. As shown in Fig.1, it consists of a visible layer and a hidden layer . Each one of the visible nodes represents the value of an input. The only design choice is the choice of the number of latent variables. It is usual to consider a multiple, , of the number of visible nodes. Therefore, the hidden layer consists of hidden nodes . The visible node and the hidden nodes are connected by the weight . A restricted Boltzmann machine is fully described by the matrix of weights.
A restricted Boltzmann machine represents the distribution of configurations of its input layer and hidden layer as a function of its weights as given in Equation 9, where is the normalisation constant. To get the distribution of its input layer, we marginalise from Equation 9 to get Equation 10. A gradient descent updating the weights can train a restricted Boltzmann machine to learn the probability distribution of a set of examples that minimises the loglikelihood whether it is supervised or unsupervised. The restricted Boltzmann machine is able to sample a configuration from this multinomial distribution. When trained with a set of example configurations, the restricted Boltzmann machine learns their distribution by minimising an energy function, which is the negative loglikelihood of the distribution. This is done by Gibbs sampling with stochastic gradient descent or contrastive divergence [34].
(9) 
(10) 
The Gibbs sampling process is as follows. From a given initial visible configuration , for each hidden node , a value is generated by sampling from the conditional probability given in Equation 11. From this hidden configuration, for each visible node , a value is generated by sampling from the given in Equation 12.
(11) 
(12) 
3.4 Restricted Boltzmann machine neuralnetwork quantum states
A restricted Boltzmann machine neuralnetwork quantum state is exactly a restricted Boltzmann machine where the visible node represents one of the particles of the quantum manybody system and its value represents the value of the spin of that particle. Each node of the Restricted Boltzmann machine neuralnetwork quantum states is a Bernoulli random variable with possible outcomes representing the values of a spin with two values, namely or .
Instead of minimising the log likelihood of the distribution of training data, as it is generally the case for unsupervised energybased machine learning models, restricted Boltzmann machine neuralnetwork quantum states minimise the expected value of the local energy given in Equation 13.
(13) 
In restricted Boltzmann machine neuralnetwork quantum states, in order to minimise the energy of the system, leveraging the variational principle and the zero variance property, the expected value of the local energy of the configurations is minimised. This makes the connection between the restricted Boltzmann machine neuralnetwork quantum states and the Hamiltonian of the system it is trying to simulate. Indeed, Equation 13 is similar to Equation 3 where the ratio of wave functions is assumed to be the same as the square root of the ratio of their norm. Here we recall that is the probability of a configuration, and we stress here that the ground state of the Ising model can be chosen as a real and positive function, which allows us to write .
The unsupervised training process does not need any example. It can rely on random configurations that it generates. The iterative minimisation process alternates the Gibbs sampling of configurations, the calculation of the expected value of their local energy and stochastic gradient descent until a predefined stopping criterion is met.
4 Finding the Quantum Critical Points
4.1 Overview of the approach
The approach that we consider for finding the critical points is as follows. We simulate an initial system at a selected initial parameter , find its ground state and calculate the order parameter corresponding to the critical point that we are looking for. We repeat the operation increasing and decreasing the parameter with an initial step size. We are looking for an inflection point in the function of the parameter of the system that gives the value of the order parameter. We recursively reduce the step size until we identify the inflection point. This first algorithm finds the inflection point of a system of a given size.
The algorithm, therefore, receives the following input: the description of the system (its dimension and its size), the initial parameter of the system, the initial step size, the order parameter and the desired precision. The algorithm additionally stores the upper bound of the parameter of the system to look for the inflection point to make sure that the algorithm terminates if it does not find any inflection point. The algorithm terminates when the desired precision is reached or no inflection point is found.
We then repeat, as long as our computing resources reasonably allow, this algorithm for increasing sizes of the system. This is done to find the value of the critical point at the limit of infinite size of the system.
We use restricted Boltzmann machine neuralnetwork quantum states to simulate the system and calculate the order parameters. However, the repeated training of restricted Boltzmann machine neuralnetwork quantum states for systems under different parameters and of increasing sizes is expensive. We devise three optimisations. The first, presented in Subsection 4.2, is the analytical construction of the innate restricted Boltzmann machine neuralnetwork quantum states for a parameter deeply in the quantum phases to avoid being accidentally trapped in a local minimum. The second, presented in Subsection 4.3, is the use of transfer learning across parameters to avoid successive cold starts. The third, presented in Subsection 4.4, is the use of transfer learning to larger sizes again to avoid successive cold starts.
4.2 Construction of innate restricted Boltzmann machine neuralnetwork quantum states
From physical understanding, we can infer the form of the probability distribution of the configurations of a system if sufficiently deep in each phase, and construct an innate restricted Boltzmann machine neuralnetwork quantum state that reproduces qualitatively the features of this distribution.
Several works have analytically or algorithmically constructed Boltzmann machine neuralnetwork quantum state, e.g. [14], for effective representations of quantum manybody systems. Here we use a standard restricted Boltzmann machine topology of the network, and instead we analytically evaluate its weights.
If , there are no interactions between spins, the system is in a deep paramagnetic phase and all the configurations are equiprobable. Putting all the weights to zero gives such distribution but forbids optimisation as all gradients are identical. Therefore, we sample the weights from a normal distribution with zero mean and a small standard deviation. This construction resembles the common initialisation method of the weights of a restricted Boltzmann machine [34].
If , the interactions between particles are dominant and the system is in a deep ferromagnetic phase. The configurations where all spins are up or all spins are down are the most probable. We then construct the weights of the restricted Boltzmann machine neuralnetwork quantum states to ensure that the probability is maximal for these two configurations. This is achieved by setting all of the weights of each visible node to a particular hidden node to be the same and zero for the other hidden nodes. Once again, instead of using zero weights, we sample small values of the weights from a normal distribution. A similar procedure can be used for the antiferromagnetic phase when by setting all of the weights of each visible node to a particular hidden node to be the same but with different sign instead.
As mentioned earlier, in order to avoid being accidentally caught in a local minimum during the initial training of the first restricted Boltzmann machine neuralnetwork quantum states for an arbitrary initial parameter, we choose the initial parameter to be deeply in one of the phases and construct an innate restricted Boltzmann machine neuralnetwork quantum state. We refer to this construction as RBMNQSI. Additionally, we refer to the restricted Boltzmann machine neuralnetwork quantum states starting from a cold start as RBMNQSCS.
4.3 Transfer learning protocol among parameters
Physically, it is expected that the wave function of systems under different but nearby values of their parameters are neighbours in the Hilbert space, although this may not be true if they are separated by a phase transition. Therefore, we expect the restricted Boltzmann machine neuralnetwork quantum states to be similar for two systems for sufficiently nearby values of the parameters.
Following the terminology in [91], the base network is a trained or innate restricted Boltzmann machine neuralnetwork quantum state for a value of the parameter of the system. The target network is a restricted Boltzmann machine neuralnetwork quantum state for a different value of parameter with the same number of visible and hidden nodes. We can thus directly transfer the weights from the base network to the target network.
After transferring the weights, we trained the target network until it converges to a new ground state. We expect that fewer iterations are needed for the target network to converge than it would take for a cold start initialised with a set of random weights.
We apply this parameter transfer protocol to define an algorithm to look for the inflection point of a system of a given size. We first construct an innate restricted Boltzmann machine neuralnetwork quantum state using RBMNQSI. We then calculate the order parameter value at the ground state and we iterate with this transfer learning protocol with adaptive step sizes until we locate the inflection point. We refer to this algorithm as RBMNQSIT.
4.4 Transfer learning protocol to larger sizes
Physically, it is also expected that there is a relationship between the wave function of systems with the same parameter value but of different sizes as if they were the same system at different length scales [88]. We have explored such physicsinspired transfer learning protocols in [92] and demonstrated their superiority over a cold start from both the effectiveness and efficiency points of view.
We want to find the critical points in the limit of infinite size. We expect the value of the parameter corresponding to the inflection points of a system of increasing finite sizes to converge asymptotically to this limit.
In our problem, this means that we need to transfer a restricted Boltzmann machine neuralnetwork quantum state that has been optimised for a system with a certain size to another restricted Boltzmann machine neuralnetwork quantum states with larger size and identical parameters.
To differentiate between the two transfer learning protocols, the transfer learning protocol among phases is transferring a point within the same Hilbert space while transfer learning protocol among sizes is transferring a point across Hilbert spaces.
The base network is a restricted Boltzmann machine neuralnetwork quantum state for a given value of the parameter of the system. The target network is a restricted Boltzmann machine neuralnetwork quantum state for the same value but for a system of larger size. The protocol needs to leverage insights in the physics of the quantum manybody system and model. The details of the protocol are given in [92].
We use this transfer learning protocol to a system of larger sizes to find the inflection point for a series of systems of increasing sizes. Instead of starting from the same initial parameter at each size of the system, we instead start from the parameter at the inflection point of the system of smaller size by using transfer learning protocol to larger sizes. We then find the inflection point at the larger size. Finally, we extrapolate the value of the critical point in the limit of infinite size. We refer to this algorithm as RBMNQSITT.
We note that our method could fail because we are implementing the transfer learning for the “hardest” location of the parameters space, which is at the inflection point of an order parameter. Several improvements to this strategy, left for future works, could be proposed. For instance, while traversing the parameters space, we could combine the transfer learning protocol to larger sizes with the transfer learning protocol among parameters.
5 Performance Evaluation
The performance evaluation is threefold. We evaluate the performance of the RBMNQSI construction, RBMNQSIT for finding the inflection point for a system of a given size and RBMNQSITT for finding the critical points at the limit of infinite size in Subsection 5.1, 5.2 and 5.3, respectively. We evaluate the effectiveness, which is the accuracy of the inflection point or the critical point, and the efficiency, which is the processing time. All of the evaluations are done for systems with open boundary conditions.
The training of the restricted Boltzmann machine neuralnetwork quantum states is done in an iterative manner. In each iteration, we take 10,000 samples to evaluate the local energy and its gradients. At the last iteration, we use these samples to calculate the order parameters. We update the weights using a stochastic gradient descent algorithm with RMSProp optimiser [33] where the initial learning rate is set to 0.001. Based on our empirical experiments, we set considering the efficiency and effectiveness tradeoff. For RBMNQSCS, a random weight is sampled from a normal distribution with 0.0 mean and 0.01 standard deviation following the practical guide in [34]. For RBMNQSI, a random weight is sampled from a normal distribution with either 0.0 or 1.0 mean and 0.01 standard deviation as required by the construction. Note that the value of 1.0 whose chosen as it results in better performance after testing a range of values between 0.1 and 1.5.
The training stops after it reaches the dynamic stopping criterion used in [92], i.e. when the ratio between the standard deviation and the average of the local energy is less than 0.005 or after 30,000 iterations. Since there is randomisation involved in the training, the value reported in the paper is an average of 20 realisations of the same calculation.
We compare this approach with the traditional methods of exact diagonalization and tensor networks. For the exact diagonalization, we use the implicitly restarted Arnoldi method to find the eigenvalues and eigenvectors [47]. Our computational resources only allow us to compute exact diagonalization up to 20 particles. For the tensor network method, we use the matrix product states algorithm [66] with a bond dimension up to 1000. Both of the methods run only once since there is no randomisation involved.
The existing code of RBMNQS is implemented in C++ with support for Message Passing Interface under a library named NetKet [12]. We ported the code into TensorFlow library [1] for a significant speedup with the graphics processing units. All of the experiments run on an NVIDIA DGX1 server equipped with NVIDIA Tesla V100 graphics processing units with 640 tensor cores, 5120 CUDA cores, and 16GB memory.
For the algorithm to find the inflection point, we choose the initial step size as and we divide the step size by after one iteration. The algorithm stops when the precision is . To calculate the gradient, we use secondorder accurate central differences.
5.1 Evaluation of innate restricted Boltzmann machine neuralnetwork quantum states
The performance evaluation of RBMNQSI deeply in each phase is twofold. First, we construct RBMNQSI without training and evaluate them. Second, we finetune the RBMNQSI until it reaches the stopping criterion and evaluate them. We evaluate the effectiveness and efficiency by comparing the value of the energy and the order parameters and by comparing the iterations needed for the training until it reached the stopping criterion with RBMNQSCS, respectively.
We choose , and for the cases of deep paramagnetic, ferromagnetic and antiferromagnetic phases, respectively. In the ferromagnetic and antiferromagnetic case, the weights are sampled from a normal distribution with either 0.0 or 1.0 mean and 0.01 standard deviation as prescribed in Subsection 4.2. Table 1, Table 2 and Table 3 show the evaluation of the RBMNQSCS and RBMNQSI where the size of the system is and parameter of the system , and , respectively.
In the case of a deep paramagnetic phase ( in Table 1), we observe that both the energy and the order parameter for both the RBMNQSCS and RBMNQSI without training are very close to the result of the tensor network method. When we train the RBMNQSI, it stops directly because it already reaches the stopping criterion. The value of the energy and order parameter are not exactly the same as the tensor network value due to the noise introduced in the weights and from the sampling process.
In the case of a deep ferromagnetic phase ( in Table 2) and antiferromagnetic phase ( in Table 3), we observe that the results of the RBMNQSI are closer to the result of the tensor network method and need less iterations to converge to the stopping criterion than RBMNQSCS. The energy and the order parameters of the RBMNQSI without training is quite far from the result of the tensor network method. We hypothesise that this is because or is not deep enough in the ferromagnetic phase. To evaluate the hypothesis, we comparatively evaluate RBMNQSI on system with 16 particles for , and . We observe that the relative error of the energy with the exact diagonalisation of RBMNQSI for , and is , and , respectively. This means that RBMNQSI is better when it is deeper in the corresponding phase.
For RBMNQSCS, we observe that the value of the order parameters are very far even though the energy is closer to the result of the tensor network method. This means that the training of RBMNQSCS remains in a local minimum and the restricted Boltzmann machine does not converge to the ground state.
Table 6, Table 7 and Table 8 in Appendix A show the evaluation of the RBMNQSCS and RBMNQSI for twodimensional system where the size of the system is and parameter of the system , and , respectively. Table 9, Table 10 and Table 11 in Appendix B show the evaluation of the RBMNQSCS and RBMNQSI for threedimensional system where the size of the system is and parameter of the system , and , respectively. We have chosen such system sizes so as to be able to compare the neural network quantum state results to exact diagonalization calculations.
We see similar trends for both the result of the twodimensional and threedimensional systems with those of the onedimensional system.
In the case of a deep paramagnetic phase ( in Table 6 in Appendix A and Table 9 in Appendix B ), the results of the RBMNQSCS and RBMNQSI are very close to the result of the exact diagonalization method. Therefore, no further training is needed.
In the case of a deep ferromagnetic phase ( in Table 7 in Appendix A and Table 10 in Appendix B) and antiferromagnetic phase ( in Table 8 in Appendix A and Table 11 in Appendix B), we see that the results of RBMNQSI are closer to those of the exact diagonalization calculations than RBMNQSCS before training. We also observe that the constructed RBMNQSI needs less iterations to converge to the stopping criterion than RBMNQSCS. Therefore, in twodimensional and threedimensional systems, we conclude that RBMNQSI is more effective than RBMNQSCS before training. However, they are equally effective after training but RBMNQSI is more efficient than RBMNQSCS.
To conclude, we showed that RBMNQSI on onedimensional, twodimensional and threedimensional systems deep in each phase is more effective and efficient than RBMNQSCS. Furthermore, further training is not needed in the case of a deep paramagnetic phase (i.e. ). Therefore, from this point forward, we choose as our initial parameter in our algorithm for finding the inflection point.
Without training  With training  Tensor network  
RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  127.9799 (0.0029)  127.9799 (0.0029)  127.9799 (0.0029)  127.9799 (0.0029)  128.00000 
0.0079 (0.0001)  0.0079 (0.0001)  0.0079 (0.0001)  0.0079 (0.0001)  0.00781  
0.0078 (0.0001)  0.0078 (0.0001)  0.0078 (0.0001)  0.0078 (0.0001)  0.00781  
0.0003 (0.0007)  0.0003 (0.0007)  0.0003 (0.0007)  0.0003 (0.0007)  0.00000  
0.0003 (0.0007)  0.0003 (0.0007)  0.0003 (0.0007)  0.0003 (0.0007)  0.00000  
Iterations      0  0   
Without training  With training  Tensor network  
RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  127.9061 (0.2577)  217.5726 (0.3152)  372.2911 (4.7748)  391.7046 (0.0182)  391.91198 
0.0078 (0.0001)  0.2934 (0.0009)  0.0981 (0.1041)  0.9658 (0.0005)  0.96980  
0.0078 (0.0001)  0.0056 (0.0001)  0.0006 (0.0001)  0.0003 (0.0000)  0.00023  
0.0002 (0.0011)  0.2897 (0.0059)  0.0360 (0.2928)  0.9362 (0.0051)  0.92871  
0.0001 (0.0010)  0.0020 (0.0009)  0.0025 (0.0032)  0.0072 (0.0002)  0.00704  
Iterations      1621.7250 (1271.7007)  20.8500 (0.4770)   
Without training  With training  Tensor network  
RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  128.0531 (0.2487)  217.8645 (0.4201)  371.8311 (5.1623)  391.7077 (0.0165)  391.91198 
0.0078 (0.0001)  0.0055 (0.0001)  0.0006 (0.0001)  0.0003 (0.0000)  0.00023  
0.0078 (0.0001)  0.2939 (0.0010)  0.1276 (0.1287)  0.9658 (0.0004)  0.96980  
0.0004 (0.0009)  0.0018 (0.0008)  0.0022 (0.0031)  0.0072 (0.0002)  0.00704  
0.0003 (0.0008)  0.2874 (0.0050)  0.0176 (0.3385)  0.9349 (0.0058)  0.92871  
Iterations      1331.6000 (720.1564)  20.8500 (0.4770)   
5.2 Finding inflection point for a system of a given size
We evaluate the performance of the algorithm for finding the inflection point for a system of a given size with RBMNQSIT.
The performance evaluation is twofold. We first provide an analysis by plotting the values of the order parameter as a function of the parameter . We then evaluate the inflection point for each system’s size and compare the value to other traditional methods to compare its effectiveness.
First, we plot the value of the order parameters as a function of . We use RBMNQSCS and RBMNQSIT to compute the order parameter at the ground state of each point in the space of the parameter of the system. For efficiency, we compare the time needed for all of the computation.
For onedimensional systems, we calculate the order parameters for within the range with intervals and for systems with size . For twodimensional systems, we calculate the order parameters for within the range with intervals and for systems with sizes . For threedimensional systems, we calculate the order parameters for within the range with intervals and for systems with sizes .
Figure 2 shows the value of the order parameters for onedimensional systems with RBMNQSIT. In the limit of infinite size, there should be an abrupt change of the derivative at the critical point and the value of the order parameter should change from 0 to an increasing function. We observe that the change in the derivative of the order parameter is more abrupt as we increase the size of the system. Similarly, we also observe that the value of the order parameter are closer to zero in one phase and closer to a function of the distance from the critical point in the other phase as we increase the size of the system. Figure 6 in Appendix C shows the result for the tensor network method and it shows a similar trend as observed in Figure 2.
We observe that the weights of RBMNQSIT do not change so drastically throughout the space of the parameters of the model. We expect that this is because the order parameters that we study behave smoothly close to the transition even for the large system sizes we consider. Video animations of the weights of RBMNQSIT for one realisation of a system of size with the ferromagnetic and the antiferromagnetic magnetisation order parameters and for from to with intervals, are available online
Figure 3 shows the value of the order parameters for onedimensional systems with RBMNQSCS. We observe that every order parameters fails to get to the correct value before or after the inflection point for systems with size and . This is possibly due to the network being trapped in a local minimum and more parameter tuning is needed. Therefore, RBMNQSIT is more effective than RBMNQSCS.
Figure 7 and Figure 8 in Appendix D show the value of the order parameters for twodimensional systems with RBMNQSIT and RBMNQSCS, respectively. Figure 9 and Figure 10 in Appendix E show the value of the order parameters for threedimensional systems with RBMNQSIT and RBMNQSCS, respectively. We see a similar trend as the result of the onedimensional model. In twodimensional systems, RBMNQSCS remains in a local minimum for a size . We note that, in three dimensions, RBMNQSCS performs well even for a system of size . This may be due to the fact that correlations are not as strong in a system with larger connectivity, i.e. each site is coupled to more sites.
It takes approximately 10 minutes to compute one realisation of a system with the size of 128 particles with RBMNQSIT, where by a realisation we mean the computation for within the range with values spaced by intervals of . Meanwhile, RBMNQSCS takes approximately 5 hours and the tensor network method that we have implemented takes approximately 60 hours. While this is not a fair comparison, we show here that the restricted Boltzmann machine neuralnetwork quantum states leveraging graphics processing units give very good computing times. Furthermore, given the reduced number of iterations required, RBMNQSIT boosts the speed even further.
Next, we evaluate the inflection point for each system’s size. We evaluate the performance on onedimensional systems from and doubling each time until . For twodimensional systems, we start from and doubling each time until . For threedimensional systems, we instead start from and increment the size of the system by one until . For threedimensional systems, we do not use the transfer learning protocol across sizes since we do not double the size of the system.
We comparatively evaluate the effectiveness and efficiency of RBMNQSCS and RBMNQSIT. To evaluate the effectiveness, we compare the value of the inflection point at each size of the system with the tensor network method [66] and exact diagonalization for onedimensional systems. For twodimensional and threedimensional systems, we only compare with exact diagonalization.
Table 4 and Table 5 show the value of the inflection point for different sizes of the system of onedimensional, twodimensional and threedimensional systems with RBMNQSCS, RBMNQSIT and tensor network method with ferromagnetic magnetisation and antiferromagnetic magnetisation order parameter, respectively.
In Table 4 and Table 5, we observe that RBMNQSCS performs the worst overall since the value of the inflection point is far from both the tensor network and exact diagonalization methods, especially in system of large size. It is particularly unstable in a onedimensional system with 64 and 128 particles, as shown by a very large standard deviation.
We observe that the tensor network method is closer to the exact diagonalization method for systems of small size than RBMNQSIT. We see that both the inflection point for RBMNQSIT and tensor network converge towards , the exact critical point at the infinite size limit [70].
In twodimensional and threedimensional system, both the results of RBMNQSCS and RBMNQSIT are close to the exact diagonalization method. However, RBMNQSIT is closer to the exact diagonalization method result than RBMNQSCS by a small margin. We believe that the performance of RBMNQSCS and RBMNQSIT is similar because of the small sizes considered, which were chosen so as to be able to compare to exact diagonalization results.
Table 12 and Table 13 in Appendix F show the value of the inflection point for different sizes of the system of onedimensional, twodimensional and threedimensional systems with RBMNQSCS, RBMNQSIT and tensor network method with ferromagnetic correlation and antiferromagnetic correlation order parameter, respectively. It shows similar trends as those observed for the magnetisation order parameters.
To evaluate the efficiency, we compare the time needed to detect the inflection point. It takes approximately 5 hours for one realisation to find the inflection point for innate RBMNQSIT for a system with a size of 128 particles. However, the absolute variance of the inflection point is relatively small, around 0.001. Therefore, in practice, one run suffices. Even though the RBMNQSCS takes approximately less than 1 hour, it is unstable and gives a wrong value for the inflection point. The tensor network method takes approximately 20 hours to find the inflection point.







1.114 (0.009)  1.105 (0.006)  1.11  1.109  
1.007 (0.008)  1.040 (0.005)  1.08  1.090  
1.011 (0.009)  1.013 (0.001)  1.05    
1.004 (0.009)  1 (0.001)  1.02    
0.646 (0.38)  1 (0.001)  1.01    
0.662 (0.04)  0.673 (0.05)    0.69  
0.5 (0.0)  0.501 (0.003)    0.51  
0.505 (0.012)  0.502 (0.004)    0.527 







1.072 (0.05)  1.10 (0.005)  1.12  1.109  
1.012 (0.01)  1.035 (0.006)  1.08  1.090  
1.011 (0.01)  1.010 (0.004)  1.05    
1.108 (0.36)  1.004 (0.003)  1.02    
128  0.912 (0.21)  1.002 (0.002)  1.01    
0.624 (0.08)  0.656 (0.05)    0.69  
0.5 (0.0)  0.5 (0.0)    0.51  
0.502 (0.002)  0.502 (0.001)    0.527 
5.3 Finding quantum critical points at the limit of infinite size
We evaluate the effectiveness of RBMNQSITT for finding the critical points at the limit of infinite size. We use the tiling protocol defined in [92] for the transfer learning protocol to larger sizes by transferring the parameters at the inflection point of a smaller size system to a larger one.
The performance evaluation is twofold. We first provide an analysis by plotting the values of the order parameter as a function of the parameter , which has been done in Subsection 5.2. We then provide an evaluation by fitting the value of the inflection point at each size of the system to show towards which value it converges in the infinitesize limit.
We observe in Figure 2 that with RBMNQSIT, for all order parameters, the inflection point converges toward , which is the exact critical point at the limit of infinite size [70], as we increase the size of the system.
For twodimensional and threedimensional systems, we observe similar trends as those observed in onedimensional systems. For twodimensional systems, we observe in Figure 7 and 8 in Appendix A that the inflection point for RBMNQSIT and RBMNQSCS, respectively, is converging toward the value of the critical point at the limit of infinite size obtained with a quantum Monte Carlo method [7]. Similarly to onedimensional systems, even though RBMNQSCS may remain trapped in a local minimum close to for a system with size , the inflection point is still close to . For threedimensional systems, we observe in Figure 9 and 10 in Appendix B that the inflection point for RBMNQSIT and RBMNQSCS, respectively, is converging toward the value of the critical point at the limit of infinite size obtained with a quantum Monte Carlo method [9].
We evaluate the value of the critical point at the limit of infinite size by extrapolating a series of inflection points at increasing system sizes as a function of the size of the system. We fit a function of the form with nonlinear least squares, where , and are the function parameters. The constraint of the parameter is and for ferromagnetic order parameters and and for antiferromagnetic order parameters. The value of approximates the value of the critical point at the limit of infinite size. We exclude the RBMNQSCS from this evaluation since we have shown in the previous sections that RBMNQSIT effectiveness is better.
Figure 4 (a) shows the evaluation of the critical point at the limit of infinite size by fitting the inflection points as a function of the size of the system in the onedimensional model with ferromagnetic magnetisation order parameter. We compare the result of RBMNQSITT with the tensor network method. The value of is and for RBMNQSITT and the tensor network method, respectively.
Figure 4 (b) shows the same evaluation for systems in two dimensions. The value of on RBMNQSITT is , which is close to the value based on quantum Monte Carlo method [7]. Figure 4 (c) shows the same evaluation for systems in threedimensions. The value of on RBMNQSITT is , which is sizeably different to the value based on quantum Monte Carlo method [7]. We expect that we need systems of larger sizes for threedimensional system to better estimate the critical point.
Figure 5 shows the same evaluation using the antiferromagnetic magnetisation order parameter. We observe similar trends as those observed in the ferromagnetic one except that the value at the limit for twodimensional system is further than the ferromagnetic one to the quantum Monte Carlo limit. The value of is for the antiferromagnetic phase where the critical point at the limit of infinite size is at .
Figure 11 and 12 in Appendix G show the same evaluation using the correlation order parameters. We observe similar trends as those observed for the magnetisation order parameters. However, using the ferromagnetic correlation order parameter on onedimensional systems, we see that the tensor network method gives better correlations, as the extrapolated critical point is closer to the exact limit than RBMNQSITT.
6 Conclusion
We have proposed an approach to finding quantum critical points with innate restricted Boltzmann machine neuralnetwork quantum states and transfer learning protocols. We applied the proposed approach to one, two and threedimensional Ising models and in the limit of infinite size.
We have empirically and comparatively shown that our proposed approach is more effective and efficient than cold start approaches, which start from a network with randomly initialised parameters. It is also more efficient than traditional approaches. Furthermore, we have shown that we can estimate the value of the quantum critical point at the infinite size limit with transfer learning protocol to larger sizes as proposed in [92].
A natural extension to this work is the study of the quantum critical exponents, which describe the behaviour of the order parameters close to the phase transitions. We also would like to further explore the opportunities to analytically and algebraically construct neuralnetwork quantum states. Such approaches may be used to devise solutions to other problems such as characterisation of properties of different quantum manybody systems, the study of their time evolution, as well as the study of quantum fewbody systems.
Mathematically, our proposed transfer learning protocols are done inside and across the Hilbert space of the wave function. Therefore, we also would like to pursue an application of the reverse where machine learning algorithms, especially energybased generative model, are explained in terms of Hilbert space and physical systems.
We acknowledge C. Guo and Supremacy Future Technologies for support on the matrix product states simulations. This work is partially funded by the National University of Singapore, the French Ministry of European and Foreign Affairs and the French Ministry of Higher Education, Research and Innovation under the Merlion programme, project “Deep Quantum”. We acknowledge support from the Singapore Ministry of Education, Singapore Academic Research Fund Tier II (project MOE2018T22142). The experiments reported in this article are performed on the infrastructure of Singapore National Supercomputing Centre
Appendix A Evaluation of innate restricted Boltzmann machine neuralnetwork quantum states for twodimensional systems
Table 6, Table 7 and Table 8 shows the evaluation of the RBMNQSCS and RBMNQSI for twodimensional system where the size of the system is and parameter of the system , and , respectively.
Without training  With training 


RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  16.0000 (0.0001)  16.0000 (0.0001)  16.0000 (0.0001)  16.0000 (0.0001)  16.0000  
0.0624 (0.0007)  0.0624 (0.0007)  0.0624 (0.0007)  0.0624 (0.0007)  0.0625  
0.0624 (0.0009)  0.0624 (0.0009)  0.0624 (0.0009)  0.0624 (0.0009)  0.0625  
0.2503 (0.0041)  0.2503 (0.0041)  0.2503 (0.0041)  0.2503 (0.0041)  0.2500  
0.2490 (0.0037)  0.2490 (0.0037)  0.2490 (0.0037)  0.2490 (0.0037)  0.2500  
Iterations      0  0   
Without training  With training 


RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  15.9808 (0.1652)  52.1501 (0.1441)  72.9401 (0.0035)  72.9397 (0.0039)  72.9455  
0.0625 (0.0008)  0.6064 (0.0023)  0.9856 (0.0005)  0.9861 (0.0005)  0.9860  
0.0624 (0.0007)  0.0263 (0.0003)  0.0010 (0.0000)  0.0009 (0.0000)  0.0009  
0.2483 (0.0042)  0.6863 (0.0056)  0.9924 (0.0009)  0.9924 (0.0009)  0.9934  
0.2520 (0.0045)  0.1042 (0.0036)  0.0022 (0.0004)  0.0024 (0.0006)  0.0017  
Iterations      180.4000 (4.8311)  126.0500 (1.7741)   
Without training  With training 


RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  16.0528 (0.1601)  52.1049 (0.1821)  72.9396 (0.0032)  72.9403 (0.0043)  72.9455  
0.0625 (0.0007)  0.0263 (0.0004)  0.0010 (0.0000)  0.0009 (0.0000)  0.0009  
0.0627 (0.0009)  0.6059 (0.0026)  0.9857 (0.0005)  0.9861 (0.0006)  0.9860  
0.2510 (0.0031)  0.1052 (0.0026)  0.0022 (0.0004)  0.0022 (0.0004)  0.0017  
0.2492 (0.0050)  0.6852 (0.0031)  0.9924 (0.0008)  0.9923 (0.0009)  0.9934  
Iterations      179.5500 (3.6807)  125.1500 (1.3143)   
Appendix B Evaluation of innate restricted Boltzmann machine neuralnetwork quantum states for threedimensional systems
Table 9, Table 10 and Table 11 shows the evaluation of the RBMNQSCS and RBMNQSI for threedimensional system where the size of the system is and parameter of the system , and , respectively.
Without training  With training 


RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  8.0000 (0.0000)  8.0000 (0.0000)  8.0000 (0.0000)  8.0000 (0.0000)  8  
0.1244 (0.0022)  0.1244 (0.0022)  0.1244 (0.0022)  0.1244 (0.0022)  0.125  
0.1250 (0.0028)  0.1250 (0.0028)  0.1250 (0.0028)  0.1250 (0.0028)  0.125  
0.4999 (0.0059)  0.4999 (0.0059)  0.4999 (0.0059)  0.4999 (0.0059)  0.5  
0.5000 (0.0059)  0.5000 (0.0059)  0.5000 (0.0059)  0.5000 (0.0059)  0.5  
Iterations      0.0000 (0.0000)  0.0000 (0.0000)   
Without training  With training 


RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  7.9931 (0.1337)  32.8919 (0.0726)  36.4436 (0.0022)  36.4445 (0.0015)  36.4451  
0.1254 (0.0022)  0.8416 (0.0026)  0.9887 (0.0007)  0.9884 (0.0004)  0.9891  
0.1252 (0.0018)  0.0225 (0.0004)  0.0016 (0.0001)  0.0017 (0.0001)  0.0015  
0.5023 (0.0056)  0.9094 (0.0026)  0.9936 (0.0014)  0.9938 (0.0008)  0.9938  
0.4977 (0.0056)  0.0906 (0.0026)  0.0064 (0.0014)  0.0062 (0.0008)  0.0062  
Iterations      274.6000 (7.4726)  171.8000 (2.2716)   
Without training  With training 


RBMNQSCS  RBMNQSI  RBMNQSCS  RBMNQSI  
Energy  8.0406 (0.0767)  32.8799 (0.0948)  36.4422 (0.0032)  36.4431 (0.0019)  36.4451  
0.1243 (0.0013)  0.0227 (0.0005)  0.0016 (0.0001)  0.0017 (0.0001)  0.0015  
0.1250 (0.0013)  0.8412 (0.0028)  0.9887 (0.0005)  0.9880 (0.0010)  0.9891  
0.4996 (0.0031)  0.0908 (0.0040)  0.0061 (0.0008)  0.0075 (0.0007)  0.0062  
0.5004 (0.0031)  0.9092 (0.0040)  0.9939 (0.0008)  0.9925 (0.0007)  0.9938  
Iterations      273.5000 (5.3898)  171.5000 (1.9621)   
Appendix C Analysis of the order parameter for a system of a given size with tensor network method
Figure 6 shows the value of the order parameters for a onedimensional system with the tensor network method. We calculate the order parameters for within the and , for antiferromagnetic and ferromagnetic order parameters respectively, with intervals and for systems with sizes .
Appendix D Analysis of the order parameter for a system of a given size for two dimensional systems
Figure 7 and Figure 8 shows the value of the order parameters for a twodimensional system with RBMNQSIT and RBMNQSCS, respectively. We calculate the order parameters for within the range and , for antiferromagnetic and ferromagnetic order parameters respectively, with intervals and for systems with sizes .
Appendix E Analysis of the order parameter for a system of a given size for three dimensional systems
Figure 9 and Figure 10 shows the value of the order parameters for a threedimensional system with RBMNQSIT and RBMNQSCS, respectively. We calculate the order parameters for within the range and , for antiferromagnetic and ferromagnetic order parameters respectively, with intervals and for systems with sizes .
Appendix F Effectiveness of finding the inflection point for a system of a given size for correlation order parameters
Appendix G Effectiveness of finding the inflection point at the limit of infinite size for correlation order parameters
Figure 11 (a), (b) and (c) show the evaluation of the critical point at the limit of infinite size by fitting the inflection points as a function of the size of the system in the onedimensional, twodimensional and threedimensional model, respectively, with ferromagnetic correlation order parameter. Figure 12 shows the same evaluation with antiferromagnetic correlation order parameter.







1.107 (0.07)  1.135 (0.005)  1.16  1.156  
1.084 (0.04)  1.106 (0.007)  1.12  1.116  
1.007 (0.05)  1.040 (0.012)  1.07    
0.524 (0.32)  1.002 (0.009)  1.03    
0.632 (0.43)  1.001 (0.005)  1.01    
0.611 (0.102)  0.673 (0.05)    0.7  
0.428 (0.041)  0.501 (0.003)    0.5  
0.501 (0.002)  0.502 (0.001)    0.527 







1.074 (0.09)  1.117 (0.007)  1.16  1.109  
1.103 (0.03)  1.054 (0.006)  1.12  1.090  
1.016 (0.007)  1.009 (0.009)  1.07    
0.910 (0.23)  1.012 (0.009)  1.03    
128  0.413 (0.27)  1.002 (0.002)  1.02    
0.617 (0.07)  0.655 (0.05)    0.7  
0.424 (0.03)  0.453 (0.05)    0.5  
0.501 (0.003)  0.502 (0.002)    0.527 
Footnotes
 Video for ferromagnetic magnetisation: https://youtu.be/OSKBC8Fm2r4, video for antiferromagnetic magnetisation: https://youtu.be/kTEzdVfVNMA.
 https://nscc.sg
References
 Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al., ‘Tensorflow: a system for largescale machine learning.’, in OSDI, volume 16, pp. 265–283, (2016).
 Ian Affleck, Tom Kennedy, Elliott H Lieb, and Hal Tasaki, ‘Rigorous results on valencebond ground states in antiferromagnets’, in Condensed Matter Physics and Exactly Soluble Models, 249–252, Springer, (2004).
 Haitham Bou Ammar, Decebal Constantin Mocanu, Matthew E Taylor, Kurt Driessens, Karl Tuyls, and Gerhard Weiss, ‘Automatically mapped transfer between reinforcement learning tasks via threeway restricted boltzmann machines’, in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 449–464. Springer, (2013).
 Moritz August and Xiaotong Ni, ‘Using recurrent neural networks to optimize dynamical decoupling for quantum memory’, Physical Review A, 95(1), 012335, (2017).
 Jonathan Baxter, ‘Theoretical models of learning to learn’, in Learning to learn, 71–94, Springer, (1998).
 Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, and Seth Lloyd, ‘Quantum machine learning’, Nature, 549(7671), 195–202, (2017).
 Henk WJ Blöte and Youjin Deng, ‘Cluster monte carlo simulation of the transverse ising model’, Physical Review E, 66(6), 066110, (2002).
 Nicolas BoulangerLewandowski, Yoshua Bengio, and Pascal Vincent, ‘Modeling temporal dependencies in highdimensional sequences: application to polyphonic music generation and transcription’, in Proceedings of the 29th International Coference on International Conference on Machine Learning, pp. 1881–1888. Omnipress, (2012).
 Briiissuurs BraiorrOrrs, Michael Weyrauch, and Mykhailo V. Rakov, ‘Phase diagrams of one, two, and threedimensional quantum spin systems’, Quantum Information & Computation, 16(9&10), 885–899, (2016).
 Sergey Bravyi and Matthew Hastings, ‘On complexity of the quantum ising model’, Communications in Mathematical Physics, 349(1), 1–45, (2017).
 Peter Broecker, Juan Carrasquilla, Roger G Melko, and Simon Trebst, ‘Machine learning quantum phases of matter beyond the fermion sign problem’, Scientific reports, 7(1), 8823, (2017).
 Giuseppe Carleo, Kenny Choo, Damian Hofmann, James E. T. Smith, Tom Westerhout, Fabien Alet, Emily J. Davis, Stavros Efthymiou, Ivan Glasser, ShengHsuan Lin, Marta Mauri, Guglielmo Mazzola, Christian B. Mendl, Evert van Nieuwenburg, Ossian O’Reilly, Hugo Théveniaut, Giacomo Torlai, Filippo Vicentini, and Alexander Wietek, ‘Netket: A machine learning toolkit for manybody quantum systems’, SoftwareX, 100311, (2019).
 Giuseppe Carleo, Ignacio Cirac, Kyle Cranmer, Laurent Daudet, Maria Schuld, Leslie VogtMaranto, and Lenka Zdeborová, ‘Machine learning and the physical sciences’, arXiv preprint arXiv:1903.10563, (2019).
 Giuseppe Carleo, Yusuke Nomura, and Masatoshi Imada, ‘Constructing exact representations of quantum manybody systems with deep neural networks’, arXiv preprint arXiv:1802.09558, (2018).
 Giuseppe Carleo and Matthias Troyer, ‘Solving the quantum manybody problem with artificial neural networks’, Science, 355(6325), 602–606, (2017).
 Juan Carrasquilla and Roger G Melko, ‘Machine learning phases of matter’, Nature Physics, 13(5), 431, (2017).
 Chunlin Chen, Daoyi Dong, HanXiong Li, Jian Chu, and TzyhJong Tarn, ‘Fidelitybased probabilistic qlearning for control of quantum systems’, IEEE transactions on neural networks and learning systems, 25(5), 920–933, (2014).
 Jing Chen, Song Cheng, Haidong Xie, Lei Wang, and Tao Xiang, ‘Equivalence of restricted boltzmann machines and tensor network states’, Physical Review B, 97(8), 085104, (2018).
 Kenny Choo, Giuseppe Carleo, Nicolas Regnault, and Titus Neupert, ‘Symmetries and manybody excitations with neuralnetwork quantum states’, Physical review letters, 121(16), 167204, (2018).
 Kenny Choo, Antonio Mezzacapo, and Giuseppe Carleo, ‘Fermionic neuralnetwork states for abinitio electronic structure’, arXiv preprint arXiv:1909.12852, (2019).
 Kenny Choo, Titus Neupert, and Giuseppe Carleo, ‘Twodimensional frustrated j 1 j 2 model studied with neural network quantum states’, Physical Review B, 100(12), 125124, (2019).
 Adam Coates, Andrew Ng, and Honglak Lee, ‘An analysis of singlelayer networks in unsupervised feature learning’, in Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 215–223, (2011).
 Mario Collura, Luca Del’Anna, Timo Felser, and Simone Montangero, ‘On the descriptive power of neuralnetworks as constrained tensor networks with exponentially large bond dimension’, arXiv preprint arXiv:1905.11351, (2019).
 George Dahl, Marc’Aurelio Ranzato, Abdelrahman Mohamed, and Geoffrey E Hinton, ‘Phone recognition with the meancovariance restricted boltzmann machine’, in Advances in neural information processing systems, pp. 469–477, (2010).
 Sankar Das Sarma, DongLing Deng, and LuMing Duan, ‘Machine learning meets quantum physics’, Physics Today, 72(3), 48, (2019).
 DongLing Deng, Xiaopeng Li, and S Das Sarma, ‘Quantum entanglement in neural network states’, Physical Review X, 7(2), 021021, (2017).
 RJ Elliott, ‘Phenomenological discussion of magnetic ordering in the heavy rareearth metals’, Physical Review, 124(2), 346, (1961).
 Ugo Fiore, Francesco Palmieri, Aniello Castiglione, and Alfredo De Santis, ‘Network anomaly detection with the restricted boltzmann machine’, Neurocomputing, 122, 13–23, (2013).
 Xun Gao and LuMing Duan, ‘Efficient representation of quantum manybody states with deep neural networks’, Nature communications, 8(1), 662, (2017).
 HA Gersch and GC Knollman, ‘Quantum cell model for bosons’, Physical Review, 129(2), 959, (1963).
 Ivan Glasser, Nicola Pancotti, Moritz August, Ivan D Rodriguez, and J Ignacio Cirac, ‘Neuralnetwork quantum states, stringbond states, and chiral topological states’, Physical Review X, 8(1), 011006, (2018).
 James Gubernatis, Naoki Kawashima, and Philipp Werner, Quantum Monte Carlo Methods: Algorithms for Lattice Models, Cambridge University Press, 2016.
 Geoffrey Hinton, Nitish Srivastava, and Kevin Swersky, ‘Neural networks for machine learning lecture 6a overview of mini–batch gradient descent’, (2012).
 Geoffrey E Hinton, ‘A practical guide to training restricted boltzmann machines’, in Neural networks: Tricks of the trade, 599–619, Springer, (2012).
 Geoffrey E Hinton and Ruslan R Salakhutdinov, ‘Reducing the dimensionality of data with neural networks’, science, 313(5786), 504–507, (2006).
 Geoffrey E Hinton and Ruslan R Salakhutdinov, ‘Replicated softmax: an undirected topic model’, in Advances in neural information processing systems, pp. 1607–1614, (2009).
 Wenjian Hu, Rajiv RP Singh, and Richard T Scalettar, ‘Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical examination’, Physical Review E, 95(6), 062122, (2017).
 Navdeep Jaitly and Geoffrey Hinton, ‘Learning a better representation of speech soundwaves using restricted boltzmann machines’, in 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5884–5887. IEEE, (2011).
 Marcin Jarzyna and Rafał DemkowiczDobrzański, ‘True precision limits in quantum metrology’, New Journal of Physics, 17(1), 013010, (2015).
 ZhihAhn Jia, YuanHang Zhang, YuChun Wu, Liang Kong, GuangCan Guo, and GuoPing Guo, ‘Efficient machinelearning representations of a surface code with boundaries, defects, domain walls, and twists’, Physical Review A, 99(1), 012307, (2019).
 Bjarni Jónsson, Bela Bauer, and Giuseppe Carleo, ‘Neuralnetwork states for the classical simulation of quantum computing’, arXiv preprint arXiv:1808.05232, (2018).
 Jyri Kivinen and Christopher Williams, ‘Multiple texture boltzmann machines’, in Artificial Intelligence and Statistics, pp. 638–646, (2012).
 Cornelius Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, United States Governm. Press Office Los Angeles, CA, 1950.
 Hugo Larochelle and Yoshua Bengio, ‘Classification using discriminative restricted boltzmann machines’, in Proceedings of the 25th international conference on Machine learning, pp. 536–543. ACM, (2008).
 Nicolas Le Roux, Nicolas Heess, Jamie Shotton, and John Winn, ‘Learning a generative model of images by factoring appearance and shape’, Neural Computation, 23(3), 593–650, (2011).
 Yann Lecun, Sumit Chopra, Raia Hadsell, Marc Aurelio Ranzato, and Fu Jie Huang, ‘A tutorial on energybased learning’, in Predicting structured data, MIT Press, (2006).
 Richard B Lehoucq, Danny C Sorensen, and Chao Yang, ARPACK users’ guide: solution of largescale eigenvalue problems with implicitly restarted Arnoldi methods, volume 6, Siam, 1998.
 Linxia Liao, Wenjing Jin, and Radu Pavel, ‘Enhanced restricted boltzmann machine with prognosability regularization for prognostics and health assessment’, IEEE Transactions on Industrial Electronics, 63(11), 7076–7083, (2016).
 Na Lu, Tengfei Li, Xiaodong Ren, and Hongyu Miao, ‘A deep learning scheme for motor imagery classification based on restricted boltzmann machines’, IEEE transactions on neural systems and rehabilitation engineering, 25(6), 566–576, (2016).
 Sirui Lu, Xun Gao, and LM Duan, ‘Efficient representation of topologically ordered states with restricted boltzmann machines’, Physical Review B, 99(15), 155136, (2019).
 Gale Martin, The effects of old learning on new in hopfield and backpropagation nets, Microelectronics and Computer Technology Corporation, 1988.
 Kristopher McBrian, Giuseppe Carleo, and Ehsan Khatami, ‘Ground state phase diagram of the onedimensional bosehubbard model from restricted boltzmann machines’, arXiv preprint arXiv:1903.03076, (2019).
 Roger G Melko, Giuseppe Carleo, Juan Carrasquilla, and J Ignacio Cirac, ‘Restricted boltzmann machines in quantum physics’, Nature Physics, 15(9), 887–892, (2019).
 ME Midhun, Sarath R Nair, VT Prabhakar, and S Sachin Kumar, ‘Deep model for classification of hyperspectral image using restricted boltzmann machine’, in Proceedings of the 2014 international conference on interdisciplinary advances in applied computing, p. 35. ACM, (2014).
 Tu Dinh Nguyen, Dinh Q Phung, Viet Huynh, and Trung Le, ‘Supervised restricted boltzmann machines.’, in UAI, (2017).
 Yusuke Nomura, Andrew S Darmawan, Youhei Yamaji, and Masatoshi Imada, ‘Restricted boltzmann machine learning for solving strongly correlated quantum systems’, Physical Review B, 96(20), 205152, (2017).
 Román Orús, ‘A practical introduction to tensor networks: Matrix product states and projected entangled pair states’, Ann. Phys., 349, 117–158, (2014).
 Sinno Jialin Pan, Qiang Yang, et al., ‘A survey on transfer learning’, IEEE Transactions on knowledge and data engineering, 22(10), 1345–1359, (2010).
 Lorien Y Pratt, ‘Discriminabilitybased transfer between neural networks’, in Advances in neural information processing systems, pp. 204–211, (1993).
 V Privman, Finite Size Scaling and Numerical Simulation of Statistical Systems, WORLD SCIENTIFIC, 1990.
 Benno S Rem, Niklas Käming, Matthias Tarnowski, Luca Asteria, Nick Fläschner, Christoph Becker, Klaus Sengstock, and Christof Weitenberg, ‘Identifying quantum phase transitions using artificial neural networks on experimental data’, Nature Physics, 15(9), 917–920, (2019).
 Subir Sachdev, ‘Quantum phase transitions’, Handbook of Magnetism and Advanced Magnetic Materials, (2007).
 Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton, ‘Restricted boltzmann machines for collaborative filtering’, in Proceedings of the 24th international conference on Machine learning, pp. 791–798. ACM, (2007).
 Anders W Sandvik, ‘Finitesize scaling of the groundstate parameters of the twodimensional heisenberg model’, Physical Review B, 56(18), 11678, (1997).
 Tanya Schmah, Geoffrey E Hinton, Steven L Small, Stephen Strother, and Richard S Zemel, ‘Generative versus discriminative training of rbms for classification of fmri images’, in Advances in neural information processing systems, pp. 1409–1416, (2009).
 Ulrich Schollwöck, ‘The densitymatrix renormalization group in the age of matrix product states’, Ann. Phys., 326(1), 96 – 192, (2011).
 Paul Smolensky, ‘Information processing in dynamical systems: Foundations of harmony theory’, Technical report, Colorado Univ at Boulder Dept of Computer Science, (1986).
 Shivaji Lal Sondhi, SM Girvin, JP Carini, and D Shahar, ‘Continuous quantum phase transitions’, Reviews of modern physics, 69(1), 315, (1997).
 Ilya Sutskever, Geoffrey E Hinton, and Graham W Taylor, ‘The recurrent temporal restricted boltzmann machine’, in Advances in neural information processing systems, pp. 1601–1608, (2009).
 Sei Suzuki, Junichi Inoue, and Bikas K Chakrabarti, Quantum Ising phases and transitions in transverse Ising models, volume 862, Springer, 2012.
 Yichuan Tang, Ruslan Salakhutdinov, and Geoffrey Hinton, ‘Robust boltzmann machines for recognition and denoising’, in 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2264–2271. IEEE, (2012).
 Graham W Taylor, Geoffrey E Hinton, and Sam T Roweis, ‘Modeling human motion using binary latent variables’, in Advances in neural information processing systems, pp. 1345–1352, (2007).
 Yee Whye Teh and Geoffrey E Hinton, ‘Ratecoded restricted boltzmann machines for face recognition’, in Advances in neural information processing systems, pp. 908–914, (2001).
 David J Thouless, The quantum mechanics of manybody systems, Courier Corporation, 2014.
 Sebastian Thrun and Lorien Pratt, ‘Learning to learn: Introduction and overview’, in Learning to learn, 3–17, Springer, (1998).
 Jakub M Tomczak and Maciej Zieba, ‘Classification restricted boltzmann machine for comprehensible credit scoring model’, Expert Systems with Applications, 42(4), 1789–1796, (2015).
 Giacomo Torlai, Guglielmo Mazzola, Juan Carrasquilla, Matthias Troyer, Roger Melko, and Giuseppe Carleo, ‘Neuralnetwork quantum state tomography’, Nature Physics, 14(5), 447, (2018).
 Evert PL Van Nieuwenburg, YeHua Liu, and Sebastian D Huber, ‘Learning phase transitions by confusion’, Nature Physics, 13(5), 435, (2017).
 Matthias Vojta, ‘Quantum phase transitions’, Reports on Progress in Physics, 66(12), 2069, (2003).
 Lei Wang, ‘Discovering phase transitions with unsupervised learning’, Physical Review B, 94(19), 195105, (2016).
 Bin Wei and Christopher Pal, ‘Heterogeneous transfer learning with rbms’, in Twentyfifth AAAI conference on artificial intelligence, (2011).
 Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang, ‘A survey of transfer learning’, Journal of Big Data, 3(1), 9, (2016).
 Alexander Weiße and Holger Fehske, ‘Exact diagonalization techniques’, in Computational manyparticle physics, 529–544, Springer, (2008).
 Tom Westerhout, Nikita Astrakhantsev, Konstantin S. Tikhonov, Mikhail Katsnelson, and Andrey A. Bagrov, ‘Neural quantum states of frustrated magnets: generalization and sign structure’, arXiv preprint arXiv:1907.08186, (2019).
 Sebastian J Wetzel, ‘Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders’, Physical Review E, 96(2), 022140, (2017).
 Robert M White, Robert M White, and Bradford Bayne, Quantum theory of magnetism, volume 1, Springer, 1983.
 Steven R White, ‘Density matrix formulation for quantum renormalization groups’, Phys. Rev. Lett., 69(19), 2863, (1992).
 Kenneth G Wilson, ‘Problems in physics with many scales of length’, Scientific American, 241(2), 158–179, (1979).
 Zhizheng Wu, Eng Siong Chng, and Haizhou Li, ‘Conditional restricted boltzmann machine for voice conversion’, in 2013 IEEE China Summit and International Conference on Signal and Information Processing, pp. 104–108. IEEE, (2013).
 Pengtao Xie, Yuntian Deng, and Eric Xing, ‘Diversifying restricted boltzmann machine for document modeling’, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1315–1324. ACM, (2015).
 Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson, ‘How transferable are features in deep neural networks?’, in Advances in neural information processing systems, pp. 3320–3328, (2014).
 Remmy Zen, Long My, Ryan Tan, Frederic Hebert, Mario Gattobigio, Christian Miniatura, Dario Poletti, and Stephane Bressan, ‘Transfer learning for scalability of neuralnetwork quantum states’, arXiv preprint arXiv:1908.09883, (2019).
 Jian Zhang, ‘Deep transfer learning via restricted boltzmann machine for document classification’, in 2011 10th International Conference on Machine Learning and Applications and Workshops, volume 1, pp. 323–326. IEEE, (2011).