NetKet: A Machine Learning Toolkit for ManyBody Quantum Systems
Abstract
We introduce NetKet, a comprehensive open source framework for the study of manybody quantum systems using machine learning techniques. The framework is built around a general and flexible implementation of neuralnetwork quantum states, which are used as a variational ansatz for quantum wavefunctions. NetKet provides algorithms for several key tasks in quantum manybody physics and quantum technology, namely quantum state tomography, supervised learning from wavefunction data, and ground state searches for a wide range of customizable lattice models. Our aim is to provide a common platform for open research and to stimulate the collaborative development of computational methods at the interface of machine learning and manybody physics.
I Motivation and significance
Recent years have seen a tremendous activity around the development of physicsoriented numerical techniques based on machine learning (ML) tools carleo_machine_2019. In the context of manybody quantum physics, one of the main goals of these approaches is to tackle complex quantum problems using compact representations of manybody states based on artificial neural networks. These representations, dubbed neuralnetwork quantum states (NQS) carleo_solving_2017, can be used for several applications. In the supervised learning setting, they can be used, e.g., to learn existing quantum states for which a nonNQS representation is available cai_approximating_2018. In the unsupervised setting, they can be used to reconstruct complex quantum states from experimental measurements, a task known as quantum state tomography Torlai2018. Finally, in the context of purely variational applications, NQS can be used to find approximate ground and excitedstate solutions of the Schrödinger equation carleo_solving_2017; Choo2018; Glasser2018; Kaubruegger2018; Saito2017; Saito2018, as well as to describe unitary carleo_solving_2017; czischek_quenches_2018; jonsson_neuralnetwork_2018 and dissipative hartmann_neuralnetwork_2019; yoshioka_constructing_2019; nagy_variational_2019; vicentini_variational_2019 manybody dynamics. Despite the increasing methodological and theoretical interest in NQS and their applications, a set of comprehensive, easytouse tools for research applications is still lacking. This is particularly pressing as the complexity of NQSrelated approaches and algorithms is expected to grow rapidly given these first successes, steepening the learning curve.
The goal of NetKet is to provide a set of primitives and flexible tools to ease the development of cuttingedge ML applications for quantum manybody physics. NetKet also wants to help bridge the gap between the latest and technically demanding developments in the field and those scholars and students who approach the subject for the first time. Pedagogical tutorials are provided to this aim. Serving as a common platform for future research, the NetKet project is meant to stimulate the open and easytocertify development of new methods and to provide a common set of tools to reproduce published results.
A central philosophy of the NetKet framework is to provide tools that are as simple as possible to use for the end user. Given the huge popularity of the Python programming language and of the many accompanying tools gravitating around the Python ecosystem, we have built NetKet as a fullfledged Python library. This simplicity of use however does not come at the expense of performance. With this efficiency requirement in mind, all critical routines and components of NetKet have been written in C++11.
Ii Software description
We will first give a general overview of the structure of the code in Sect. II.1 and then provide additional details on the functionality of NetKet in Sect. II.2.
ii.1 Software architecture
The core of NetKet is implemented in C++. For ease of use and in order to facilitate the integration with other frameworks, a Python interface is provided, which exposes all highlevel functionality from the C++ core via pybind11 pybind11 bindings. Use of the Python interface is recommended for users building on the library for research purposes, while the C++ code should be modified for extending the NetKet library itself.
NetKet is divided into several submodules. The modules graph, hilbert, and operator contain the classes necessary for specifying the structure of the manybody Hilbert space, the Hamiltonian, and other observables of a quantum system.
The core component of NetKet is the machine module, which provides different variational representations of the quantum wavefunction, particularly in the form of NQS. The variational, supervised, and unsupervised modules contain driver classes for energy optimization, supervised learning, and quantum state tomography, respectively. These driver classes are supported by the sampler and optimizer modules, which provide classes for performing Variational Monte Carlo (VMC) sampling and optimization steps.
The exact module provides functions for exact diagonalization (ED) and imaginary time propagation of the full quantum state, in order to allow for easy benchmarking and exploration of small systems within the NetKet framework. ED can be performed by full diagonalization of the Hamiltonian or, alternatively, by a Lanczos procedure, where the user may choose between a sparse matrix representation of the Hamiltonian and a matrixfree implementation. The Lanczos solver is based on the IETL library from the ALPS project Bauer2011; Albuquerque2007 which implements a variant of the Lanczos algorithm due to Cullum and Willoughby Cullum1981; Cullum1985. The dynamics module provides a basic RungeKutta ODE solver which is used for the exact imaginary time propagation.
The utility modules output, stats, and util contain some additional functionality for output and statistics that is used internally in other parts of NetKet.
An overview of the most important modules and their dependencies is given in Figure 1. A more detailed description of the module contents will be given in the next section.
NetKet uses the Eigen 3 library eigen3 for linear algebra routines. In the Python interface, Eigen datatypes are transparently converted to and from NumPy numpy arrays by pybind11. The NetKet driver classes provide methods to directly write the simulation output to JSON files, which is done with the help of the nlohmann/json library for C++ nlohmannjson. Parallelization is implemented based on the Message Passing Interface (MPI), allowing to substantially decrease running time. Specifically, the Monte Carlo sampling of expectation values implemented in the variational.Vmc class is parallelized, with each node drawing independent samples from the probability distribution which are averaged over all nodes.
ii.2 Software functionalities
The core feature of NetKet is the variational representation of quantum states by artificial neural networks. Given a variational state, the task is to optimize its parameters with regard to a specified loss function, such as the total energy for ground state searches or the (negative) overlap with a given target state. In this section, we will discuss the models, types of variational wavefunctions, and learning schemes that are available in NetKet.
ii.2.1 Model specification
NetKet currently supports lattice models with a finite Hilbert space of the form where denotes the number of lattice sites. The system is defined on a graph with a set of sites and a set of edges (also called bonds) between sites. The graph structure is used to help with the definition of operators on the lattice and to encode the spatial structure of the model, which is necessary, e.g., to work with convolutional neural networks (CNNs). NetKet provides the predefined Hypercube and Lattice graphs. Furthermore, CustomGraph supports arbitrary edgecolored graphs, where each edge is associated with an integer label called its color. This color can be used to describe different types of bonds.
Several predefined Hamiltonians are provided, such as spin models (transverse field Ising, Heisenberg models) or bosonic models (BoseHubbard model). For specifying other observables and custom Hamiltonians, additional classes are available: A convenient option for common lattice models is to use the GraphOperator class, which allows to construct a Hamiltonian from a family of 2local operators acting on each bond of a selected color and a family of 1local operators acting on each site. It is also possible to specify general local operators (as well as their products and sums) using the LocalOperator class.
ii.2.2 Variational quantum states
The purpose of variational states is to provide a compact and computationally efficient representation of quantum states. Since generally only a subset of the full manybody Hilbert space will be covered by a given variational ansatz, the aim is to use a parametrization that captures the relevant physical states for a given problem.
The variational wavefunctions supported by NetKet are provided as part of the machine module, which currently includes NQS but also Jastrow wavefunctions RevModPhys.63.1; Becca2017 and matrixproduct states (MPS) White1992; Rommer1997; Schollwck2011.
Broadly, there are two main types of NQS available in NetKet: restricted Boltzmann machines (RBM) Hinton2006 and feedforward neural networks (FFNN) LeCun2015; Goodfellow2016; Saito2017; Saito2018. Both types of networks are fully complex, i.e., with both complexvalued parameters and output.
The machine module contains the RbmSpin class for spin systems as well as two other variants: the symmetric RBM (RbmSpinSymm) to capture lattice symmetries such as translation and inversion symmetries and the multivalued RBM (RbmMultiVal) for systems with larger local Hilbert spaces (such as higher spins or bosonic systems).
FFNNs represent a broad and flexible class of networks and are implemented by the FFNN class. They consist of a sequence of layers available from the layer submodule, each layer performing either an affine transformation to the input vector or applying a nonlinear activation function. There are currently two types of affine maps available:

Dense fullyconnected layers, which for an input and output have the form where and are called the weight matrix and bias vector, respectively.

Convolutional layers Gu2015; LeCun2015 for hypercubic lattices.
As activation functions, rectified linear units (Relu) Nair2010, hyperbolic tangent (Tanh) LeCun2012, and the logarithm of the hyperbolic cosine (Lncosh) are provided. RBMs without visible bias can be represented as singlelayer FFNNs with activation, allowing for a generalization of these machines to multiple layers Choo2018.
Finally, the machine module also provides more traditional variational wavefunctions, namely MPS with periodic boundary conditions (MPSPeriodic) and longrange Jastrow (Jastrow) wavefunctions, which allows for comparison of NQS with results obtained using these approaches.
Custom wavefunctions may be provided by implementing subclasses of the AbstractMachine class in C++.
ii.2.3 Supervised learning
In supervised learning, a target wavefunction is given and the task is to optimize a chosen ansatz to represent it. This functionality is contained within the supervised module. Given a variational state depending on the parameters and a target state , the negative log overlap
(1) 
is taken as the loss function to be minimized. The loss is computed in a Monte Carlo fashion by direct sampling of the target wavefunction. To minimize the loss, the gradient of the loss function with respect to the parameters is calculated. This gradient is then used to update the parameters according to a specified gradientbased optimization scheme. For example, in stochastic gradient descent (SGD) the parameters are updated as
(2) 
where is the learning rate. The different update rules supported by NetKet are contained in the optimizer module. Various types of optimizers are available, including SGD, AdaGrad Duchi2011, AdaMax and AdaDelta Kingma2014, AMSGrad Reddi2018, and RMSProp.
ii.2.4 Unsupervised learning
NetKet also allows to carry out unsupervised learning of unknown probability distributions, which in this context corresponds to quantum state tomography PhysRevA.64.052312. Given an unknown quantum state, a neural network can be trained on projective measurement data to discover an approximate reconstruction of the state Torlai2018. In NetKet, this functionality is contained within the unsupervised.Qsr class.
For some given target quantum state , the training dataset consists of a sequence of projective measurements in different bases , with underlying probability distribution . The quantum reconstruction of the target state translates into minimizing the statistical divergence between the distribution of the measurement outcomes and the distribution generated by the NQS. This corresponds, up to a constant dataset entropy contribution, to maximizing the loglikelihood of the network distribution over the measurement data
(3) 
where denotes the probability distribution
(4) 
generated by the NQS wavefunction.
Note that, for every training sample where the measurement basis differs from the reference basis of the NQS, a unitary transformation should be applied to appropriately change the basis, .
The network parameters are updated according to the gradient of the loglikelihood . This can be computed analytically, and it requires expectation values over both the training data points and the network distribution . While the first is trivial to compute, the latter should be approximated by a Monte Carlo average over configurations sampled from a Markov chain.
ii.2.5 Variational Monte Carlo
Finally, NetKet supports ground state searches for a given manybody quantum Hamiltonian . In this context, the task is to optimize the parameters of a variational wavefunction in order to minimize the energy . The variational.Vmc driver class contains the main logic to optimize a variational wavefunction given a Hamiltonian, a sampler, and an optimizer.
The energy of a wavefunction can be estimated as
(5) 
where in the last line denotes a stochastic expectation value taken over a sample of configurations drawn from the probability distribution corresponding to the variational wavefunction (4). This sampling is performed by classes from the sampler module, which generate Markov chains of configurations using the Metropolis algorithm Metropolis1953 to ensure detailed balance. Parallel tempering Swendsen1986 options are also available to improve sampling efficiency.
In order to optimize the parameters of a machine to minimize the energy, a gradientbased optimization scheme can be applied as discussed in the previous section. The energy gradient can be estimated at the same time as Becca2017; carleo_solving_2017. This requires computing the partial derivatives of the wavefunction with respect to the variational parameters, which can be obtained analytically for the RBM carleo_solving_2017 or via backpropagation LeCun2015; LeCun2012; Goodfellow2016 for multilayer FFNNs. In this case, the steepest descent update according to Eq. (2) is also a form of SGD, because the energy is estimated using a subset of the full data available from the variational wavefunction. Alternatively, often more stable convergence can be achieved by using the stochastic reconfiguration (SR) method Sorella_SR; casula, which approximates the imaginary time evolution of the system on the submanifold of variational states. The SR approach is closely related to the natural gradient descent method used in machine learning Amari1998. In the NetKet implementation, SR is performed using either an exact or an iterative linear solver, the latter being recommended when the number of variational parameters is large.
Information on the optimization run (sampler acceptance rates, energy, energy variance, expectation of additional observables, and the current variational parameters) for each iteration can be written to a log file in JSON format. Alternatively, they can be accessed directly inside the simulation loop in Python to allow for more flexible output.
Iii Illustrative examples
NetKet is available as a Python package and can be obtained from the Python package index (PyPI) PyPI. Assuming a properly configured Python environment, NetKet can be installed via the shell command
pip install netket
which will download, compile, and install the package. A working MPI environment is required to run NetKet. In case multiple MPI installations are present on the system and in order to avoid potential conflicts, we recommend to run the installation command as
CC=mpicc CXX=mpicxx pip install netket
with the desired MPI environment loaded in order to perform the build with the correct compiler. After a successful installation, the NetKet module can be imported in Python scripts.
Alternatively to installing NetKet locally, NetKet also uses the deployment of BinderHub from mybinder.org Jupyter2018 to build and deploy a stable version of the software, which can be found at https://mybinder.org/v2/gh/netket/netket/master. This allows users to run the tutorials or other small jobs without installing NetKet.
iii.1 Onedimensional Heisenberg model
As a first example, we present a Python script for obtaining a variational RBM representation of the ground state of the spin Heisenberg model on a onedimensional chain with periodic boundary conditions. The code for this example is shown in Listing 1. Figure 2 shows the evolution of the energy expectation value over the course of the optimization run. We see that for a small chain of 20 sites and an RBM with 20 hidden units, the energy converges to a relative error of the order within about 100 iteration steps.
iii.2 Supervised learning
As a second example, we use the supervised learning module in NetKet to optimize an RBM to represent the ground state of the transverse field Ising model. The example script is shown in Listing 2. The exact ground state wavefunction is first obtained by exact diagonalization and then used for training the RBM state by minimizing the overlap loss (1). Figure 3 shows the evolution of the overlap over the training iterations.
Iv Impact
Given the flexibility of NetKet, we envision several potential applications of this library both in datadriven experimental research and in more theoretical, problemdriven research on interacting quantum manybody systems. For example, several important theoretical and practical questions concerning the expressibility of NQS, the learnability of experimental quantum states, and the efficiency at finding ground states of local Hamiltonians, can be directly addressed using the current functionality of the software.
Moreover, having an easytoextend set of tools to work with NQSbased applications can propel future research in the field, without researchers having to pay a significant cost of entry in terms of algorithm implementation and testing. Since its early release in April 2017, NetKet has already been used for research purposes by several groups worldwide. We also hope that, building upon a common set of tools, practices like publishing accompanying codes to research papers, largely popular in the ML community, can become standard practice also for ML applications in quantum physics.
Finally, for a fastgrowing community like ML for quantum science, it is also crucial to have pedagogical tools available that can be conveniently used by new generations of students and researchers. Benefiting from a growing set of tutorials and stepbystep explanations, NetKet can be comfortably used in schools and lectures.
V Conclusions and future directions
We have introduced NetKet, a comprehensive open source framework for the study of manybody quantum systems using machine learning techniques. Central to this framework are variational parameterizations of manybody wavefunctions in the form of artificial neural networks. NetKet is a Python framework implemented in C++11, designed with efficiency as well as ease of use in mind. Several examples, tutorials, and notebooks are provided with our software in order to reduce the learning curve for newcomers.
The NetKet project is meant to continuously evolve in future releases, welcoming suggestions and contributions from its users. For example, future versions may provide a natural interface with general ML frameworks such as PyTorch paszke2017automatic and Tensorflow tensorflow2015whitepaper. On the algorithmic side, future goals include the extension of NetKet to incorporate unitary dynamics carleo_localization_2012; jonsson_neuralnetwork_2018 as well as support for neural density matrices torlai_latent_2018.
Acknowledgements
We acknowledge support from the Flatiron Institute of the Simons Foundation. J.E.T.S. gratefully acknowledges support from a fellowship through The Molecular Sciences Software Institute under NSF Grant ACI1547580. H.T. is supported by a grant from the Fondation CFM pour la Recherche. S.E. and I.G. are supported by an ERC Advanced Grant QENOCOBA under the EU Horizon2020 program (grant agreement 742102) and the German Research Foundation (DFG) under Germany’s Excellence Strategy through Project No. EXC2111  390814868 (MCQST). This project makes use of other open source software, namely pybind11 pybind11, Eigen eigen3, ALPS IETL Bauer2011; Albuquerque2007, nlohmann/json nlohmannjson, and NumPy numpy. We further acknowledge discussions with, as well as bug reports, comments, and more from S. Arnold, A. Booth, A. Borin, J. Carrasquilla, S. Lederer, Y. Levine, T. Neupert, O. Parcollet, A. Rubio, M. A. Sentef, O. Sharir, M. Stoudenmire, and N. Wies.