NetKet: A Machine Learning Toolkit for Many-Body Quantum Systems

# NetKet: A Machine Learning Toolkit for Many-Body Quantum Systems

Giuseppe Carleo Center for Computational Quantum Physics, Flatiron Institute, 162 5th Avenue, NY 10010, New York, USA    Kenny Choo Department of Physics, University of Zurich, Winterthurerstrasse 190, 8057 Zürich, Switzerland    Damian Hofmann Max Planck Institute for the Structure and Dynamics of Matter, Luruper Chaussee 149, 22761 Hamburg, Germany    James E. T. Smith Department of Chemistry, University of Colorado Boulder, Boulder, Colorado 80302, USA    Tom Westerhout Institute for Molecules and Materials, Radboud University, NL-6525 AJ Nijmegen, The Netherlands    Fabien Alet Laboratoire de Physique Théorique, IRSAMC, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France    Emily J. Davis Department of Physics, Stanford University, Stanford, California 94305, USA    Stavros Efthymiou Max-Planck-Institut für Quantenoptik, Hans-Kopfermann-Straße 1, 85748 Garching bei München, Germany    Ivan Glasser Max-Planck-Institut für Quantenoptik, Hans-Kopfermann-Straße 1, 85748 Garching bei München, Germany    Sheng-Hsuan Lin Department of Physics, T42, Technische Universität München, James-Franck-Straße 1, 85748 Garching bei München, Germany    Marta Mauri Center for Computational Quantum Physics, Flatiron Institute, 162 5th Avenue, NY 10010, New York, USA Dipartimento di Fisica, Università degli Studi di Milano, via Celoria 16, I-20133 Milano, Italy    Guglielmo Mazzola Theoretische Physik, ETH Zürich, 8093 Zürich, Switzerland    Christian B. Mendl Technische Universität Dresden, Institute of Scientific Computing, Zellescher Weg 12-14, 01069 Dresden, Germany    Evert van Nieuwenburg Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, CA 91125, USA    Ossian O’Reilly Southern California Earthquake Center, University of Southern California, 3651 Trousdale Pkwy, Los Angeles, CA 90089, USA    Hugo Théveniaut Laboratoire de Physique Théorique, IRSAMC, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France    Giacomo Torlai Center for Computational Quantum Physics, Flatiron Institute, 162 5th Avenue, NY 10010, New York, USA    Alexander Wietek Center for Computational Quantum Physics, Flatiron Institute, 162 5th Avenue, NY 10010, New York, USA
###### Abstract

We introduce NetKet, a comprehensive open source framework for the study of many-body quantum systems using machine learning techniques. The framework is built around a general and flexible implementation of neural-network quantum states, which are used as a variational ansatz for quantum wavefunctions. NetKet provides algorithms for several key tasks in quantum many-body physics and quantum technology, namely quantum state tomography, supervised learning from wavefunction data, and ground state searches for a wide range of customizable lattice models. Our aim is to provide a common platform for open research and to stimulate the collaborative development of computational methods at the interface of machine learning and many-body physics.

## I Motivation and significance

Recent years have seen a tremendous activity around the development of physics-oriented numerical techniques based on machine learning (ML) tools carleo_machine_2019. In the context of many-body quantum physics, one of the main goals of these approaches is to tackle complex quantum problems using compact representations of many-body states based on artificial neural networks. These representations, dubbed neural-network quantum states (NQS) carleo_solving_2017, can be used for several applications. In the supervised learning setting, they can be used, e.g., to learn existing quantum states for which a non-NQS representation is available cai_approximating_2018. In the unsupervised setting, they can be used to reconstruct complex quantum states from experimental measurements, a task known as quantum state tomography Torlai2018. Finally, in the context of purely variational applications, NQS can be used to find approximate ground- and excited-state solutions of the Schrödinger equation carleo_solving_2017; Choo2018; Glasser2018; Kaubruegger2018; Saito2017; Saito2018, as well as to describe unitary carleo_solving_2017; czischek_quenches_2018; jonsson_neural-network_2018 and dissipative hartmann_neural-network_2019; yoshioka_constructing_2019; nagy_variational_2019; vicentini_variational_2019 many-body dynamics. Despite the increasing methodological and theoretical interest in NQS and their applications, a set of comprehensive, easy-to-use tools for research applications is still lacking. This is particularly pressing as the complexity of NQS-related approaches and algorithms is expected to grow rapidly given these first successes, steepening the learning curve.

The goal of NetKet is to provide a set of primitives and flexible tools to ease the development of cutting-edge ML applications for quantum many-body physics. NetKet also wants to help bridge the gap between the latest and technically demanding developments in the field and those scholars and students who approach the subject for the first time. Pedagogical tutorials are provided to this aim. Serving as a common platform for future research, the NetKet project is meant to stimulate the open and easy-to-certify development of new methods and to provide a common set of tools to reproduce published results.

A central philosophy of the NetKet framework is to provide tools that are as simple as possible to use for the end user. Given the huge popularity of the Python programming language and of the many accompanying tools gravitating around the Python ecosystem, we have built NetKet as a full-fledged Python library. This simplicity of use however does not come at the expense of performance. With this efficiency requirement in mind, all critical routines and components of NetKet have been written in C++11.

## Ii Software description

We will first give a general overview of the structure of the code in Sect. II.1 and then provide additional details on the functionality of NetKet in Sect. II.2.

### ii.1 Software architecture

The core of NetKet is implemented in C++. For ease of use and in order to facilitate the integration with other frameworks, a Python interface is provided, which exposes all high-level functionality from the C++ core via pybind11 pybind11 bindings. Use of the Python interface is recommended for users building on the library for research purposes, while the C++ code should be modified for extending the NetKet library itself.

NetKet is divided into several submodules. The modules graph, hilbert, and operator contain the classes necessary for specifying the structure of the many-body Hilbert space, the Hamiltonian, and other observables of a quantum system.

The core component of NetKet is the machine module, which provides different variational representations of the quantum wavefunction, particularly in the form of NQS. The variational, supervised, and unsupervised modules contain driver classes for energy optimization, supervised learning, and quantum state tomography, respectively. These driver classes are supported by the sampler and optimizer modules, which provide classes for performing Variational Monte Carlo (VMC) sampling and optimization steps.

The exact module provides functions for exact diagonalization (ED) and imaginary time propagation of the full quantum state, in order to allow for easy benchmarking and exploration of small systems within the NetKet framework. ED can be performed by full diagonalization of the Hamiltonian or, alternatively, by a Lanczos procedure, where the user may choose between a sparse matrix representation of the Hamiltonian and a matrix-free implementation. The Lanczos solver is based on the IETL library from the ALPS project Bauer2011; Albuquerque2007 which implements a variant of the Lanczos algorithm due to Cullum and Willoughby Cullum1981; Cullum1985. The dynamics module provides a basic Runge-Kutta ODE solver which is used for the exact imaginary time propagation.

The utility modules output, stats, and util contain some additional functionality for output and statistics that is used internally in other parts of NetKet.

An overview of the most important modules and their dependencies is given in Figure 1. A more detailed description of the module contents will be given in the next section.

NetKet uses the Eigen 3 library eigen3 for linear algebra routines. In the Python interface, Eigen datatypes are transparently converted to and from NumPy numpy arrays by pybind11. The NetKet driver classes provide methods to directly write the simulation output to JSON files, which is done with the help of the nlohmann/json library for C++ nlohmann-json. Parallelization is implemented based on the Message Passing Interface (MPI), allowing to substantially decrease running time. Specifically, the Monte Carlo sampling of expectation values implemented in the variational.Vmc class is parallelized, with each node drawing independent samples from the probability distribution which are averaged over all nodes.

### ii.2 Software functionalities

The core feature of NetKet is the variational representation of quantum states by artificial neural networks. Given a variational state, the task is to optimize its parameters with regard to a specified loss function, such as the total energy for ground state searches or the (negative) overlap with a given target state. In this section, we will discuss the models, types of variational wavefunctions, and learning schemes that are available in NetKet.

#### ii.2.1 Model specification

NetKet currently supports lattice models with a finite Hilbert space of the form where denotes the number of lattice sites. The system is defined on a graph with a set of sites and a set of edges (also called bonds) between sites. The graph structure is used to help with the definition of operators on the lattice and to encode the spatial structure of the model, which is necessary, e.g., to work with convolutional neural networks (CNNs). NetKet provides the predefined Hypercube and Lattice graphs. Furthermore, CustomGraph supports arbitrary edge-colored graphs, where each edge is associated with an integer label called its color. This color can be used to describe different types of bonds.

Several predefined Hamiltonians are provided, such as spin models (transverse field Ising, Heisenberg models) or bosonic models (Bose-Hubbard model). For specifying other observables and custom Hamiltonians, additional classes are available: A convenient option for common lattice models is to use the GraphOperator class, which allows to construct a Hamiltonian from a family of 2-local operators acting on each bond of a selected color and a family of 1-local operators acting on each site. It is also possible to specify general -local operators (as well as their products and sums) using the LocalOperator class.

#### ii.2.2 Variational quantum states

The purpose of variational states is to provide a compact and computationally efficient representation of quantum states. Since generally only a subset of the full many-body Hilbert space will be covered by a given variational ansatz, the aim is to use a parametrization that captures the relevant physical states for a given problem.

The variational wavefunctions supported by NetKet are provided as part of the machine module, which currently includes NQS but also Jastrow wavefunctions RevModPhys.63.1; Becca2017 and matrix-product states (MPS) White1992; Rommer1997; Schollwck2011.

Broadly, there are two main types of NQS available in NetKet: restricted Boltzmann machines (RBM) Hinton2006 and feed-forward neural networks (FFNN) LeCun2015; Goodfellow2016; Saito2017; Saito2018. Both types of networks are fully complex, i.e., with both complex-valued parameters and output.

The machine module contains the RbmSpin class for spin- systems as well as two other variants: the symmetric RBM (RbmSpinSymm) to capture lattice symmetries such as translation and inversion symmetries and the multi-valued RBM (RbmMultiVal) for systems with larger local Hilbert spaces (such as higher spins or bosonic systems).

FFNNs represent a broad and flexible class of networks and are implemented by the FFNN class. They consist of a sequence of layers available from the layer submodule, each layer performing either an affine transformation to the input vector or applying a non-linear activation function. There are currently two types of affine maps available:

• Dense fully-connected layers, which for an input and output have the form where and are called the weight matrix and bias vector, respectively.

• Convolutional layers Gu2015; LeCun2015 for hypercubic lattices.

As activation functions, rectified linear units (Relu) Nair2010, hyperbolic tangent (Tanh) LeCun2012, and the logarithm of the hyperbolic cosine (Lncosh) are provided. RBMs without visible bias can be represented as single-layer FFNNs with activation, allowing for a generalization of these machines to multiple layers Choo2018.

Finally, the machine module also provides more traditional variational wavefunctions, namely MPS with periodic boundary conditions (MPSPeriodic) and long-range Jastrow (Jastrow) wavefunctions, which allows for comparison of NQS with results obtained using these approaches.

Custom wavefunctions may be provided by implementing subclasses of the AbstractMachine class in C++.

#### ii.2.3 Supervised learning

In supervised learning, a target wavefunction is given and the task is to optimize a chosen ansatz to represent it. This functionality is contained within the supervised module. Given a variational state depending on the parameters and a target state , the negative log overlap

 (1)

is taken as the loss function to be minimized. The loss is computed in a Monte Carlo fashion by direct sampling of the target wavefunction. To minimize the loss, the gradient of the loss function with respect to the parameters is calculated. This gradient is then used to update the parameters according to a specified gradient-based optimization scheme. For example, in stochastic gradient descent (SGD) the parameters are updated as

 α→α−λ∇αL (2)

where is the learning rate. The different update rules supported by NetKet are contained in the optimizer module. Various types of optimizers are available, including SGD, AdaGrad Duchi2011, AdaMax and AdaDelta Kingma2014, AMSGrad Reddi2018, and RMSProp.

#### ii.2.4 Unsupervised learning

NetKet also allows to carry out unsupervised learning of unknown probability distributions, which in this context corresponds to quantum state tomography PhysRevA.64.052312. Given an unknown quantum state, a neural network can be trained on projective measurement data to discover an approximate reconstruction of the state Torlai2018. In NetKet, this functionality is contained within the unsupervised.Qsr class.

For some given target quantum state , the training dataset consists of a sequence of projective measurements in different bases , with underlying probability distribution . The quantum reconstruction of the target state translates into minimizing the statistical divergence between the distribution of the measurement outcomes and the distribution generated by the NQS. This corresponds, up to a constant dataset entropy contribution, to maximizing the log-likelihood of the network distribution over the measurement data

 L=∑σb∈Dlogπ(σb), (3)

where denotes the probability distribution

 π(σ)=∣∣ΨNN(σ)∣∣2∑σ′∣∣ΨNN(σ′)∣∣2. (4)

generated by the NQS wavefunction.

Note that, for every training sample where the measurement basis differs from the reference basis of the NQS, a unitary transformation should be applied to appropriately change the basis, .

The network parameters are updated according to the gradient of the log-likelihood . This can be computed analytically, and it requires expectation values over both the training data points and the network distribution . While the first is trivial to compute, the latter should be approximated by a Monte Carlo average over configurations sampled from a Markov chain.

#### ii.2.5 Variational Monte Carlo

Finally, NetKet supports ground state searches for a given many-body quantum Hamiltonian . In this context, the task is to optimize the parameters of a variational wavefunction in order to minimize the energy . The variational.Vmc driver class contains the main logic to optimize a variational wavefunction given a Hamiltonian, a sampler, and an optimizer.

The energy of a wavefunction can be estimated as

 ⟨^H⟩=∑σ,σ′Ψ∗(σ)⟨σ|^H|σ′⟩Ψ(σ′)∑σ|Ψ(σ)|2=∑σ(∑σ′⟨σ|^H|σ′⟩Ψ(σ′)Ψ(σ))|Ψ(σ)|2∑σ′|Ψ(σ′)|2≈⟨∑σ′⟨σ|^H|σ′⟩Ψ(σ′)Ψ(σ)⟩σ (5)

where in the last line denotes a stochastic expectation value taken over a sample of configurations drawn from the probability distribution corresponding to the variational wavefunction (4). This sampling is performed by classes from the sampler module, which generate Markov chains of configurations using the Metropolis algorithm Metropolis1953 to ensure detailed balance. Parallel tempering Swendsen1986 options are also available to improve sampling efficiency.

In order to optimize the parameters of a machine to minimize the energy, a gradient-based optimization scheme can be applied as discussed in the previous section. The energy gradient can be estimated at the same time as Becca2017; carleo_solving_2017. This requires computing the partial derivatives of the wavefunction with respect to the variational parameters, which can be obtained analytically for the RBM carleo_solving_2017 or via backpropagation LeCun2015; LeCun2012; Goodfellow2016 for multi-layer FFNNs. In this case, the steepest descent update according to Eq. (2) is also a form of SGD, because the energy is estimated using a subset of the full data available from the variational wavefunction. Alternatively, often more stable convergence can be achieved by using the stochastic reconfiguration (SR) method Sorella_SR; casula, which approximates the imaginary time evolution of the system on the submanifold of variational states. The SR approach is closely related to the natural gradient descent method used in machine learning Amari1998. In the NetKet implementation, SR is performed using either an exact or an iterative linear solver, the latter being recommended when the number of variational parameters is large.

Information on the optimization run (sampler acceptance rates, energy, energy variance, expectation of additional observables, and the current variational parameters) for each iteration can be written to a log file in JSON format. Alternatively, they can be accessed directly inside the simulation loop in Python to allow for more flexible output.

## Iii Illustrative examples

NetKet is available as a Python package and can be obtained from the Python package index (PyPI) PyPI. Assuming a properly configured Python environment, NetKet can be installed via the shell command

    pip install netket


which will download, compile, and install the package. A working MPI environment is required to run NetKet. In case multiple MPI installations are present on the system and in order to avoid potential conflicts, we recommend to run the installation command as

    CC=mpicc CXX=mpicxx pip install netket


with the desired MPI environment loaded in order to perform the build with the correct compiler. After a successful installation, the NetKet module can be imported in Python scripts.

Alternatively to installing NetKet locally, NetKet also uses the deployment of BinderHub from mybinder.org Jupyter2018 to build and deploy a stable version of the software, which can be found at https://mybinder.org/v2/gh/netket/netket/master. This allows users to run the tutorials or other small jobs without installing NetKet.

### iii.1 One-dimensional Heisenberg model

As a first example, we present a Python script for obtaining a variational RBM representation of the ground state of the spin- Heisenberg model on a one-dimensional chain with periodic boundary conditions. The code for this example is shown in Listing 1. Figure 2 shows the evolution of the energy expectation value over the course of the optimization run. We see that for a small chain of 20 sites and an RBM with 20 hidden units, the energy converges to a relative error of the order within about 100 iteration steps.

### iii.2 Supervised learning

As a second example, we use the supervised learning module in NetKet to optimize an RBM to represent the ground state of the transverse field Ising model. The example script is shown in Listing 2. The exact ground state wavefunction is first obtained by exact diagonalization and then used for training the RBM state by minimizing the overlap loss (1). Figure 3 shows the evolution of the overlap over the training iterations.

## Iv Impact

Given the flexibility of NetKet, we envision several potential applications of this library both in data-driven experimental research and in more theoretical, problem-driven research on interacting quantum many-body systems. For example, several important theoretical and practical questions concerning the expressibility of NQS, the learnability of experimental quantum states, and the efficiency at finding ground states of -local Hamiltonians, can be directly addressed using the current functionality of the software.

Moreover, having an easy-to-extend set of tools to work with NQS-based applications can propel future research in the field, without researchers having to pay a significant cost of entry in terms of algorithm implementation and testing. Since its early release in April 2017, NetKet has already been used for research purposes by several groups worldwide. We also hope that, building upon a common set of tools, practices like publishing accompanying codes to research papers, largely popular in the ML community, can become standard practice also for ML applications in quantum physics.

Finally, for a fast-growing community like ML for quantum science, it is also crucial to have pedagogical tools available that can be conveniently used by new generations of students and researchers. Benefiting from a growing set of tutorials and step-by-step explanations, NetKet can be comfortably used in schools and lectures.

## V Conclusions and future directions

We have introduced NetKet, a comprehensive open source framework for the study of many-body quantum systems using machine learning techniques. Central to this framework are variational parameterizations of many-body wavefunctions in the form of artificial neural networks. NetKet is a Python framework implemented in C++11, designed with efficiency as well as ease of use in mind. Several examples, tutorials, and notebooks are provided with our software in order to reduce the learning curve for newcomers.

The NetKet project is meant to continuously evolve in future releases, welcoming suggestions and contributions from its users. For example, future versions may provide a natural interface with general ML frameworks such as PyTorch paszke2017automatic and Tensorflow tensorflow2015-whitepaper. On the algorithmic side, future goals include the extension of NetKet to incorporate unitary dynamics carleo_localization_2012; jonsson_neural-network_2018 as well as support for neural density matrices torlai_latent_2018.

## Acknowledgements

We acknowledge support from the Flatiron Institute of the Simons Foundation. J.E.T.S. gratefully acknowledges support from a fellowship through The Molecular Sciences Software Institute under NSF Grant ACI1547580. H.T. is supported by a grant from the Fondation CFM pour la Recherche. S.E. and I.G. are supported by an ERC Advanced Grant QENOCOBA under the EU Horizon2020 program (grant agreement 742102) and the German Research Foundation (DFG) under Germany’s Excellence Strategy through Project No. EXC-2111 - 390814868 (MCQST). This project makes use of other open source software, namely pybind11 pybind11, Eigen eigen3, ALPS IETL Bauer2011; Albuquerque2007, nlohmann/json nlohmann-json, and NumPy numpy. We further acknowledge discussions with, as well as bug reports, comments, and more from S. Arnold, A. Booth, A. Borin, J. Carrasquilla, S. Lederer, Y. Levine, T. Neupert, O. Parcollet, A. Rubio, M. A. Sentef, O. Sharir, M. Stoudenmire, and N. Wies.

## References

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters