SOM: Implementation of the Stochastic Optimization Method for Analytic Continuation

# Som: Implementation of the Stochastic Optimization Method for Analytic Continuation

Igor Krivenko Malte Harland I. Institut für Theoretische Physik, Uni. Hamburg, Jungiusstraße 9, 20355 Hamburg, Germany
###### Abstract

We present the SOM analytic continuation package, an efficient implementation of the Stochastic Optimization Method proposed by A. Mishchenko et al. SOM strives to provide a high quality open source (distributed under the GNU General Public License version 3) alternative to the more widely adopted Maximum Entropy continuation programs. It supports a variety of analytic continuation problems encountered in the field of computational condensed matter physics. Those problems can be formulated in terms of response functions of imaginary time, Matsubara frequencies or in the Legendre polynomial basis representation. The application is based on the TRIQS C++/Python framework, which allows for easy interoperability with TRIQS-based quantum impurity solvers, electronic band structure codes and visualization tools. Similar to other TRIQS packages, it comes with a convenient Python interface.

journal: Computer Physics Communications

PROGRAM SUMMARY

Program Title: SOM
Project homepage:Catalogue identifier:
Journal Reference:
Operating system: Unix, Linux, macOS
Programming language: C++/Python
Computers: any architecture with suitable compilers including PCs and clusters
RAM: Highly problem-dependent
Licensing provisions: GNU General Public License (GPLv3)
Classification: 4.8, 4.9, 4.13, 7.3
Keywords: Many-body physics, Quantum Monte Carlo, Analytic continuation, Stochastic optimization, C++, Python
External routines/libraries: TRIQS 1.4.2TRIQS (), Boost >=1.58, cmake.
Nature of problem: Quantum Monte Carlo methods (QMC) are powerful numerical techniques widely used to study quantum many-body effects and electronic structure of correlated materials. Obtaining physically relevant spectral functions from noisy QMC  results requires solution of an ill-posed analytic continuation problem as a post-processing step.
Solution method: We present an efficient C++/Python open-source implementation of the stochastic optimization method for analytic continuation.

## 1 Introduction

Quantum Monte Carlo methods (QMC)Gull11RMP () are one of the most versatile classes of numerical tools used in quantum many-body theory and adjacent fields. When applied in combination with the dynamical mean field theory (DMFT)VollhardtMetzner (); DMFT (); DMFTReview () and an electronic structure calculation method, such as the density functional theory (DFT), they also allow to assess diverse effects of electronic correlations in real materialsKotliar2006 ().

In spite of a considerable progress in development of real time quantum Monte Carlo solvers Muehlbacher08 (); Werner09 (); RealTimeCTAUX (); Chen17A (); Chen17B (), QMC methods formulated in the imaginary time Rubtsov2005 (); Werner2006 (); Gull2008 (); Otsuki2007 (); Prokofev2007 (); Prokofev2008 (); Blankenbecler1981 () remain a more practical choice in most situations. Such algorithms have a fundamental limitation as they calculate dynamical response functions in the imaginary time/Matsubara frequency representation, or in a basis of orthogonal polynomials in more recent implementationsBoehnke2011 (). Getting access to the spectral functions that are directly observable in the experiment requires solution of the infamous analytic continuation problem. The precise mathematical statement of the problem varies between response functions in question and representations chosen to measure them. However, all formulations amount to the solution of an inhomogeneous Fredholm integral equation of the first kind w.r.t. to the spectral function and with an approximately known right hand side.

Such equations are known to be ill-posed by Hadamard, i.e. neither existence nor uniqueness of their solutions is guaranteed. Moreover, behaviour of the solution is unstable w.r.t. the changes in the RHS. This last feature makes the numerical treatment of the problem particularly hard as the RHS is known only up to the stochastic QMC noise. There is little hope that a general solution of the problem of numerical analytic continuation can be attained. Nonetheless, the ongoing effort to develop useful continuation tools has given rise to dozens of diverse methods.

One of the earliest continuation methods, which is still in use, is based on construction of Padé approximants of the input dataVidberg1977 (); Zhu2002 (); Beach2000 (). In general, its applicability is strongly limited to the cases with high signal-to-noise ratio of the input.

Variations of the Maximum Entropy Method (MEM) currently represent the de facto standard among tools for numerical analytic continuation Jarrell1996 (); Bryan1990 (). While not being computationally demanding, they allow to obtain reasonable results for realistic input noise levels. There are high quality open source implementations of MEM available to the community, such as Maxent Bergeron2016 () and a package based on the ALPSCore library Levy2017 (). The typical criticism of the MEM targets its bias towards the specified default model. Besides, MEM results are prone to the so-called saw-tooth instability Silver1990 ().

Another group of approaches, collectively dubbed stochastic continuation methods, exploit the idea of random sampling of the ensemble of possible solutions (spectral functions). The method of A. Sandvik is one of the most well-known members of this family Sandvik1998 (). It can be shown that the MEM is equivalent to the saddle-point approximation of the statistical ensemble considered by Sandvik Beach2004 ().

The Stochastic Optimization Method (SOM) Mishchenko2000 (); Goulko2017 (), also referred to as Mishchenko method, is a stochastic algorithm that demonstrates good performance in practical tests Schott2016 (). It is bias-free in the sense that it does not favour solutions close to a specific default model, or those possessing special properties beyond the standard non-negativity and normalization requirements. It also features a continuous-energy representation of solutions, which is more natural than the traditional projection on a fixed energy mesh. The lack of a high quality open source implementation of such a well-established and successful method urged us to write the program package called SOM. Besides, there had been no good analytic continuation tool compatible with otherwise powerful Toolbox for Research on Interacting Quantum Systems (TRIQS) TRIQS (), so we decided to fill this gap.

There are other continuation methods that are worth mentioning. Among these are the average spectrum method Syljuasen2008 (), the method of consistent constraints (MCC) Prokofev2013 (), stochastic methods based on the Bayesian inference Vafayi2007 (); Fuchs2010 (), the Stochastic Optimization with Consistent Constraints protocol (SOCC, a hybrid between SOM and MCC that allows for characterization of the result accuracy) Goulko2017 (), and finally the recent and promising sparse modelling approach Otsuki2017 ().

This paper is organized as follows. In Section 2 we recapitulate the Stochastic Optimization Method. Section 3 gives a brief description of the performance optimizations devised in our implementation of SOM. We proceed in Section 4 with instruction on how to use the program. Section 5 contains some illustrational results of analytic continuation with SOM. Practical information on how to obtain and install the program is given in Section 6. Section 7 summarizes and concludes the paper. Some technical details, which could be of interest to authors of similar codes, can be found in the appendices.

## 2 Stochastic Optimization Method

We will now briefly formulate the analytic continuation problem, and summarize Mishchenko’s algorithm Pavarini2012 () as implemented in SOM.

### 2.1 Formulation of problem

Given a function (referred to as observable throughout the text), whose numerical values are approximately known at discrete points , we seek to solve a Fredholm integral equation of the first kind,

 O(xm)=ϵmax∫ϵmindϵ K(xm,ϵ)A(ϵ), (1)

w.r.t. spectral function defined on interval . The observable is usually obtained as a result of QMC accumulation, and therefore known with some uncertainty. It is also assumed that values of for different are uncorrelated. The definition domain of as well as the explicit form of the integral kernel vary between different kinds of observables and their representations (choices of ). The sought spectral function is required to be smooth, non-negative within its domain, and is subject to normalization condition

 ϵmax∫ϵmindϵ A(ϵ)=N, (2)

where is a known normalization constant.

Currently, SOM supports four kinds of observables. Each of them can be given as a function of imaginary time points on a regular grid ( is the inverse temperature), as Matsubara components, or as coefficients of the Legendre polynomial expansion Boehnke2011 (). This amounts to a total of 12 special cases of equation (1). Here is a list of all supported observables.

1. Finite temperature Green’s function of fermions, .

must fulfil the anti-periodicity condition . The off-diagonal elements are not supported, since they are not in general representable in terms of a non-negative . It is possible to analytically continue similar anti-periodic functions, such as fermionic self-energy, provided that the zeroth spectral moment is known a priori (for an example of how it can be computed, see Potthoff1997 ()).

2. Finite temperature correlation function of boson-like operators and , .

must be -periodic, . Typical examples of such functions are Green’s function of bosons and the transverse spin susceptibility .

3. Autocorrelator of a Hermitian operator.

The most widely used observables of this kind are the longitudinal spin susceptibility and the charge susceptibility . This is a special case of the previous observable kind with , and its use is in general preferred due to the reduced definition domain (see below).

4. Correlation function of operators at zero temperature.

Strictly speaking, the imaginary time argument of a correlation function is allowed to grow to positive infinity in the limit of . Therefore, one has to choose a fixed cutoff up to which the input data is available. Similarly, in the zero temperature limit spacing between Matsubara frequencies vanishes. Neither periodicity nor antiperiodicity property makes sense for observables at , so one can opt to use fictitious Matsubara frequencies with any of the two possible statistics and with the spacing equal to .

The spectral function is defined on for observables (1)-(2), and on for observables (3)-(4).

The Matsubara representation of an imaginary time function is assumed to be

 F(izn)=∫~β0dτeiznτχ(τ), (3)

with being in the cases (1)-(3) and in the case (4). are fermionic Matsubara frequencies in the case (1), bosonic Matsubara frequencies in the cases (2)-(3), and fictitious frequencies (or ) in the case (4). The Legendre polynomial representation is similar for all four observables.

 Fℓ=√2ℓ+1∫~β0dτPℓ[x(τ)]F(τ),x(τ)=2τ/~β−1. (4)

Supported observables and their respective integral kernels are summarized in Table 1.

### 2.2 Method

The main idea of the Stochastic Optimization Method is to use Markov chain sampling to minimize an objective function

 D[A]=M∑m=1|Δ(m)|,Δ(m)=1S(m)⎡⎢⎣ϵmax∫ϵmindϵ K(xm,ϵ)A(ϵ)−O(xm)⎤⎥⎦ (5)

with respect to the spectrum . is the deviation function for the -th component of the observable. The weight factor can be set to QMC error-bar estimates (if available) to stress importance of some input data points . In contrast to most other analytic continuation methods, should not, however, be identified with the error-bars. Multiplication of all by a common scalar would result in an equivalent objective function, i.e. only the relative magnitude of different components of matters.

The integral equation (1) is not guaranteed to have a unique solution, which means has in general infinitely many minima. Most of these minima correspond to non-smooth spectra containing strong sawtooth noise. SOM addresses the issue of the sawtooth instability in the following way. At first, it generates particular solutions using a stochastic optimization procedure. Each generated solution is characterized by a value of the objective function . A subset of relevant, or “good” solutions is then isolated by picking only those that satisfy . is the absolute minimum among all computed particular solutions, and is a deviation threshold factor (usually set to 2). The final solution is constructed as a simple average of all good solutions, and has the sawtooth noise approximately cancelled out from it. In mathematical form, this procedure summarizes as

 A(ϵ)=1LgoodL∑j=1θ(αgoodmin{D[~Aj]}−D[~Aj])~Aj(ϵ), (6) Lgood=L∑j=1θ(αgoodmin{D[~Aj]}−D[~Aj]).

SOM features a special way to parametrize spectra as a sum of possibly overlapping rectangles with positive weights.

 ~A(ϵ) =K∑k=1R{ck,wk,hk}(ϵ), (7) R{ck,wk,hk}(ϵ) =hkθ(ϵ−(ck−wk/2))θ((ck+wk/2)−ϵ). (8)

stand for centre, width and height of the -th rectangle, respectively. The positivity constraint is enforced by requiring , for all , while the normalization condition (2) translates into

 K∑k=1hkwk=N. (9)

Sums of rectangles offer more versatility as compared to fixed energy grids and sums of delta-peaks (as in Beach2004 ()). They are well suited to represent both large scale structures, as well as narrow features of the spectra.

The optimization procedure used to generate each particular solution is organized as follows. In the beginning, a random configuration (sum of rectangles) is generated and height-normalized to fulfil Eq. (9). Then, a fixed number of global updates are performed. Each global update is a short Markov chain of elementary updates governed by the Metropolis-Hastings algorithmMetropolis1953 (); Hastings1970 () with acceptance probability

 Pt→t+1={1,D[Ct+1]≤D[Ct],(D[Ct]/D[Ct+1])d+1,D[Ct+1]>D[Ct]. (10)

The global update is only accepted if . SOM implements all seven types of elementary updates proposed by Mishchenko et al.

1. Shift of rectangle;

2. Change of width without change of weight;

3. Change of weight of two rectangles preserving their total weight;

4. Adding a new rectangle while reducing weight of another one;

5. Removing a rectangle while increasing weight of another one;

6. Splitting a rectangle;

7. Glueing two rectangles.

To further improve ergodicity, the Markov chain of elementary updates is split into two stages. During the first elementary updates, large fluctuations of the deviation measure are allowed by setting . is then changed to for the rest of the Markov chain, so that chances to escape a local minimum are strongly suppressed during the second stage. , and are chosen randomly and anew at the beginning of a global update. Introduction of global updates helps accelerate convergence towards a deep minimum of the objective function , whereas MC evolution within each global update ensures possibility to escape a shallow local minimum.

An insufficient amount of global updates may pose a serious problem and result in particular solutions that fit (1) poorly. Mishchenko suggested a relatively cheap way to check whether a given is large enough to guarantee good fit quality (see Appendix F for a detailed description of this recipe for real- and complex-valued observables). Although SOM can optionally try to automatically adjust , this feature is switched off by default. From practical calculations, we have found that the -adjustment procedure may fail to converge for input data with a low noise level.

Another important calculation parameter is the number of particular solutions to be accumulated. Larger lead to a smoother final solution but increases computation time as . Setting manually to some value from a few hundred to a few thousand range gives reasonably smooth spectral functions in most cases. Nonetheless, SOM can be requested to dynamically increase until a special convergence criterion is met. This criterion can be formulated in terms of a frequency distribution (histogram) of the objective function . is considered sufficient if the histogram is peaked near . In addition to we define (4/3 by default), and count those good solutions, which also satisfy . If nearly all good solutions (95% or more) are very good, the histogram is well localized in the vicinity of , and all included in the final solution provide a good fit. It is worth noting, however, that simple increase of does not in general guarantee localization of the histogram.

## 3 Performance optimizations

The most computationally intensive part of the algorithm is evaluation of integral (1) for a given configuration ,

 ϵmax∫ϵmindϵ K(xm,ϵ)~A(ϵ) =K∑k=1ϵmax∫ϵmindϵ K(xm,ϵ)R{ck,wk,hk}(ϵ)= =K∑k=1hk[Λ(xm,ck+wk/2)−Λ(xm,ck−wk/2)]. (11)

In the Matsubara representation, the integrated kernels are simple analytic functions (see Appendix B). For the imaginary-time and Legendre representations, however, there are no closed form expressions for . Doing the integrals with a numerical quadrature method for each rectangle and upon each proposed elementary update would be prohibitively slow. SOM boasts two major optimizations that strongly alleviate the problem.

At first, it uses pre-computation and special interpolation schemes to quickly evaluate . As an example, let us consider the kernel for the fermionic GF in the imaginary time,

 Λ(τ;Ω)=−1ββΩ∫0e−(τ/β)x1+e−xdx. (12)

The integral on the right hand side is analytically doable only for a few values of , namely . For all other it has to be done numerically and approximated using a cubic spline interpolation. The spline interpolation would be easy to construct if the integrand were localized near for all . Unfortunately, this is not the case. The integrand develops a long “tail” on the positive/negative half-axis as approaches 0/1 respectively. The length of this tail scales as (or ), which would require an excessively large number of spline knots needed to construct a reliable approximation. This issue is solved by subtracting an exponential tail contribution from the integrand, such that the difference is well localized, and an integral of is trivial (Fig. 1).

 Λ(τ;Ω)=−1β[θ(−Ω)S−α(βΩ)+θ(Ω)S+α(βΩ)+Tα(βΩ)], (13)
 S−α(z)=−0∫z[e−αx1+e−x−tα(x)]dx,S+α(z)=z∫0[e−αx1+e−x−tα(x)]dx, (14)
 Tα(z)=z∫0tα(x)dx. (15)

For each , the integrals and are expressed in terms of special functions or rapidly convergent series and are precomputed on a uniform grid with a fixed number of -points. The precomputed values provide data to construct the cubic spline interpolation between those points, and are simple analytical functions that are quick to evaluate. The length of the segment, on which the splines are defined, is chosen such that functions can be considered constant outside it. Appendix C contains relevant expressions and necessary implementation details for all observable kinds.

A somewhat similar approach is taken to evaluate integrated Legendre kernels. Once again, let us take the fermionic GF as an example,

 Λ(ℓ;Ω)=|Ω|β∫0β√2ℓ+1(−sgn(ϵ))ℓiℓ(β|ϵ|2)2cosh(βϵ/2)=(−sgn(Ω))ℓ+1√2ℓ+1|Ω|β/2∫0iℓ(x)cosh(x)dx, (16)

where is the modified spherical Bessel function of the first kind spherical_bessel (). The integrand is rather inconvenient for numerical evaluation, because both the Bessel function and the hyperbolic cosine grow exponentially. Moreover, the integral itself grows logarithmically for , which makes naive pre-computation and construction of a spline infeasible. This time, our evaluation scheme is based on the following expansion of the integrand,

 iℓ(x)cosh(x)=exex+e−xℓ∑n=0(−1)nan(ℓ+1/2)xn+1+e−xex+e−xℓ∑n=0(−1)ℓ+1an(ℓ+1/2)xn+1, (17)

with being coefficients of the Bessel polynomials carlitz1957 (),

 an(ℓ+1/2)=(ℓ+n)!2nn!(ℓ−n)!. (18)

For large (high-energy region), the integrand can be approximated as

 iℓ(x)cosh(x)≈ℓ∑n=0(−1)nan(ℓ+1/2)xn+1. (19)

In the low-energy region we do the integral on a fixed -mesh, , using the adaptive Simpson’s method. Results of the integration are used to construct a cubic spline interpolation of . For each the boundary value between the low- and high-energy regions is found by comparing values of (17) and (19). In the high-energy region we use the asymptotic form (19) to do the integral,

 F>(z)|z>x0=F<(x0)+{log(x)+ℓ∑n=1(−1)n+1an(ℓ+1/2)xnn}∣∣ ∣∣zx0. (20)

Implementation details for other Legendre kernels can be found in Appendix D.

The second optimization implemented in SOM consists in aggressive caching of rectangles’ contributions to the integral (1). As it is seen from (3), the integral is a simple sum over all rectangles in a configuration. Thanks to the fact that elementary updates affect at most two rectangles at once, it makes sense to store values of every individual term in the sum. In the proposal phase of an update, evaluation of the integrated kernel is then required only for the affected rectangles. If the update is accepted, contributions of the added/changed rectangles are stored in a special cache and can be reused in a later update. Since the size of a configuration does not typically exceed 100, and , the cache requires only a moderate amount of memory.

Besides the two aforementioned optimizations, SOM can take advantage of MPI parallel execution. Generation of particular solutions is “embarrassingly parallel”, because every solution is calculated in a completely independent manner from the others. When a SOM calculation is run in the MPI context, the accumulation of particular solutions is dynamically distributed among available MPI ranks.

## 4 Usage

### 4.1 Basic usage example

Running SOM to analytically continue input data requires writing a simple Python script. This usage method is standard for TRIQS applications. We refer the reader to Sec. 9.3 of TRIQS () for instructions on how to execute Python scripts in the TRIQS environment. Details of the script will vary depending on the physical observable to be continued, and its representation. Nonetheless, a typical script will have the following basic parts.

• Import TRIQS and SOM Python modules.

from pytriqs.gf.local import *
# HDFArchive interface to HDF5 files.
from pytriqs.archive import HDFArchive
# HDF5 archives must be modified only by one MPI rank.
# is_master_node() checks we are on rank 0.
from pytriqs.utility.mpi import is_master_node
# Main SOM class.
from pytriqs.applications.analytical_continuation.som import Som
• Load the observable to be continued from an HDF5 archive.

arch = HDFArchive(’input.h5’, ’r’)
# arch[’g’] must be an object of type GfImTime, GfImFreq or GfLegendre.
g = arch[’g’]

This step can be omitted or modified if the input data comes from a different source. For instance, it could be loaded from text files or generated in the very same script by a quantum Monte-Carlo impurity solver, such as TRIQS/CTHYB CTHYB (). Only the values stored in the g.data array will be used by SOM, while any high frequency expansion information (g.tail) will be disregarded. If g is matrix-valued (or, in TRIQS’s terminology, has a target shape bigger than 1x1), SOM will only construct analytic continuation of the diagonal matrix elements.

• Set the importance function

S = g.copy()
# Set all elements of ’S’ to a constant.
S.data[:] = 1.

In this example we assume that all elements of are equally important. Alternatively, one could read the importance function from an HDF5 archive or another source.

• Construct Som object.

cont = Som(g, S, kind = "FermionGf", norms = 1.0)

The argument kind must be used to specify the kind of the physical observable in question. Its recognized values are FermionGf (Green’s function of fermions), BosonCorr (correlator of boson-like operators), BosonAutoCorr (autocorrelator of a Hermitian operator), ZeroTemp (zero temperature correlator). The optional argument norms is a list containing expected norms (Eq. 2) of the spectral functions to be found, one positive real number per one diagonal element of g. Instead of setting all elements of norms to the same constant x, one may simply pass norms = x. By default, all norms are set to 1, which is correct for the fermionic Green’s functions. However, adjustments would normally be needed for self-energies and bosonic correlators/autocorrelators.

• Set simulation parameters.

params = {}
# SOM will construct a spectral function
# within this energy window (mandatory)
params[’energy_window’] = (-5.0,5.0)
# Number of particular solutions to accumulate (mandatory).
params[’l’] = 1000
# Set verbosity level to the maximum on MPI rank 0,
# and silence all other ranks
params[’verbosity’] = 3 if is_master_node() else 0
# Do not auto-adjust the number of particular solutions to accumulate
# (default and recommended).
# Do not auto-adjust the number of global updates (default and recommended).
# Number of global updates per particular solution.
params[’f’] = 200
# Number of local updates per global update.
params[’t’] = 500
# Accumulate histogram of the objective function values,
# useful to analyse quality of solutions.
params[’make_histograms’] = True

In Section 4.2 we provide a table of main accepted simulation parameters.

• Run simulation.

This function call is normally the most expensive part of the script.

• Extract solution and reconstructed input.

g_w = GfReFreq(window = (-5.0, 5.0), n_points = 1000, indices = g.indices)
g_w << cont
# Imaginary time/frequency/Legendre data reconstructed from the solution.
g_rec = g.copy()
g_rec << cont

g_w is the retarded fermionic Green’s function of the real frequency corresponding to the input g. Its imaginary part is set to , whereas the real part is computed semi-analytically using the Kramers-Kronig relations (there are closed form expressions for a contribution of one rectangle to the real part). The relation between g_w and slightly varies with the observable type; relevant details are to be found in the online documentation.

The high frequency expansion coefficients will be computed from the sum-of-rectangles representation of and written into g_w as well. If g_w is constructed with a wider energy window compared to that from params, is assumed to be zero at all “excessive” energy points.

The reconstructed function g_rec is the result of the substitution of back into the integral equation (1). The correctness of results should always be controlled by comparing g with g_rec.

• Save results to an HDF5 archive.

if mpi.is_master_node():
with HDFArchive("results.h5",’w’) as ar:
ar[’g’] = g
ar[’g_w’] = g_w
ar[’g_rec’] = g_rec
# Save histograms for further analysis
ar[’histograms’] = cont.histograms

The output archive can be read by other TRIQS scripts and outside utilities, in order to analyse, post-process and visualize the resulting spectra. More elaborate examples of SOM Python scripts can be found on the official documentation page, http://krivenko.github.io/som/documentation.html.

### 4.2 Simulation parameters

Table (2) contains main simulation parameters understood by run() method of class Som. More advanced parameters intended for experienced users are listed in Appendix A.

## 5 Illustrational results

### 5.1 Testing methodology

We apply SOM to a few synthetic test cases where the exact spectral function is known. These tests serve as an illustration of SOM’s capabilities and as guidance to the prospective users. Being based on a Markov chain algorithm with a fair number of adjustable parameters, SOM may suffer from ergodicity problems and produce characteristic spectral artefacts if the parameters are not properly chosen. It is therefore instructive to consider analytic continuation problems that can potentially pose difficulty to SOM, and study the effect of the various input parameters on the quality of the solution.

To perform most of the tests, we have devised a special procedure allowing to model stochastic noise characteristic to fermionic Green’s functions measured by QMC solvers. The procedure involves running a single-loop DMFTDMFTReview () calculation for a non-interacting single-band tight-binding model on a Bethe lattice using the TRIQS/CTHYB solver. The exact Green’s function of this model is easily computed from the well known semi-elliptic spectral function. The impurity solver will introduce some noise that can easily be estimated from the accumulation result as

 ~η(τ)=GMCBethe(τ)−G% exBethe(τ), (21)

and normalized according to

 η(τ)=~η(τ)/ ⎷1M∑m(~η(τm)−1M∑m′~η(τm′))2. (22)

We ensure that values of the noise are uncorrelated between different time slices by taking a sufficiently large number of Monte Carlo updates per measurement. Eventually, the extracted and normalized noise is rescaled and added to a model Green’s function to be continued with SOM (Fig. 2),

 Gnoisyσ(τ)=Gex(τ)+ση(τ). (23)

In the case of a susceptibility (two-pole model, see below), we simply add Gaussian noise to . It is independently generated for each with zero mean and equal dispersion .

### 5.2 Comparison of model spectra

Results of tests with four model spectra and various levels of added noise are shown in Fig. 3. In all tests presented in that figure, the inverse temperature has been set to ( for the zero-temperature kernel). All curves have been calculated from the imaginary time input data on a grid with -points ( are computed from analytically known spectral functions via (1)). The importance function has been chosen to be a constant in all simulations. The number of global updates is set to and the number of elementary updates is . The remaining parameters are set to SOM defaults, if not stated otherwise.

The model studied first is the single Hubbard atom with the Coulomb repulsion strength at half-filling. Its spectral function comprises two discrete levels at corresponding to excitations through empty and doubly occupied states. For a general complex frequency , the Green’s function of the Hubbard atom reads

 G(z)=1/2z+U/2+iδ+1/2z−U/2+iδ, (24)

where is the resonance broadening term. The analytic continuation with SOM shows that strong noise smears the calculated spectral function, the peak height is diminished and the spectral weight is spread asymmetrically around the peak. The smearing at large frequencies is more pronounced. The peak position converges correctly to the exact solution as decreases.

The second model is the non-interacting double Bethe lattice Moeller1999 (),

 G(z)=ζ(z−t⊥)+ζ(z+t⊥), (25) ζ(z)=z−isignI(z)√4t2b−z22t2b. (26)

The specific feature of this model is a gapped symmetric spectrum with sharp band edges. Splitting between the semi-elliptic bands is given by , and their bandwidths are . Top right part of Fig. 3 shows that large noise can break the particle-hole symmetry. As the noise level goes down, the SOM solution approaches the exact one, but the shape of the bands remains somewhat Lorentzian-like, which leads to an underestimated gap.

As an example of a gapless spectrum (Fig. 3, bottom left), we take a Green’s function that produces a sum of four Lorentzian peaks, i.e.

 G(z)=∑iciz−bi+iδ (27)

with , and . The Lorentzians overlap and form a single peak that has two shoulders. The peak doesn’t have a mirror symmetry with respect to the vertical axis and its centre is not at the Fermi level. The shoulders are not resolved for the largest noise level of , but for smaller noise levels both shoulders are correctly detected by SOM. The spectra with look notably more noisy. The origin of this effect is SOM’s ability to find the best particular solution that is a very good fit to the input data (with a very small ). With a fixed threshold, only a few particular solutions are considered for inclusion in the final solution. Hence, the less smooth curves.

The fourth model (Fig. 3, bottom right) appears in the context of the Fermi polaron problem, and it has previously been used in Goulko2017 () to assess performance of other continuation methods. It consists of two peaks modelled via Gaussians,

 G(z)=1N∑icie−(z−ai)2bi (28)

with , and . is used to set the spectral norm to . Here, we apply the zero-temperature imaginary time kernel with . This case is difficult as it contains sharp features over a broad energy range. We observe that the low-energy peak can be relatively well resolved. However, the high-energy peak is strongly broadened with the width dependent of the noise level. Positions of both peaks are reproduced with a reasonable accuracy of .

### 5.3 Effect of F

We have observed in practical calculations that the choice of the amount of global updates can drastically affect ergodicity of SOM Markov chains and, eventually, the output spectra. Fig. 4 shows evolution of SOM spectral functions with for the Fermi polaron model. Parameters other than are kept at their default values, i.e. , , and the noise level is set to . The Fermi polaron spectra obtained from the imaginary time representation (Fig. 4, left) demonstrates that SOM has generally no problem resolving the low-energy peak regardless of . The high-energy peak, however, significantly changes its shape and position as grows, slowly approaching the reference curve.

The situation is rather different, when the Legendre representation is used (Fig. 4, right). As before, the noise is created and added to the Green’s function in the imaginary time representation. Subsequently, it is transformed to the Legendre basis and passed to SOM. The number of Legendre coefficients used here is . Positions of both peaks do not change much with . For sufficiently large , their width and height are visibly closer to the exact solution as compared to the imaginary time results. In contrast, for small the peaks contain relatively strong noise. This effect has been found with the asymmetric metal model for large , too. It occurs when the final solution is dominated by a few particular solutions with very small , i.e. when Markov chains suffer from poor ergodicity and struggle to escape an isolated deep minimum of the objective function.

In Fig. 5, a similar comparison is presented for the two-pole susceptibility modelSchott2016 (),

 χ(z)=1χ0∑iciz2−b2i, (29) χ0=−∑icib2i,

with , . We use the Hermitian autocorrelator kernel for Matsubara frequencies and add uncorrelated Gaussian noise with the standard deviation . Other parameters are , and . For the same number of global updates, we obtain a double-peak structure (Fig. 5, left) similar to that shown in Fig. 2 of Schott2016 (). Additionally, we transform the same set of noisy input data into the Legendre representation and apply the corresponding kernel (Fig. 5, right). As in the Fermi polaron case, use of the Legendre basis results in slightly more pronounced peaks and a deeper valley between them.

### 5.4 Diagnostics

After a solver run, SOM does not only provide the continued function, but also the reconstructed input function as well as a histogram of the objective function. These features are mainly for debugging purposes and for an estimation of result’s quality. It is obvious that the reconstructed Green’s function should ideally be a smooth version of the input Green’s function. In the upper left part of Fig. 6 we plot the difference between the exact Green’s function of the Fermi polaron model and its SOM reconstruction after an imaginary time simulation (the corresponding spectrum is shown in Fig. 4). First of all, we observe that the difference shrinks with increasing . Furthermore, the main difference comes from the small imaginary times. The Green’s function has its largest values there, and so does the noise. The relevant information is encoded at small imaginary time points, while data values at the long-time plateau are less important. We learn from that figure that runs with a smaller amount of imaginary time points could have been performed in order to reduce the required CPU time. In our case half of that amount would suffice.

The Legendre basis has been identified as a useful tool to filter noise by cutting off large expansion coefficients.Boehnke2011 () In Fig. 6 (bottom left) we provide a similar comparison of exact and reconstructed Green’s functions in the Legendre representation. They lie on top of each other everywhere except for large-order Legendre coefficients. This kind of plots may give a hint at how many Legendre polynomial coefficient should be retained to filter out the noise without losing relevant information in the input data.

The objective function histogram helps determine whether the choice of and is adequate. The histogram is usually stretched when and/or are too small, and should converge to a sharper peak shape for larger values. Objective function histograms for both used kernels are plotted in the right part of Fig. 6. It is clearly seen that increased lead to localization of the histograms and, as a result, to more efficient accumulation of particular solutions.

### 5.5 Final remarks

In conclusion, we would like to make some remarks about the rest of the main simulation parameters. The energy window must be chosen such that the entire spectral weight fits into it, but within the same order of magnitude with the bandwidth. Extremely wide energy windows may be wasteful as they are likely to cause low acceptance rates in the Markov chains. The effect of increased has not been explicitly studied here, but it is similar to that of a moving average noise filter. It is, however, not recommended to post-process (smoothen) SOM-generated spectra using such a filter, because it can cause artefacts that also depend on the moving frame size. Regarding efficiency, it is worth noting that simulations with the Legendre kernels typically require less CPU time, mainly because of the different dimensions of the input data arrays ( Legendre coefficients versus -points/Matsubara frequencies).

## 6 Getting started

### 6.1 Obtaining Som

The SOM source code is available publicly and can be obtained from a GitHub repository located at https://github.com/krivenko/som. It is recommended to always use the ‘master’ branch of the repository, as it contains the most recent bug fixes and new features being added to SOM.

### 6.2 Installation

The current version of SOM requires the TRIQS library 1.4.2 as a prerequisite. An installation procedure of TRIQS is outlined in Sec. 9.2 of TRIQS (). More detailed step-by-step instructions are available on the TRIQS documentation website (https://triqs.github.io/triqs/1.4/install.html). It is of crucial importance to check out the version tag 1.4.2 before compiling and installing TRIQS. This can be done by running the following shell command

$git checkout 1.4.2  in the cloned source directory of the TRIQS library. It is also important to make sure that TRIQS is compiled against a recent version of the Boost C++ libraries, 1.58 or newer. Installing SOM is similar to that of other TRIQS-based applications. Assuming that TRIQS 1.4.2 has been installed at /path/to/TRIQS/install/dir SOM is simply installed by issuing the following commands at the shell prompt: $ git clone https://github.com/krivenko/som som_src
$mkdir som_build && cd som_build$ cmake -DTRIQS_PATH=/path/to/TRIQS/install/dir ../som_src
$make$ make test
\$ make install


This will install SOM in the same location as TRIQS. Further installation instructions are given in the online documentation.

### 6.3 Citation policy

SOM is provided free of charge. We kindly ask to help us with its continued maintenance and development by citing the present paper, as well as the accompanying paper to the TRIQS library TRIQS (), in any published work using the SOM package. We also strongly recommend to cite Mishchenko2000 () as the original work where the Stochastic Optimization Method was first introduced.

### 6.4 Contributing

SOM is an open source project and we encourage feedback and contributions from the user community. Issues should be reported via the GitHub website at https://github.com/krivenko/som/issues. For contributions, we recommend to use the pull requests on the GitHub website. It is recommended to coordinate with the SOM developers before any major contribution.

## 7 Summary

We have presented the free SOM package that implements the Stochastic Optimization Method for analytic continuation. On a set of practical examples, we demonstrated the versatility of SOM, as well as its potential to become a viable alternative to various implementations of the Maximum Entropy Method, the current de facto standard in the field of computational condensed matter physics. The algorithm contains a number of performance optimizations that significantly reduce overall computation costs.

Our plans for future releases include porting the code base to more recent versions of the TRIQS library and adding support for more integral kernels, such as kernels for superconducting correlators Levy2017 (). Another interesting addition would be the ability to compute the two-point correlators of particular solutions as defined in Goulko2017 (). Following the methodology of Goulko et al, one could use the accumulated correlators to characterize the accuracy of the output spectral functions.

## 8 Acknowledgements

Authors are grateful to Alexander Lichtenstein and Olga Goulko for fruitful discussions, to Roberto Mozara for his valuable comments on the usability of the code, and to Andrey Mishchenko for giving access to the original FORTRAN implementation of the SOM algorithm. We would also like to thank Vladislav Pokorný and Hanna Terletska, who stimulated writing of this paper. I.K. acknowledges support from Deutschen Forschungsgemeinschaft via project SFB668 (subproject A3, “Electronic structure and magnetism of correlated systems”). The computations were performed with resources provided by the North-German Supercomputing Alliance (HLRN).

## Appendix B Evaluation of integrated Matsubara kernels

Observable kind, Integrated kernel,
 Green’s function of fermions
 Correlator of boson-like operators
 Autocorrelator of a Hermitian operator
Zero temperature correlator
 hlog(izn−c+w/2izn−c−w/2), with zn=π(2n+1)/τmax or 2πn/τmax

## Appendix C Evaluation of integrated imaginary time kernels

The spline interpolation procedure used to evaluate the integrated kernels is outlined in Sec. 3. Here we only give the observable-specific details. Throughout this Appendix, we denote dimensionless imaginary time and energy arguments by and respectively. denotes the integrated tail contribution. All series found in the tables below are rapidly (exponentially) convergent. is the dilogarithm (Spence’s function), is the digamma function, is the trigamma function, .

### c.1 Green’s function of fermions

 Λ(τ;Ω)=−1βz∫0e−αx1+e−xdx=−1β[θ(−z)S−α(z)+θ(z)S+α(z)+Tα(z)], (30)
 S−α(z)=−0∫z[e−αx1+e−x−tα(x)]dx,S+α(z)=z∫0[e−αx1+e−x−tα(x)]dx. (31)

Segments to construct the splines on:

 S−α(z):z∈[−z0,0],S+α(z):z∈[0;z0],z0=−2log(tolerance). (32)

### c.2 Correlator of boson-like operators

 Λ(τ;Ω)=1πβ2z∫0xe−αx1−e−xdx.=1πβ2[θ(−z)S−α(z)+θ(z)S+α(z)+Tα(z)], (33)
 S−α(z)=−0∫z[xe−αx1−e−x−tα(x)]dx,S+α(z)=z∫0[xe−αx1−e−x−tα(x)]dx. (34)

Segments to construct the splines on:

 S−α(z):z∈[−z0,0],S+α(z):z∈[0;z0]. (35)

Factor in the kernel tend to strengthen its delocalization, so it makes sense to choose slightly larger, for instance, .