# A first look at quasi-Monte Carlo

for lattice field theory problems

###### Abstract

In this project we initiate an investigation of the applicability of Quasi-Monte Carlo methods to lattice field theories in order to improve the asymptotic error behavior of observables for such theories. In most cases the error of an observable calculated by averaging over random observations generated from an ordinary Monte Carlo simulation behaves like , where is the number of observations. By means of Quasi-Monte Carlo methods it is possible to improve this behavior for certain problems to up to . We adapted and applied this approach to simple systems like the quantum harmonic and anharmonic oscillator and verified an improved error scaling.

HU-EP-12/46

SFB/CPP-12-88

DESY 12-211

## 1 Introduction

Four-dimensional quantum field theories play a crucial role in the mathematical
description of the fundamental forces of nature. The path integral formalism developed
by R.P. Feynman has been established since decades to quantize classical
field theories and thus, to formulate quantum field theories.

A quantum field theory (QFT) describes the behavior of certain particles and their
interaction. In the absence of an interaction the path integral of a QFT is usually
trivial. It is mostly the interaction term that makes the path integral challenging
to evaluate. In cases where the coefficient of the interaction term, the coupling
constant, is small enough a perturbative treatment of the path integral is possible
and often sufficient.

However many interesting quantities, like e.g. the hadron spectrum, decay constants,
certain matrix elements and form factors, have to be calculated in a regime of a
strong coupling(-constant), where a perturbative approach must fail. In such
situations it is necessary to treat the path integral non-perturbatively. Because of
the lack of closed form solutions the path integral can only be evaluated numerically.
In order to do so, it is necessary to give the path integral a mathematically well
defined meaning. A straightforward method is it to discretize space and time by
introducing a, moreover Euclidean space-time lattice with a fixed spacing between two
neighboring lattice points, the lattice spacing.
Restricting the system to a finite lattice extension (and applying certain boundary
conditions) the
infinite-dimensional path integral is converted to a finite-dimensional integral.
(In principle, one still has to perform the transition to zero lattice spacing and
therefore to infinitely many space-time points at the end.)

Such lattice path integrals can easily have dimensions of
(e.g. simulations of lattice-discretized quantum chromodynamics (QCD), the theory of
strong interactions of elementary particles).
The high dimensionality of the problem restricts the spectrum of applicable algorithms
to Monte Carlo-based methods. Especially Markov chain-Monte Carlo (Mc-MC) methods
have been successfully applied since the beginning of the study of lattice field theories.
With the algorithms employed observables calculated from a Monte Carlo chain of
steps will obey a statistical error proportional to .

Recent developments in the field of Quasi-Monte Carlo (QMC) methods, discussed in
more detail in sections 3 to 5, show that under certain
conditions it is possible to construct sets of samples of integration points leading
to much faster rates of convergence of an observable and a much better asymptotic
error behavior of up to .

Such an improved error behavior would decrease the number of samples necessary to
achieve a certain error bound, resulting in a drastic reduction of runtime. Note that
for present computations in field theory state-of-the-art supercomputers are used.
It is unclear, whether QMC methods can be used for lattice field theory simulations.
As a first step, to nevertheless investigate this possibility, we will focus in this
work on the study of much simpler models, namely the quantum mechanical harmonic
and anharmonic oscillator.

## 2 Quantum Mechanical Harmonic and Anharmonic Oscillator

In this section we will discuss the basic steps for the quantization of the theory in the path integral approach and the discretization on a time lattice. The first step is the construction of the Lagrangian (resp. the action) of the corresponding classical mechanical system for a given path of a particle with mass . For a numerically stable evaluation of the path integral it is essential to pass on to Euclidean time. In this case the Lagrangian and the action is given by:

(1) | ||||

(2) |

Depending on the scenario (harmonic or anharmonic oscillator) the potential consists of two parts

(3) |

such that the parameter controls the anharmonic part of the theory. It should also be mentioned that in the anharmonic case the parameter can take on negative values, leading then to a double well potential.

The next step is to discretize time into equidistant time slices with a spacing of . The path is then only defined on the time slices:

(4) | ||||

(5) |

On the lattice the derivative with respect to the time appearing in (1) (first term) will be replaced by the forward finite difference . The choice of the lattice derivative is not unique and requires special care, particularly if one considers more complicated models like lattice QCD. But in [1] it was shown that the lattice derivative chosen here permits a well defined continuum limit. Putting all the ingredients together, we can write down the lattice action for the (an)harmonic oscillator

(6) |

For the path a cyclic boundary condition can be assumed. In the following the superscript “latt” will be dropped, as we will only refer to the lattice action from now on. The expectation value of an observable of the quantized theory expressed in terms of the path integral reads as follows:

(7) |

This expression is suitable for a numerical evaluation of certain quantities of the underlying theory. Up to now only Monte Carlo methods are known to give reliable results for dimensions . One type of such methods, often used in physics, is the Markov chain-Monte Carlo approach mostly applying the weight for sampling paths (so-called “importance sampling”). Especially the Metropolis algorithm[2] is suitable and a straightforward solution of (7) (also described in [1]) and serves as a reference method for the QMC approach, which is much less intuitive. The theory of QMC methods is a purely mathematical topic. During the discussion of the key aspects of QMC, following in the next sections, we will stick to a rather mathematical language, being more adequate for the description of a mathematical issue.

## 3 Direct Monte Carlo and quasi–Monte Carlo methods

In many practical applications one is interested in calculating quotients of the form (7) where the action and the observables are usually smooth functions in high dimensions. In some special situations where one would like to deal with integrands of moderately high dimensions, one may consider an estimator for the integral in the numerator and in the denominator of (7) separately, and then take as an estimation of . Another possibility is to take a joint estimator for the total quantity using a single direct sampling method. A well known approach based on direct sampling is the so called weighted uniform sampling (WUS) estimator, analyzed in [3]. We will show some characteristics of the WUS estimator in section 6, and we will refer from now on to these methods as plain or direct sampling methods for estimating (7). In many interesting examples, we encounter the case were the action and the observable lead to integrals , of Gaussian type. Then the integrals , can be written in the form

where denotes the covariance matrix of the Gaussian density function. A transformation to the unit cube in can be applied such that the corresponding integrals take the form

(8) |

Here is some symmetric factorization of the covariance matrix,
and ,
where represents
the inverse of the normal cumulative distribution function .

In the classical plain or direct Monte–Carlo (MC) approach one tries
to estimate (8)
by generating samples pseudo-randomly. One starts with a finite sequence of
independent identically distributed (i.i.d.) samples ,
where the points ,
have been generated from the uniform distribution in .
Then, the quadrature rule is fixed by taking the average of the function evaluations for

as an approximation of the desired integral . The resulting estimator is unbiased. The integration error can be approximated via the central limit theorem, given that belongs to . The variance of the estimator is given by

As measured by its standard deviation from zero the integration error associated with the MC approach is then of order . The quality of the MC samples relies on the selected pseudo–random number generators of uniform samples, here we use the Mersenne Twister generator from Matsumoto and Nishimura (see [4]). MC is in general a very reliable tool in high–dimensional integration, but the order of convergence is in fact rather poor.

In contrast, quasi–Monte Carlo (QMC) methods generates deterministically point sets that are more regularly distributed than the pseudo–random points from MC (see [5], [6], [7], [8]). Typical examples of QMC are shifted lattice rules and low–discrepancy sequences. To explain what we mean by “regularly distributed”, we define now the classical notion of discrepancy of a finite sequence of points in . Given a set of points in , and a nonempty family of Lebesgue-measurable sets in , we define the classical discrepancy function by

where is the characteristic function of . This allows us to define the so called star discrepancy.

###### Definition 3.1

We define the star discrepancy of the point set by , where is the family of all sub-intervals of the form , with .

The star discrepancy can be considered as a measure of the worst difference between the uniform distribution and the sampled distribution in attributed to the point set . The usual way to analyze QMC as a deterministic method is by choosing a class of integrand functions , and a measure of discrepancy for the point sets . Then, the deterministic integration error is usually given in the form

where measures a particular variation of the
function . A classical particular error bound in this form is
the famous Koksma–Hlawka inequality,
where is taken to be the star discrepancy
of the point set , and is the variation
in the sense of Hardy and Krause of .
In the context of QMC, a sequence of points in
is called a low–discrepancy sequence
if
for all truncations of the sequence to its first terms.

### 3.1 Quasi–Monte Carlo errors and complexity

There are certain reproducing kernel Hilbert spaces of functions which are particularly useful for estimating the quadrature error of QMC methods (see [9]). Consider a kernel satisfying and for each and . We denote now with and the inner product and norm in . If the integral

is a continuous functional on the space , then the worst case quadrature error for point sets and quasi-Monte Carlo algorithms for the space can be given by

due to Riesz’ representation theorem for linear bounded functionals. In this case, the representer of the quadrature error is given by

In QMC error analysis, one usually considers the weighted (anchored) tensor product Sobolev space introduced in [10]

with the weighted norm and inner product

where for we denote by its cardinality, and denotes the vector containing the coordinates of with indices in , and the other coordinates set equal to .

The corresponding reproducing kernel is given by

There are several other examples considered for error analysis. For example, the weighted Walsh space consisting of Walsh series (see [7, Example 2.8] and [11]). The weighted tensor product Sobolev space allow for explicit QMC constructions deriving error estimates of the form

(9) |

where the constant is independent on the dimension , given that the sequence of weights satisfies (see [12])

Traditional unweighted function spaces considered for integration suffer the from the curse of dimensionality. Their weighted variants describe a setting where the variables or group of variables may vary in importance. Thus, they give a partial explanation of why some very high-dimensional spaces become tractable for QMC.

Explicit QMC constructions satisfying (9) are shifted lattice rules for weighted spaces. The rate (9) can be also obtained for Niederreiter and Sobol’ sequences (see [13]).

The idea of “weighting” the norm of the spaces to obtain tractable results can be applied in fact to more general function spaces than smooth function spaces of tensor product form, and many integration examples can be found in [6]. In our numerical experiments, we used so far QMC algorithms based on a particular type of low–discrepancy sequences. Numerical experiments with shifted lattice rules will be carried out in the near future, following new techniques for fixing adequate weights introduced in [14].

## 4 Low–discrepancy -sequences

The most well known type of low–discrepancy sequences are the so called -sequences. To introduce how -nets and -sequences are defined, we consider first elementary intervals in a integer base . Let be any sub-interval of of the form with for . An interval of this form is called an elementary interval in base .

###### Definition 4.1

Let be integers. A -net in base is a point set of points in such that every elementary interval in base with contains exactly points.

###### Definition 4.2

Let be an integer. A sequence of points in is a -sequence in base if for all integers and , the point set consisting of points with , is a -net in base .

The parameter is called the quality parameter of the –sequences. In [15], theorem 4.17, it is shown that -sequences are in fact low–discrepancy sequences. We reproduce this result in the following

###### Theorem 4.3

The star-discrepancy of the first terms of a -sequence in base , satisfies

where the implied constants depend only on and . If either or , , we have

and otherwise

Explicit constructions of -sequences are available. Some of them are the generalized Faure, Sobol’, Niederreiter and Niederreiter–Xing sequences. All these examples fall into the category of constructions called digital sequences. We refer to [7] for further reading on this topic.

## 5 Randomized QMC

There are some advantages in retaining the probabilistic properties of the sampling. There are practical hybrid methods permitting us to combine the good features of MC and QMC. Randomization is an important tool for QMC if we are interested for a practical error estimate of our sample quadrature to the desired integral. One goal is to randomize the deterministic point set generated by QMC in a way that the estimator preserves unbiasedness. Another important goal is to preserve the better equidistribution properties of the deterministic construction.

The simplest form of randomization applied to digital sequences seems to be
the technique called digital –ary shifting. In this case, we add
a random shift to each point of the deterministic set
using
operations over the selected ring .
The application of this randomization preserves in particular the value of any projection of
the point set (see [5] and references therein). The resulting estimator is
unbiased.

The second randomization method we present is the one introduced by
Art B. Owen ([16]) in 1995. He considered -nets and -sequences
in base and applied a randomization procedure based on permutations of the digits of
the values of the coordinates of points in these nets and sequences. This can be interpreted
as a random scrambling of the points of the given sequence in such a way
that the net structure remains unaffected.
We do not discuss here in detail Owen’s randomization procedure,
or from now on called Owen’s scrambling.
The main results of this randomization procedure can be stated in the following

###### Proposition 5.1

(Equidistribution)

A randomized -net in base using Owen’s scrambling is again a -net
in base with probability 1. A randomized -sequence in base using Owen’s
scrambling is again a -sequence in base with probability 1.

###### Proposition 5.2

(Uniformity)

Let be the randomized version of a point
originally belonging to a -net
in base or a -sequence in base , using Owen’s scrambling.
Then has
the uniform distribution in , that is, for any Lebesgue measurable set , ,
with the -dimensional Lebesgue measure.

The last two propositions state that after Owen’s scrambling of digital sequences
we retain unaffected the low discrepancy properties of the constructions, and that
after this randomization procedure we obtain random samples uniformly distributed in .

The basic results about the variance of the randomized QMC estimator after applying Owen’s scrambling to -nets in base (or of -sequences in base ) can be found in [17]. We summarize these results in the following

###### Theorem 5.3

Let , , be the points of a scrambled -net in base , and let be a function on with integral and variance Let , where . Then for the variance of the randomized QMC estimator it holds

For we have

The above theorem says that the variance of scrambled –nets is never more than
times the variance of the corresponding MC estimator.
The bound of the theorem above can be improved (see theorem 13.9 in [7]) to show that the
variance of scrambled –nets are in fact always smaller than the variance of the MC estimator.
If the integrand at hand is smooth enough, using Owen’s scrambling
it can be shown that one can obtain an improved asymptotic error estimate of order
, for any , see [18].
Improved scrambling techniques have been developed in [19],[20].

## 6 Weighted uniform sampling

Weighted uniform sampling is a way of estimating a quotient of integrals of the form

by taking the estimator

(10) |

where the points , have been generated from the uniform distribution in . This estimator was analyzed in [3] and applications have been investigated for example in [21] and [22]. The bias and the root mean square error (RMSE) of this estimator satisfy

The bias of the estimator is asymptotically negligible compared with the RMSE. One clear disadvantage of WUS against Mc-MC or Importance Sampling for problems with large regions of relative low values of the integrands is that with WUS we sample over the entire unit cube uniformly, while Mc-MC and Importance Sampling based techniques try to concentrate in more characteristic or important regions of the integrands. These limitations where observed in our numerical experiments.

## 7 Numerical experiments

We consider for our numerical tests the quantum mechanical harmonic and anharmonic oscillator in the path integral approach as described in section 2. For definiteness we repeat here the expression for the action of the system:

(11) |

We investigate the two observable functions

using the notation , for , in our tests.

### 7.1 Harmonic Oscillator

For the harmonic oscillator we can apply immediately the direct sampling approach described in sections 3 and 6 for calculating estimates of observables by setting

in (10). The matrix is a square root of , the covariance matrix of the variables , appearing in the action if it is expressed as a bilinear form: . Different factorizations, namely Cholesky and PCA (principle component analysis) have been tried out. The PCA based factorization seemed to perform better in our tests, which is the reason why we will only show results for this method. The PCA can be explicitely obtained for circulant Toeplitz matrices and the matrix–vector products can be efficiently computed by means of the fast Fourier transform. In the ordinary Mc-MC approximation, we used the Mersenne Twister[4] pseudo random number generator. For the QMC tests, we use randomly scrambled Sobol’ sequences using the technique proposed by J. Matousěk[19]. The error of was obtained by scrambling 10 times the QMC sequence and making 10 runs of an Mc-MC simulation (with different seeds). This procedure is repeated 30 times in both cases to obtain the error of the error. From the results, shown in figure 1, we can see a scaling that agrees perfectly with the expected behavior, namely for Mc-MC and for QMC, for large .

Although this example is trivial, it was our first successful application of the QMC approach in a physical model and motivated us to pass on to more complicated models.

### 7.2 Anharmonic Oscillator

The WUS approach was also used for this problem to estimate and .

With the anharmonic term in the action the distribution function of the variables becomes very complicated. This makes it very hard to generate the samples directly from the PDF of the anharmonic oscillator. Instead of this, the anharmonic term and a part of the harmonic term is treated as part of the weight functions and in (10), leaving the sampling procedure of the as it was for the harmonic oscillator, accept for a different factor in front of the harmonic term

(12) |

This procedure is neccessary, because of being positive definite only if , which is neccessary for the existence of during the sampling procedure. Further, it is important to note that the PCA factorization during the generation of the gaussian samples is essential for an efficient reduction of the effective dimension (see [23]) of the problem. For the parameters listed below, we estimated the effective dimensions of the functions 12 to be close to (for a variance concentration). On the other hand we found out that the effective dimension depends also very strong on the parameter , the physical time extent of the system. For small -values, say , the effective dimension is reduced sufficiently good like in the harmonic case, such that the QMC approach leads to a error scaling. The situation changes for , where the error behaves only like , due to the increase of the effective dimension. The parameters were set to , , . In the two tests the parameters and had been adjusted such, that was kept fixed. We set and for , whereas for and was chosen. The error analysis of and has been adopted from the harmonic oscillator test case described in the last subsection 7.1. The result is shown in figure 2. For reasons mentioned earlier, WUS shows its limitations for large in our experiments. If and , then we observe poor results with the Mc-MC or RQMC direct WUS sampling method. For and the PCA results for RQMC seem satisfactory. The resulting estimation of the ground state energy matches in at least two significant digits with the theoretical value, , calculated in [24], namely for and for .

## 8 Concluding Remarks

For the harmonic oscillator we found a large- error behavior as expected for QMC () and Mc-MC (). Also for the anharmonic oscillator the estimation procedure leads to a significant improvement when employing the QMC approach. In this case, the error scaling is only of instead of the theoretically best case of . Further, we found that the applicability of the WUS approach seems to be limited by the physical time extent . Stable results could only by found for values . On the other hand, the choice of does not seem to have any effect and the accessible range of values gives already estimates of the ground state energy, compatible (within errors) with the theoretical prediction (valid in the limit and ). For the case that the improved error scaling and the mild dependence on the lattice spacing found here will also be present in more elaborate models, the QMC has the potential to become very valuable in the future.

## Acknowledgement

The authors wish to express their gratitude to Alan Genz (Washington State University) and Frances Kuo (University of New South Wales, Sydney) for inspiring comments and conversations, which helped to develop the work in this report. Frances Kuo collaborated with us during her visit to the Humboldt-University Berlin in 2011. A.N., K.J. and M.M.-P. acknowledge financial support by the DFG-funded corroborative research center SFB/TR9.

## References

## References

- [1] Creutz M and Freedman B A 1981 Ann. Phys. 132 427–462
- [2] Metropolis N, Rosenbluth A, Rosenbluth M, Teller A and Teller E 1953 J. Chem. Phys. 21 1087–1092
- [3] Powell M J D and Swann J 1966 J.Inst.Maths Applics 2 228–236
- [4] Matsumoto M and Nishimura T 1998 ACM Trans. Model. Comput. Simul. 8 3–30 ISSN 1049-3301 URL http://doi.acm.org/10.1145/272991.272995
- [5] L’Ecuyer P and Lemieux C 2005 Modeling Uncertainty (International Series in Operations Research & Management Science vol 46) ed Dror M, LâEcuyer P and Szidarovszky F (Springer US) pp 419–474 ISBN 978-0-7923-7463-3 URL http://dx.doi.org/10.1007/0-306-48102-2_20
- [6] Novak E and Woźniakowski H 2010 Tractability of multivariate problems. Volume II: Standard information for functionals (EMS Tracts in Mathematics vol 12) (European Mathematical Society (EMS), Zürich) ISBN 978-3-03719-084-5 URL http://dx.doi.org/10.4171/084
- [7] Dick J and Pillichshammer F 2010 Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration (New York, NY, USA: Cambridge University Press) ISBN 0521191599, 9780521191593
- [8] Kuo F, Schwab C and Sloan I 2012 ANZIAM Journal 53
- [9] Hickernell F J 1998 Math. Comp 67 299–322
- [10] Sloan I H and Wozniakowski H 1997 J. Complexity 14 1–33
- [11] Dick J 2008 SIAM J. Numer. Anal. 46 1519–1553 ISSN 0036-1429 URL http://dx.doi.org/10.1137/060666639
- [12] Kuo F Y 2003 J. Complexity 19 301–320 ISSN 0885-064X numerical integration and its complexity (Oberwolfach, 2001) URL http://dx.doi.org/10.1016/S0885-064X(03)00006-2
- [13] Wang X 2003 Math. Comput. 72 823–838 ISSN 0025-5718 URL http://dx.doi.org/10.1090/S0025-5718-02-01440-0
- [14] Griewank A, Lehmann L, Leovey H and Zilberman M 2012 Math. Comp.
- [15] Niederreiter H 1992 Random number generation and quasi-Monte Carlo methods (CBMS-NSF Regional Conference Series in Applied Mathematics vol 63) (Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM)) ISBN 0-89871-295-5
- [16] Owen A B 1995 Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing (Lecture Notes in Statistics vol 106) ed Niederreiter H and Shiue P J S (Springer-Verlag) pp 299–317
- [17] Owen A B 1997 SIAM J. Numer. Anal. 34 1884–1910 ISSN 0036-1429 URL http://dx.doi.org/10.1137/S0036142994277468
- [18] Owen A B 2008 Ann. Statist. 36 2319–2343
- [19] Matousěk J 1998 Journal of Complexity 14 527–556
- [20] Tezuka S and Faure H 2003 Journal of Complexity 19 744 – 757 ISSN 0885-064X URL http://www.sciencedirect.com/science/article/pii/S0885064X03000359
- [21] Spanier J and Maize E H 1994 SIAM Rev. 36 18–44 ISSN 0036-1445 URL http://dx.doi.org/10.1137/1036002
- [22] Caflisch R E and Moskowitz B 1995 Lecture Notes in Statistics 106 (Springer-Verlag) pp 1–16
- [23] Caflisch R E, Morokoff W and Owen A 1997 The Journal of Computational Finance 1 27–46
- [24] Blankenbecler R, DeGrand T A and Sugar R 1980 Phys.Rev. D21 1055