Directed Worm Algorithm with Geometric Optimization
Abstract
The worm algorithm is a versatile technique in use of the Markov chain Monte Carlo method for both quantum and classical systems. In particular, the dynamic critical exponents of classical spin systems are greatly reduced, compared to the case of a single spin update. We improve the efficiency of the worm algorithm for classical models in combination with the directedloop framework and the geometric probability optimization. The worm scattering process is optimized for reducing its diffusive nature, which is evident in the probability distribution of worm position. In addition, we carefully discuss how to quantify the computational efficiency of a Markov chain Monte Carlo sampler. Performance improvement is demonstrated for the Ising model at the critical temperature by measurement of exponential autocorrelation times and asymptotic variances. The present method is approximately 25 times as efficient as the classical worm algorithm for the simplecubiclattice model. Remarkably, it is even more efficient than the Wolff cluster algorithm. We estimate the dynamic critical exponent of the longest time scale in the worm update to be and expect it to be the same with the Wolff cluster algorithm. While the integrated autocorrelation time of the susceptibility estimator shows the quite different size scaling between the worm and the Wolff algorithms, the asymptotic variance has the same dynamic critical exponent. Our approach is generally applicable to a wide range of physical systems, such as the model, the Potts model, the O() loop model, the lattice QCD, and also some frustrated systems in combination with the dual worm formalism.
pacs:
I Introduction
A Markov chain Monte Carlo (MCMC) method is a powerful numerical tool to study a wide variety of statistical mechanical problems Landau and Binder (2005); Newman and Barkema (1999). Many kinds of nontrivial phases and phase transitions, both in classical and quantum systems, have been uncovered as its applications. The essence of the method is to construct a transition kernel as a series of local kernels acting on local state variables and to achieve eventual sampling from an arbitrary target distribution even in a huge number of dimensions (or degrees of freedom) of a state space.
Because next state (sample) is generated from the previous state, one has to care correlation of samples, which can be described by an autocorrelation function Landau and Binder (2005); Newman and Barkema (1999):
(1) 
where is an observable, or a realization of a random variable, at the th Monte Carlo step, the bracket denotes the Monte Carlo average, and is an estimator of physical quantity , e.g., energy or susceptibility.
The autocorrelation function eventually becomes independent of in Eq. (1) after distribution convergence (thermalization or burnin). In many cases, the function dumps exponentially for large :
(2) 
where
(3) 
is the exponential autocorrelation time. Here we assume is finite, which is the case for finite size systems we study in the present paper. A thermalization period in a Monte Carlo simulation should be, at least, several times as long as the exponential autocorrelation time.
Meanwhile, autocorrelation reduces the effective number of Monte Carlo samples to , where is the number of samples in simulation and
(4) 
is the integrated autocorrelation time. The constant comes from the discrete nature of Monte Carlo steps. Then needed computational time is proportional to these autocorrelation times: and , which may differ among estimators and update methods.
MCMC methods can be applied to many kinds of phase transitions in principle, but distributionconvergence rate and sampling efficiency can become quite poor in some cases, such as critical slowing down Hohenberg and Halperin (1977); Sokal (1997). For example, in the case of the Ising model on the square lattice, the Metropolis algorithm for the single spin update suffers from the rapid growth of the autocorrelation times: with Wang and Hu (1997); Nightingale and Blöte (2000); Murase and Ito (2008); Liu et al. (2014), where is the system length. Here is called the dynamic critical exponent, which may differ between and depending on estimator. Nevertheless, the exponent is expected to be universal among many types of updates used for MCMC sampling Nightingale and Blöte (2000). This slowing down hampers accurate or precise analysis of phase transition. It is thus crucial to devise an efficient update method that alleviates or avoids the slowing down.
In the case of unfrustrated models, the cluster update, such as the SwendsenWang Swendsen and Wang (1987) and the Wolff Wolff (1989) algorithm, reduces the dynamic critical exponents significantly Tamayo et al. (1990); Coddington and Baillie (1992); Liu et al. (2014): for example, for the Ising model in two dimensions.
The size of a cluster corresponds to the correlation length in systems, and the flip of clusters, which can be performed with probability one, achieves an efficient global (nonlocal) spin update. Forming such an efficient cluster, however, is nontrivial or impractical in general cases. Application of a nonlocal update is thus limited to specific cases so far.
In the meantime, the worm algorithm has been one of the most versatile techniques in the worldline quantum Monte Carlo method Prokof’ev et al. (1998); Syljuasen and Sandvik (2002). In quantum cases, a naive local update is often not allowed because of a certain conservation law; for example, the total magnetization is conserved in the quantum spin model including the and Heisenberg models as special cases. The worm algorithm works well especially for systems with such a conservation law. The key point of the algorithm is to extend physical state space and allow configurations with kinks breaking the conservation law, a pair of which is called a worm.
A whole procedure of the worm algorithm is described by the repetition of the following processes: (i) A pair of kinks is inserted at a randomly chosen position of a system. (ii) These kinks move on the system in a stochastic way and update configurations. (iii) When the kinks meet each other, the conservation law is again satisfied and an original physical configuration is sampled. Although each worm move is local, a nonlocal update is achieved in terms of the original state space after a worm update (from insertion to removal of a worm). We review the detail of the algorithm in Sec. II.
The worm algorithm has been proposed also for classical systems Prokof’ev and Svistunov (2001). It aims at a random walk of kinks on sites of a lattice. Remarkably, in spite of the local nature of the update, the worm algorithm greatly reduces the dynamic critical exponents in the case of the Ising models and other classical models Deng et al. (2007); Liu et al. (2011).
It is critical to optimize the stochastic worm update for efficient computation. The original algorithm (hereafter called the classical algorithm) uses the simple Metropolis algorithm; the next site is chosen at random among nearest sites and the site shifting process is accepted or rejected according to the distribution weights. The detailed balance is satisfied in every worm shifting process.
Meanwhile, the efficient directedloop algorithm was proposed in the worldline quantum Monte Carlo method Syljuasen and Sandvik (2002). The directed worm does not satisfy the detailed balance for each worm scattering process but does for one worm update from insertion to removal. Here the worm backscattering process canceling the previous update, which is thus a rejection process, should be averted for efficient sampling.
In the present paper, we propose a modified worm update moving on bonds instead of sites. The worm scattering probability is optimized using the geometric allocation Suwa and Todo (2010); Suwa (2014), which is a universal approach for probability optimization under the nonnegativity condition. One can find solutions satisfying global balance even without detailed balance. In our modified worm update, we minimize the worm backscattering rate and maximize the probability of going straight on lattices for reducing the diffusive nature of worm random walk. The reduction of the diffusive behavior is confirmed in the probability distribution of worm position. The present algorithm is detailed in Sec. III.
We demonstrate, in Sec. V, that the computational efficiency quantified by exponential autocorrelation time and an asymptotic variance is approximately 25 times as high as that of the classical worm for the simplecubiclattice Ising model at the critical temperature. Our algorithm is even more efficient than the Wolff algorithm. We stress that sampling efficiency of an MCMC method should be compared in asymptotic variance, which is discussed in Sec. IV. There is no extra computational cost in the present algorithm, compared with the classical algorithm. Our approach is applicable to generic classical models, such as the model, the Potts model Mercado et al. (2012), the O() loop model Janke et al. (2010); Liu et al. (2011); Shimada et al. (2014), and the lattice QCD Adams and Chandrasekharan (2003). The present paper is summarized in Sec. VI.
Ii Classical Algorithm
We review the classical, or conventional, worm algorithm Prokof’ev and Svistunov (2001) for the Ising model in this section. Let the model be represented by , where is the Hamiltonian, is a temperature, and is the Ising spin variable at site (vertex) of the lattice (graph). The partition function of the canonical ensemble is expanded into the Taylor series as
(5) 
where denotes the bond variable on bond , the identity is used in the second line, and and are the total number of sites and bonds of a lattice, respectively. In the last line, the sum runs over bond configurations forming loops whose total length is denoted by . The terms of other (nonloop) configurations cancel with each others in tracing out the spin degrees of freedom. Then the bond variables on the constraint of loop structure are sampled by means of the MCMC method. Any set of bond variables can be used as the initial state in a simulation as long as the loop constraint is satisfied. We choose the state with all bond variables deactivated ( ) as the initial one.
The worm algorithm is an efficient update method to sample under such a constraint or a conservation law. The main idea is to extend the sampled state space and allow configurations with kinks breaking the constraint. Let us consider inserting two kinks, and move one of them in a stochastic way. The moving kink is called the head of a worm and the other is the tail of it. The classical worm algorithm is then described as follows:

Choose a site at random as the starting point and set . Insert the head and tail of the worm at . Go to step 2.

Choose a nearest neighbor site, , at random among nearest neighbor sites of site and shift the head of the worm from to with probability , where is the bond variable on before shifting. If the shift is accepted, update as and set . If , go to step 3. Otherwise, repeat step 2.

Measure observables. Go to step 1 after removing the worm with probability , or go to step 2 with probability .
The probability can be set an arbitrary value in : in Ref. 15.
As for measurement, energy can be measured by the total number of activated bonds:
(6) 
where is the inverse temperature. The correlation function, , can be calculated by , where is the number of times when the head is at and the tail is at in the interval between measurements (at above Step 3), and is the number of times when both the head and tail are at at Step 3. Susceptibility, , can be calculated by , where is the worm length, or the total number of worm shifting processes (Step 2) in the interval between measurements. It is straightforward to calculate the Fourier transformed (or the Fourier series of) correlation function and the correlation length by use of the moment method Suwa and Todo (2015), where one only needs to consider a phase factor depending on worm position.
In several cases, the worm update produces a much smaller dynamic critical exponent than a naive local spin update does: in Ref. 16, and for the Ising model on the square and the cubic lattices, respectively. The worm algorithm is efficient not only for the Ising model but for many fundamental physical systems: the Potts model Mercado et al. (2012), the model, the O() loop model Janke et al. (2010); Liu et al. (2011); Shimada et al. (2014), the Lattice QCD Adams and Chandrasekharan (2003), and so on.
Meanwhile, one can use the worm algorithm for dual variables on a dual lattice Prokof’ev and Svistunov (2001); Hitchcock et al. (2004); Wang (2005); Rakala and Damle (). The dual worm algorithm samples domain walls of original spin variables; in other words, it samples “unsatisfied” bonds increasing energy. While the classical worm explained above is formulated in the hightemperature expansion, the dual worm is in the lowtemperature expansion with dual inverse temperature Kramers and Wannier (1941); Kogut (1979). One of the advantages of the dual worm is that it is applicable also to frustrated cases while the original worm suffers from the negative sign problem Wang (2005); Rakala and Damle ().
Contrary to the hightemperature expansion, dual variables have the constraint that the winding numbers of unsatisfied bonds are even in the case of the periodic boundary condition. If the constraint is ignored, nevertheless, the free energy difference between the periodic and the antiperiodic boundaries can be calculated from the winding number histogram Hitchcock et al. (2004): , where and are the free energies, and are the partition functions, and and are the numbers of times when the winding numbers are even (odd) in one (the other) direction, and both are even, respectively. Since the square lattice has the selfduality, the dual worm update at the critical temperature is exactly the same with the original update except for the winding number constraint. The free energy difference is then available in both the formalisms.
Iii Present Approach
We present a modified directedworm update in this section. The worm backscattering (rejection) probability is minimized using the geometric allocation approach, and our algorithm is indeed free from rejection at the critical temperature of the Ising model on the square and the cubic lattices. As a result, the random walk behavior of the kink is successfully suppressed, which is evident in the probability distribution of worm position. We also explain how to measure relevant physical quantities, such as energy and susceptibility.
iii.1 Worm on Bonds
We adopt the same representation of the partition function [Eq. (5)] with the classical worm algorithm. Our goal is to sample bond variables efficiently under the loop constraint. We insert a worm, namely a pair of kinks, at a bond of a lattice, and move the head, namely one of the pair, in a stochastic way. When the head meets a site, it scatters onto another (or possibly same) bond with a certain probability. This scattering process continues until the head comes back to the tail, namely the other of the pair.
In the case of the square lattice, a typical example of configurations with the present worm after several scattering processes is illustrated in Fig. 1. In our algorithm, the head and tail breaking the loop constraint are located at the center of bonds, or edges, of a lattice. As a result, bond variables can take , or . The head has a moving direction in the same sense of the directedloop algorithm Syljuasen and Sandvik (2002).
Let us move the head in Fig. 1 upward and scatter it around the next vertex. The head is going to scatter onto a bond connecting to the vertex, which we define as a worm scattering process. The four candidate states after scattering are shown as , , , and in Fig. 2. The next state is chosen with a certain (optimized) probability, which we will discuss in Sec. III.2. After the worm scattering, the bond variables are updated as shown in Fig. 2; the halves of bonds are updated () since the kink is assumed at the center of a bond. We repeat this worm scattering process until the head comes back to the tail.
The whole algorithm of the present method is described as follows:

Choose a bond at random as a starting point and . Insert the head and tail of the worm at . Choose a moving direction at random. Go to step 2.

Choose the next bond according to a set of probabilities (optimized using the geometric allocation). If , update bond variables and , and set . if , go to step 3. Otherwise, repeat step 2.

Measure observables and go to step 1 after removing the worm.
Compared with the classical worm algorithm, we simply set in our procedure.
As an advantage of our approach, it is straightforward to optimize the worm scattering probability and improve efficiency. In the MCMC method, transition probabilities are set under global balance. The worm shifting probability at a site, in the classical algorithm, is determined by other shifting processes at the nearest neighboring sites. The processes at the nearest neighboring sites are then affected by the next nearest sites. Thus, it is nontrivial to write down the balance condition in a closed form. The Metropolis scheme reduces the condition to a local form, but no room for optimization is left except for increasing candidate states. In our approach, on the other hand, the balance condition of the worm scattering process is represented in a closed form. This simple structure of the balance condition leaves much room for optimization we discuss in the next subsection, compared to the previous directedloop approaches Hitchcock et al. (2004).
iii.2 Geometric Optimization
We detail the transitionprobability optimization in the geometric allocation approach Suwa and Todo (2010); Suwa (2014). To satisfy the global balance or the detailed balance, raw stochastic flow is allocated in a geometric way instead of solving simultaneous algebraic equations. Let us denote a raw stochastic flow from state to as , where is the weight, or the measure, of state apart from the normalization of a target distribution, and is the transition probability from to . The law of probability conservation and the global balance condition are expressed by
(7) 
and
(8) 
respectively, where is the number of candidate states. The average rejection (worm backscattering) rate is written as .
We optimize the flows so that the average rejection rate is minimized. To reduce further the worm diffusive behavior, we maximize the average probability that the worm head goes straight on the square or the cubic lattice under the condition of rejection minimization. This preference at local transition is expected to reduce the variance of the worm length, namely the variance of return time for the head to come back to the same position.
In our approach, raw stochastic flows are optimized using the geometric allocation. The optimized flows in the case of the square lattice are illustrated in Figs. 3 and 4. It is easy to confirm that Eqs. (7) and (8) are both satisfied: the area of each weight (color) is conserved, which is nothing but the probability conservation, and the whole box shape is intact after the allocation, which ensures the global balance. As for the probability that the head goes straight, the value is maximized. The ratio of the weights, , varies according to temperature in simulation. The rejection free condition is, in general, : that is, in the case of Fig. 3. This condition is always satisfied in the case of Fig. 4. Thus, our update is rejection free at the critical temperature, Kramers and Wannier (1941).
In our implementation, we calculate all transition probabilities before sampling, and prepare a lookup table storing probabilities. We then choose the next state at each worm scattering by using the Walker’s method of alias in computational time Fukui and Todo (2009); Horita et al. (). The advantage of this method is that the computational cost does not increase with the number of candidates , while the binary search does in . There is no extra computational cost in our algorithm, compared with the classical worm algorithm.
We choose a set of flows satisfying detailed balance, which is expressed by in the allocation approach. It is easy to find many, or actually infinite, solutions to satisfy the required conditions [Eqs. (7) and (8)] thanks to the geometric picture. Even solutions without detailed balance can be readily found Suwa and Todo (2010), but the performance is not different much as far as we checked for the Ising model, compared with the present choice of flows.
In the simplecubiclattice case, we choose a set of flows as illustrated in Fig. 5. The six candidate states are indexed so that (1, 2), (3, 4), and (5, 6) are pairs of states where the worm head is on a bond in the same direction; that is, the pairs are states before and after worm going straight in a similar way to the squarelattice case. All possible cases of the simple cubic lattice are shown in Fig. 5. The rejectionfree condition is satisfied for including the critical temperature, Deng and Blöte (2003). In addition to the conditions of the rejection minimization and the goingstraightprobability maximization, we here put a further condition to find the unique solution; the variance of the going straight flow, which is , where , is minimized. Other solutions, nevertheless, are expected to work as well as our choice as long as the rejection rate is minimized and the going straight flow is maximized. Our worm update optimized by the geometric allocation can be generalized to many classical models, such as the model, the Potts model, and the O() loop model. It is promising to improve computational efficiency, in general, as well as the case of the Ising model in the present paper.
iii.3 Reduction of Diffusive Behavior
We demonstrate here that the present worm algorithm indeed reduces the diffusive behavior of worm position. Figure 6 shows the probability distribution of the distance between two kinks (the head and tail of the worm) for the squarelattice case with after 256 worm shifting or scattering processes since the insertion. The present worm exhibits the much broader distribution than the classical worm does. Worms removed at before 256 worm processes are not shown in Fig. 6 but counted in the distribution normalization.
The tail of the distribution of kink distance is well approximated by a Gaussian distribution. We estimate the variance of the Gaussian distribution after 64, 128, and 256 worm shifting or scattering processes for on the square lattice, and find the linear growth of the variance as a function of the number of processes, as shown in Fig. 7: , where is the estimated variance and is the number of processes. In Fig. 7, although some faster decay is seen in the distribution after 64 scattering processes of the present worm, the tails of the distributions after 128 and 256 processes are well fitted to the Gaussian form until longer distance. The distribution of distance in the present algorithm is much broader: the variance in the present approach grows approximately 6 times as fast as that in the classical approach does, as a function of the number of worm shifting or scattering processes. These observations clearly show that the present method successfully reduces the diffusive nature of the worm random walk, which is expected to make sampling more efficient.
iii.4 Estimators
The present worm can measure all physical quantities accessible by the classical worm. Energy can be calculated by the same formula (6). Meanwhile, quantities related to worm position are measured in a different way. We explain how to measure the spin correlation function and the susceptibility here.
In the classical worm algorithm, the sampled configuration space is extended including configurations with two kinks on sites of a lattice. Then the number of times when the two kinks are at sites and naturally becomes an estimator of the spin correlation between and , as mentioned in Sec. II. On the other hand, the present worm never visits sites, so we need a different estimator for measuring the correlation.
Let us here consider a virtual Monte Carlo step to shift two kinks from bonds to sites. There are four choices of sites because each bond connects two sites. Let us then choose a pair of sites at random and consider the Metropolis algorithm for accepting or rejecting the virtual shift. Since we assume the kink is at the center of a bond in the present algorithm, the weight of configurations will be changed by this kink shift according to the change in . If this virtual shift would be accepted, we would count one for measuring the corresponding spin correlation in a similar way to the classical worm algorithm. Thus, we can use the acceptance probability as the reweighting factor from bondkink configurations to sitekink configurations. To calculate the susceptibility, we simply need to sum up the reweighting factor during the onetime worm update from insertion to removal.
To sum up the argument of the above virtual shift, the susceptibility estimator in the present algorithm is expressed by
(9) 
where is the inverse temperature, is the coordination number (the number of bonds connecting to a site, which is four on the square lattice and six on the cubic lattice),
(10) 
is the reweighting factor after a worm scattering process, and . In Eq. (10), is if the head is on an activated bond, on a deactivated bond, and on a halfactivated and halfdeactivated bond. In other words, takes or if the head comes back to the tail, otherwise. The summation in Eq. (9) means that is calculated after each worm scattering process and summed in the interval between measurements. Note that
(11) 
where is the worm length, because takes except for the case where the head comes back to the tail.
Regarding the prefactor in Eq. (9), we assume the present worm has the extra weight, , for accepting the worm insertion and removal with probability one, and need to consider it for estimators related to the extended configuration. The value of the extra weight comes from the fact that there are two possible directions for the worm head to go in. Then, in the susceptibility estimator (9), the reweighting factor is multiplied by two, which is the inverse of the extra weight the worm carries. In addition, it is divided by four because we need the average of four reweighting factors, and divided by the coordination number because of multiple counts caused by the virtual shift from bonds to sites. It is straightforward to calculate other quantities, such as the Fourier transformed correlation function and the correlation length. Note that while the extra weight worm carries is two in the case of the Ising model, it depends on model and worm variant used in simulation.
iii.5 Possible Bias
Before closing this section on the methodology, we note possible bias caused by fixedtime computation of the worm algorithm as well as the Wolff cluster algorithm Wolff (1989). In these methods, computational time cost in a Monte Carlo step fluctuates depending on worm length, or cluster size. The mean worm length, which corresponds to susceptibility, is usually a decreasing function of temperature. Energy, on the other hand, is an increasing function. Thus, when a configuration is at a higher energy, the computational time spent at the subsequent Monte Carlo step will be shorter on average. As a result, given a simulation time, say, one hour, highenergy configurations tend to be sampled more times than lowenergy configurations. Therefore, such a fixedtime computation causes bias. For example, if parallel simulations are run using different random numbers and the average among them is naively calculated after some runtime, the estimator has a bias. To avoid it, we have to fix the number of Monte Carlo steps instead of runtime and calculate the average among runs of the same number of Monte Carlo steps. The bias we discuss here is caused in quantum Monte Carlo simulations Prokof’ev et al. (1998); Syljuasen and Sandvik (2002) as well as classical simulations. Although the systematic error might be negligible compared to statistical error, our simulations are carefully done avoiding the bias.
Iv How to Compare MCMC Samplers
We discuss how to quantify the computational efficiency of an MCMC sampler. There are two points to care Suwa (2014): thermalization (namely distribution convergence or burnin) and sampling efficiency. On the former, since Monte Carlo samples are taken after thermalization process, faster distribution convergence to a target one allows for sampling from an earlier Monte Carlo step; on the latter, more efficient sampling yields a smaller statistical error. The mean squared error of an estimator in an MCMC method is proportional to the inverse of the number of samples (Monte Carlo steps) according to the central limit theorem Robert and Casella (2004). Efficiency then should be compared in the prefactor of the scaling, that is, the asymptotic variance Suwa (2014). We explain here how to measure these quantities.
Thermalization rate is quantified by exponential autocorrelation time. An autocorrelation function exponentially decays in large Monte Carlo steps as shown in Eq. (2), which is the case for finitesize systems we study in the present paper. We calculate the function by running independent simulations and estimate the exponential autocorrelation time as a fitting parameter. The error of an estimate is calculated by bootstrapping Davison and Hinkley (1997); Sen et al. (2015). We simply choose a single exponential function as the fitting function.
In the worm algorithm, we consider a Monte Carlo step in a simulation to be a onetime worm update. In other words, the number of Monte Carlo steps is equal to the number of times when the head comes back to the tail. Here, one Monte Carlo step should be renormalized in units of the number of sites. Then an autocorrelation time () estimated by regression is rescaled:
(12) 
where is the mean worm length, and is the number of sites. The mean worm length differs between the classical and the present algorithms since the sampled space is extended in different manners.
Sampling efficiency is, on the other hand, related to an integrated autocorrelation time. It can be estimated by
(13) 
where is a mean squared error, namely the square of a statistical error, calculated by the binning analysis using a much larger bin size than the exponential autocorrelation time, and is calculated without binning. The above estimator (13) will give an exact integrated autocorrelation time in the large MonteCarlostep limit Landau and Binder (2005). In a similar way to Eq. (12), it is also rescaled as
(14) 
for comparison.
Although the integrated autocorrelation time is useful to study Monte Carlo dynamics, we stress that sampling efficiency should be compared in asymptotic variance that is the prefactor of the scaling:
(15) 
where is the mean squared error of an estimator , is the asymptotic variance of , and is the number of Monte Carlo steps used for sampling. Here we assume is an unbiased estimator of a physical quantity : . Then the asymptotic variance is represented by:
(16) 
where is the variance of . In other words, if Monte Carlo samples are all independent with each others, the mean squared error of is estimated to be . In actual MCMC simulations, samples are correlated, so ; the effective number of samples are reduced to . Note in most cases: in the present paper. Because different estimators may have different variances, an integrated autocorrelation time may not be adequate for comparison of sampling efficiency between MCMC methods. We, therefore, use asymptotic variances to compare MCMC samplers.
In the present paper, according to Eqs. (13), (14), (15) and (16), we estimate the variances by using the jackknife method Berg (2004) in the following ways:
(17)  
(18) 
where is the number of Monte Carlo steps used for sampling, is the average, and and are the mean squared errors of an estimator with and without binning, respectively. The squared coefficient of variation () is used here for avoiding the dependence of the definition of estimator: for example, an asymptotic variance estimated by Eq. (17) is the same for total energy and energy density. Note that the rescaling factor in Eqs. (12), (14), and (17) is necessary for the worm and also the Wolff algorithms, but not for the single spin update.
V Results
We investigate the performance of our worm algorithm for the Ising model on the simple cubic lattice, focusing on the critical slowing down at the transition temperature. We compare the present algorithm to the classical worm algorithm Prokof’ev and Svistunov (2001) and the Wolff cluster algorithm Wolff (1989). The ensemble used in simulations is represented by the Boltzmann distribution at the critical temperature: Deng and Blöte (2003). The boundary condition is periodic in all the spatial directions. We optimize the worm scattering probability as illustrated in Fig. 5. More than Monte Carlo samples were taken, in total, after thermalization steps.
For fair comparison, autocorrelation times are rescaled in units of the number of sites as shown in Eqs. (12) and (14). In the case of the Wolff algorithm, we rescale integrated autocorrelation time by using cluster size: , where is estimated using Eq. (13), is the mean cluster size, and is the number of sites. The mean worm length of the classical algorithm is proportional to the susceptibility: , where and are the critical exponents of the susceptibility and the correlation length, respectively Landau and Binder (2005). We found the relation of the worm length between the present and the classical algorithms: for .
The integrated autocorrelation time, the variance, and the asymptotic variance of the energy estimator are shown in Fig. 8. These quantities were calculated in the manner explained above in Sec. IV. In the Wolff algorithm, energy is simply calculated from spin configuration. In Fig. 8, the present algorithm gives the shortest integrated autocorrelation time and the smallest asymptotic variance. Meanwhile, the variance of the estimator is almost the same for the three algorithms. It is interesting that the variance decays fast as the system size increases, which is sort of a selfaveraging effect. The shorter correlation time of the present algorithm allowed us to calculate the larger system size.
By fitting data to a powerlaw form (), we estimate the dynamic critical exponents to be , , and for , in common for , and , , and for , in the use of the Wolff, the classical worm, and the present worm algorithms, respectively. We expect three algorithms produce the same exponent asymptotically. Nevertheless, the present algorithm yields approximately 27 and 2.2 times as small asymptotic variance as the classical worm and the Wolff algorithm do, respectively, as shown in the inset of the bottom panel of Fig. 8.
The quantities of the susceptibility estimator are shown in Fig. 9 in a similar way to the energy estimator. In the case of the Wolff algorithm, we compare two estimators: , where is the total magnetization of spins, and , where is cluster size. The dynamic critical exponents are estimated to be , , , and for ; , , , and for ; and , , , and for , in the use of spins in the Wolff, cluster size in the Wolff, the classical worm, and the present algorithms, respectively. The numbers in parentheses hereafter indicate statistical uncertainty, one standard deviation, on the preceding digit. Interestingly, while the scalings of and are quite different between the estimators and algorithms, the dynamic critical exponent of is likely common. Particularly, in the case of the Wolff algorithm, is almost the same between the estimators using spins and cluster size. The present worm, nonetheless, yields approximately 23 and 1.6 times as small asymptotic variance as the classical worm and the Wolff cluster do, respectively, as shown in the inset of the bottom panel of Fig. 9.
We emphasize sampling efficiency should be compared in asymptotic variance. The integrated autocorrelation time of the susceptibility estimator in the classical worm algorithm is orders of magnitude shorter than that measured by spins in the Wolff algorithm, showing the quite different size scaling. However, the asymptotic variances of the two estimators show almost the same scaling. In fact, the asymptotic variance is much larger in the classical worm algorithm because of the different scalings of the estimator variances. If you only look at integrated autocorrelation time, it may lead to an incorrect conclusion about efficiency.
Compared to the classical worm, the variance of the susceptibility estimator is significantly reduced by the present worm update as shown in the middle panel of Fig. 9. Because the worm length is, exactly in the case of the classical worm and almost in the case of the present worm as shown in Eq. (11), proportional to the susceptibility estimator, also the variance of the worm length is reduced by the present algorithm in a similar way to the middle panel of Fig. 9. We expect the performance improvement to be attributed to the variance reduction of the worm length.
We compare not only sampling efficiency but also thermalization rate. The autocorrelation functions (1) of the energy and the susceptibility estimators for and are shown in Fig. 10, calculated from more than independent Markov chains (sample paths). The function of the energy estimator shows the almost single exponential decay. As for the susceptibility estimator, on the other hand, some fast and slow decays are observed. While decreases with as shown in the top panel of Fig. 9, the autocorrelation function for clearly has slower decay than that for does, in the case of the classical algorithm, as shown in the right panel of Fig. 10. The reason why decreases with is that the prefactor of the slowly decaying mode decreases, which is also seen in Fig. 10.
The scaling of the exponential autocorrelation time is shown in Fig. 11, calculated by bootstrapping as mentioned in Sec. IV. We found , which is also seen in Fig. 10. Since the autocorrelation function of the energy estimator is well approximated by a single exponential function, it is expected that , which we indeed confirmed. Therefore, we expect the asymptotic scaling: , where the power was estimated from the plots in Fig. 8. The ratio of the exponential autocorrelation times between the classical and the present algorithms is approximately 26 as shown in the inset of Fig. 11, which is consistent with the ratio regarding the integrated autocorrelation time or the asymptotic variance of the energy estimator. Note that the summation of the autocorrelation function with respect to the renormalized time is somewhat different from the renormalized integrated autocorrelation time (14) because of the constant in the definition (4). Nonetheless, the asymptotic scaling with respect to system size should be the same for the two quantities.
A lesson to learn from these analyses is that we must be careful for estimating and needed thermalization (burnin) period. Because is usually easier to estimate than , people roughly estimate from in some (or maybe many) cases. Assume that an autocorrelation function is approximated by a single exponential function and , then . When an autocorrelation function has a slow decay with prefactor , . Thus can be much larger than possibly in orders of magnitude as we have estimated , but for the classical algorithm.
Vi Summary
We have proposed the modified worm algorithm for classical models and demonstrated the performance improvement for the Ising model at the critical temperature. The kinks of the present worm are located on bonds instead of sites on a lattice. The probabilities at worm scattering are optimized using the directedloop framework and the geometric allocation approach so that the bounce (rejection) probability is minimized and indeed reduced to zero in a wide range of temperature including the critical point. Furthermore, the probability for the worm head to go straight is maximized to reduce the diffusive nature of worm position.
The reduction of the diffusive behavior is confirmed in the probability distribution of worm position as the rapid growth of the variance of the distribution as a function of scattering processes in Figs. 6 and 7. As a result, the variance of worm length, as well as the variance of the susceptibility estimator, is significantly reduced as seen in the middle panel of Fig. 9, which is expected to lead to efficient sampling.
We have discussed how to compare Monte Carlo samplers and measure relevant quantities. While thermalization rate is quantified by exponential autocorrelation time, sampling efficiency, as we have stressed, should be compared in asymptotic variance.
The exponential autocorrelation time and the asymptotic variance in the present method are approximately only 4% as large as those in the classical algorithm for the Ising model on the simple cubic lattice, shown in Figs. 8, 9, and 11. Our algorithm is remarkably even more efficient than the Wolff cluster algorithm. The compared algorithms likely produce the same dynamic critical exponents of the exponential autocorrelation time and the asymptotic variance. The exponent of the longest time scale in the worm update is estimated to be , which is somewhat larger than the previous estimate in the classical worm algorithm, Deng et al. (2007), but consistent with the recent estimate in the Wolff update, Liu et al. (2014). This agreement suggests that the worm and the Wolff algorithms share the same exponent of the exponential autocorrelation time as well as the asymptotic variance.
Meanwhile, we observed the integrated autocorrelation time of the susceptibility estimator shows the quite different scaling from the exponential autocorrelation time: . These findings clearly indicate that asymptotic variance is more adequate than integrated autocorrelation time for comparison of sampling efficiency.
The present approach can be generalized to a wide range of physical models, such as the model, the Potts model Mercado et al. (2012), the O() loop model Janke et al. (2010); Liu et al. (2011); Shimada et al. (2014), and the lattice QCD Adams and Chandrasekharan (2003). The directedloop framework and the geometric allocation approach are expected to reduce computational cost significantly for many kinds of systems. In particular, it is promising to apply to frustrated models in combination with the dual worm formalism Rakala and Damle (). The performance of the present worm algorithm in the case of other models needs to be investigated in the future.
Acknowledgements.
The author is grateful to Synge Todo for discussion on measurement and the distribution of worm position. Some simulations have been done using computational resources of the Supercomputer Center at the Institute for Solid State Physics, the University of Tokyo. The author acknowledges support by KAKENHI under Grant No. 16K17762 from JSPS.References
 Landau and Binder (2005) D. P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, 2nd ed. (Cambridge University Press, Cambridge, 2005).
 Newman and Barkema (1999) M.E.J. Newman and G.T. Barkema, Monte Carlo Methods in Statistical Physics (Oxford University Press, 1999).
 Hohenberg and Halperin (1977) P. C. Hohenberg and B. I. Halperin, “Theory of dynamic critical phenomena,” Rev. Mod. Phys. 49, 435–479 (1977).
 Sokal (1997) A. Sokal, “Monte carlo methods in statistical mechanics: Foundations and new algorithms,” in Functional Integration: Basics and Applications, edited by Cecile DeWittMorette, Pierre Cartier, and Antoine Folacci (Springer US, 1997).
 Wang and Hu (1997) FuGao Wang and ChinKun Hu, “Universality in dynamic critical phenomena,” Phys. Rev. E 56, 2310–2313 (1997).
 Nightingale and Blöte (2000) M. P. Nightingale and H. W. J. Blöte, “Monte carlo computation of correlation times of independent relaxation modes at criticality,” Phys. Rev. B 62, 1089–1101 (2000).
 Murase and Ito (2008) Yohsuke Murase and Nobuyasu Ito, “Dynamic critical exponents of threedimensional ising models and twodimensional threestates potts models,” J. Phys. Soc. Jpn. 77, 014002 (2008).
 Liu et al. (2014) ChengWei Liu, Anatoli Polkovnikov, and Anders W. Sandvik, ‘‘Dynamic scaling at classical phase transitions approached through nonequilibrium quenching,” Phys. Rev. B 89, 054307 (2014).
 Swendsen and Wang (1987) R. H. Swendsen and J. S. Wang, “Nonuniversal critical dynamics in Monte Carlo simulations,” Phys. Rev. Lett. 58, 86 (1987).
 Wolff (1989) U. Wolff, “Collective Monte Carlo updating for spin systems,” Phys. Rev. Lett. 62, 361 (1989).
 Tamayo et al. (1990) P. Tamayo, R. C. Brower, and W. Klein, “Singlecluster monte carlo dynamics for the ising model,” Journal of Statistical Physics 58, 1083–1094 (1990).
 Coddington and Baillie (1992) Paul D. Coddington and Clive F. Baillie, “Empirical relations between static and dynamic exponents for ising model cluster algorithms,” Phys. Rev. Lett. 68, 962–965 (1992).
 Prokof’ev et al. (1998) N. V. Prokof’ev, B. V. Svistunov, and I. S. Tupitsyn, “Exact, complete, and universal continuoustime worldline Monte Carlo approach to the statistics of. discrete quantum systems,” Sov. Phys. JETP 87, 310 (1998).
 Syljuasen and Sandvik (2002) O. F. Syljuasen and A. W. Sandvik, “Quantum Monte Carlo with directed loops,” Phys. Rev. E 66, 046701 (2002).
 Prokof’ev and Svistunov (2001) Nikolay Prokof’ev and Boris Svistunov, “Worm algorithms for classical statistical models,” Phys. Rev. Lett. 87, 160601 (2001).
 Deng et al. (2007) Youjin Deng, Timothy M. Garoni, and Alan D. Sokal, “Dynamic critical behavior of the worm algorithm for the ising model,” Phys. Rev. Lett. 99, 110601 (2007).
 Liu et al. (2011) Qingquan Liu, Youjin Deng, and Timothy M. Garoni, “Worm monte carlo study of the honeycomblattice loop model,” Nucl. Phys. B 846, 283 – 315 (2011).
 Suwa and Todo (2010) Hidemaro Suwa and Synge Todo, “Markov chain Monte Carlo method without detailed balance,” Phys. Rev. Lett. 105, 120603 (2010).
 Suwa (2014) Hidemaro Suwa, Geometrically Constructed Markov Chain Monte Carlo Study of Quantum Spinphonon Complex Systems, Springer Theses (Springer Japan, 2014).
 Mercado et al. (2012) Ydalia Delgado Mercado, Hans Gerd Evertz, and Christof Gattringer, “Worm algorithms for the 3state potts model with magnetic field and chemical potential,” Comput. Phys. Commun. 183, 1920 – 1927 (2012).
 Janke et al. (2010) Wolfhard Janke, Thomas Neuhaus, and Adriaan M.J. Schakel, “Critical loop gases and the worm algorithm,” Nucl. Phys. B 829, 573 – 599 (2010).
 Shimada et al. (2014) H. Shimada, J. L. Jacobsen, and Y. Kamiya, “Phase diagram and strongcoupling fixed point in the disordered o() loop model,” J. Phys. A: Math. Theor. 47, 122001 (2014).
 Adams and Chandrasekharan (2003) David H. Adams and Shailesh Chandrasekharan, “Chiral limit of strongly coupled lattice gauge theories,” Nuclear Physics B 662, 220 – 246 (2003).
 Suwa and Todo (2015) Hidemaro Suwa and Synge Todo, “Generalized moment method for gap estimation and quantum Monte Carlo level spectroscopy,” Phys. Rev. Lett. 115, 080601 (2015).
 Hitchcock et al. (2004) Peter Hitchcock, Erik S. Sørensen, and Fabien Alet, ‘‘Dual geometric worm algorithm for twodimensional discrete classical lattice models,” Phys. Rev. E 70, 016702 (2004).
 Wang (2005) JianSheng Wang, “Worm algorithm for twodimensional spin glasses,” Phys. Rev. E 72, 036706 (2005).
 (27) G. Rakala and K. Damle, “Cluster algorithms for frustrated two dimensional Ising antiferromagnets via dual worm constructions,” arXiv:1612.00851 .
 Kramers and Wannier (1941) H. A. Kramers and G. H. Wannier, “Statistics of the twodimensional ferromagnet. part i,” Phys. Rev. 60, 252–262 (1941).
 Kogut (1979) John B. Kogut, “An introduction to lattice gauge theory and spin systems,” Rev. Mod. Phys. 51, 659–713 (1979).
 Fukui and Todo (2009) K. Fukui and S. Todo, “Order cluster Monte Carlo method for spin systems with longrange interactions,” J. Comp. Phys. 228, 2629 (2009).
 (31) Toshiki Horita, Hidemaro Suwa, and Synge Todo, “Upper and lower critical decay exponents of ising ferromagnets with longrange interaction,” arXiv:1605.09496 .
 Deng and Blöte (2003) Youjin Deng and Henk W. J. Blöte, “Simultaneous analysis of several models in the threedimensional ising universality class,” Phys. Rev. E 68, 036125 (2003).
 Robert and Casella (2004) Christian P. Robert and George Casella, Monte Carlo Statistical Methods, 2nd ed. (Springer, New York, 2004).
 Davison and Hinkley (1997) A.C. Davison and D.V. Hinkley, Bootstrap Methods and Their Application (Cambridge University Press, Cambridge, 1997).
 Sen et al. (2015) Arnab Sen, Hidemaro Suwa, and Anders W. Sandvik, “Velocity of excitations in ordered, disordered and critical antiferromagnets,” Phys. Rev. B 92, 195145 (2015).
 Berg (2004) Bernd A. Berg, Markov Chain Monte Carlo Simulations and Their Statistical Analysis (World Scientific Publishing, 2004).