Measuring free energy in parallel tempering Monte Carlo
Abstract
An efficient approach of measuring the absolute free energy in parallel tempering Monte Carlo using the exponential averaging method is discussed and the results are compared with those of population annealing Monte Carlo using the threedimensional EdwardsAnderson Ising spin glass model as benchmark tests. Numerical results show that parallel tempering, even though uses a much less number of temperatures than population annealing, can nevertheless equally efficiently measure the absolute free energy by simulating each temperature for longer times.
pacs:
75.50.Lk, 75.40.Mg, 05.50.+q, 64.60.iI Introduction
Measuring free energy is difficult in Monte Carlo simulations, but nevertheless can be very helpful if available. More phenomena and finite size effects can be analyzed using free energy. This is not only for the sake of free energy, but also as a consequence entropy can be obtained indirectly since energy can be readily measured in Monte Carlo simulations. The Bennett acceptance ratio method Bennett (1976) is mainly used to compute the free energy difference of two different systems but with the same state space at the same temperature. To access free energy differences of the same system at two different temperatures, exponential averaging Zwanzig (1954) and thermodynamic integration Thomas et al. (2011); Schilling and Schmid (2009) are often used. Thermodynamic integration has the drawback of integration errors and therefore needs data at many temperatures to be accurate. When data is available at only few temperatures interpolation techniques are often needed. This is especially a problem when the specific heat shows complicated multipeak structures like in spin glass Wang et al. ((2015) and protein folding Trebst and Hansmann (2007) problems due to various reasons like phase transitions and temperature chaos. The exponential averaging method however doesn’t have this problem and therefore is still beneficial and suitable for such models.
For systems with complicated free energy landscapes like spin glasses, a simple Monte Carlo method is not sufficient. Two algorithms that both have a hierarchical structure are shown to be efficient: parallel tempering Swendsen and Wang (1986); Geyer (1991); Hukushima and Nemoto (1996) and population annealing Hukushima and Iba (2003); Machta (2010). It has been known for more than a decade that population annealing can accurately estimate the absolute free energy Machta (2010) using the exponential averaging method. But slightly negative view has been held for parallel tempering due to mainly two reasons: the first is that there are too few temperatures, therefore the measurement would not be very accurate while the other is that the highest temperature is usually finite for parallel tempering while in population annealing this can be naturally chosen as infinity. The second problem can be trivially solved by adding a high temperature stage before the parallel tempering stage using simple single temperature Monte Carlo since thermal equilibration is fast at high temperatures. For the first problem, here we show numerically that the larger temperature steps in parallel tempering compared with population annealing are compensated by simulating each temperature for longer times, therefore, permitting equally accurate measurements of free energy in parallel tempering. Parallel tempering however does suffer some minor technical problems, but nevertheless can be readily solved to get access to free energy.
The work is organized as follows: Section II introduces the EdwardsAnderson model, the free energy estimator using the exponential averaging method and the twostage free energy measurement algorithm in parallel tempering. The results of parallel tempering and population annealing are compared in Sec. III and the conclusions are stated in Sec. IV.
Ii Models and Methods
ii.1 The EdwardsAnderson model
The EdwardsAnderson (EA) Hamiltonian is defined as
(1) 
where are the spin degrees of freedom defined on a threedimensional cubic lattice with . The sum over means sum over the nearest neighbour sites. is the coupling between spin and and is independently chosen from the standard Gaussian distribution with mean zero and variance one. I will refer to each disorder realization as a sample. Periodic boundary conditions are used in the simulations of this work.
ii.2 The free energy estimator
It has been known that population annealing (PA) can naturally estimate the absolute free energy of the EA model Machta (2010); Machta and Ellis (2011). The method used, exponential averaging, is nevertheless general. The ratio of and , which are the partition functions at inverse temperatures and respectively are
(2)  
(3)  
(4)  
(5)  
(6) 
where the sum over is the sum over all states and the sum over is the sum over measured equilibrium states in a Monte Carlo run. is the energy of a microstate, the number of measurements and means a thermal average. Take the natural logarithm of both sides
(7)  
(8) 
where is the free energy. Note that by sampling equilibrium states at a temperature, one can integrate the free energy between the temperature and a nearby temperature, usually chosen as a lower one. The initial condition of is known at theoretically. At , , where is the total number of microstates and with the total number of spins. If we order a set of temperatures as and , then we have
(9) 
ii.3 Measuring free energy in parallel tempering
Since parallel tempering (PT) doesn’t usually work from , we need to modify the parallel tempering algorithm somewhat to get the absolute free energy. One simple way is to implement the simulation in two stages with a high temperature stage using simple single temperature Monte Carlo and a regular parallel tempering Monte Carlo stage. The two stages together will cover from to a low temperature of interest.

Simple Monte Carlo stage
Suppose the regular parallel tempering Monte Carlo works between and , the minimum and maximum temperatures, respectively. The first stage is to use simple single temperature Monte Carlo to simulate the system and work from to , not including . Then the free energy can be integrated from to , including . 
Parallel tempering Monte Carlo stage
Simulate the system using the regular parallel tempering Monte Carlo between and . Then the free energy can be further integrated to the low temperature spin glass phase.
In my implementation, I used temperatures evenly distributed in for the first stage while used temperatures evenly distributed in for the second stage. A uniform distribution in for the first stage is easier to work with because of the infinite temperature. For the second stage, a uniform distribution in can yield better performance of parallel tempering since the swap probabilities are more constant at different temperatures. Also, no Monte Carlo sweeps were done at , it is sufficient to generate random states and make measurements like that of population annealing. In this work, the amount of work is counted in terms of Monte Carlo sweeps and one Monte Carlo sweep is a sequential update of all the spins for one replica at one temperature once. The same number of sweeps were done for both the thermal equilibration run and the data collection run.
It is important to stress here that the only difference of this algorithm compared with the regular parallel tempering algorithm is the additional simple Monte Carlo stage. Since the free energy landscape is not rough at high temperatures, the autocorrelation time in sweeps therefore doesn’t depend on the energy landscape and system size. It is not necessary to do as many sweeps as in the second stage, where the free energy landscape is rough. Therefore when the algorithm is well optimized, most of the work would be spent in the second stage and the overhead would be small when adding the first stage. In fact, the work spend in the first stage compared with the second stage should vanish in the thermodynamic limit. In addition, due to the similarity of this algorithm with the regular parallel tempering algorithm, it is straightforward to modify a regular parallel tempering code to collect the free energy data.
Finally, I want to point out that there is a technical issue of overflow when computing for large systems if is small or there are too many data collection steps for parallel tempering. The later problem can be easily solved since data collection after each Monte Carlo sweep is neither necessary nor desired. For the former problem, one can subtract a low energy from the energy of a state when computing the exponential term and then add it back when integrating . Another method is averaging on the fly. One can also use reasonably more temperatures, but this requires more computational work and may also increase the round trip time. Note that this problem could appear as well for population annealing. However, since population annealing naturally uses a lot more temperatures and the temperatures are usually evenly distributed in , no such problem was encountered in our studies of the population annealing algorithm at least up to size down to Wang et al. ((2015).
Iii Results
The free energy of each sample was measured using both the parallel tempering and population annealing algorithms from infinite temperature to low temperatures deep in the spin glass phase. The parallel tempering results are compared with the production run of population annealing Wang et al. (2014). The reference simulation parameters of population annealing are summarized in Table 1 while the simulation parameters of parallel tempering are summarized in Table 2. Since the free energy results of the two algorithms agree to a very high degree of precision, I have therefore studied a random subset of samples of Wang et al. (2014), 100 samples per system size. This is sufficient to show the equally efficiency of parallel tempering in measuring free energy compared with population annealing. In the following, I will first make a detailed comparison for a single hard sample and then a large scale comparison for the two algorithms.
10  
20  
30  
40 
iii.1 Detailed comparison of a single hard sample
In this section, I will do a detailed comparison of the efficiency of parallel tempering and population annealing in measuring the free energy using the hardest sample out of about 5000 samples of Wang et al. (2014). I will first focus on the dimensionless quantity of at a low temperature and study how the mean and the errorbar of the systematic error of change as a function of the amount of work . Then I will study how the relative errors of the estimated free energy evolves as a function of temperature at a fixed amount of work. The amount of work is counted as the total number of sweeps performed in the simulation.
The systematic error is defined as the differences of the estimated and the exact i.e. . The exact is estimated using the reference runs of population annealing Wang et al. (2014). The population annealing comparison data with different amount of work is taken from a previous study of finding ground states Wang et al. ((2014). The work is varied by changing the number of replicas while holding and constant. The parallel tempering data was done using the parameters of Tabel 2 but varying . The result is shown in Fig. 1. One thing we can learn from this study is that the accuracy is about the same considering neither algorithms is very carefully optimized. Furthermore, both algorithms underestimate the value of when the amount of work is too small i.e. the sample is not in thermal equilibrium. This can be understood as those runs are so small so that some low energy states were probably not properly found so that was underestimated, and therefore also .
It is also important and interesting to study how the relative error of the estimated free energy behaves as temperature is lowered due to the integration nature of the method. The relative error is defined as minus the ratio of the standard deviation and mean of the estimated free energy. The minus sign is to ensure the defined relative error is positive. The relative error is measured via multiple runs in practice. Figure 2 shows this quantity as a function of inverse temperature for both PA and PT for the largest run of Fig. 1. Note that there is no trend that the relative error grows as temperature is lowered, showing the effectiveness of both algorithms. In addition, the magnitude of the relative errors is also about the same.
iii.2 A large scale comparison
Before showing the large scale comparison, I would like to show a comparison of the result of for a typical sample of each system size in the whole range of temperatures studied. A typical result is shown in Fig. 3. Note that the parallel tempering data falls right on top of the population annealing curve, showing the effectiveness of parallel tempering in measuring free energy in a wide range of temperatures.
To make a comparison of more samples, a scatter plot of the free energy per spin at the lowest simulation temperature of PA and PT is shown in Fig. 4. Note that the statistical error compared with the absolute value is too small to be seen in this plot, it is therefore interesting to look at the relative error of the free energy per spin of parallel tempering against population annealing. The relative error is shown in Fig. 5. The errors are well bounded within the accuracy of about and are well scattered around zero suggesting the nature of the errors is essentially statistical and again showing PT is as efficient as PA in accurately measuring the absolute free energy.
Iv Conclusion
In this paper, I showed an efficient approach to measure the absolute free energy in parallel tempering using the exponential averaging method. The algorithm is tested using the threedimensional EA model and the results of parallel tempering agree very well with those of population annealing, showing parallel tempering can equally efficiently measure free energy as population annealing. There are technical issues with parallel tempering but nevertheless can be readily solved. The fact that the reweighting technique works so well suggests that many quantities including energy and free energy can be accurately estimated using the reweighting technique at many temperatures with little overhead in parallel tempering without using interpolation techniques. Finally, I hope that this simple and direct access to free energy in parallel tempering can make the spin glass and other relevant research fields potentially richer.
Acknowledgments
I gratefully acknowledge support from NSF (Grant No. DMR1208046). I thank Helmut Katzgraber for kindly allowing me to use part of the reference population annealing data. I thanks Jon Machta and Helmut Katzgraber for helpful discussions and suggestions, and also for the careful reading of the manuscript.
References
 C. H. Bennett, Journal of Computational Physics 22, 245 (1976).
 R. W. Zwanzig, Journal of Chemical Physics 22 (1954).
 C. K. Thomas, D. A.Huse, and A. Middleton, Phys. Rev. Lett. 107, 047203 (2011).
 T. Schilling and F. Schmid, The Journal of Chemical Physics 131, 231102 (2009).
 W. Wang, J. Machta, and H. G. Katzgraber ((2015), in preparationa).
 S. Trebst and U. Hansmann, The European Physical Journal E 24, 311 (2007).
 R. H. Swendsen and J.S. Wang, Phys. Rev. Lett. 57, 2607 (1986).
 C. Geyer, in Computing Science and Statistics: 23rd Symposium on the Interface, edited by E. M. Keramidas (Interface Foundation, Fairfax Station, 1991), p. 156.
 K. Hukushima and K. Nemoto, J. Phys. Soc. Jpn. 65, 1604 (1996).
 K. Hukushima and Y. Iba, in The Monte Carlo method in the physical sciences: celebrating the 50th anniversary of the Metropolis algorithm, edited by J. E. Gubernatis (AIP, 2003), vol. 690, pp. 200–206.
 J. Machta, Phys. Rev. E 82, 026704 (2010).
 J. Machta and R. Ellis, J. Stat. Phys. 144, 541 (2011).
 W. Wang, J. Machta, and H. G. Katzgraber ((2015), in preparationb).
 W. Wang, J. Machta, and H. G. Katzgraber, Phys. Rev. B 90, 184412 (2014).
 W. Wang, J. Machta, and H. G. Katzgraber, Phys. Rev. E ((2014), submitted).