Auxiliary-field based trial wave functions in quantum Monte Carlo calculations
Quantum Monte Carlo (QMC) algorithms have long relied on Jastrow factors to incorporate dynamic correlation into trial wave functions. While Jastrow-type wave functions have been widely employed in real-space algorithms, they have seen limited use in second-quantized QMC methods, particularly in projection methods that involve a stochastic evolution of the wave function in imaginary time. Here we propose a scheme for generating Jastrow-type correlated trial wave functions for auxiliary-field QMC methods. The method is based on decoupling the two-body Jastrow into one-body projectors coupled to auxiliary fields, which then operate on a single determinant to produce a multi-determinant trial wave function. We demonstrate that intelligent sampling of the most significant determinants in this expansion can produce compact trial wave functions that reduce errors in the calculated energies. Our technique may be readily generalized to accommodate a wide range of two-body Jastrow factors and applied to a variety of model and chemical systems.
The development of predictive quantum simulation methods is one of the foremost challenges in the fields of quantum chemistry and condensed matter physics. One step toward being able to accurately predict the properties of a variety of complex molecules and solids is to develop improved variational trial wave functionsToulouse et al. (2015); Needs et al. (2010) for projection quantum Monte Carlo methods, such as diffusion Monte Carlo (DMC).Toulouse et al. (2015); Needs et al. (2010); Umrigar et al. (1988); Umrigar and Filippi (2005); Foulkes et al. (2001) In these methods, the trial wave function serves not only as an importance function to drive the sampling of configurations, but also as a constraint used to suppress the development of the sign/phase problems. Accurate variational wave functions are therefore pivotal for guaranteeing convergence to the correct ground state energy with minimal bias and for improving the efficiency of simulations.Huang et al. (1997); Toulouse and Umrigar (2008); Petruzielo et al. (2012) This is especially true for strongly correlated systems, for which non-interacting or mean field trial wave functions are known to yield substantial statistical and systematic errors.Rubenstein et al. (2012) Developing more accurate variational wave functions is thus a crucial step toward being able to properly model many technologically important, yet theoretically challenging materials, such as high- superconductors, the lanthanides and actinides.
One route toward more accurate variational wave functions for larger, multidimensional systems has been to develop more sophisticated variational ansatzes, most of which explicitly include some amount of correlation. Such forms include antisymmetric geminal product (AGP),Casula and Sorella (2003); Casula et al. (2004); Neuscamman (2012, 2013a, 2013b, 2015) Bardeen-Cooper-Schrieffer (BCS),Guerrero et al. (2000); Carlson et al. (2011); Guerrero et al. (1999) Pfaffian,Bajdich et al. (2006, 2008) and matrix product state (MPS) wave functions.Wouters et al. (2014) All of these forms have a long history of being used in calculations performed at the variational level, but have assumed a more limited role in projector QMC calculations. A second path toward more accurate variational states is to create such wave functions by applying a physically-inspired projection operator onto a trial wave function. For years, the DMC community has generated trial wave functions using Jastrow factors containing one-, two-, and/or three-body terms that, among other things, provide a compact way of reinforcing cusp conditions.Huang et al. (1997); Toulouse et al. (2015); Needs et al. (2010); Foulkes et al. (2001); Clark et al. (2011); Morales et al. (2012); Clay and Morales (2015) These Slater-Jastrow wave functions and the advent of new techniques for variationally optimizing themUmrigar et al. (2007); Toulouse and Umrigar (2007) have greatly expanded the fidelity and reach of this method in recent years. Symmetry-projected wave functions have also been shown to recover substantial portions of the correlation energy at the variational level and to considerably reduce the statistical noise observed in auxiliary-field quantum Monte Carlo (AFQMC) calculations when used as trial wave functions.Shi et al. (2014); Shi and Zhang (2013); Rodríguez-Guzmán et al. (2013); Rodriguez-Guzman et al. (2012); Jimenez-Hoyos et al. (2013) Even more sophisticated projectors could be imagined, but key to unlocking their potential is the ability to apply and evaluate them in an efficient manner in the framework of AFQMC.
In this work, we propose a scheme to create strongly correlated variational/trial wave functions by exploiting the Hubbard-Stratonovich (HS) transformation, commonly used to decouple the Coulomb term in AFQMC simulations, to decouple two-body projection operators.Hirsch (1985) Based upon this scheme, we generate Slater-Jastrow trial states for use in second-quantized projector QMC methods, thus extending the benefits of the Jastrow wave function beyond the realm of first-quantized techniques. Within our method, the exact form of the Slater-Jastrow wave function yields a multi-determinant expansion whose size scales exponentially with the system size. We therefore explore a few techniques that allow us to generate representations that quickly converge to the exact variational energy using but a fraction of the total number of determinants. In this paper, we use the one-band Hubbard model and the Gutzwiller projector, the simplest form of a Jastrow factor, to demonstrate our methodology. Nevertheless, the method we propose is completely general and can be extended to more sophisticated wave function forms and systems.
ii.1 The Gutzwiller wave function
Here, denotes a single Slater determinant, such as the free-electron or Hartree-Fock wave function. is a variational parameter. The projector ,Gutzwiller (1963) which is the simplest Jastrow correlator, introduces correlations among electrons by suppressing doubly occupied configurations in . is a one-body operator often chosen to be the kinetic energy term of the Hamiltonian, and is also a variational parameter. It is shown that the projector enhances kinetic exchange, and can improve the variational energy of the Hubbard model.Otsuka (1992); Yanagisawa et al. (1998) For simplicity, we still refer to the state given by Eq. (1) as the Gutzwiller wave function.
The two-body nature of hinders the direct application of in QMC simulations. Nonetheless, using the discrete HS transformation,Hirsch (1985) the projector can be decoupled as follows
where , and is the auxiliary field on the -th site. The Gutzwiller wave function produced after decoupling may be viewed as a finite sum over determinants, each of which is a function of a discrete set of HS fields . We refer to this wave function (Eq. (2)) as the exact GWF (exGWF).
For a given system size and filling (where and represent the number of spin-up and spin-down electrons), we optimize the variational energy as a function of . We have verified that our optimized are consistent with those reported in Ref. Otsuka, 1992 for the half-filled Hubbard model in one and two dimensions.
ii.2 The Hubbard model and the Constrained-Path Monte Carlo algorithm
To showcase the GWF, we study the ground state of the one-band repulsive Hubbard model in two dimensions using the constrained-path Monte Carlo (CPMC) technique.Zhang et al. (1997); Zhang and Krakauer (2003) The system is defined by the Hamiltonian
The parameters and represent the hopping amplitude and on-site repulsion, respectively. () creates (destroys) an electron with spin at site , and is the number operator for a spin- electron.
The CPMC algorithm is an AFQMC method that works in the second-quantized framework. For a detailed discussion of the CPMC method and benchmark results, we refer readers to Ref. Zhang et al., 1995, 1997; Zhang and Krakauer, 2003. Here we note that CPMC eliminates the sign problem much as the fixed-node approximation does in DMC by rejecting random walkers that have negative overlaps with the trial wave function. We use as the unit of energy and set throughout this work.
Iii Results and Discussion
In order to demonstrate the benefits of using a Slater-Jastrow wave function, we first examine how the quality of the trial wave function affects the magnitude of the systematic Trotter factorization error. To do so, we compare the ground state energies obtained using the free electron (FE) and exGWF trial wave functions for various time steps for the half-filled 10-site 2D Hubbard model at under periodic boundary conditions. Although quantum Monte Carlo does not exhibit the sign problem at this filling, we deliberately apply the constrained-path approximationZhang et al. (1997) in the simulations so that we can gauge how the bias and errors that result from the constrained-path approximation vary with the quality of trial states. In both sets of calculations, we have utilized the second order Trotter break-up formula for the propagators. Fig. 1 compares the correction of the Trotter error obtained by extrapolating the CPMC energy to the limit . As illustrated in Fig. 1, the FE case has a strong dependence on , and the extrapolated energy is off by 6.2%. In contrast, the energies are not only more accurate, but have a much weaker dependence on when the exGWF is used Similar conclusions may be drawn from other simulation results (e.g. Fig. 6 in Appendix A).
Next, we compare the fully extrapolated CPMC ground state energy obtained using a FE trial state with that obtained using the exGWF. We again consider the half-filled 2D Hubbard model on the 10-site square lattice with periodic boundary conditions, and retain the constrained-path approximation in order to gauge the effects of the different trial wave functions.
The results of this comparison are listed in Table 1. The deviation between the exact and FE trial wave function results generally grows with , and can be as large as at . The exGWF data, on the other hand, are in excellent agreement with the exact energies regardless of . As manifest in Table 2, the same conclusion may be drawn for other half-filled systems. While Table 1 illustrates that the exGWF is an accurate trial wave function for the system examined, the exGWF quickly becomes computationally intractable because the number of determinants in the expansion in Eq. (2) scales exponentially with . In order to reduce the computational cost, we propose several compact representations of the exGWF and discuss their performance below.
Method I : Using Monte Carlo sampling, we construct a representation of the exGWF by randomly choosing determinants from the states of the full exGWF expansion. The wave functions constructed in this manner will be called randomly-sampled GWFs (rGWFs). Fig. 2 illustrates results obtained from using 30 independent rGWF samples for the half-filled 10-site 2D Hubbard at . In panels (a) and (b), each rGWF consists of 100 and 800 determinants respectively. The final CPMC energy is computed by averaging the 30 simulations in each case. Using 100-determinant rGWFs as trial states, the averaged CPMC ground state energy is about away from the exact result, which compares favorably against the deviation of the FE result. By increasing the sample size to 800 determinants, the deviation is reduced to . We note that the average overlap between the rGWF and exGWF is and for the 100- and 800-determinant respectively. The higher overlap explains the improvement seen in the 800-determinant data, as more terms are involved in the sampling process.
Although each rGWF contains a subset of terms from the exGWF, the comparisons demonstrate that the approach could still capture the essential physics of the exact Gutzwiller wave function. There are two factors that can affect the accuracy and computational cost of the rGWF: the number of determinants in a given sample, and the total number of independent rGWF samples. As the system size and interaction strength are increased, we expect the number of determinants needed to achieve the same level of accuracy as the exGWF to also increase for a fixed number of independent samples.
Method II : In the rGWF approach, because the determinants (and their corresponding HS field configurations) are selected randomly, one clear way of reducing the effort needed to sample rGWFs is to select the determinants more intelligently. This is precisely what motivates importance sampling in efficient Monte Carlo algorithms. To make progress, we proceed as follows. Let denote the determinants in the expansion Eq. (2). We construct a Hamiltonian matrix using the non-orthogonal determinants . After diagonalizing , we interpret the eigenvector of the lowest eigenvalue of the matrix as the weight of the determinant .
The inset of Fig. 3 shows one example of the distribution of the determinant weights for the kagome lattice doped with 4 holes at . Based on the information in , we construct our trial wave functions by linearly combining determinants with weights satisfying , where is a cutoff, and study their variational energy as a function of the number of determinants (hence ) retained. The empty (red) circles in the main panel of Fig. 3 depict the results for the doped 12-site kagome lattice system. As the curve indicates, the variational energy quickly converges with the number of “important”Â (i.e., large weight) determinants included in the wave function. For instance, in the main panel of Fig. 3, the vertical arrow indicates a state containing 252 determinants, corresponding to the cut-off . This state contains of the total determinants, and has a overlap with the exGWF. Its energy is of the exact GWF variational energy.
These results indicate that, as long as importance sampling is employed, it is possible to construct a trial wave function much reduced in size which still recovers a sizable fraction of the variational energy. The same trend carries over to CPMC calculations using these same trial wave functions. In fact, as shown in the main panel of Fig. 3, the CPMC energies converge faster than the variational energies do. For example, using the 252-determinant trial state indicated by the vertical arrow, the resulting CPMC result is away from the exact energy. Using the next exGWF representation (which contains 504 determinants), the deviation is further reduced to . Therefore, by properly sampling the most important determinants, one is able to create an accurate representation of the exact Gutzwiller wave function that is also computationally tractable.
Method III : In order to gain insight into the distribution of weights, we took a closer look at the determinants’ corresponding HS field configurations. Let 1 and 0 denote the field values and respectively. Drawing upon the half-filled 10-site square lattice case as an example, we make the following observations. Firstly, the field configurations of many of the important determinants are all permutations of the configuration , which has an equal number of and fields. Secondly, the field structures with nearly degenerate weights (as indicated by the “band”-like structure in the inset of Fig. 3) are related via translational symmetry. For instance, determinants generated from configurations and have almost identical weights while these two configurations are related via a translation in the -direction by one lattice constant, c.f. Fig. 5.
Based on this idea, we generate a trial wave function (denoted as sfGWF) using only the HS field configuration and its permutations. The resulting sfGWF has determinants and a surprisingly large overlap with the exGWF. For the same system, we observe the same behavior at other interaction strengths: sfGWFs constructed in this manner ubiquitously have almost unity overlap with the corresponding exGWF.
Encouraged by these observations at half-filling, we subsequently considered the kagome lattice doped with 4 holes. This is a closed-shell filling under periodic boundary conditions. The coefficients from diagonalizing the matrix indicate that highly-weighted determinants have two degenerate HS field configurations:
where is the number of holes. Because these configurations are degenerate, we adopt one of them to construct the sfGWF trial wave functions for our doped system.
Benchmark results of the sfGWFs for the Hubbard model are depicted in Fig. 4. FE trial wave function data are also included for comparison. More detailed data can be found in Appendix A. In the case of the half-filled 10-site square lattice and the doped 12-site kagome lattice, the non-interacting ground state is closed-shell under periodic boundary conditions. On the square lattice, we have implemented twist boundary conditions in order to have closed-shell free-electron states at the fillings considered. In all cases, the CPMC ground state energy is improved when sfGWF is adopted as the trial wave function. This is particularly true for the 10-site square and 12-site kagome lattice systems where the sfGWF results are almost exact. On the square lattice, the deviation in the sfGWF result is typically less than from the exact energy for half-filling. In the doped case, the error only becomes smaller at large couplings.
Before we close the discussion, we would like to make a few remarks regarding the three methods presented in this section. Because the exGWF expansion scales exponential with system size, reducing the computational cost of using Slater-Jastrow wave functions like the Gutzwiller wave function discussed here is essential to making these wave functions practically useful. The “importance sampling” scheme (i.e., Method II) appears to be the most efficient approach, at least for the clusters tested. To converge the CPMC energy, the number of dominant determinants required is only a fraction of the total number of terms . Obviously, the diagonalization technique is only suitable for small clusters. A full variational approach will be required for large simulation cells and realistic Hamiltonians in quantum chemistry. This idea will be explored in a future publication.
The computational cost of the proposed sfGWF approach scales as ( being the total number of electrons), which compares favorably with the scaling of the exGWF, but nevertheless is substantial. The cost may be further reduced if the symmetry among degenerate fields is exploited. We also note that the HS fields generated in the sfGWF do not exhaust all the highly-weighted determinants. This may be responsible for the slightly larger deviation (comparing to exGWF data) observed in Table 2 for systems simulated with periodic boundary conditions.
Lastly, the comparisons in Fig. 4 and Table 2 indicate that the best agreement between the sfGWF and exact energies is achieved for closed-shell systems with PBCs. For the lattice cases, while twist boundary conditions allow the free-electron state at any filling to be unique (i.e., closed-shell), they nevertheless break the rotational symmetry of the lattice. We speculate that the effective magnetic flux resulting from twist boundary conditions may be responsible for the relatively large deviations in the sfGWF results because the decomposition Eq. (2) breaks the spin symmetry. This speculation is partly supported by the fact that, for the lattice system, the variational energy is at least higher than the exact energy and can be as large as (half-filled, ). In contrast, the variational energy of the sfGWF is quite accurate for the 10-site square and 12-site kagome lattice cases, with the smallest deviation being only (4-hole doped 12-site kagome at , c.f., Table 2 in the Appendix). In addition to the “spin” decomposition scheme Eq. (2), one could also employ the charge decomposition which preserves the spin rotation symmetry.Hirsch (1983) This issue of different HS decomposition techniques will be explored in subsequent works.
Using the Gutzwiller projected wave function as an example, we have illustrated an auxiliary-field based scheme for generating Slater-Jastrow trial wave functions for second-quantized AFQMC simulations. We have shown that, by intelligently sampling multi-determinant representations of these wave functions, we can produce trial wave functions that recover substantial amounts of both the variational and correlation energies. These wave functions decrease CPMC errors when compared with those produced by traditional AFQMC techniques that rely on single determinant trial wave functions. Although the HS field structure is unique to the discrete HS transformation adopted in this work, the results presented shed light on how to develop a more efficient sampling scheme for more general Jastrow-type wave functions, paving the way toward more accurate AFQMC simulations of not only strongly-correlated model systems, but of molecules and solid-state materials as well. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, 15-ERD-013. The authors would like to thank Hao Shi for providing exact diagonalization results for the square lattice.
Appendix A Benchmarking the sfGWF
- Toulouse et al. (2015) J. Toulouse, R. Assaraf, and C. Umrigar, arXiv:1508.02989v1 (2015).
- Needs et al. (2010) R. Needs, M. Towler, N. Drummond, and P. Lopez Rios, J. Phys.: Condens. Matter 22, 023201 (2010).
- Umrigar et al. (1988) C. Umrigar, K. Wilson, and J. Wilkins, Phys. Rev. Lett. 60, 1719 (1988).
- Umrigar and Filippi (2005) C. Umrigar and C. Filippi, Phys. Rev. Lett. 94, 150201 (2005).
- Foulkes et al. (2001) W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Rajagopal, Rev. Mod. Phys. 73, 33 (2001).
- Huang et al. (1997) C.-J. Huang, C. Umrigar, and M. Nightingale, J. Chem. Phys. 107, 3007 (1997).
- Toulouse and Umrigar (2008) J. Toulouse and C. Umrigar, J. Chem. Phys. 128, 174101 (2008).
- Petruzielo et al. (2012) F. R. Petruzielo, J. Toulouse, and C. J. Umrigar, J. Chem. Phys. 136, 124116 (2012).
- Rubenstein et al. (2012) B. M. Rubenstein, S. W. Zhang, and D. R. Reichman, Phys. Rev. A 86, 053606 (2012).
- Casula and Sorella (2003) M. Casula and S. Sorella, J. Chem. Phys. 119, 6500 (2003).
- Casula et al. (2004) M. Casula, C. Attaccalite, and S. Sorella, J. Chem. Phys. 121, 7110 (2004).
- Neuscamman (2012) E. Neuscamman, Phys. Rev. Lett. 109, 203001 (2012).
- Neuscamman (2013a) E. Neuscamman, J. Chem. Phys. 139, 194105 (2013a).
- Neuscamman (2013b) E. Neuscamman, J. Chem. Phys. 139, 181101 (2013b).
- Neuscamman (2015) E. Neuscamman, Mol. Phys. 114, 577 (2015).
- Guerrero et al. (2000) M. Guerrero, G. Ortiz, and J. E. Gubernatis, Comp. Phys. Comm. 127, 143 (2000).
- Carlson et al. (2011) J. Carlson, S. Gandolfi, K. E. Schmidt, and S. Zhang, Phys. Rev. A 84, 061602 (2011).
- Guerrero et al. (1999) M. Guerrero, G. Ortiz, and J. E. Gubernatis, Phys. Rev. B 59, 1706 (1999).
- Bajdich et al. (2006) M. Bajdich, L. Mitas, G. Drobny, L. K. Wagner, and K. Schmidt, Phys. Rev. Lett. 96, 130201 (2006).
- Bajdich et al. (2008) M. Bajdich, L. Mitas, L. K. Wagner, and K. Schmidt, Phys. Rev. B 77, 115112 (2008).
- Wouters et al. (2014) S. Wouters, B. Verstichel, D. Van Neck, and G. K.-L. Chan, Phys. Rev. B 90, 045104 (2014).
- Clark et al. (2011) B. K. Clark, M. Morales, J. McMinis, J. Kim, and G. E. Scuseria, J. Chem. Phys. 135, 244105 (2011).
- Morales et al. (2012) M. Morales, J. McMinis, B. K. Clark, J. Kim, and G. E. Scuseria, J. Chem. Theory Comput. 8, 2181 (2012).
- Clay and Morales (2015) R. C. Clay and M. A. Morales, J. Chem. Phys. 142, 234103 (2015).
- Umrigar et al. (2007) C. J. Umrigar, J. Toulouse, C. Filippi, S. Sorella, and R. Hennig, Phys. Rev. Lett. 98, 110201 (2007).
- Toulouse and Umrigar (2007) J. Toulouse and C. J. Umrigar, J. Chem. Phys. 126, 084102 (2007).
- Shi et al. (2014) H. Shi, C. A. Jiménez-Hoyos, R. Rodríguez-Guzmán, G. E. Scuseria, and S. Zhang, Phys. Rev. B 89, 125129 (2014).
- Shi and Zhang (2013) H. Shi and S. Zhang, Phys. Rev. B 88, 125132 (2013).
- Rodríguez-Guzmán et al. (2013) R. Rodríguez-Guzmán, C. A. Jiménez-Hoyos, R. Schutski, and G. E. Scuseria, Phys. Rev. B 87, 235129 (2013).
- Rodriguez-Guzman et al. (2012) R. Rodriguez-Guzman, K. W. Schmid, C. A. Jimenez-Hoyos, and G. E. Scuseria, Phys. Rev. B 85, 245130 (2012).
- Jimenez-Hoyos et al. (2013) C. A. Jimenez-Hoyos, R. Rodriguez-Guzman, and G. E. Scuseria, J. Chem. Phys. 139, 204102 (2013).
- Hirsch (1985) J. E. Hirsch, Phys. Rev. B 31, 4403 (1985).
- Otsuka (1992) H. Otsuka, J. Phys. Soc. Jpn. 61, 1645 (1992).
- Yanagisawa et al. (1998) T. Yanagisawa, S. Koike, and K. Yamaji, J. Phys. Soc. Jpn. 67, 3867 (1998).
- Gutzwiller (1963) M. C. Gutzwiller, Phys. Rev. Lett. 10, 159 (1963).
- Zhang et al. (1997) S. Zhang et al., Phys. Rev. B 55, 7464 (1997).
- Zhang and Krakauer (2003) S. Zhang and H. Krakauer, Phys. Rev. Lett. 90, 136401 (2003).
- Zhang et al. (1995) S. Zhang et al., Phys. Rev. Lett. 74, 3652 (1995).
- Hirsch (1983) J. E. Hirsch, Phys. Rev. B 28, 4059 (1983).
- Carlson et al. (1999) J. Carlson, J. E. Gubernatis, G. Ortiz, and S. Zhang, Phys. Rev. B 59, 12788 (1999).