# Toward an improved control of the fixed-node error in quantum Monte Carlo: The case of the water molecule

###### Abstract

All-electron Fixed-node Diffusion Monte Carlo (FN-DMC) calculations for the nonrelativistic ground-state energy of the water molecule at equilibrium geometry are presented. The determinantal part of the trial wavefunction is obtained from a perturbatively selected Configuration Interaction calculation (CIPSI method) including up to about 1.4 million of determinants. Calculations are made using the cc-pCVZ family of basis sets, with to 5. In contrast with most QMC works no re-optimization of the determinantal part in presence of a Jastrow is performed. For the largest cc-pCV5Z basis set the lowest upper bound for the ground-state energy reported so far of -76.43744(18) is obtained. The fixed-node energy is found to decrease regularly as a function of the cardinal number and the Complete Basis Set limit (CBS) associated with exact nodes is easily extracted. The resulting energy of -76.43894(12) -in perfect agreement with the best experimentally derived value- is the most accurate theoretical estimate reported so far. We emphasize that employing selected CI nodes of increasing quality in a given family of basis sets may represent a simple, deterministic, reproducible, and systematic way of controlling the fixed-node error in DMC.

The only uncontrolled source of errornot (a) in quantum Monte Carlo (QMC) methods is the fixed-node approximation introduced to suppress the wild fluctuations of the sign of the wavefunction (fermion sign problem). Although the fixed-node error is small (typically, a few percents of the correlation energy), and many fixed-node QMC calculations of impressive accuracy have been realized, the error can still be too large in some applications, particularly in the important case of the computation of (very) small energy differences.

A major challenge for QMC is thus to set up a strategy of construction of trial wavefunctions having “good” nodes and, even more importantly, to propose a systematic way of improving such nodes. In practice, a standard strategy consists in introducing trial wavefunctions of the best possible quality and then to optimize their parameters in a preliminary Variational Monte Carlo (VMC) step through minimization of the variational energy or its variance.Umrigar et al. (2007) Many functional forms for the trial function have been explored in the literature, the most popular one being the Jastrow Slater formSchmidt and Moskowitz (1990)

(1) |

combining a Jastrow prefactor containing explicit electronic correlations and a short multi-determinantal expansion (typically, a few thousands of determinants) describing the multireference character of the wavefunction (static correlation effects).

Very recently some of us have proposed to keep the standard Jastrow Slater form for the trial wavefunction but to rely on the more conventional Configuration Interaction (CI) expansions of quantum chemistry for the multideterminantal part. No stochastic re-optimization of the CI expansion is performed, so that “pure CI” nodes are employed. Giner et al. (2013); Scemama et al. (2014); Giner et al. (2015) The rationale behind this proposal is to search for a better control of the fixed-node error by exploiting the unique properties of CI wavefunctions. Indeed, CI approaches provide a simple, deterministic, and systematic way to build wavefunctions of controllable quality. In a given one-particle basis set, the wavefunction is improved by increasing the number of determinants, up to the Full CI (FCI) limit. Then, by increasing the basis set, the wavefunction can be further improved, up to the complete basis set (CBS) limit where the exact solution of the electronic Schrödinger equation is reached. CI nodes, defined as the zeroes of the CI expansions, are also expected to display such a systematic improvement. The main difficulty is of course the exponential growth of the space of determinants with respect to the number of electrons and orbitals. However, this severe exponential increase can be dramatically attenuated by considering Selected CI (SCI) approaches designed to keep only the most important determinants. In practice, we have proposed to make use of the CIPSI method (Configuration Interaction using a Perturbative Selection done Iteratively),Huron et al. (1973); Evangelisti et al. (1983) one of the numerous variants of SCI proposed in the literature (see, e.g., Bender and Davidson (1969); Huron et al. (1973); Buenker and Peyerimholf (1974, 1975); Buenker et al. (1978); Bruna et al. (1980); Buenker et al. (1981); Evangelisti et al. (1983); Harrison (1991)). In this approach the multideterminant expansion is built iteratively by selecting determinants according to the importance of their second-order perturbational contribution to the total energy. As illustrated by a number of applications, CIPSI represents a very efficient way of approaching the FCI limit using only a tiny fraction of the total FCI space (see, for example a recent all-electron FCI-converged CIPSI calculation for CuCl involving 25 electrons and 36 active orbitals for a FCI space including 10 determinants,Caffarel et al. (2014)). This remarkable result is actually common to all variants of SCI approaches, including the FCI-QMC approach of Alavi et al.Booth et al. (2009); Cleland et al. (2010), which can be considered as a stochastic version of SCI. In practice, the main difficulty in using lengthy multideterminant expansions in QMC is the expensive cost of evaluating at each Monte Carlo step the first and second derivatives of the trial wavefunction (drift vector and local energy). However, efficient algorithms have been proposed to perform such calculations.Nukala and Kent (2009); Clark et al. (2011); Weerasinghe et al. (2014) Here, we shall use our recently introduced algorithm allowing to perform converged DMC calculations using multideterminant expansions including up to a few millions of determinants for a system like the water molecule.Scemama et al. (2016)

A remarkable property systematically observed so far in our first DMC applications using large CIPSI expansionsGiner et al. (2013); Scemama et al. (2014); Giner et al. (2015) is that, except for a possible transient regime at small number of determinants,not (b) the fixed-node error associated with CIPSI nodes decreases monotonically, both as a function of the number of determinants and of the basis set size, leading to the possibility of a control of the fixed-node error. Such a result is known not to be systematically true for a general CI expansion (see, e.g. Flad et al. (1997)). However, its validity here could be attributed to the fact that determinants are selected in a hierarchical way (the most important ones first), so that the wavefunction quality increases step by step, and so the quality of nodes.

In this Communication all-electron DMC/CIPSI calculations for the water molecule at equilibrium geometry using the cc-pCVZ family of basis sets with ranging from to 5 and large multideterminant expansions including up to 1 423 377 determinants are presented. The lowest (upper bound) fixed-node energy reported so far of -76.43744(18) is obtained. Performing the Complete Basis set (CBS) limit by extrapolating fixed-node energies as a function of the cardinal number of the basis set a value of -76.43894(12) for the total energy associated with exact nodes is obtained, in full agreement with the best known estimate of -76.4389.Klopper (2001)

CIPSI expansion. The multideterminant CIPSI expansion is built by selecting iteratively the most important determinants of the FCI expansion. In short (for more details, see Giner et al. (2013)), at iteration the multideterminant expansion is written as the sum of the previously selected determinants (thus, defining the reference space at this iteration)

(2) |

with energy Then, one determinant (or a group of determinants) not belonging to the reference space and corresponding to the greatest second-order energy change (or close to it within some threshold),

(3) |

is (are) selected and added to the reference space. At iteration the new expansion and energy is obtained by diagonalizing the Hamiltonian matrix within the new set of selected determinants. The iterative process is started with the Hartree-Fock determinant or a short expansion and is stopped when a target number of determinants is reached. In what follows the variational energy associated with the final CI expansion will be denoted as .

Water molecule. In this study we present benchmark calculations for the non-relativistic ground-state energy of the water molecule at equilibrium geometry, and .

CIPSI results. All configuration interaction calculations have been carried out using our perturbatively selected CI program QUANTUM PACKAGE (downloadable at Scemama et al. (2015)). Standard Dunning type correlation-consistent polarized core-valence basis sets cc-pCVZ with going from 2 to 5 are employed. CIPSI calculations have been performed using natural orbitals issued from the diagonalization of the one-body density matrix obtained in a preliminary CIPSI run. For each basis set, the selected CI expansion has been stopped for one million determinants, except for the largest cc-pCV5Z basis sets for which two million determinants were considered. Results are presented in Table 1 and compared to the recent benchmark CI calculations of Almora-Dìaz including up to sextuple excitations.Almora-Díaz (2014) As we shall see below, truncated versions of these one- and two million-determinant CIPSI expansions will actually be used in DMC, results are thus presented for these shorter expansions. A remarkable point is the high efficiency of CIPSI in obtaining accurate CI expansion with a small number of determinants. For the cc-pCVDZ basis set, the variational energy obtained with the 172 256 determinants used in DMC is different from the FCI value of Almora-Dìaz by only 0.7 mhartree. For the other basis sets, the differences remain small, that is 1.8, 1.8, and 2.5 mhartree for the cc-pCVTZ, cc-pCVQZ, and cc-pCV5Z basis sets, respectively.

Basis set | FCI size | # dets used in DMC | FCI, Almora-DìazAlmora-Díaz (2014) | Deviation | |
---|---|---|---|---|---|

cc-pCVDZ | 172 256 | -76.282136 | -76.282865 | 0.0007 | |

cc-pCVTZ | 640 426 | -76.388287 | -76.390158 | 0.0018 | |

cc-pCVQZ | 666 927 | -76.419324 | -76.421148 | 0.0018 | |

cc-pCV5Z | 1 423 377 | -76.428550 | -76.431105 | 0.0025 |

FN-DMC results. All-electron DMC calculations have been realized using our general-purpose QMC program QMC=CHEM (downloadable at Scemama et al. ()). A minimal Jastrow prefactor taking care of the electron-electron cusp condition is employed and molecular orbitals are slightly modified at very short electron-nucleus distances to impose exact electron-nucleus cusp conditions. The time-step used, a.u, has been chosen small enough to make the finite time-step error not observable with statistical fluctuations.

To accelerate DMC calculations and not to use the full one- and two million- determinant expansion of the initial CIPSI calculations we have employed the improved truncation scheme described in Scemama et al. (2016). In short, the approach consists in writing the CI expansion as

For each -determinant (), the contribution to the norm of the wavefunction is given either by for or for . If this contribution is below a given threshold the -determinant is discarded ( or , for the - or -sector, respectively). Here, denotes the number of different -determinants in the CI expansion. Such numbers being usually much smaller than the total number of products of determinants , the gain in computational cost can be important (see, Table 5 of Scemama et al. (2016)). Here, we chose to truncate the expansion by taking , except for the cc-pCV5Z basis where a value of has been used. Values for have been chosen small enough to get converged fixed-node energies as a function of the number of selected determinants within statistical errors. In other words, nodes employed in this work are expected to be close to FCI nodes. The final numbers of selected determinants used are given in Table 1.

Basis set[Ndets] | T(Ndets)/T(1det) | |
---|---|---|

cc-pCVDZ[172 256] | 101. | -76.41571(20) |

cc-pCVTZ[640 426] | 185. | -76.43182(19) |

cc-pCVQZ[666 927] | 128. | -76.43622(14) |

cc-pCV5Z[1 423 377] | 235. | -76.43744(18) |

The efficiency of our algorithm for computing large multiderminant expansions can be quantified by measuring the ratio of CPU times needed to realize one Monte Carlo step using either the full expansion or only the single HF determinant. Such ratios are presented in table 2 for each basis set. Fixed-Node DMC energies (in atomic units) obtained with CIPSI nodes for the various basis sets are also given in table 2 and plotted in Fig. 1 as a function of the inverse of the cardinal number to 5. The horizontal line is the best estimate of the total nonrelativistic energy reported in the literature, see Klopper (2001). For comparison, we have also reported the best estimates of the FCI energies of Almora-Dìaz. Quite remarkably both sets of points display a very similar overall behavior. In particular, the values converge smoothly to the same CBS limit as a function of the cardinal number with a typical inverse third power law. Using a simple two-parameter fitting function, , the CBS limit for DMC results gives an extrapolated value of . The error bar has been estimated by reproducing the fit over a large statistical ensemble of independent data drawn according to their respective error bars. No correlation between data being considered, the error value shoud be considered as rather conservative. Note that the energy of obtained with the cc-pCV5Z nodes is the lowest upper bound reported so far in DMC or any other approach. Regarding computational aspects, calculation of each FN-DMC energies of table 2 were performed using 800 cores on the Curie machine (TGCC/CEA/Genci) during about 15 hours. The cost of deterministic CIPSI calculations to build the trial wavefunctions is marginal. Roughly speaking, the cost is similar to that needed for making CISD calculations with the same basis sets.

Clark et al.,Clark et al. (2011) DMC (upper bound) | -76.4368(4) |
---|---|

This work, DMC (upper bound) | -76.43744(18) |

Almora-Dìaz,Almora-Díaz (2014) CISDTQQnSx (upper bound) | -76.4343 |

Helgaker et al.,Helgaker et al. (1997) R12-CCSD(T) | -76.439(2) |

Muller and Kutzelnigg ,Muller and Kutzelnigg (1997) R12-CCSD(T) | -76.4373 |

Almora-Dìaz,Almora-Díaz (2014) FCI + CBS | -76.4386(9) |

Halkier et al.,Halkier et al. (1998) CCSD(T)+CBS | -76.4386 |

Bytautas and Ruedenberg,Bytautas and Ruedenberg (2006) FCI+CBS | -76.4390(4) |

This work, DMC + CBS | -76.43894(12) |

Experimentally derived estimateKlopper (2001) | -76.4389 |

In Table 3 a selection of the best (lowest) values reported in the literature for the total energy of the water molecule is presented. Using DMC, the lowest value published so far is that of Clark et al. of . Here, using the nodes of the CIPSI/cc-pCV5Z expansion an improved value of is obtained. The lowest upper bound reached using a post-Hartree Fock correlated approach is that of Almora-Dìaz of , a value significantly higher than DMC values. Finally, the best (non-variational) estimates are those obtained by performing CBS extrapolation. At FCI level the most accurate one is that of Bytautas and Ruedenberg,Bytautas and Ruedenberg (2006) . Here, our value of -76.43894(12) is, to the best of our knowledge, the most accurate value reported so far. In both cases the best experimentally derived estimate of -76.4389 is recovered within error bars.

Conclusion. In this study we have performed DMC calculations using nodes of multideterminant CI expansions obtained through a perturbative selection of the most important determinants (selected CI). In contrast with most QMC works, no-reoptimization of nodes in presence of a Jastrow prefactor has been performed. For each basis set of the cc-pCVZ family (), CIPSI nodes obtained are of near-Full-CI quality. As a result of the deterministic construction of nodes using CI expansions, the total fixed-node energy is found to be a smoothly-decreasing function of the cardinal number of the basis set with a typical inverse third power law. The Complete Basis Set (CBS) limit leading to the total energy associated with exact nodes is then easy to perform. From a general perspective, we emphasize that employing selected CI nodes of increasing quality in a given family of basis sets may represent a simple, deterministic, reproducible, and systematic way of controlling the fixed-node error in DMC.

Acknowledgments. AS and MC thank the Agence Nationale pour la Recherche (ANR) for support through Grant No ANR 2011 BS08 004 01. This work has been made through generous computational support from CALMIP (Toulouse) under the allocation 2015-0510, and GENCI under the allocation x2015081738.

## References

- not (a) a Here, by uncontrolled it is meant that there does not exist a systematic way of arbitrarily reducing the fixed-node error.
- Umrigar et al. (2007) C. J. Umrigar, J. Toulouse, C. Filippi, S. Sorella, and R. G. Hennig, Phys. Rev. Lett. 98, 110201 (2007).
- Schmidt and Moskowitz (1990) K. E. Schmidt and J. W. Moskowitz, J. Chem. Phys. 93, 4172 (1990).
- Giner et al. (2013) E. Giner, A. Scemama, and M. Caffarel, Can. J. Chem. 91, 879 (2013).
- Scemama et al. (2014) A. Scemama, T. Applencourt, E. Giner, and M. Caffarel, J. Chem. Phys. 141, 244110 (2014).
- Giner et al. (2015) E. Giner, A. Scemama, and M. Caffarel, J. Chem. Phys. 142, 044115 (2015).
- Huron et al. (1973) B. Huron, P. Rancurel, and J. P. Malrieu, J. Chem. Phys. 58, 5745 (1973).
- Evangelisti et al. (1983) S. Evangelisti, J. P. Daudey, and J. P. Malrieu, Chem. Phys. 75, 91 (1983).
- Bender and Davidson (1969) C. F. Bender and E. R. Davidson, Phys. Rev. 183, 23 (1969).
- Buenker and Peyerimholf (1974) R. J. Buenker and S. D. Peyerimholf, Theor. Chim. Acta 35, 33 (1974).
- Buenker and Peyerimholf (1975) R. J. Buenker and S. D. Peyerimholf, Theor. Chim. Acta 39, 217 (1975).
- Buenker et al. (1978) R. J. Buenker, S. D. Peyerimholf, and W. Butscher, Mol. Phys. 35, 771 (1978).
- Bruna et al. (1980) P. J. Bruna, D. S. Peyerimholf, and R. J. Buenker, Chem. Phys. Lett. 72, 278 (1980).
- Buenker et al. (1981) R. J. Buenker, S. D. Peyerimholf, and P. J. Bruna, Computational Theoretical Organic Chemistry (Reidel, Dordrecht, 1981) p. 55.
- Harrison (1991) R. J. Harrison, J. Chem. Phys. 94, 5021 (1991).
- Caffarel et al. (2014) M. Caffarel, E. Giner, A. Scemama, and A. Ramírez-Solís, J. Chem. Theory Comput. 10, 5286 (2014).
- Booth et al. (2009) G. H. Booth, A. J. W. Thom, and A. Alavi, J. Chem. Phys. 131, 054106 (2009).
- Cleland et al. (2010) D. Cleland, G. H. Booth, and A. Alavi, J. Chem. Phys. 132, 041103 (2010).
- Nukala and Kent (2009) P. K. V. V. Nukala and P. R. C. Kent, J. Chem. Phys. 130, 204105 (2009).
- Clark et al. (2011) B. K. Clark, M. A. Morales, J. McMinis, J. Kim, and G. E. Scuseria, J. Chem. Phys. 135, 244105 (2011).
- Weerasinghe et al. (2014) G. L. Weerasinghe, P. López-Ríos, and R. J. Needs, Phys. Rev. E 89 (2014), 10.1103/physreve.89.023304.
- Scemama et al. (2016) A. Scemama, T. Applencourt, E. Giner, and M. Caffarel, J. Comput. Chem. (in press) (2016).
- not (b) bAn increase of the fixed-node energy may be sometimes observed at small number of determinants, (say, less than a few thousands) when large basis sets and/or canonical orbitals are used. This transient behavior has been found to systematically disappear when natural orbitals are used and/or larger expansion are considered.
- Flad et al. (1997) H. J. Flad, M. Caffarel, and A. Savin, in Recent Advances in Quantum Monte Carlo Methods (World Scientific Publishing, 1997).
- Klopper (2001) W. Klopper, Mol. Phys. 99, 481 (2001).
- Scemama et al. (2015) A. Scemama, E. Giner, T. Applencourt, G. David, and M. Caffarel, “Quantum package v0.6,” (2015), doi:10.5281/zenodo.30624.
- Almora-Díaz (2014) C. X. Almora-Díaz, J. Chem. Phys. 140, 184302 (2014).
- (28) A. Scemama, E. Giner, T. Applencourt, and M. Caffarel, “Qmc=chem,” Https://github.com/scemama/qmcchem.
- Helgaker et al. (1997) T. Helgaker, W. Klopper, H. Koch, and J. Noga, J. Chem. Phys. 106, 9639 (1997).
- Muller and Kutzelnigg (1997) H. Muller and W. Kutzelnigg, Mol. Phys. 92, 535 (1997).
- Halkier et al. (1998) A. Halkier, T. Helgaker, P. Jorgensen, W. Klopper, H. Koh, J. Olsen, and A. K. Wilson, Chem. Phys. Lett. 286, 243 (1998).
- Bytautas and Ruedenberg (2006) L. Bytautas and K. Ruedenberg, J. Chem. Phys. 124, 174304 (2006).