Using Local Operator Fluctuations to Identify Wave Function Improvements
A method is developed that allows analysis of quantum Monte Carlo simulations to identify errors in trial wave functions. The purpose of this method is to allow for the systematic improvement of variational wave functions by identifying degrees of freedom that are not well-described by an initial trial state. We provide proof of concept implementations of this method by identifying the need for a Jastrow correlation factor, and implementing a selected multi-determinant wave function algorithm for small dimers that systematically decreases the variational energy. Selection of the two-particle excitations is done using quantum Monte Carlo within the presence of a Jastrow correlation factor, and without the need to explicitly construct the determinants. We also show how this technique can be used to design compact wave functions for transition metal systems. This method may provide a route to analyze and systematically improve descriptions of complex quantum systems in a scalable way.
First principles quantum Monte Carlo calculationsfoulkes_quantum_2001 () for solids are a promising way to go beyond density functional theory (DFT). These methods directly simulate electron-electron correlations and can obtain very high accuracy on challenging materialswagner_quantum_2014 (); wagner_ground_2015 (); foyevtsova_textitab_2014 (); zheng_computation_2015 () using current state of the art techniques like fixed node diffusion Monte Carlo (DMC). Despite this success, the DMC method’s accuracy is limited by the fixed node approximation, which allows for polynomial scaling of the computational cost with system size, but results in a DMC energy that is only an upper bound to the true ground state energy. In practical calculations, improvement of the accuracy and efficiency of fixed node diffusion Monte Carlo is reliant on improving trial wave functions which determine the fixed nodal surface.
In order for a trial wave function to be appropriate for quantum Monte Carlo calculations, it should be compact and efficient to calculate. For application to bulk materials, it must also be size-extensive; that is, the total energy must scale with the system size. By far the most common trial wave function is the Slater-Jastrow wave functionjastrow_many-body_1955 (); ceperley_monte_1977 (), which is simple, extensive, and initial guesses are easily obtainable from DFT codes. While truncated determinant expansions can be effective in describing small moleculespetruzielo_approaching_2012 (), they cannot be used in bulk materials because they are not size-extensive. Backflow wave functionsfeynman_energy_1956 () have proven effective in homogeneouskwon_effects_1998 () and inhomogeneouslopez_rios_inhomogeneous_2006 () systems, but worsen the computational scaling with system size of QMC methods and sometimes do not offer improvement of results. It is thus of great interest, given a Slater-Jastrow wave function, whether there is a compact wave function that describes the most important improvements relative to the ground state.
In this article, we present some first steps towards a method that uses fluctuations of the local energy , not to optimize a given parameterization, but to identify directions in Hilbert space that can improve trial wave functions. We first provide a summary of the principle of using the imaginary time projector to improve wave functions and the notation that will be used in the article. Then we show a proof of concept for multi-Slater Jastrow wave functions, in which this method is used to select determinants in the wave function. Finally, we show how the local energy fluctuations can be used to determine a priori what terms to add to a variational wave function for a transition metal system TiO. These results set the stage for data mining of many-body wave functions to determine how they should be improved.
In this work, we use ideas that have been known for a long time for optimizing parameters in wave functionsneuscamman_optimizing_2012 (); toulouse_optimization_2007 (); filippi_optimal_2000 (), but we follow more the work of Holtzmannholzmann_backflow_2003 () in that we would like to use the Feynman-Kac formulae to discover which parameterization to add to a given wave function. The quantum variational principle states that for any appropriately normalized trial wave function , where R is the many-body electron coordinate and is a set of parameter values, the expectation value of the Hamiltonian of the system in state equals or exceeds the ground state energy of the Hamiltonian:
We then minimize with respect to the parameter set . Once this is done, we must alter the parameterization of the trial wave function to obtain further improvement in the energy estimate. Our ultimate goal will be not to optimize the parameters within a fixed set , but to identify new parameters that must be added to to improve the qualitative structure of the particular trial state.
Iteratively applying the projection operator to a trial function produces a sequence of new wave functions:
This converges to the exact ground-state wave function in the infinite limit:
Performing this operation directly corresponds to a projector Monte Carlo method, such as diffusion Monte Carlo. The challenge in doing this is that compact representations of the operator are generally not known, and so the imaginary time dynamics must operate in very high dimensions. Our objective here will be to find a compact representation of the short-time projector operator.
We begin by considering an arbitrary set of linear operators . Iteratively applying this set of operators generates a new sequence of wave functions:
For brevity, we define:
We force the minimal set of operators to mimic the projection operator by minimizing the square deviation of from :
This minimization procedure provides an estimate of the set of associated operator amplitudes. We define the local operators and a local energy . By expanding the projection operator to first-order and minimizing the square deviation, we find that:
where we have assumed that elements of the set are orthogonal, or approximately orthogonal, such that . Fig. 1 depicts this scheme pictorially, with the exact and mimicked projection operators represented by the black and tangential red arrows respectively. We see then that the mimicked projection operator evaluated for can be viewed as a linearized approximation to the exact path to the ground state through Hilbert space. In this way, our approximation to the projection operator identifies the most significant elements of Hilbert space absent from an initial trial state.
The derivation of our method is similar in spirit to the stochastic reconfiguration (SR) of Sorella sorella_green_1998 (); sorella_green_2000 (); sorella_generalized_2001 (); sorella_weak_2007 (); neuscamman_optimizing_2012 (). The energy fluctutation potential method (EFP) also shares some similarities with our technique in its focus on the correlation between the local behavior of the energy and some chosen operator filippi_optimal_2000 (); schautz_optimized_2004 (); scemama_simple_2006 (). A set of operators is a good set if only a few terms in Eqn 9 are non-zero, while a set with many small values in Eqn 9 is not an efficient descriptor of the wave function improvement.
Iii QMC Methodology
We first compute the single-particle Hartree-Fock (HF) orbitals for a molecular system. We obtain all orbitals using the GAMESS computational package schmidt_general_1993 (); gordon_chapter_2005 (). Core electrons were replaced by the corresponding Burkatzki-Filippi-Dolg pseudopotential burkatzki_energy-consistent_2007 () with triple- basis sets.
We perform variational Monte Carlo with the QWalk computational package wagner_qwalk:_2009 (). We begin with a trial wave function of the Slater-Jastrow form:
We use the linear method of Umrigar et. al. umrigar_energy_2005 (); umrigar_alleviation_2007 (); toulouse_optimization_2007 () to optimize the Jastrow . The form of the Jastrow correlation factor is a function of the electron and ionic coordinates:
where and indices represent electronic coordinates and represents ionic coordinates. The functions are given by:
where the and functions have the general form:
and is a polynomial chosen to smoothly go to zero at schmidt_correlated_1990 (). This form of the Jastrow factor explicitly incorporates three-body interactions between two electrons and an ion.
Iv Determinant selection
The set of second-quantized two-body creation/destruction operators,
offers one possible choice of linear operators in Eqn 9. If are occupied orbitals and are unoccupied orbitals, then applying a to a Slater determinant generates an additional excited-state determinant. The elements of the two-body reduced density matrix (2-RDM) are given by the expectation values of these two-body creation/destruction operators. We thus make the analogy with local energy to define a local density matrix element, given a wave function :
where , refers to the set of coordinates generated by changing the positions of two electrons, and we have omitted overall normalization. We evaluate this 2-body integral in a QMC calculation by sampling the coordinates and from the sum over orbitals and the many-body electron coordinate R from wagner_types_2013 (). With this, the expression given in Eqn 16 can be rearranged to give:
where the normalization factor is given by:
The two particle operators in Eqn 14 are used to evaluate Eqn 9 and generate a list of important determinants missing from the initial wave function. Hence, we can select the determinants most important to the exact ground state without the need to first evaluate those determinants. The entire process of wave function generation is summarized as such: .
Obtain single-particle orbitals from a HF calculation.
Optimize single-determinant Slater-Jastrow:
Rank 2-RDM elements by covariance of with .
Add corresponding determinant to the expansion:
Optimize coefficients of using the linear method.
Iterating steps 3-5 of this process generates a determinantal expansion of arbitrary length up to the full size of the active space.
iv.1 H molecule
For the case of H, we restrict our active Hilbert space to the set of bonding/antibonding -symmetry orbitals. Fig. 2 shows the contours of the sampled amplitude of the local operator associated with a 2-body bonding-to-antibonding excitation in an isolated hydrogen dimer versus the sampled local energy for each of two trial states:
where and are the Slater-Jastrow and multi-Slater Jastrow wave functions containing the bonding and antibonding single-particle orbitals respectively.
The line segments on each panel in Fig 2 indicate the principal components of the resulting distribution. These components are given by the eigenvectors of the covariance matrix of the local energy distribution taken with respect to the local operator :
in this two-dimensional representation. The rotation of the principle components relative to the axes in the left panel of Fig. 2 shows that the covariance matrix contains nonvanishing off-diagonal elements. It follows that and are correlated for this single-determinant trial state. After the addition of the associated determinant to the wave function in the right panel of Fig. 2, the principle components rotate to align with the axes, indicating that the covariance matrix has become diagonal. This implies that the covariance between the local energy and local operator has vanished, and the two variables now have zero covariance. That is, a key element absent from the initial trial state has been identified and added based on the covariance, pushing the wave function closer to the exact ground state.
iv.2 Dimer Molecules
As a further proof of concept, we apply the covariance method to select determinants for a set of stretched molecules: H (0.88 Å bond length), N (1.7 Å bond length), O (1.6 Å bond length), and F (1.5 Å bond length). By stretching the molecules, the electron correlations are enhanced, increasing the strength of the covariance signal. We obtain single-particle orbitals for each system from a restricted open-shell Hartree-Fock (ROHF) calculation using GAMESS. This method doubly-fills molecular orbitals (MOs) to the greatest extent possible, and places remaining unpaired electrons into singly-filled MOs. We limit our active space to a set of bonding and antibonding MOs with cylindrical symmetry and either - or -symmetry. Other states exist within the full orbital space, but their inclusion yields only small improvement to the final wave function and system energy. Because different methods of determinant selection produce significantly different rates of energy convergence iii_influence_2015 (), the covariance-based method we have described can yield interesting results even at the level of a multi-Slater-Jastrow ansatz. Our chief objective in this section is to show that the covariance technique can select the most significant determinants for a particular molecule before performing a variational optimization of the wave function.
We consider only 2-particle excitations featuring 1 particle in each spin channel. We compare these results to those obtained with the usual configuration interaction method with singles and doubles excitation (CISD). This is natural for molecules such as N with a ground state singlet spin configuration, though it can lead to the exclusion of significant excitations in molecules like O which contain unpaired electrons. Fig. 3 compares the normalized weight of each CSF in conventional CISD, the optimized weight of each CSF in a multi-Slater-Jastrow ansatz, and the local energy covariance for each relevant CSF in each material respectively. We see that the determinant orderings predicted by both traditional CISD and our method based on local energy covariance are equivalent for each system across the dominant particle excitations. This indicates that the path to the ground state through Hilbert space obtained by successively applying the projection operator is approximately equivalent to that produced by the usual CI procedure in this case.
From Eqn 9, we see that the covariance signal in a 2-RDM element should fall identically to zero once the corresponding excitation has been added to the trial state. In practice, we observe that the signal in an added excitation falls significantly once it has been added to the trial wave function, but it does not vanish entirely. This is a consequence of the Jastrow factor in the trial state, which spoils the orthogonality of determinants and prevents the covariance in each 2-RDM element from fully vanishing. That is, the overlap matrix is non-diagonal when a Jastrow factor multiplies the determinant expansion. Practically speaking, this slight non-orthogonality did not seem to affect the performance of the technique. Assuming that is diagonal allows the method to scale linearly with the number of excitation operators, so we use this approximation.
We also find the rate of energy convergence for the predicted CSF ordering in each model molecular system. Fig. 4 shows the variational Monte Carlo energy of an optimized multi-Slater-Jastrow wave function as a function of the CSFs included in the trial state. The CSFs are ordered here according to the weight given by a conventional CISD calculation. We see that the energy converges rapidly with respect to the number of CSFs included in the wave function. This explicitly illustrates that the CISD method and our covariance-based technique can drive the initial trial state asymptotically close to the exact ground state.
Finally, we also assess the degree to which the covariance in a 2-RDM element predicts the energy gain obtained from adding the associated determinant to the trial state. Fig. 5 compares the decrease in total system energy obtained from each additional CSF with the corresponding covariance signal. We observe that the energy gain and the covariance signal are negatively correlated with one another. This correlation indicates that the covariance in a 2-RDM element can be used as a proxy for estimating the energy change from adding a determinant to the trial state.
As a method of determinant selection for these systems, this technique is less efficient than using CI to determine the weights, and the results are similar. We therefore would not recommend this technique as a selection method for small molecules. However, the point of this section is that the energy fluctuations can be data mined to find the correct directions in Hilbert space to improve trial wave functions. In the case of stretched dimers, it is well-known that the most important improvement over Slater-Jastrow consists of multiple determinants, and the energy fluctuation technique selects the correct ones.
V Comparing real and orbital spaces: TiO molecule
We now proceed to use the technique to selectively improve wave function parameterizations in a more challenging case. As an example of a system where we do not know a priori the most important degrees of freedom, we consider a transition metal molecule, TiO. The dynamic correlation present in transition metal systems is larger than in s-p systems like the dimers considered above, so the Jastrow factor could be expected to play a larger rolevisscher_electronic_1993 (); roos_new_2005 ().
In this case, the 1-RDM can be used to select the orbital space by calculating the covariance of the diagonal elements of the 1-RDM with the local energy. In the case of the stretched N dimer, the elements corresponding to the bonding and antibonding orbitals have a covariance with the local energy of approximately 0.001, while other orbitals have much smaller signals.
In Fig 6, the covariances of the 1-RDM and the real-space electron-electron correlation function distance are shown. The covariance signal for the 1-RDM is very small, and in fact the selection of determinants was very difficult in this case, although we do obtain larger signals for the and states as one would expect. On the other hand, for our starting wave function, labeled J12, with 12 three-body parameters per atom, there is a large spin-dependent covariance with . Increasing the number of parameters in the Jastrow factor without making it spin-dependent (J30, with 30 3-body parameters per atom) does not resolve this covariance, but adding in four spin dependent parameters (JS30) immediately reduces the covariance signal to nearly zero across all separation distances.
Since the determinant selection of TiO via energy covariance was not efficient, we used a CI calculation with sextuple excitations into 8 virtual states to select CSFs, then formed a set of multi-Slater Jastrow wave functions. If the covariance analysis was correct, then we would expect the spin-dependent terms in the Jastrow to be most effective in lowering the energy, followed by either the extra three-body terms or multiple determinants. As can be seen in Fig 7, this supposition is correct: with only four parameters, the spin-dependent terms lower the energy by nearly 10 mHartree, while 30 determinants or a similar number of 3-body parameters are necessary to achieve that decrease in energy.
This example illustrates some the strengths and weaknesses of this covariance-based selection. If the set is selected in a basis that does not describe the needed improvement efficiently, in this case the determinant basis, then it is not the best tool. On the other hand, if several different basis sets are used, then the best basis can be used to improve the wave function. In this case, we learned that a spin-dependent Jastrow factor can improve the energy significantly for magnetic molecules, while the determinant basis is not an efficient way to improve the wave function for this molecule. The cost for performing these calculations was about a factor of two larger than a variational Monte Carlo calculation and much smaller than the energy optimization technique.
We have presented an outline of a technique to select, not just terms in a many-body ansatz, but which type of ansatz with which to proceed. For example, the selection method can quickly determined whether a determinant-type basis is appropriate by evaluating the 1-RDM covariance with the local energy. Similarly, if an explicitly correlated approach such as a Jastrow is more appropriate, then the covariance of the local energy with the electron-electron distance is large. The computational cost of this assessment is quite low: is essentially zero cost over a VMC energy evaluation, and the 1-RDM is approximately a factor of two additional, regardless of system size. This is much less expensive than attempting energy minimization on multiple ansatz.
As proof of concept, we demonstrated that the selection technique both selects the correct directions in Hilbert within a defined ansatz space, and also can select between alternate viewpoints of the electron correlation problem. We demonstrated the former by selecting determinants for stretched dimer molecules, and the latter by differentiating between short range ’dynamic’ correlation best described by a Jastrow factor and long range ’static’ correlation best described by multiple determinants in the transition metal oxygen system TiO. Using standard wave functions for this problem, the dynamic correlation in TiO is more important. This work forms the base for an algorithm in which the local energy can be analyzed directly in the many-body space using feature extraction techniques to describe the most efficient basis in which to improve many-body wave functions.
This material is based upon work supported by NSF DMR-1206242 (L.K.W.) and the National Science Foundation Graduate Research Fellowship Program under Grant Number DGE-1144245 (K.T.W.). We also acknowledge computer resources from the Campus Cluster program at Illinois. Useful conversations with David Ceperley are gratefully acknowledged.
- (1) W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Rajagopal, “Quantum monte carlo simulations of solids,” vol. 73, no. 1, pp. 33–83.
- (2) L. K. Wagner, “Quantum monte carlo for ab initio calculations of energy-relevant materials,” vol. 114, no. 2, pp. 94–101.
- (3) L. K. Wagner, “Ground state of doped cuprates from first-principles quantum monte carlo calculations,” vol. 92, no. 16, p. 161116.
- (4) K. Foyevtsova, J. T. Krogel, J. Kim, P. Kent, E. Dagotto, and F. A. Reboredo, “Ab initio quantum monte carlo calculations of spin superexchange in cuprates: The benchmarking case of ,” vol. 4, no. 3, p. 031003.
- (5) H. Zheng and L. K. Wagner, “Computation of the correlated metal-insulator transition in vanadium dioxide from first principles,” vol. 114, no. 17, p. 176401.
- (6) R. Jastrow, “Many-body problem with strong forces,” vol. 98, no. 5, pp. 1479–1484.
- (7) D. Ceperley, G. V. Chester, and M. H. Kalos, “Monte carlo simulation of a many-fermion study,” vol. 16, no. 7, pp. 3081–3099.
- (8) F. R. Petruzielo, J. Toulouse, and C. J. Umrigar, “Approaching chemical accuracy with quantum monte carlo,” vol. 136, no. 12, p. 124116.
- (9) R. P. Feynman and M. Cohen, “Energy spectrum of the excitations in liquid helium,” vol. 102, no. 5, pp. 1189–1204.
- (10) Y. Kwon, D. M. Ceperley, and R. M. Martin, “Effects of backflow correlation in the three-dimensional electron gas: Quantum monte carlo study,” vol. 58, no. 11, pp. 6800–6806.
- (11) P. López Ríos, A. Ma, N. D. Drummond, M. D. Towler, and R. J. Needs, “Inhomogeneous backflow transformations in quantum monte carlo calculations,” vol. 74, no. 6, p. 066701.
- (12) E. Neuscamman, C. J. Umrigar, and G. K.-L. Chan, “Optimizing large parameter sets in variational quantum monte carlo,” vol. 85, no. 4, p. 045103.
- (13) J. Toulouse and C. J. Umrigar, “Optimization of quantum monte carlo wave functions by energy minimization,” vol. 126, no. 8, p. 084102.
- (14) C. Filippi and S. Fahy, “Optimal orbitals from energy fluctuations in correlated wave functions,” vol. 112, no. 8, pp. 3523–3531.
- (15) M. Holzmann, D. M. Ceperley, C. Pierleoni, and K. Esler, “Backflow correlations for the electron gas and metallic hydrogen,” vol. 68, no. 4, p. 046707.
- (16) S. Sorella, “Green function monte carlo with stochastic reconfiguration,” vol. 80, no. 20, pp. 4558–4561.
- (17) S. Sorella and L. Capriotti, “Green function monte carlo with stochastic reconfiguration: An effective remedy for the sign problem,” vol. 61, no. 4, pp. 2599–2612.
- (18) S. Sorella, “Generalized lanczos algorithm for variational quantum monte carlo,” vol. 64, no. 2, p. 024512.
- (19) S. Sorella, M. Casula, and D. Rocca, “Weak binding between two aromatic rings: Feeling the van der waals attraction by quantum monte carlo methods,” vol. 127, no. 1, p. 014105.
- (20) F. Schautz and C. Filippi, “Optimized jastrow-slater wave functions for ground and excited states: Application to the lowest states of ethene,” vol. 120, no. 23, pp. 10931–10941.
- (21) A. Scemama and C. Filippi, “Simple and efficient approach to the optimization of correlated wave functions,” vol. 73, no. 24, p. 241101.
- (22) M. W. Schmidt, K. K. Baldridge, J. A. Boatz, S. T. Elbert, M. S. Gordon, J. H. Jensen, S. Koseki, N. Matsunaga, K. A. Nguyen, S. Su, T. L. Windus, M. Dupuis, and J. A. Montgomery, “General atomic and molecular electronic structure system,” vol. 14, no. 11, pp. 1347–1363.
- (23) M. S. Gordon and M. W. Schmidt, “Chapter 41 - advances in electronic structure theory: GAMESS a decade later,” in Theory and Applications of Computational Chemistry (C. E. D. F. S. K. E. Scuseria, ed.), pp. 1167–1189, Elsevier.
- (24) M. Burkatzki, C. Filippi, and M. Dolg, “Energy-consistent pseudopotentials for quantum monte carlo calculations,” vol. 126, no. 23, p. 234105.
- (25) L. K. Wagner, M. Bajdich, and L. Mitas, “QWalk: A quantum monte carlo program for electronic structure,” vol. 228, no. 9, pp. 3390–3404.
- (26) C. J. Umrigar and C. Filippi, “Energy and variance optimization of many-body wave functions,” vol. 94, no. 15, p. 150201.
- (27) C. J. Umrigar, J. Toulouse, C. Filippi, S. Sorella, and R. G. Hennig, “Alleviation of the fermion-sign problem by optimization of many-body wave functions,” vol. 99, no. 17.
- (28) K. E. Schmidt and J. W. Moskowitz, “Correlated monte carlo wave functions for the atoms he through ne,” vol. 93, no. 6, pp. 4172–4178.
- (29) L. K. Wagner, “Types of single particle symmetry breaking in transition metal oxides due to electron correlation,” vol. 138, no. 9, p. 094106.
- (30) R. C. Clay and M. A. Morales, “Influence of single particle orbital sets and configuration selection on multideterminant wavefunctions in quantum monte carlo,” vol. 142, no. 23, p. 234103.
- (31) L. Visscher, T. Saue, W. C. Nieuwpoort, K. Faegri, and O. Gropen, “The electronic structure of the PtH molecule: Fully relativistic configuration interaction calculations of the ground and excited states,” vol. 99, no. 9, pp. 6704–6715.
- (32) B. O. Roos, R. Lindh, P.-Ã. Malmqvist, V. Veryazov, and P.-O. Widmark, “New relativistic ANO basis sets for transition metal atoms,” vol. 109, no. 29, pp. 6575–6579.