Stochastic multi-configurational self-consistent field theory
Abstract
The multi-configurational self-consistent field theory is considered the standard starting point for almost all multireference approaches required for strongly-correlated molecular problems. The limitation of the approach is generally given by the number of strongly-correlated orbitals in the molecule, as its cost will grow exponentially with this number. We present a new multi-configurational self-consistent field approach, wherein linear determinant coefficients of a multi-configurational wavefunction are optimized via the stochastic full configuration interaction quantum Monte Carlo technique at greatly reduced computational cost, with non-linear orbital rotation parameters updated variationally based on this sampled wavefunction. This extends this approach to strongly-correlated systems with far larger active spaces than it is possible to treat by conventional means. By comparison with this traditional approach, we demonstrate that the introduction of stochastic noise in both the determinant amplitudes and the gradient and Hessian of the orbital rotations does not preclude robust and reliable convergence of the orbital optimization. It can even improve the ability to avoid convergence to local minima in the orbital space, and therefore aid in finding variationally lower-energy solutions. We consider the effect on the convergence of the orbitals as the number of walkers and the sampling time within the active space increases, as well as the effect on the final energy and error. The scope of the new protocol is demonstrated with a study of the increasingly strongly correlated electronic structure in a series of polycyclic aromatic hydrocarbons, up to the large coronene molecule in a complete active space of 24 electrons in 24 orbitals, requiring only modest computational resources.
I Introduction
Much of the fields of computational electronic-structure theory and quantum chemistry is concerned with describing the changes in energies and properties due to the correlated motion of electrons.Helgaker et al. (2000) The treatment of these effects is, in general, required for the fields to be predictive and increasingly relevant to wider areas of chemistry, physics, and material science. This correlated physics of the electrons is generally divided into two qualitative categories: dynamic and static correlation.
As a broad categorization, dynamic correlation describes the short-ranged, two-electron phenomena of the Coulomb holes and cusps denoting the depletion in probability amplitude of finding two electrons close to each other. Static correlation, on the other hand, derives from the breakdown in the qualitative utility of the single-particle picture, which manifests as strong mixing between a number of determinants, with many-body co-operative effects giving rise to significant probability amplitudes of the wavefunction among its energetically low-lying configurations. If an electronic system contains significant static correlation, then perturbative approaches such as the “gold standard” of quantum chemistry — coupled-cluster with singles, doubles and non-iterative triples (CCSD(T)) — will necessarily fail, since they rely on the single-particle picture to generate a single many-electron state about which to construct a perturbative expansion. This situation is prevalent in systems with localized, partially-occupied atomic states (such as the -shells of transition metals), as well as low-spin, open-shell systems (including magnetic interactions) and important transition or stretched-bond configurations of molecular systems.
In quantum chemistry, it can be common to define the amount of static correlation in a problem as the difference between the uncorrelated Hartree–Fock energy and that of complete active space self-consistent field (CASSCF) theory. The CASSCF method is a variational approach for optimizing a linear combination of low-energy configurations which arise from a set of correlated “active” degrees of freedom, optimized self-consistently to provide the majority of the static portion of the correlation effects.Roos (1972); Hinze (1973); Siegbahn et al. (1980); Roos et al. (1980); Siegbahn et al. (1981); Knowles and Werner (1985); Werner and Knowles (1985); Olsen (2011); Lin and Demkov (2014) This is then considered as the standard starting point for almost all multi-reference approaches required for strongly-correlated molecular problems. A number of “active” orbitals and electrons are chosen, and the configurational subspace is constructed from all the possible distributions of those electrons in those orbitals, maintaining an inert “core” of the remaining particles. The energy is then minimized within this configurational subspace. In addition to this, a unitary rotation of the non-redundant orbital space is found, again in a variational fashion, in order to optimize the “best” orbitals, in a variational sense, to describe these configurations. This optimization can be crucial, since the original orbitals are generally based on the canonical Hartree–Fock orbitals, and hence do not necessarily span the qualitatively correct degrees of freedom to describe the strongly-correlated physics.
Since the high-energy part of the configurational spectrum is missing, this approach will not, with normal usage, be able to capture dynamic correlation from high-energy virtual scattering effects to obtain a complete description of the correlated wavefunction. However, the final CASSCF wavefunction will achieve a qualitative description of a strongly-correlated molecular problem, which will have been previously missing if building upon a single-reference. Dynamic correlation can often be included subsequently in the description on top of a CASSCF “zeroth-order” wavefunction in a perturbative or variational manner.Andersson et al. (1992); Werner and Knowles (1988); Shamasundar et al. (2011)
The greatest limitation upon the applicability of the CASSCF approach is the size of the active space which can be treated. While this space would, ideally, include all valence electrons and low-lying molecular orbitals, the resultant number of variational parameters to optimize grows factorially with its size. Thus, computational bottlenecks usually demand that no more than around active electrons and degrees of freedom can be considered to contribute to the strong correlation. For larger systems with multiple stronger correlation centres, this is often simply insufficient, and even for relatively small systems, testing the influence of additional effects such as ‘-’ mixing or a double- shell to include - transitions and wider energy scales in metal complexes is generally prohibitively expensive.
In order to access these required larger space sizes, approximations are generally made to reduce the number of parameters which need to be optimized. The most prevalent of these are the restricted active space,Olsen et al. (1988); Malmqvist et al. (1990) and more recently the generalized active space constructions.Fleig et al. (2001, 2003); Ma et al. (2011); Vogiatzis et al. (2015) These approaches choose only a subset of the complete active space configurations within which to optimize, thereby dramatically cutting the number of configurations that need be considered. However, this adds additional uncertainties and approximations in the quality of the results, as well as additional complexity in setting up the calculation.
To maintain the full variational flexibility of the active space and the treatment of strong correlation effects which that entails, we adopt a different approach in this work. The optimization of the linear determinant coefficients is decoupled from the non-linear orbital rotation parameters. The linear coefficients are then solved for stochastically each iteration using the machinery of the full configuration interaction quantum Monte Carlo (FCIQMC) method.Booth et al. (2009) Stochastically-derived quantities have been previously used for quantum-chemical methods,Neuhauser et al. (2012); Willow et al. (2013); Neuhauser et al. (2014) and FCIQMC provides a probablistic approach to the solution of the configuration interaction (CI) problem, whereby sparsity in the wavefunction is exploited to allow for large numbers of determinants to be optimized in systems for which a deterministic solution (or even enumeration of the space) would be impossible. Simultaneous to this sampling, one- and two-body reduced density matrices are accumulated, which are used in a subsequent step to form the gradient and Hessian for the orbital rotations amongst the active and inactive spaces. The rotations are then solved for deterministically (using the stochastically sampled gradient and Hessian), since there are generally a manageable number of these variational parameters, which grow only as where is the total number of orbitals in the full space. This process is repeated until convergence, implicitly coupling the linear parameters of the active space back to the non-linear rotations within the full space.
We begin with a more detailed discussion of these steps, before comparing the new approach to traditional techniques. Of particular interest is how the systematic and random errors of the result depend upon the number of walkers, and the upon amount of time for which these walkers stochastically sample the active space in the FCIQMC step, as well as an analysis of the nature of the convergence over updates of the non-linear parameters. Moving beyond what can be feasibly optimized within traditional CASSCF, we turn to a series of increasingly large polycyclic aromatic hydrocarbons. These are important systems in the development of new photovoltaic devices,Smith and Michl (2013); Teichen and Eaves (2012) while the trend from molecular systems towards the infinite graphene system and its novel properties is also of interest.Clar (1964); Havey (1997) However, as the conjugated system grows in size, there is no natural separation of energy scales from which an active space can be formed, and so the problem of a balanced choice of active space with which to compare systems of different sizes traditionally becomes difficult. With this new approach, however, we can consider the complete space in all cases, allowing for a consistent truncation of the correlation effects, and can examine trends as these systems grow in size. Density matrix renormalization group (DMRG) calculations have previously indicated a trend for increasing strong correlation effects, towards polyradical character, in one-dimensional polycyclic acenes and narrow nano-ribbons,Hachmann et al. (2007); Mizukami et al. (2012) and we compare this to the character of growth in two-dimensional polycyclic systems, up to the coronene system, including all 24 electrons and orbitals in the active space. It is found that, whilst there is a general increase in correlated behavior with increasing system size, the two-dimensional series also sees the development of degeneracies in the highest-occupied and lowest-unoccupied parts of the natural-orbital spectrum not observed in the linear cases.
Ii Methodology
The wavefunction of MCSCF theory is ultimately expressed as a linear combination of -electron Slater determinants, . Its crucial difference from a wavefunction expanded as a subset of configurations, however, is the inclusion of an orbital rotation operator, , to allow for the iterative optimization of the space spanned by the correlated orbitals, in addition to the set of configurational coefficients, , within it:
(1) |
The anti-Hermitian operator, , takes the form of a sum over antisymmetrized excitation operators, which in this work we take to be spin-free,
(2) | ||||
(3) |
with
(4) |
such that the rotations are real and preserve spin. The task is thus to determine the coefficients, , and the matrix elements, .
It is specifically the complete-active-space (CASSCF) treatment, which truncates the configurational space by partitioning the orbital space into inactive, active, and virtual subspaces, with which we are concerned here. No restrictions are placed on the allowed occupancies in the active manifold, but the inactive and virtual orbitals are constrained to be always doubly occupied and unoccupied, respectively, in all configurations. The related restricted-active-space (RASSCF) theory, which further divides the active space by placing restrictions on the number of holes and electrons allowed to appear at its lower and upper energetic limits, is also implemented, and will be considered in future work. In CASSCF, simplifications arise as all intra-space rotations (active-active rotations, for instance) are redundant, while all inter-space transformations are not.Helgaker et al. (2000) The only transformations which need be considered, therefore, are inactive-virtual, inactive-active, and active-virtual rotations, which induces a sizeable reduction in the effective size of the matrix .
In principle, CASSCF may be undertaken as a simultaneous second-order optimization of the orbital and configurational parameters. Whilst some of the established CASSCF implementations take this approach,Knowles and Werner (1985); Werner and Knowles (1985); Werner (1987); Shepard (1987); Schmidt and Gordon (1998) it restricts the flexibility for using newly-developed solvers for strongly-correlated quantum problems to obtain the CI coefficients. Increasingly, therefore, a two-step, macroiterative approach, in which the configurational and orbital parameters are optimized separately, has come to prominence. These approaches are predicated on the fact that the equations governing configurational optimization are equivalent to those found in “stand-alone” CI theory, and may thus be solved with any of the techniques which seek to approximate it. The approach using DMRG as the configurational solver has enjoyed considerable success,Zgid and Nooijen (2008); Ghosh et al. (2008); Yanai et al. (2009, 2010); Kurashige et al. (2013) with studies on long-chain polyenes, -carotene, and transition-metal clusters all proving fruitful, and the stochastic approach discussed here seeks, ultimately, to continue in that vein.
Our formulation uses the full configuration interaction quantum Monte Carlo (FCIQMC) technique as its configurational optimizer, which we have elucidated in previous work.Booth et al. (2009); Cleland et al. (2010); Booth and Alavi (2010); Booth et al. (2011); Cleland et al. (2012); Booth et al. (2013); Thomas et al. (2014); Booth et al. (2014); Thomas et al. (2015) This approach achieves the stochastic integration of the underlying Schrödinger equation via an imaginary-time evolution of an ensemble of signed walkers in Slater-determinant space. This evolution is performed as an iterative application of “spawning”, “death”, and “annihilation” steps, with the effect that the walker populations on each determinant, , become proportional to the coefficient, , when averaged over the long-imaginary-time limit. To control the sign problem, the initiator approximation is used, which results in a method which exhibits a systematic error at too low walker number, but which is systematically improvable to exactness as the walker number increases. Throughout this work, the initiator threshold parameter, , is chosen to be .Cleland et al. (2010) To further reduce the size of random error bars, we make use of the semi-stochastic adaptation, in which the projection operator for the ground-state is applied deterministically to a small, “core” space, and stochastically elsewhere.Petruzielo et al. (2012); Blunt et al. (2015) For the systems studied here, we choose our core spaces to be either all the determinants coupled to the reference, or such objects, whichever is the smaller.
Crucially for our present purposes, the FCIQMC approach also facilitates the stochastically unbiased sampling of the two-body reduced density matrices (RDMs),Overy et al. (2014); Thomas et al. (2015) given by
(5) | ||||
(6) |
from which the one-body reduced density matrix can be found as
(7) | ||||
(8) |
An unbiased sampling of these quantities demands the introduction of a second walker ensemble, to which the population dynamics are applied separately, and the statistics acquired independently, from the first. This “replica” sampling eliminates biasing by ensuring that all the required products of determinant amplitudes are calculated from populations from both simulations, and has found application in a variety of related Monte Carlo techniques.Zhang and Kalos (1993); Blunt et al. (2014, 2015)
The reduced density matrices allow for the calculation of a host of molecular properties including electrical moments, forces, and coupling to two-particle geminal functions,Overy et al. (2014); Thomas et al. (2015); Booth et al. (2012) but their chief use in this present study is in their access to the gradient, , and Hessian, , of the energy with respect to the orbital parameters,
(9) |
and
(10) | ||||
(11) |
in which the operator permutes the orbital pair indices and , and is the relevant Hamiltonian.Ghosh et al. (2008); Yanai et al. (2009); Helgaker et al. (2000); Lin and Demkov (2014) These objects may be expressed entirely in terms of the reduced density matrices and the one- and two-electron integrals, and , in much the same manner as the analogous quantities are formulated in Hartree–Fock theory. Indeed, convergence is defined when the gradient vanishes, at which point the generalized Brillouin theorem is satisfied, indicating that there is no interaction between the CASSCF wavefunction and its singly-excited configurations.
The energies presented in this work are quasi-variational energy estimates, derived from these sampled density matrices, and given by
(12) |
This ensures that consistency with the corresponding gradient and Hessian expressions is maintained. It would, however, be instructive in a future study also to consider two other, non-variational energy expressions commonly used in FCIQMC, derived from a mixed estimator and the “shift” – a measure of the offset required to yield a norm-conserving propagator.Booth et al. (2009)
Although the matrix, , may be obtained by solution of the stationary equation,
(13) |
convergence is improved by instead constructing the augmented Hessian and its corresponding eigenequation.Lengsfield III (1980, 1982). Thus, is found as the lowest root of the equation
(14) |
with , whose solution is found at modest computational expense via iterative subspace methods, as implemented in the PySCF package.Sun et al.
The protocol outlined above represents a single macroiterative cycle – a single FCIQMC convergence and update of the CASSCF energy and orbital coefficients. Once performed, the newly-obtained orbitals and transformed integrals are used to construct a new Hamiltonian, and the process continued until the desired convergence is achieved. The initial studies to which we now turn address the suitability of this approach for achieving meaningful and satisfactory convergences of this type, and exactly what level of convergence can be expected of the macroiterations in the presence of random errors in the sampled energy, gradient and Hessian expressions.
Iii Results and discussion
iii.1 Preliminary studies
The natural first test for this new formulation is the comparison with entirely deterministic calculations. It ought, in principle, to provide the same results as an approach using conventional configuration-interaction theory for the configurational optimizer, and a comparison between the two thus yields some insight into the effects of the inevitable introduction of stochastic noise up the convergence. As concrete examples, we turn to the benzene and naphthalene molecules in their experimental equilibrium geometries and in cc-pVDZ basis sets.Herzberg (1966); Dunning, Jr (1989) The active spaces in these cases are (6,6) and (10,10), respectively, for which the diagonalization of the active Hamiltonian is straightforward.
Of particular concern for the stochastic formulation is the satisfactory sampling of the density matrices. We return to a consideration of the behavior with number of walkers () for the later systems, but our initial focus with these smaller molecules is to assess the effect of the number of iterations of the FCIQMC dynamic for the accurate accumulation of the density matrices for these purposes, and the effect of this sampling time on the overall convergence.
The convergence of the CASSCF macroiterations as the number of FCIQMC iterations, , for which the density matrices are sampled changes is shown in Figure 1. These RDMs are sampled afresh every macroiteration, and no prior initialization information is currently used between macroiteration steps, although the potential benefits of so doing are to be investigated in future studies. It can be seen that the overall qualitative convergence profile of the macroiterations are relatively insensitive to the length of time sampling the density matrices. At each FCIQMC iteration, each walker has the possibility to sample only a single contribution to the sum in Equation 6 if a doubly excited walker is successfully spawned, and will contribute terms if a singly excited walker is created. It further contributes (diagonal) terms during the course of a configuration being occupied (this is accumulated and only included once during the lifetime of the determinant for computational expediency – see Reference Overy et al. (2014) for more details). The fact that as few as 500 sampling iterations with walkers (and a successful spawning rate of ) can provide qualitatively correct results, therefore, demonstrates an efficient sampling for these purposes. This is substantially helped by the fact that the diagonal (generally dominant) part of the two-body RDM is averaged over the stochastic wavefunction exactly for the sampling iterations, without requiring an additional stochastic step as is required for the off-diagonal elements in Equation 6. Furthermore, any deterministic space defined as part of the ‘semi-stochastic’ adaptation of the algorithm (which generally contains some of the largest weighted configurations) has its contribution to the density matrices included exactly.Petruzielo et al. (2012); Blunt et al. (2015); Overy et al. (2014) Whether this robust sampling is maintained or more iterations are required for systems with larger degrees of static correlation remains an open question, though this is likely offset by the larger number of walkers also required to resolve the wavefunction.
Despite the similarity and robustness of the qualitative results, the number of sampling iterations does have the expected effect on the standard deviation in the energy estimator over independent samples at each macroiteration. The variation between independent calculations, as measured by the standard deviations in Figure 1, can be traced to two effects due to the stochastic error in the density matrices. The first is the random error in the calculation of the variational energy for a given macroiteration as given by Equation 12, while the second is in the random error in the gradient and Hessian information in previous iterations, which gives different updates for the matrix and therefore orbital spaces in each macroiteration. Both of these effects are systematically controllable by changing both and . Too few sampling iterations of the density matrices () yields noisy behavior, and a standard deviation between independent calculations of up to . Increasing the sampling decreases the changes between energies at each macroiteration as expected. The ultimate convergence criterion for the optimization of the overall wavefunction must be greater than this standard deviation, as even at “convergence”, it is expected that the parameters will vary in a random fashion of the order of this standard deviation. For this reason, the choice of sampling iterations is chosen for the rest of the results in this work as providing close agreement with the fully deterministic result, and allows for a convergence threshold of the order .
Having established that the density matrices can be obtained with sufficient quality for the orbital optimization step, we may compare the specific behavior of the stochastic convergence with that of its deterministic counterpart, as given in Figure 2. The conclusion borne out by these results appears to be that, once the density matrices are resolved with sufficient clarity, the stochastic noise inherent in this new formulation does not particularly change the convergence of the energy over that of the conventional approach. The stochastic method does, as is clear, demand a somewhat more permissive convergence criterion than its deterministic cousin — of the order of , as established above — but the sub-millihartree agreement with deterministic results where comparison is possible suggests the feasibility of this new approach for addressing questions of physical and chemical significance.
As an initial illustration of such problems, we consider the torsional barrier of the ethene molecule, CH, as an archetypal example of the type of substantial static correlation problem to which CASSCF is traditionally best suited.Krylov et al. (1998) In this case, the bonding and anti-bonding and orbitals become degenerate as the dihedral angle, , approaches , and a proper treatment of the twisted geometry thus demands contributions from both the and the configurations to give rise to a diradical singlet.
Figure 3 details the description of the torsional barrier by various levels of theory, and illustrates an important advantage of the CASSCF approach in its ability to eliminate the energetic cusp at . This cusp appears in the restricted HF curve as a result of that theory’s complete neglect of the excited configuration, and the inconsistent treatment of this state by both CCSD and MP2 fails to remove this feature completely. By contrast, the CASSCF allows for the necessarily equal contributions from the and the configurations at the barrier, and hence achieves the smooth curve required for a qualitatively correct description of the behavior. This also demonstrates the ability for the FCIQMC-CASSCF to also work in situations with changing levels of correlation, robustly and consistently optimizing the orbitals across the potential energy curve.
iii.2 Towards extended systems
The ethene molecule provides a useful, albeit simple, illustration of the general importance of treating the full -valence space when considering -electron systems. The formulation introduced here, however, is applicable to active spaces of much greater size, and it is to these more extended systems which our attention now turns.
The polycyclic aromatic hydrocarbons, formed of fused six-membered rings,Clar (1964); Havey (1997) have long been a focus of both experimental and theoretical interest,Angliker et al. (1982); Kertesz and Hoffmann (1983); Kivelson and Chapman (1983); Wiberg (1997); Houk et al. (2001); Bendikov et al. (2004, 2004, 2004); Mondal et al. (2006); Reddy and Bendikov (2006); Hachmann et al. (2007) providing technological potential,Smith and Michl (2013); Teichen and Eaves (2012); Reese et al. (2004) as well as being a source of medical and environmental concern.Samanta et al. (2002); Armstrong et al. (2004); Srogi (2007) The linear acenes have been extensively studied with DMRG,Hachmann et al. (2007); Mizukami et al. (2012) as their quasi-one-dimensional structure is ideally suited to the method. However, an advantage of FCIQMC is that its efficiency is not particularly dependent on the dimensionality of the system. As exemplar problems of this new technique, therefore, we explore a series of non-linear polycyclic aromatic hydrocarbons in cc-pVDZ basis sets,Dunning, Jr (1989) up to the coronene — or superbenzene — molecule, taking in the full carbon system in each case. This allows for a balanced active space for comparison between the systems as they increase in size, without truncation of the conjugated electrons. The geometries used are optimized at the level of density-functional theory,Eckert et al. (1997); Werner et al. (2012, 2012) using the B3LYP functional.Vosko et al. (1980); Lee et al. (1988); Becke (1993); Stephens et al. (1994)
Following on from the benzene and naphthalene studies of the previous section, we first consider the phenanthrene and triphenylene molecules, with (14,14) and (18,18) active spaces, respectively. This latter system is beyond the reach of deterministic CASSCF theories but, as illustrated in Figure 4, the stochastic formulation provides a smooth, robust convergence.
The energy for each macroiteration is obtained directly from the reduced density matrices as in Equation 12, and, as in the previous section, we calculate an average energy for each point from the results of three independent calculations, initialized from different random number seeds, with the corresponding standard deviation serving as the error bar. As can be seen from Figure 4, the precise trajectories of the independent convergences differ somewhat in their initial phases, as small random errors in the gradients and Hessians lead to propagation of differing orbitals through the macroiterative optimization. However, the later iterations agree sufficiently well that a meaningful energy and error bar may be extracted, providing confidence that the same orbital space is sampled at convergence. In particular, it is evident that performing each configurational step for density-matrix sampling iterations after an equilibration period — the rule of thumb inferred in the previous section — is sufficient for the robust, reliable convergence which is our aim. In addition, walker populations were chosen such that the population residing on the reference exceeded , which has previously been shown to be a satisfactory criterion for reliable convergence of any initiator error for systems with substantial weight on a single determinant.Cleland et al. (2012); Thomas et al. (2014, 2015) This results in small changes to the total walker number for each iteration, as the wavefunction sparsity alters as the orbitals change. However, this was found to give consistently converged results for these systems, and we now turn to a more detailed consideration of the number of walkers required for larger active spaces in the coronene molecule.
iii.2.1 The effect of walker number
It has been stressed that the macroiterative optimization of the orbitals in CASSCF is a non-linear problem, and as such ensuring convergence to a global minimum is close to impossible, while sensitivity to initial conditions or simulation procedure can lead to differing solutions. One way in which convergence to global minima is encouraged in non-linear optimization problems is somewhat counterintuitively via the addition of random noise into the gradient and Hessian in an optimization step. An example of this is the essential addition of “noise” to the optimization in DMRG, where a non-linear factorization of the wavefunction coefficients is optimized.White (2005); Zgid and Nooijen (2008) Although the added noise in DMRG is not generally stochastic (though can be), neglect of this noise can lead to the convergence to meta-stable solutions and local minima, rather than the desired converged state.White (2005) With this in mind, we detail the three anticipated effects of increasing the number of walkers in the FCIQMC-CASSCF procedure.
First among these will be the decrease in “initiator error” in the FCIQMC calculation.Cleland et al. (2010) The use of the initiator approximation (applied throughout this work) entails the introduction of a systematically improvable error into the sampling of the wavefunction as a way of encouraging annihilation events and ameliorating the sign problem in the space. As the number of walkers increases, it can be rigorously shown that the energy (generally rapidly) converges to the exact (FCI) result. Secondly, the increase in walkers will lead to a reduction in the error in the sampled density matrices (and hence the gradient and Hessian of the orbital rotations), again tending to an exact limit. This is due both to a reduction in the initiator error in the wavefunction from which the density matrices are sampled, and also an improvement in the sampling of the matrices themselves. However, a third effect of changing the walker number relates to the convergence profile in the non-linear optimization of the orbitals. As the walker number increases, it will decrease the random noise in the sampled gradient and Hessian information for the orbital rotation steps, and this will influence the stability of the global optimization of the orbitals.
The convergences for coronene, using a (24,24) active space and different values for , are shown in Figure 5. The number of active space determinants is . The simulations which are carried out with and million walkers per FCIQMC calculation (the latter value of which fulfils the criterion of placing walkers upon the reference) provide the kind of smooth, robust convergences in close agreement with one another which we observe for the smaller systems. Although slightly different optimization pathways may be taken, these walker numbers both achieve a quantitative agreement at convergence within a small standard deviation of each other. However, a calculation using only million walkers for each step, results in a very different convergence to a lower-energy solution, in marked disagreement with the other results.
This discrepancy is not merely a simple manifestation of initiator error, but rather a clear example of the sensitivity of the optimization to its initial and early conditions. As Figure 6 illustrates, increasing the number of walkers in FCIQMC calculations to eliminate any effects of initiator error using the converged orbitals of both the meta-stable (higher energy) and stable (lower energy) states decreases their energy as the initiator error is removed, but the two states remain well separated by , while the initiator error is less than .
It appears, therefore, that rather noisy initial macroiterations may, in fact, be advantageous in converging upon a true ground state, much in the manner that added noise is helpful for DMRG. Indeed, in Figure 7, we adopt a protocol wherein the first configurational optimizations are performed with walkers, and macroiteration and later steps with million walkers to converge residual initiator error. This approach yields a trajectory towards the converged energy of the stable state in Figure 6, and is the recommended procedure for future studies to encourage convergence to the global minima of the orbital optimization problem. We speculate further that a deterministic FCI solver of the (24,24) coronene problem (if possible) with the same CASSCF algorithm would also likely converge to the higher energy orbital solution, and that for large active spaces or in the presence of strong correlation and difficult convergence, the addition of artificial noise to the CASSCF optimization may also yield benefits in convergence stability. The protocol of optimizing first at lower walker numbers also reduces the computational cost of an FCIQMC-CASSCF calculation, since only a few final iterations are required at the higher walker number finally to eliminate initiator error. While the optimisation is weakly exponentially scaling with the number of orbitals and electrons in the active space,Booth et al. (2011); Cleland et al. (2012); Booth et al. (2014); Shepherd et al. (2014) in the example of the (24,24) coronene system in Figure 7, this corresponded to only hours on cores per macroiteration (including savings due to symmetry), taking advantage of the highly parallelizable nature of the FCIQMC algorithm. Memory requirements were also small, at GB of distributed memory (in contrast to the unfeasible traditional calculation which would require a minimum of TB of RAM).
iii.2.2 The electronic structure of polycyclic hydrocarbons from FCIQMC-CASSCF
Further insight into the electronic structure of these systems may be gleaned by considering the natural orbital occupation numbers, , of the converged wavefunctions. Integrated measures of the number of “effectively unpaired” electrons in a system have been proposed by Takatsuka,Takatsuka et al. (1978); Takatsuka and Fueno (1978); Bochicchio (1998); Staroverov and Davidson (2000, 2000) as
(15) |
and by Head-Gordon,Head-Gordon (2003, 2003); Bochicchio et al. (2003) as
(16) |
in which . In addition, the HONO-LUNO gap, , can be used, where HONO and LUNO correspond to the highest/lowest occupied/unoccupied natural orbital respectively. This quantity should not be considered as a measure of an energetic excitation, but rather as a metric for the deviation from the suitability of a Hartree–Fock picture of the electronic wavefunction. As has been previously pointed out,Hachmann et al. (2007); Mizukami et al. (2012) a little care must also be exercised in interpreting the values of ; certainly, they should not be taken literally as the number of unpaired electrons found in a given system, but, taken together with , provide a useful indication of the correlated effects present in differently-sized molecules.
Figure 8 compares these metrics for the two-dimensional systems discussed thus far to the corresponding linear acenes, which we have also calculated. The trend for the linear systems reveals, as has been previously reported,Hachmann et al. (2007) the steady increase in correlated behavior as the chain length (and CAS size) is increased, leading to increasingly polyradical character of the wavefunction. The behavior for the two-dimensional systems, however, is rather less straightforward, indicating the strength of the correlation effects and radical nature of the resulting wavefunction is not simply related to the number of electrons in the conjugated system, but also to the precise geometry of the molecule under study. Whilst the integrated measure of the radical character is approximately the same for the 1D and 2D structures when comparing the same number of conjugated electrons, the HONO-LUNO gap does not increase at the same rate.
This behavior can be rationalized by examining the occupation-number spectra, as shown in Figure 9 for linear tetracene and two-dimensional triphenylene, which have the same number of electrons in its -system. These exemplar spectra highlight that, whilst the HONO-LUNO gap of the linear system is smaller than that of the two-dimensional case, the latter system sees the emergence of degeneracy in both the HONO and the LUNO not present in the former, contributing to the integrated measures of the radical nature. This is expected due to the increased symmetry of the 2D system, as compared to the linear acene. It has previously been shown that the radical HONO and LUNO orbitals are related to edge effects of these conjugated systems, with larger systems exhibiting HONO and LUNO orbitals in these systems which increasingly localize onto the perimeter of conjugated system.Mizukami et al. (2012) This rationalizes the results of Figure 8, where the smaller perimeter of the two-dimensional systems compared to the linear acenes for the same number of electrons is reflected in the size of the HONO-LUNO gap. Visualization of the coronene HONO in Figure 10, also indicates that these orbitals that are most radical in nature are indeed beginning to localize on the edge of the molecule. It is anticipated that the increased degeneracy of the 2D conjugated systems will mean that for a given perimeter of these systems, the 2D systems will exhibit stronger correlation effects than their 1D counterparts. Theoretically observing these correlated trends provides a stern challenge for future applications of this new approach. However, the applicability and relative ease in dealing with these large correlated spaces with a stochastic solver bodes well for its future utility and scope to probe these systems.
Iv Conclusions
A novel formulation of multi-configurational self consistent field theory, using a two-step, macroiterative algorithm with full configuration interaction quantum Monte Carlo for its configurational optimizations, is capable of performing successful calculations with large active spaces. In this work, this ability is demonstrated with a study of a series of two-dimensional polycyclic aromatic hydrocarbons, culminating in calculations on the coronene molecule with a (24,24) active space.
The inevitable introduction of stochastic noise in this approach does not significantly worsen convergence over a deterministic method, provided that satisfactory sampling is undertaken. In particular, we observe that the FCIQMC calculations must be performed for sufficiently many density-matrix sampling iterations (with a useful, if informal, lower bound) and with sufficiently many walkers if the adverse effects of errors in the density matrices are to be avoided. Moreover, it is shown that the noise brought about by using comparatively few walkers for the initial iterations can be of benefit in converging to a true ground state, with high populations only required in the last stages of convergence in order to provide a satisfactory treatment of initiator error. The modest computational resources thus required suggest that larger active spaces still will be within range, and thus that major new avenues of inquiry — including studies of transition-metal compounds and clusters — are now open and navigable. Furthermore, an approach for obtaining excited states within FCIQMC will allow for a state-averaged-CASSCF implementation in the future, allowing for studies of excitations and photochemistry in correlated molecules.Blunt et al. (2015)
V Acknowledgements
The authors wish to thank Nicholas Blunt for helpful comments on the manuscript. R.E.T gratefully acknowledges Trinity College, Cambridge for funding, while G.H.B gratefully acknowledges the Royal Society via a University Research Fellowship. This work was supported by EPSRC under Grant No. EP/J003867/1. The calculations made use of the facilities of the Max Planck Society’s Rechenzentrum Garching.
References
- Helgaker et al. (2000) Helgaker, T.; Jørgensen, P.; Olsen, J. Molecular Electronic-Structure Theory; John Wiley & Sons: Chichester, 2000.
- Roos (1972) Roos, B. O. Chem. Phys. Lett. 1972, 15, 153.
- Hinze (1973) Hinze, J. J. Chem. Phys. 1973, 59, 6424.
- Siegbahn et al. (1980) Siegbahn, P. E. M.; Heiberg, A.; Roos, B. O.; Levy, B. Phys. Scripta 1980, 21, 323.
- Roos et al. (1980) Roos, B. O.; Taylor, P. R.; Siegbahn, P. E. M. Chem. Phys. 1980, 48, 157.
- Siegbahn et al. (1981) Siegbahn, P. E. M.; Almlöf, J.; Heiberg, A.; Roos, B. O. J. Chem. Phys. 1981, 74, 2384.
- Knowles and Werner (1985) Knowles, P. J.; Werner, H.-J. Chem. Phys. Lett. 1985, 115, 259.
- Werner and Knowles (1985) Werner, H.-J.; Knowles, P. J. J. Chem. Phys. 1985, 82, 5053.
- Olsen (2011) Olsen, J. Int. J. Quantum Chem. 2011, 111, 3267.
- Lin and Demkov (2014) Lin, C.; Demkov, A. A. Phys. Rev. B 2014, 90, 235122.
- Andersson et al. (1992) Andersson, K.; Malmqvist, P.-Å.; Roos, B. O. J. Chem. Phys. 1992, 96, 1218.
- Werner and Knowles (1988) Werner, H.-J.; Knowles, P. J. J. Chem. Phys. 1988, 89, 5803.
- Shamasundar et al. (2011) Shamasundar, K. R.; Knizia, G.; Werner, H.-J. J. Chem. Phys. 2011, 135, 054101.
- Olsen et al. (1988) Olsen, J.; Roos, B. O.; Jørgensen, P.; Aa. Jensen, H. J. J. Chem Phys. 1988, 89, 2185.
- Malmqvist et al. (1990) Malmqvist, P.-Å.; Rendell, A.; Roos, B. O. J. Phys. Chem. 1990, 94, 5477.
- Fleig et al. (2001) Fleig, T.; Olsen, J.; Marian, C. M. J. Chem. Phys. 2001, 114, 4775.
- Fleig et al. (2003) Fleig, T.; Olsen, J.; Visscher, L. J. Chem. Phys. 2003, 119, 2963.
- Ma et al. (2011) Ma, D.; Li Manni, G.; Gagliardi, L. J. Chem. Phys. 2011, 135, 044128.
- Vogiatzis et al. (2015) Vogiatzis, K. D.; Li Manni, G.; Stoneburner, S. J.; Ma, D.; Gagliardi, L. J. Chem. Theory Comput. 2015, 11, 3010.
- Booth et al. (2009) Booth, G. H.; Thom, A. J. W.; Alavi, A. J. Chem. Phys. 2009, 131, 054106.
- Neuhauser et al. (2012) Neuhauser, D.; Rabani, E.; Baer, R. J. Chem. Theory Comput. 2012, 9, 24.
- Willow et al. (2013) Willow, S. Y.; Kim, K. S.; Hirata, S. J. Chem. Phys. 2013, 138, 164111.
- Neuhauser et al. (2014) Neuhauser, D.; Gao, Y.; Arntsen, C.; Karshenas, C.; Rabani, E.; Baer, R. Phys. Rev. Lett. 2014, 113, 076402.
- Smith and Michl (2013) Smith, M. B.; Michl, J. Annu. Rev. Phys. Chem. 2013, 64, 361–386.
- Teichen and Eaves (2012) Teichen, P. E.; Eaves, J. D. J. Phys. Chem. B 2012, 116, 11473–11481.
- Clar (1964) Clar, E. Polycyclic Hydrocarbons; Academic: London, 1964.
- Havey (1997) Havey, R. G. Polycyclic Aromatic Hydrocarbons; Wiley-VCH: New York, 1997.
- Hachmann et al. (2007) Hachmann, J.; Dorando, J. J.; Avilés, M.; Chan, G. K.-L. J. Chem. Phys. 2007, 127, 134309.
- Mizukami et al. (2012) Mizukami, W.; Kurashige, Y.; Yanai, T. J. Chem. Theory Comput. 2012, 9, 401.
- Werner (1987) Werner, H.-J. Adv. Chem. Phys. 1987, 69, 1.
- Shepard (1987) Shepard, R. Adv. Chem. Phys. 1987, 69, 63.
- Schmidt and Gordon (1998) Schmidt, M. W.; Gordon, M. S. Ann. Rev. Phys. Chem. 1998, 49, 233.
- Zgid and Nooijen (2008) Zgid, D.; Nooijen, M. J. Chem. Phys. 2008, 128, 144116.
- Ghosh et al. (2008) Ghosh, D.; Hachmann, J.; Yanai, T.; Chan, G. K.-L. J. Chem. Phys. 2008, 128, 144117.
- Yanai et al. (2009) Yanai, T.; Kurashige, Y.; Ghosh, D.; Chan, G. K.-L. Int. J. Quantum Chem. 2009, 109, 2178.
- Yanai et al. (2010) Yanai, T.; Kurashige, Y.; Neuscamman, E.; Chan, G. K.-L. J. Chem. Phys. 2010, 132, 024105.
- Kurashige et al. (2013) Kurashige, Y.; Chan, G. K.-L.; Yanai, T. Nature Chem. 2013, 5, 660.
- Cleland et al. (2010) Cleland, D. M.; Booth, G. H.; Alavi, A. J. Chem. Phys. 2010, 132, 041103.
- Booth and Alavi (2010) Booth, G. H.; Alavi, A. J. Chem. Phys. 2010, 132, 174104.
- Booth et al. (2011) Booth, G. H.; Cleland, D. M.; Thom, A. J. W.; Alavi, A. J. Chem. Phys. 2011, 135, 084104.
- Cleland et al. (2012) Cleland, D. M.; Booth, G. H.; Overy, C.; Alavi, A. J. Chem. Theory Comput. 2012, 8, 4138.
- Booth et al. (2013) Booth, G. H.; Grüneis, A.; Kresse, G.; Alavi, A. Nature 2013, 493, 365.
- Thomas et al. (2014) Thomas, R. E.; Overy, C.; Booth, G. H.; Alavi, A. J. Chem. Theory Comput. 2014, 10, 1915.
- Booth et al. (2014) Booth, G. H.; Smart, S. D.; Alavi, A. Mol. Phys. 2014, 112, 1855.
- Thomas et al. (2015) Thomas, R. E.; Booth, G. H.; Alavi, A. Phys. Rev. Lett. 2015, 114, 033001.
- Petruzielo et al. (2012) Petruzielo, F. R.; Holmes, A. A.; Changlani, H. J.; Nightingale, M. P.; Umrigar, C. J. Phys. Rev. Lett. 2012, 109, 230201.
- Blunt et al. (2015) Blunt, N. S.; Smart, S. D.; Kersten, J. A.-F.; Spencer, J. S.; Booth, G. H.; Alavi, A. J. Chem. Phys. 2015, 142, 184107.
- Overy et al. (2014) Overy, C.; Booth, G. H.; Blunt, N. S.; Shepherd, J. J.; Cleland, D. M.; Alavi, A. J. Chem. Phys. 2014, 141, 244117.
- Thomas et al. (2015) Thomas, R. E.; Opalka, D.; Overy, C.; Knowles, P. J.; Alavi, A.; Booth, G. H. J. Chem. Phys. 2015, 143, 054108.
- Zhang and Kalos (1993) Zhang, S.; Kalos, M. H. J. Stat. Phys. 1993, 70, 515.
- Blunt et al. (2014) Blunt, N. S.; Rogers, T. W.; Spencer, J. S.; Foulkes, W. M. C. Phys. Rev. B 2014, 89, 245124.
- Blunt et al. (2015) Blunt, N. S.; Alavi, A.; Booth, G. H. Phys. Rev. Lett. 2015, 115, 050603.
- Booth et al. (2012) Booth, G. H.; Cleland, D. M.; Alavi, A.; Tew, D. P. J. Chem. Phys. 2012, 137, 164112.
- Lengsfield III (1980) Lengsfield III, B. H. J. Chem. Phys. 1980, 73, 382.
- Lengsfield III (1982) Lengsfield III, B. H. J. Chem. Phys. 1982, 77, 4073.
- (56) others,, et al. PySCF, a python module for quantum chemistry. see https://github.com/sunqm/pyscf.git.
- Herzberg (1966) Herzberg, G. Molecular Spectra and Molecular Structure: III. Electronic spectra and electronic structure of polyatomic molecules; Van Nostrand: New York, 1966.
- Dunning, Jr (1989) Dunning, Jr, T. H. J. Chem. Phys. 1989, 90, 1007.
- Krylov et al. (1998) Krylov, A. I.; Sherrill, C. D.; Byrd, E. F. C.; Head-Gordon, M. J. Chem. Phys. 1998, 109, 10669.
- Angliker et al. (1982) Angliker, H.; Rommel, E.; Wirz, J. Chem. Phys. Lett. 1982, 87, 208.
- Kertesz and Hoffmann (1983) Kertesz, M.; Hoffmann, R. Solid State Commun. 1983, 47, 97.
- Kivelson and Chapman (1983) Kivelson, S.; Chapman, O. L. Phys. Rev. B 1983, 28, 7236.
- Wiberg (1997) Wiberg, K. B. J. Org. Chem. 1997, 62, 5720.
- Houk et al. (2001) Houk, K. N.; Lee, P. S.; Nendel, M. J. Org. Chem. 2001, 66, 5517.
- Bendikov et al. (2004) Bendikov, M.; Duong, H. M.; Starkey, K.; Houk, K. N.; Carter, E. A.; Wudl, F. J. Am. Chem. Soc. 2004, 126, 7416.
- Bendikov et al. (2004) Bendikov, M.; Duong, H. M.; Starkey, K.; Houk, K. N.; Carter, E. A.; Wudl, F. J. Am. Chem. Soc. 2004, 126, 10493.
- Bendikov et al. (2004) Bendikov, M.; Wudl, F.; Perepichka, D. F. Chem. Rev. 2004, 104, 4891.
- Mondal et al. (2006) Mondal, R.; Shah, B. K.; Neckers, D. C. J. Am. Chem. Soc. 2006, 128, 9612.
- Reddy and Bendikov (2006) Reddy, A. R.; Bendikov, M. Chem. Commun 2006, 1179–1181.
- Reese et al. (2004) Reese, C.; Roberts, M.; Ling, M.; Bao, Z. Mater. Today 2004, 7, 20.
- Samanta et al. (2002) Samanta, S. K.; Singh, O. V.; Jain, R. K. Trends Biotechnol. 2002, 20, 243.
- Armstrong et al. (2004) Armstrong, B.; Hutchinson, E.; Unwin, J.; Fletcher, T. Environ. Health Persp. 2004, 112, 970.
- Srogi (2007) Srogi, K. Environ. Chem. Lett. 2007, 5, 169.
- Eckert et al. (1997) Eckert, F.; Pulay, P.; Werner, H.-J. J. Comp. Chem. 1997, 18, 1473.
- Werner et al. (2012) Werner, H.-J.; Knowles, P. J.; Knizia, G.; Manby, F. R.; Schütz, M. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2012, 2, 242.
- Werner et al. (2012) others,, et al. MOLPRO, version 2012.1, a package of ab initio programs. 2012; see http://molpro.net.
- Vosko et al. (1980) Vosko, S. H.; Wilk, L.; Nusair, M. Can. J. Phys. 1980, 58, 1200.
- Lee et al. (1988) Lee, C.; Yang, W.; Parr, R. G. Phys. Rev. B 1988, 37, 785.
- Becke (1993) Becke, A. D. J. Chem. Phys. 1993, 98, 5648.
- Stephens et al. (1994) Stephens, P. J.; Devlin, F. J.; Chabalowski, C. F.; Frisch, M. J. J. Phys. Chem. 1994, 98, 11623.
- White (2005) White, S. R. Phys. Rev. B 2005, 72, 180403(R).
- Shepherd et al. (2014) Shepherd, J. J.; Scuseria, G. E.; Spencer, J. S. Phys. Rev. B 2014, 90, 155130.
- Takatsuka et al. (1978) Takatsuka, K.; Fueno, T.; Yamaguchi, K. Theor. Chim. Acta 1978, 48, 175.
- Takatsuka and Fueno (1978) Takatsuka, K.; Fueno, T. J. Chem. Phys. 1978, 69, 661.
- Bochicchio (1998) Bochicchio, R. C. J. Mol. Struct.: THEOCHEM 1998, 429, 229.
- Staroverov and Davidson (2000) Staroverov, V. N.; Davidson, E. R. J. Am. Chem. Soc. 2000, 122, 186.
- Staroverov and Davidson (2000) Staroverov, V. N.; Davidson, E. R. Chem. Phys. Lett. 2000, 330, 161.
- Head-Gordon (2003) Head-Gordon, M. Chem. Phys. Lett. 2003, 372, 508.
- Head-Gordon (2003) Head-Gordon, M. Chem. Phys. Lett. 2003, 380, 488.
- Bochicchio et al. (2003) Bochicchio, R. C.; Torre, A.; Lain, L. Chem. Phys. Lett. 2003, 380, 486.
- Schaftenaar and Noordik (2000) Schaftenaar, G.; Noordik, J. H. J. Comput.-Aided Mol. Design 2000, 14, 123.
- Blunt et al. (2015) Blunt, N. S.; Smart, S. D.; Booth, G. H.; Alavi, A. J. Chem. Phys. 2015, 143, 134117.