GW method with the self-consistent Sternheimer equation

GW method with the self-consistent Sternheimer equation

Feliciano Giustino feliciano.giustino@materials.ox.ac.uk Department of Materials, University of Oxford, Parks Road, Oxford OX1 3PH, United Kingdom Department of Physics, University of California at Berkeley, Berkeley, California 94720, USA, and Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA    Marvin L. Cohen    Steven G. Louie Department of Physics, University of California at Berkeley, Berkeley, California 94720, USA, and Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
September 9, 2019
Abstract

We propose a novel approach to quasiparticle GW calculations which does not require the computation of unoccupied electronic states. In our approach the screened Coulomb interaction is evaluated by solving self-consistent linear-response Sternheimer equations, and the noninteracting Green’s function is evaluated by solving inhomogeneous linear systems. The frequency-dependence of the screened Coulomb interaction is explicitly taken into account. In order to avoid the singularities of the screened Coulomb interaction the calculations are performed along the imaginary axis, and the results are analytically continued to the real axis through Padé approximants. As a proof of concept we implemented the proposed methodology within the empirical pseudopotential formalism and we validated our implementation using silicon as a test case. We examine the advantages and limitations of our method and describe promising future directions.

pacs:
71.15.-m, 71.15.Qe

I Introduction

During the past two and a half decades the methodhedin1 ; hl86 for the study of electron quasiparticle excitations has had a number of successes and witnessed significant growth of interest within the computational electronic structure community. The method is currently being used for predicting electron quasi-particle excitation spectra as well as optical spectra in a variety of materials ranging from bulk solids to nanostructures and organic systems. The method is also of widespread use as a starting point for Bethe-Salpeter calculations of two-particle neutral excitations.onida ; bse1 ; bse2 ; bse3 ; rolfing ; reining-review Current implementations find many diverse applications, including among others the calculation of the optical response of nanostructures,catalin quantum transport in nanoscale junctions,rubio pump-probe spectroscopy,catalin-lw angle-resolved photoemission spectroscopy,cheolhwan and strongly-correlated systems.bruneval-oxide

Current trends in the development of improved computational approaches for quasiparticle excitations based on the method include the refinement of the initial guess for the non-interacting Green’s function and for the polarization operator,rinke ; schilfegarde the inclusion of approximate vertex corrections or higher-order self-energy diagrams,york and the description of the frequency-dependent dielectric response beyond the original generalized plasmon-pole approximation.spacetime ; blochl Detailed reviews of past and current developments in techniques can be found in Refs. hl, ; gunnarsson, ; reining-review, ; rinke, ; wilkins, .

The majority of current implementations obtain the screened Coulomb interaction and the non-interacting Green’s function using a perturbative expansion over the Kohn-Sham eigenstates (cf. Sec. II.1 below). Such expansion requires the calculation of both occupied and unoccupied electronic states, as well as their associated optical matrix elements.hl86 A common bottleneck of this approach is that the convergence of the quasi-particle excitation energies with the number of unoccupied states is rather slow.sohrab This difficulty is particularly relevant when calculating the absolute values of the quasiparticle excitation energies.bruneval-gonze Several avenues have been explored so far in order to circumvent this bottleneck and to perform calculations by employing only occupied electronic states,reining-sternheimer ; umari1 ; umari2 ; gygi or a small number of unoccupied states.bruneval-gonze

The main aim of the present work is to demonstrate the feasibility of calculations entirely based on occupied states only.baroni.rmp In practice we adopt the principles of density-functional perturbation theory (DFPT) to determine (i) the frequency-dependent screened Coulomb interaction by directly solving self-consistent linear response Sternheimer equations, and (ii) the non-interacting Green’s function by solving inhomogeneous linear systems. The main advantage of the proposed method is that it does not require the computation of unoccupied electronic states. In addition, we demonstrate the possibility of fast evaluations of the frequency dependence of the screened Coulomb interaction based on multishift linear-system solvers.frommer As a proof of concept we have implemented our method within a planewaves empirical pseudopotential scheme,cohen_berg and validated it by comparing with previous work for the prototypical test case of silicon.

The use of the Sternheimer equation for calculating the polarizability in the random-phase approximation (RPA) or the inverse dielectric matrix has already been discussed in Refs. fleszar-resta, and kunc-tosatti, , respectively, within the framework of a non-perturbative supercell approach. After the introduction of DFPT in the context of lattice-dynamical calculations,giannozzi the authors of Ref. reining-sternheimer, proposed the use of the non self-consistent Sternheimer method for the calculation of the dielectric matrix. The elimination of unoccupied electronic states in the evaluation of the screened Coulomb interaction has also been proposed recently within the framework of a Wannier-like representation of the polarization propagator and the Lanczos recursion method.umari1 ; umari2

This manuscript is organized as follows. In Sec. II we describe how the self-consistent Sternheimer formalism can be adapted to perform calculations. In particular, we outline the procedure to obtain the screened Coulomb interaction in Sec. II.1, the non-interacting Green’s function in Sec. II.2, and the self-energy in Sec. II.3. In Sec. III we specialize to a planewave basis set representation and derive the key equations for the case of Bloch electrons. Sections III.1, III.2, III.3 parallel the corresponding sections in the general theory part, respectively. In Sec. IV we critically analyze the advantages and limitations of the present approach with an emphasis on the scaling of the calculations with system size. In Sec. V we report the results of our test calculations for silicon and compare with previous calculations in the literature. Specifically, we presents results for the direct and inverse dielectric matrix (Sec. V.1), for the analytic continuation of the dielectric matrix using Padé approximants (Sec. V.2), for the self-energy (Sec. V.3), and for the spectral function (Sec. V.4). In Sec. VI we discuss possible future developments of our method and discuss our conclusions. The Appendices provide technical details on some numerical algorithms adopted in this work, in particular the preconditioned complex biconjugate gradient method (Appendix A), the analysis of the conditioning of the Sternheimer equations (Appendix B), the analytic continuation using Padé approximants (Appendix C), and the use of multishift methods for the simultaneous calculation of the polarization at multiple frequencies (Appendix D).

Ii General theory

ii.1 Screened Coulomb interaction

In this section we describe how to exploit the Sternheimer scheme within density-functional perturbation theory in order to calculate the screened Coulomb interaction (where , are the space variables and is the excitation frequency). While the use of the Sternheimer approach in DFPT was originally developed bearing in mind the Kohn-Sham effective Hamiltonian, we note that the present procedure applies without restrictions also to post-DFT methods such as the LDA+U method,anisimov hybrid functionals,becke and exact exchange.exx We assume Rydberg atomic units throughout this manuscript. The Hedin’s equation which defines the screened Coulomb interaction reads:hl

(1)

where denotes the bare Coulomb interaction and the irreducible polarization propagator. As Eq. (1) is a self-consistent Dyson equation for the screened Coulomb interaction, it should be possible to solve it recursively in the spirit of density-functional perturbation theory. For simplicity, we here specialize to the case of the random-phase approximation (RPA) for the polarization propagator. The generalization of this procedure to include exchange and correlation effects can be performed without difficulties (cf. Sec. II.1.1). Within the random-phase approximation the polarization propagator can be written as:hl

(2)

where indicates an electronic eigenstate of the single-particle Hamiltonian with energy eigenvalue and occupation number . In the following we assume that the are Kohn-Sham eigenstates for definiteness. In Eq. (2) the summation indices and run over both occupied and unoccupied electronic states, and the factor of 2 accounts for the spin degeneracy.hl Although the expression for the RPA polarization Eq. (2) has been derived for real frequencies in Ref. hl, , it is possible to continue the polarization throughout the complex plane by using Eq. (2) as a definition outside of the real axis.

Our goal is to rewrite Eqs. (1) and (2) by avoiding explicit summations over the unoccupied electronic states. For this purpose it is convenient to regard the screened Coulomb interaction as a function of the second space variable , whilst the first space variable and the frequency are kept as parameters: . If the system under consideration is subject to the perturbation , then within the RPA the first-order variation of the single-particle density matrix reads

(3)

In Eq. (3) the index stands for “valence” and runs over the occupied electronic states only, the factor of 2 is for the spin degeneracy, and the superscript refer to the positive and negative frequency components of the induced charge. The first-order variations of the occupied wavefunctions can be determined by solving the following two Sternheimer equations:

(4)

where is the effective single-particle Hamiltonian and is the projector on the occupied states manifold. In the particular case of vanishing frequency () the variations of the wavefunctions do coincide, and the standard DFPT equations are recovered. The screening Hartree potential associated with the induced charge is calculated as usual through

(5)

and finally the screened Coulomb interaction in the RPA is obtained as

(6)

It is tedious but otherwise straightforward to verify that Eqs. (3)-(6) are equivalent to the original Eqs. (1)-(2). The only assumptions made in our derivation are that time-reversal symmetry applies, and that the system under consideration has a finite energy gap for electronic excitations. The assumption of time-reversal symmetry is not essential and is mainly used to obtain a compact expression for the wavefunction perturbations. The assumption of finite energy gap can be relaxed by using the extension of DFPT to metallic systems developed in Ref. degironcoli, .

There is a simple and intuitive physical meaning associated with the calculation scheme outlined above. To see this we consider an external test charge introduced in the system at the point . This charge generates a bare Coulomb potential , and the system responds to such perturbation by generating the induced charge and the associated screening potential . The sum of the external perturbation and the screening potential yields the screened Coulomb interaction at the point within the RPA.

The linear systems in Eq. (4) must be solved self-consistently. For this purpose we begin by initializing the screened Coulomb interaction using the bare interaction . We then calculate the linear variations of the wavefunctions . Using the calculated linear variations we update the induced charge density and the associated screening potential . This allow us to generate an improved estimate of the screened Coulomb interaction . We cycle through these steps until convergence of the screened Coulomb interaction is achieved. Equations (3),(4) can be regarded as the generalization of the self-consistent Sternheimer equations used for lattice-dynamical calculationsbaroni.rmp to finite-frequency test-charge perturbations.

In practical calculations we solve Eq. (4) along the imaginary frequency axis in order to avoid the null eigenvalues of the operator , and then we perform the analytic continuation of the screened Coulomb interaction to real frequencies (cf. Appendixes A-C). In the special case of it is convenient to modify the linear operator on the left-hand side of Eq. (4) by adding the projector on the occupied states manifold :

(7)

with set to twice the occupied bandwith. This extra term does not affect the solutions which are linear combinations of unoccupied electronic states. At the same time, the extra term has the effect of shifting away from zero the null eigenvalues of the linear operator thereby making it non singular. This strategy is common practice in DFPT implementations,baroni.rmp ; espresso and is discussed in greater detail in Appendix B.

ii.1.1 Vertex correction

Within the scheme outlined here it is rather straightforward to introduce an approximate vertex correction to the self-energy along the lines of Refs. hl86, ; reining94, . This correction results from setting the self-energy in the first iteration of Hedin’s equations to the DFT exchange-correlation (XC) potential, . Within the present scheme this correction is simply obtained by including the variation of the exchange-correlation potential in the self-consistent potential used in Eq. (4):

(8)

being the functional derivative of the XC potential with respect to the density. The screened Coulomb interaction is still to be calculated through Eq. (6). This approach has been called “” approximation in Ref. reining94, due to the inclusion of the XC contribution in the screening of the test charge. That the inclusion of the XC term in the self-consistent induced potential leads to the approximation can easily be seen as follows. We combine Eqs. (3),(8) to yield the induced charge density (we use symbolic operator notation for clarity):

(9)

Then, we substitute this result in the definition of the screened Coulomb interaction Eqs. (5),(6) to find

(10)

The last equation yields precisely the screened Coulomb interaction in the approximation.hl86 ; reining94 The difference between this approach and the standard approximation is that in this case the screening charge is calculated for an electron, while in the -RPA approximation the screening is calculated for a test charge. It is worth pointing out that in standard implementations of DFPT the XC term is already included in the variation of the self-consistent potential,baroni.rmp therefore the use of the approximation would not require any additional computational developments if the present approach was to be implemented on top of existing DFT software.

ii.1.2 Non self-consistent calculation of the dielectric matrix

An alternative approach to the calculation of the screened Coulomb interaction using the self-consistent Sternheimer equation consists in solving Eq. (4) non self-consistently. For this purpose we can replace the self-consistent perturbation in the right-hand side of Eq. (4) by the bare Coulomb potential as follows:

(11)

and we can solve this Sternheimer equation with the known term on the right-hand side kept fixed. By constructing the non self-consistent induced charge density as in Eq. (3) we then obtain the dielectric matrix :

(12)

It is straightforward to check that this procedure correctly leads to the RPA dielectric matrix.hl86-prb The difference between this approach and the self-consistent calculation described in Sec. II.1 is that here we also need to invert the dielectric matrix obtained through Eq. (12) in order to calculate the screened Coulomb interaction.

This non self-consistent procedure was first proposed in Ref. reining-sternheimer, . One additional step that we make in the present work is to notice that Eq. (11) constitutes a shifted linear system, i.e. a system where the linear operator differs from the “seed” operator only by a constant shift ( being the identity operator). In this case we can take advantage of the multishift linear system solver of Ref. frommer, to determine for every frequency at the computational cost of one single calculation for the seed system . This procedure is extremely advantageous as it makes it possible to calculate the entire frequency-dependence by performing one single iterative minimization. The technical implementation of this procedure is described in Appendix D.

ii.2 Green’s function

The calculation of the Green’s function can efficiently be performed by adopting a strategy similar to the Sternheimer approach described in Sec. II.1. We introduce the noninteracting Green’s function following Ref. hl, :

(13)

where the sum extends over occupied as well as unoccupied electronic states. The real infinitesimal is positive () for occupied states and negative () for unoccupied states.hl ; hl86 ; note.spinfactor We now split the sum in Eq. (13) into occupied () and unoccupied () states:

(14)

and we add and subtract to obtain:

(15)

with

(16)
(17)

In the above derivation we assumed again time-reversal symmetry, we used the Lorentzian representation of the Dirac’s delta function for small [], and we defined . The component of the Green’s function is obviously analytic in the upper half of the complex energy plane as its poles lie below the real axis. The non-analytic component vanishes whenever the frequency is above the chemical potential. For frequencies below the chemical potential, the non-analytic component introduces the poles associated with the occupied electronic states. The partitioning of Eqs. (16), (17) closely reflects the analytic structure of the non-interacting Green’s function. A detailed discussion of this aspect can be found in Ref. hl, . The two components of the Green’s function in Eq. (16) are associated with the Coulomb hole (COH) and the screened exchange (SEX) terms of the self-energy, and , respectively.hl86

The computation of the non-analytic component of the Green’s function in Eq. (17) is straightforward once the occupied electronic eigenstates have been determined. In order to calculate the analytic component it is convenient to proceed as in the case of the screened Coulomb interaction, by regarding as a parametric function of the the first space variable and of the frequency: . If we apply the operator to both sides of Eq. (16), with , and we use the completeness relation , then we find immediately:

(18)

As expected, we can determine the analytic part of the Green’s function by directly solving a linear system. As Eq. (18) does not explicitly require unoccupied electronic states, this procedure mimics the Sternheimer approach for the screened Coulomb interaction outlined in Sec. II.1, albeit without the self-consistency requirement.

The procedure described here is especially advantageous because Eq. (18) constitutes a shifted linear system, in the same way as Eq. (11). Also in this case we exploit the multishift method of Ref. frommer, to determine for every frequency at the computational cost of one single calculation for the seed system (cf. Appendix D).

The presence of the infinitesimal in guarantees that the linear operator in Eq. (41) is never singular. This operator can nonetheless become ill-conditioned, hence the use of appropriate preconditioners may become necessary. We discuss this aspect in Appendix B.

ii.3 Self-energy

The electron self-energy in the approximation is:hl86

(19)

where is a positive infinitesimal. At large frequencies the Green’s function decays as and the screened Coulomb interaction tends to the frequency-independent bare Coulomb interaction . As a consequence, the integrand in Eq. (19) decays as and the integration requires some care.

It is convenient to split the self-energy into an exchange contribution and a Coulomb term .blochl It is easy to check that the integrand in the Coulmb term decays as at large frequencies, therefore the integral is well behaved and the integration can be performed by using a numerical cutoff in Eq. (19). A detailed analysis of the analytic properties of the Coulomb term shows that it must decay as at large frequencis, and that the use of the cutoff in the integration introduces an error of the order of , where denotes the characteristic plasmon frequency of the system.

In order to integrate the exchange term we observe that and that the poles of lie in the lower half of the complex plane, hence the integration of the term yields a vanishing contribution. On the other hand, the integration of yields a constant (frequency-independent) term.note.integral In summary, we perform the frequency integration in Eq. (19) by evaluating numerically the Coulomb term using an energy cutoff, and by integrating analitycally the exchange term:

(20)

with

(21)

and

(22)

Iii Implementation in a basis of planewaves

We here describe our planewaves implementation of the scheme developed in Sec. II. The choice of a planewaves representation was motivated by the need for making contact with existing literature on dielectric matrices,cpm ; hl86-prb ; balde_tosa ; baroni-resta and by the availability of DFPT software for lattice-dynamical calculationsespresso which was used as a reference for our implementation.

We adopt the following conventions for the transformation from real to reciprocal space. The wavefunctions transform as usual according to:

(23)

with the volume of the unit cell and the Bloch wavevector. The bare Coulomb interaction transforms according to

(24)

where is also a Bloch wavevector and is the number of such wavevectors in our discretized Brillouin zone. In Eq. (24) we have . The latter expression for is arrived at by replacing the integration over the crystal volume by an integration over all space. This choice corresponds to assuming that we can rely on a very fine sampling of the Brillouin zone. Had we performed the integration on a sphere with a radius defined by the crystal volume (), then we would have obtained

(25)

which corresponds to the truncated Coulomb potential introduced in Ref. alavi, . We will come back to this aspect in Sec. III.1. The screened Coulomb interaction and the non interacting Green’s function transform according to

(26)

and

(27)

with similar expressions for , . We note that the sign convention adopted here in the Fourier transforms [e.g.  in the rhs of Eq. (26)] is necessary to obtain the compact expression Eq. (35) below for the induced charge, and is opposite to the convention adopted in Ref. hl86, . Before proceeding it is also convenient to introduce the “right-sided” inverse dielectric matrix through

(28)

By adopting the same convention for the inverse dielectric matrix as for the screened Coulomb interaction the above equation can be rewritten as:

(29)

We note that our Eq. (29) is slightly different from the standard expression [e.g. Eq. (22) of Ref. hl86, ], due to our choice of using the right-sided inverse dielectric matrix.

iii.1 Screened Coulomb interaction

In order to rewrite Eqs. (3)-(6) in the Bloch representation and in reciprocal space we proceed as follows. We first write the linear systems Eq. (4) by relabeling the wavefunctions as Bloch states :

(30)

The linear variations of the wavefunctions can be expanded in terms of Bloch waves as follows:

(31)

where is cell-periodic in . From the linear variations of the wavefunctions we construct the induced charge density using Eq. (3):

(32)

Here the factor takes into account the normalization of the Bloch states in the unit cell [the wavefunctions in Eq. (4) are normalized in the whole crystal]. Next we expand the screened Coulomb interaction in terms of Bloch waves:

(33)

where is cell-periodic in . If we now place Eqs. (33) and (31) into Eq. (30) we discover that the component of the perturbing potential corresponding to the Bloch wave couples only to the variations of the wavefunctions corresponding to the Bloch wave . As a result the linear system Eq. (4) becomes:

(34)

where and . The induced charge density associated with the Bloch wave now reads

(35)

This result is very similar to the case of standard DFPT.baroni.rmp The main difference is that in the present case the translational invariance of the screened Coulomb interaction induces a coupling between the perturbation with Bloch wave in the variable and its induced response with Bloch wave in the variable . To conclude our derivation, we rewrite the screened Coulomb interaction Eq. (6) after expanding the cell-periodic function in planewaves:

(36)

In practical calculations we proceed as follows: we first initialize the perturbation in the linear systems using . The solution of the linear systems yields the change in the wavefunctions, which are then used to construct the induced charge, the induced Hartree potential, and the updated screened Coulomb interaction. We repeat this procedure by starting with the updated screened Coulomb interaction until convergence is achieved. At convergence the self-consistent perturbing potential yields the screened Coulomb interaction . The calculation must be repeated for every perturbation, i.e. for each set of parameters . At the end of this procedure it is straightforward to obtain the inverse dielectric matrix using Eqs. (36) and (29). Alternatively, it is also possible to scale the initial perturbation and use to obtain the inverse dielectric matrix at the end of the self-consistent procedure [indeed Eq. (34) is a linear system].

The scheme developed here allows us to calculate one row (in ) of the inverse dielectric matrix by determining the linear response to the perturbation . This idea has been discussed already in Ref. kunc-tosatti, in the framework of nonperturbative methods based on supercell calculations.

iii.1.1 Singularities in the inverse dielectric matrix and the screened Coulomb interaction

In order to avoid the singular behavior of the wings of the inverse dielectric matrix in the long wavelength limit () it is convenient to work with the symmetrized inverse dielectric matrix defined as follows:balde_tosa

(37)

Unlike its unsymmetrized counterpart , the wings of have finite limits at long wavelengths. The screened Coulomb interaction of Eq. (29) is now rewritten in symmetrized form as

(38)

While the symmetrized inverse dielectric matrix has finite limits at long wavelengths, the screened Coulomb interaction still presents a divergence corresponding to the long-range tail of the Coulomb potential in real space. This divergence requires special handling when performing the Brillouin zone integration to calculate the self-energy.hl86 We here overcome this difficulty following the prescription of Ref. alavi, . For this purpose we replace the bare Coulomb potential by the truncated potential , being the Heaviside step function. The truncation radius is defined as in Sec. III.1. Using this truncated Coulomb potential, the final expression for the screened Coulomb interaction in reciprocal space becomes:

(39)

In the long-wavelength limit the head of the truncated screened Coulomb interaction () tends to the finite limit and the singular behavior is removed. Optimized truncation strategies have been developed for non-isotropic materials and systems with reduced dimensionality.sohrab

iii.2 Green’s function

We now specialize Eqs. (13)-(18) to the case of a planewaves basis and the Bloch representation. We start by rewriting Eq. (13) after relabeling the electronic states as Bloch states and taking into account the normalization, as already done in Sec. III.1. Next we expand the Green’s function in terms of the Bloch waves and :

(40)

with cell-periodic in . An analogous expansion holds for the non-analytic component . Equations (16),(17) are now rewritten as:

(41)
(42)

In deriving Eqs. (41),(42) we made use once again of time-reversal symmetry, yielding . Similarly to the case of the screened Coulomb interaction, by solving the linear system in Eq. (41) for a set of parameters we obtain an entire row of the analytic component of the Green’s function .

iii.3 Self-energy

The self-energy in Eq. (19) is calculated in real space after performing the Fourier transforms of and . The result is then transformed back in reciprocal space to obtain . The evaluation of the matrix elements of the self-energy in the basis of Kohn-Sham eigenstates is performed in reciprocal space. Since the planewaves cutoff required to describe the inverse dielectric matrix and the self-energy is typically much smaller than the cutoff used in density-functional calculations,hl86 the procedure described here does not require an excessive computational effort and accounts for only a fraction of the total computation time.

Iv Scaling properties

In this section we analyze the computational complexity of the algorithms proposed in Sec. III, by focusing on our planewaves implementation. Without loss of generality we consider a point sampling of the Brillouin zone and we leave aside the frequency-dependence. We assume that the Kohn-Sham electronic wavefunctions are expanded in a basis of planewaves with a kinetic energy cutoff , corresponding to plane waves. In the simplest case of norm-conserving pseudopotential approaches the electronic charge density is described using a basis set with a cutoff , and the corresponding numbers of basis functions and real-space grid points are and , respectively. The screened Coulomb interaction and the Green’s function are described by a smaller cutoff and planewaves. The self-energy is expanded in a planewaves basis with cuttoff , and we denote by the number of real-space grid points associated with this basis.

iv.0.1 Screened Coulomb interaction

Equation (34) needs to be solved for each one of the planewave perturbations and the occupied electronic states. For the solution of Eq. (34) we adopt the complex bi-orthogonal conjugate gradient method of Ref. jacobs, (cBiCG), as described in Appendix A. Each solution of Eq. (34) requires two cBiCG minimizations (for ), and each cBiCG minimization consists of two conjugate gradients (CG) sequences. The most time-consuming operation in each CG step is the application of the Hamiltonian to the previous search direction, and in particular the Fourier transform of the wavefunctions to real-space and back for evaluating the product with the local potential. Fast-Fourier-transform (FFT) algorithms allow us to perform these calculation in floating point operations.frigo If in average the CG minimization requires steps and the self-consistency loop requires iterations, then the total cost of the entire calculation corresponds to a number of floating point operations

(43)

where S stands for “Sternheimer GW”. As , , and scale linearly with the size of the system as measured by the number of atoms , the overall scaling of this procedure is .

For comparison it is useful to consider the scaling of standard calculations based on the expansion over unoccupied states (hereafter referred to as the “HL” method).hl86 The calculation of the irreducible RPA polarization requires the evaluation of the optical matrix elements between each of the occupied states and each of the unoccupied states. These matrix elements are typically computed by using Fourier transforms of , therefore this procedure requires essentially Fourier transforms from real- to reciprocal-space. Each Fourier transform is performed on the real-space grid for the density with grid points, therefore the total cost of the standard method corresponds to a number of floating point operations

(44)

Even in this case therefore the overall scaling is .

Since the method of Ref. hl86, calculates the dielectric matrix and then performs a matrix inversion, in order to compare the prefactors in Eqs. (43) and (44) we consider the non self-consistent calculation of the dielectric matrix as described in Sec. II.1.2 [ in Eq. (43)], and we restrict ourselves to the calculation of the static dielectric matrix. In this case only one calculation of Eq. (4) is required instead of two for , and the two CG sequences of the cBiCG algorithm do coincide. As a result, a factor 4 drops out of the prefactor in Eq. (43). If we assume for definiteness a perfectly well-conditioned linear system (condition number ), and express the number of CG iterations required to achieve convergence through Eq. (65), then the ratio between the complexity of the S approach in a planewaves implementation and the standard approach becomes

(45)

where is the relative accuracy of the results. As an example, for a relative accuracy of we find this ratio to be . In the case of silicon, using a typical cutoff Ry we obtain , therefore the S approach becomes convenient when more than 1650 unoccupied states are used in the standard approach. This is rarely the case as most calculations reported to date use only a few hundreds of unoccupied electronic states. Of course the accuracy of the standard sum-over-states expression is difficult to quantify, and probably a convergence on 5 significant digits is not warranted by a few hundreds of electronic states.

Our estimate suggests that the planewaves implementation of our method can be as expensive as the standard approach. It should be noted, however, that our scheme has the advantage of providing the whole self-energy , while the standard approach typically provides the matrix elements of the self-energy on a small subset of states of the order of . Therefore if we were to perform a comparison based on the same amount of output information, we should use in Eq. (44) instead of . In this case Eq. (45) would change into

(46)

and for we would have . This clearly shows that, if the entire self-energy was needed (as opposed to a few matrix elements), then our proposed S approach would be more convenient that the standard sum-over-states approach.

The above analysis shows that the main bottlenecks of our method are (i) the Fourier transform for the application of the Hamiltonian and (ii) the large basis sets adopted. In order to make the approach proposed here more efficient we could either move to real-space methods where the application of the Hamiltonian scales linearly with system size,chelikowsky or reduce the size of the basis set by using local orbitals.siesta Fast evaluations of the operation in order- operations should make it feasible calculations with scaling and with a very favorable prefactor. We will come back to this aspect in Sec. VI.

iv.0.2 Green’s function

The complexity of the procedure for calculating the Green’s function proposed in Sec. II.2 can be analyzed along the same lines of Sec. IV.0.1. The main differences in this case are that (i) the linear system Eq. (18) does not depend on the occupied states, (ii) the calculation is non self-consistent, and (iii) the calculation is performed for one single frequency , while the entire frequency-dependence is generated through the multishift method. As a result, a factor drops out of Eq. (43) and the computational cost of the Green’s function calculation reads:

(47)

The complexity of this calculation is significantly smaller than the complexity of the algorithm for the screened Coulomb interaction. In particular, the calculation of the Green’s function scales as . This procedure for calculating the Green’s function is advantageous with respect to an expansion over empty states, as the orthogonalization of the unoccupied states would require a number of floating point operations scaling as .

iv.0.3 Scaling of the self-energy calculation

The self-energy is computed in real-space after obtaining and from and , respectively, and then is tranformed back into reciprocal space. The 6-dimensional FFT transforms require operations for each frequency of the screened Coulomb interaction, having defined . The computational cost of this procedure scales as , and is small with respect to the cost of calculating the screened Coulomb interaction.

V Results

In order to demonstrate the approach proposed in Secs. II,III we have realized a prototype implementation within the empirical pseudopotential method (EPM) of Ref. cohen_berg, , and we have validated our implementation for the test case of silicon.

v.1 Dielectric matrix

Table 1 contains some of the components of the inverse dielectric matrix calculated using the self-consistent Sternheimer method described in Sec. III. In all our calculations we used inverse dielectric matrices of size 5959, corresponding to a kinetic energy cutoff of 5 Ry for the screened Coulomb interaction. For the purpose of comparison with Ref. balde_tosa, we calculated the static and long wavelength limit (, ) of the inverse dielectric matrix for the first few reciprocal lattice vectors. The authors of Ref. balde_tosa, adopted the standard approach based on the expansion over the unoccupied electronic states of the dielectric matrix, and obtained the inverse dielectric matrix by performing matrix inversions. In our calculations we used the self-consistent method of Sec. III and no matrix inversion was necessary. The excellent agreement which can be seen in Table 1 between our calculations and Ref. balde_tosa, supports the validity of our approach.

ciao Ref. balde_tosa, Present work
(0,0,0) (0,0,0) -0.083 -0.0866
(1,1,1) (1,1,1) -0.605 -0.6055
(,1,1) (1,1,1) -0.008 -0.0076
(1,,1) (,1,1) -0.010 -0.0102
(1,,) (,1,1) -0.045 -0.0463
(2,0,0) (1,1,1) -0.038 -0.0382
(2,0,0) (,1,1) -0.005 -0.0049
(2,0,0) (2,0,0) -0.667 -0.6671
(,0,0) (2,0,0) -0.006 -0.0063
(0,2,0) (2,0,0) -0.016 -0.0166
Table 1: Long-wavelength limit of the static symmetrized inverse dielectric matrix of silicon [ and ]. We compare our calculations performed within the self-consistent Sternheimer approach with the results obtained in Ref. balde_tosa, using the expansion over empty states and the inversion of the dielectric matrix. For the calculations we sampled the Brillouin zone with 29 irreducible -points, corresponding to a grid,balde_tosa ; baroni-resta and a plane wave cutoff of 5 Ry.balde_tosa Following Ref. balde_tosa, we employed the empirical pseudopotential parameters from Ref. cohen_berg, . The reciprocal lattice vectors are in units of , being the lattice parameter.

Next we consider the wavevector dependence of the head of the dielectric matrix . We performed the calculation by using the non self-consistent method described in Sec. II.1.2 in order to compare our results with Ref. cohen.diel1, . Figure 1 shows that our calculations are in very good agreement with the results of Ref. cohen.diel1, . The slight differences between our results and those of Ref. cohen.diel1, at large wavevectors can likely be ascribed to the use of a limited number of empty states in the perturbative expansion over unoccupied states in the latter work.

Figure 2 compares our results for the frequency dependence of the dielectric matrix at long wavelength with the results reported in Ref. hl86, . We focused in particular on the cases illustrated in Fig. 3 of Ref. hl86, . Apart from some small differences possibly arising from the use of the expansion over unoccupied states in Ref. hl86, , even in this case the agreement between our calculations and those of Ref. hl86, is very good throughout the entire frequency range. The agreement is consistently good for the head of the dielectric matrix and for diagonal and off-diagonal matrix elements.

[t]

Figure 1: (Color online) Dielectric function of silicon calculated using the empirical pseudopotential method and the non self-consistent Sternheimer method of Sec. II.1.2: calculated head of the static dielectric function as a function of wavevector (blue solid line), and results from Ref. cohen.diel1, (red dashed line). We used a planewave kinetic energy cutoff of 5 Ry and sampled the Brillouin zone through a uniform grid. The wavevectors are in units of , being the lattice parameter.

[b]

Figure 2: (Color online) Frequency-dependent dielectric matrix of silicon at long wavelength []. The calculations were performed using the empirical pseudopotential method of Ref. cohen_berg, and the non self-consistent Sternheimer method of Sec. II.1.2. We used a planewave kinetic energy cutoff of 5 Ry and sampled the Brillouin zone thorough a uniform grid. Such a dense Brillouin zone sampling was necessary to correctly describe the absorption onset. In order to avoid null eigenvalues in the linear system Eq. (11) we performed the calculations by including a small imaginary component of 0.1 eV in the frequency . The panels (a)-(c) correspond to the cases illustrated in Fig. 3 of Ref. hl86, and show for (a) , (b) , and (c) , . The blue solid lines are our calculations, the red dashed lines are from Ref. hl86, . The real and imaginary parts of the dielectric matrix are indicated in each panel. We note that the scales on the vertical axes correspond to three different orders of magnitude.

v.2 Padé approximants and convergence

[t]

Figure 3: (Color online) Real part of the long-wavelength dielectric matrix of silicon as a function of frequency. Panels (a) and (b) of this figure correspond to panels (a) and (b) of Fig. 2, respectively. The technical details of the calculations are the same as those described in the caption of Fig. 2. Solid blue lines: dielectric matrices calculated directly along the real frequency axis, from Fig. 2. Dotted, dashed, and dash-dotted red lines: dielectric matrices obtained from the analytic continuation on the real axis using Padé approximants of order 6, 11, and 26, respectively. The Padé approximants were generated using dielectric matrices calculated along the imaginary frequency axis on uniform frequency grids in the range 0-50 eV. For instance, the 6-points approximant corresponds to calculations at the imaginary frequencies of eV.

Figure 3 shows the quality of the analytic continuation from the imaginary to the real frequency axis using Padé approximants (cf. Appendix C).pade1 ; pade2 We found that this procedure based on Padé approximants is generally very stable and requires minimal manual intervention. Approximants of order 5 and higher are able to reproduce the location, the strength, and the width of the main plasmon-pole structure of the dielectric matrix. Head, wings, and body of the dielectric matrix are all described consistently (cf. Fig. 3). Although the singularity corresponding to the absorption onset in Fig. 3(a) is smoothed out by Padé approximants of low order, this effect is washed out when calculating the frequency convolution of the Green’s function with the screened Coulomb interaction for the self-energy. The advantages of performing calculations along the imaginary axis are that (i) the linear system in Eq. (4) becomes increasingly more well-conditioned when approaching large imaginary frequencies, and (ii) a moderate Brillouin-zone sampling is required to perform calculations along the imaginary axis unlike the case of real-axis calculations. As a result, the worst case scenario for the solution of the linear system Eq. (4) corresponds to the static case . These technical aspects are described in detail in Appendix B.

The typical number of non self-consistent iterations required to solve Eq. (4) for a fixed with a relative accuracy of using the cBiCG algorithm described in Appendix A is (using the preconditioner of Ref. tpa, ). This estimate has been obtained by averaging over all the reciprocal lattice vectors, -vectors, and imaginary frequencies. The typical number of self-consistent cycles required to obtain the screened Coulomb interaction through Eq. (4) with a relative accuracy of is . Charge-sloshing effects are attenuated by using the potential mixing method proposed in Ref. johnson, , appropriately modified to deal with complex potentials.

For completeness we report here the corresponding figures for the calculation of the Green’s function using Eq. (18). The average number of non self-consistent iterations required to obtain the analytic part of the Green’s function is when preconditioning is adopted (for this purpose we used a straightforward adaptation of the method of Ref. tpa, ). However, multishift minimizations as described in Appendix D do not allow for the use of preconditioning, and in the latter case the number of iterations required to achieve convergence (within a relative accuracy ) can be as high as .

v.3 Self-energy

Figures 4 and 5 show the real part and the imaginary part of the self-energy calculated for the first few silicon eigenstates at using our S method within the EPM scheme. Our results are compared with the calculations of of Ref. blochl, performed within DFT/LDA and the projector-augmented wave method (PAW). We calculated the screened Coulomb interaction by using a uniform grid to sample the Brillouin zone, and Padé approximants of order 7 along the imaginary frequency axis. The frequency integration of the Coulomb term of the self-energy in Eq. (21) was performed by using a Coulomb cutoff eV and a grid spacing of 0.5 eV. The Green’s function was calculated using an imaginary component eV in Eq. (18). Apart from some differences in the damping of the plasmaron peaks, the agreement between our calculated self-energy and the results of Ref. blochl, is rather good throughout the entire frequency range 100 eV. This finding is quite surprising since we are comparing our empirical pseudopotential calculations with low kinetic energy cutoff (5 Ry) with the ab-initio LDA calculations including PAW core-reconstruction of Ref. blochl, . Such agreement probably reflects the ability of the EPM method to provide not only a good description of the band structure of silicon, but also a reasonable description of the electronic wavefunctions.

[t]

Figure 4: (Color online) Real part of the expectation value of the self-energy for the first 8 silicon eigenstates at . The solid blue lines are our S results using the empirical pseudopotential method. The dashed red lines are from the first-principles calculation of Ref. blochl, using the LDA and the PAW method. Panels (a)-(d) correspond to the states (Band 1), (Bands 2-4), (Bands 5-7), and (Band 8), respectively. The energy axis is aligned with the top of the valence band. The horizontal dotted lines indicate the calculated bare exchange contribution to the self-energy .

v.4 Quasi-particle excitations and spectral function

[t]

Figure 5: (Color online) Imaginary part of the expectation value of the self-energy for the first 8 silicon eigenstates at . The solid blue lines are our S results using the empirical pseudopotential method. The dashed red lines are from the first-principles calculation of Ref. blochl, . Panels (a)-(d) correspond to the states , , , and , respectively. The energy axis is aligned with the top of the valence band. For comparison, the vertical grey lines indicate the locations of the logarithmic singularities at ( being the plasmon energy) that would arise using a model plasmon-pole dielectric function.hl
Figure 6: (Color online) Expectation values of the quasiparticle spectral function for the first 8 silicon eigenstates at . The solid blue lines are our S results using the EPM, the dashed red lines are from the first-principles calculation of Ref. blochl, using the LDA and the PAW method. The expectation values have been calculated within the diagonal approximation of Eq. (48). In each panel the sharp peak near the band extrema corresponds to a well-defined quasi-particle, while the two broad peaks corresponds to plasmarons, i.e. electrons or holes coupled to a cloud of real plasmons.hl The suppression of one of the plasmaron peaks reflects the large imaginary parts of the self-energy in the corresponding panels of Fig. 5. We point out the different vertical scale in panel (d).
EPM/LDA eigenvalues Quasiparticle energies
Present Ref. blochl,          Present     Ref. hl86,     Ref. blochl,           Expt.
3. 89 3. 23           4. 21     4. 08     4. 05          4.23,4.1