Optical properties of periodic systems within the currentcurrent response framework: pitfalls and remedies
Abstract
We compare the optical absorption of extended systems using the densitydensity and currentcurrent linear response functions calculated within manybody perturbation theory. The two approaches are formally equivalent for a finite momentum of the external perturbation. At , however, the equivalence is maintained only if a small expansion of the densitydensity response function is used. Moreover, in practical calculations this equivalence can be lost if one naively extends the strategies usually employed in the densitybased approach to the currentbased approach. Specifically we discuss the use of a smearing parameter or of the quasiparticle lifetimes to describe the finite width of the spectral peaks and the inclusion of electronhole interaction. In those instances we show that the incorrect definition of the velocity operator and the violation of the conductivity sum rule introduce unphysical features in the optical absorption spectra of three paradigmatic systems: silicon (semiconductor), copper (metal) and lithium fluoride (insulator). We then demonstrate how to correctly introduce lifetime effects and electronhole interactions within the currentbased approach.
pacs:
I Introduction
The electromagnetic linear response of solids can be measured experimentally by applying a small external perturbationor probewhich induces a small change in the sample material. This change is the response of the material to the perturbing field and it can have both longitudinal and transverse components, depending on the experimental setup and the inhomogeneity of the system. Wellknown examples of such experiments are the measurements of absorption, electron energy loss, Kerr and Faraday effects, and dichroism.
From the theoretical point of view the linear response of a system to longitudinal fields can be obtained from both the densitydensity response function and the currentcurrent response function , which is a tensor Strinati (1988). Instead, the response to transverse fields can only be described using . This is due to the fact that the density determines the longitudinal current—through the continuity equation— but not the transverse current. An important instance where transverse electric fields come into play is the recent experimental progress on topological insulators Checkelsky et al. (2012); Chang et al. (2013). Therefore approaches based on are more general than those based on . On the other hand the current–density based approach is more susceptible to numerical issues and instabilities. For example the paramagnetic and diamagnetic contributions to must be treated on equal footing Kootstra et al. (2000); Raimbault et al. (2015) otherwise divergencies can arise.
The differences between the two response functions are intrinsically related to the different gauge used to define the coupling of the external field with the electrons. While transverse fields can be described only in terms of the transverse component of the vector potential, for longitudinal fields there are two options. One can use a scalar potential, which couples to the electron density of the system, or the longitudinal component of the vector potential, which couples to the current density. In the optical limit, i.e., in the limit where the momentum carried by the external perturbation is negligible, these two gauges are called length and velocity gauge, respectively. In this limit the distinction between longitudinal, or parallel to , and transverse, or perpendicular to , fields vanishes. However the optical limit is also the case where many subtle differences between the two approaches arise, fundamentally because the densitybased approach is illdefined at .
Various works in the literature have discussed and compared the two approaches either
within the independentparticle (IP) picture Sipe and Ghahramani (1993); Lamb et al. (1987); Sangalli et al. (2012),
in which optical properties can be described in terms of a simple sumoverstates expression,
or within a generalized KohnSham approach. Springborg and Kirtman (2008)
These studies focus mostly on the limitations of the densitybased approaches and the avoidance of divergencies in the
currentbased approaches.
An important point of debate is whether the two approaches give the same longitudinal macroscopic dielectric function. At the
linearresponse level the equivalence has been shown, but only in some specific cases, i.e., for semiconductors and insulators,
for Hamiltonians with only local operators, and for absorption at resonance Schafer and Wegener (2002).
The comparison for metals, for Hamiltonians with nonlocal operators, and for absorption out of resonance due to smearing,
The objectives of this work are (i) to better elucidate the differences between the density and the currentbased approaches at the IP level; (ii) to extend the discussion to the manybody framework. In particular, at the IP level, we compare the two approaches when lifetime effects are introduced (either from firstprinciples or by introducing a smearing parameter). At the manybody level we consider the BetheSalpeter equation (BSE). Based on Green functions theory, the BSE is the fundamental equation of ManyBody Perturbation Theory for the description of linear response properties Fetter and Walecka (2003). In particular we address some important and rather subtle aspects related to the definition of the velocity operator in the currentbased BSE approach Rohlfing and Louie (2000) which did not receive proper attention so far. Strinati (1988); Rohlfing and Louie (2000); Del Sole and Fiorino (1984)
We note that alternative approaches exist in which the explicit calculation of response functions is avoided, for example by calculating the induced currentdensity in the frequency domain Kootstra et al. (2000); Romaniello and de Boeij (2005); Berger et al. (2005, 2006) or by calculating the timedependent density or currentdensity using realtime propagation Attaccalite et al. (2011); Castro and Appel (2006); Bertsch et al. (2000). Many of the issues discussed in this work are also pertinent to these methods.
The manuscript is organized as follows. In Sec. II we compare the density and the currentbased approaches at a formal level, within both the independentparticle approximation and the BSE. We compare the two approaches in the optical limit and elucidate the origin of their fundamental difference at . In Sec III we show how an incorrect implementation of smearing and an erroneous definition of the velocity operator can lead to different spectra in the two approaches. We then show how these problems can be solved. For sake of completeness, in Sec. IV we briefly discuss alternative approaches to calculate optical spectra. Finally, in Sec. V we draw our conclusions.
Ii Formal equivalence of density and currentbased approaches
In this section we review the basic equations which describe optical properties within both a densitybased and a currentbased approach
and we compare these approaches in the optical limit .
We first consider the case of noninteracting electronhole pairs
referred to in the following as IP level,
ii.1 Noninteracting electronhole pairs
Let us consider a collection of electrons moving in a periodic potential , with and inside the unit cell volume , whose position is identified by the Bravais lattice vector . We are interested in the response of the system to a macroscopic timedependent electromagnetic field characterized by a set of scalar and vector potentials, . A macroscopic quantity is defined as an average taken over the unit cell whose location is given by Wiser (1963) (see App. A).
The motion of the system is then governed by the following oneparticle Hamiltonian,
(1) 
where , and
(2a)  
(2b) 
for which the macroscopic external perturbations and vanish identically for . The term instead describes a periodic potential (i.e. it does not depend on ) and can be seen as the mean field felt by the electrons. Here and throughout the article we use atomic units with the gaussian convention for electromagnetism. It is important to notice that the perturbing potentials and are external potentials and they do not take into account the contributions due to the response of the system.
The density and the currentdensity of the system are defined in terms of the Bloch wave functions , which are eigenstates of the oneparticle hamiltonian , as
(3a)  
(3b) 
Here is a generalized index that comprises the band index and the wave vector , and are the occupations factors. In the collinear case can be further specified as the band with the corresponding spin index , while in the spinorial case it is the spinorial band index. The density operator is the identity in real space . The currentdensity is given in terms of the velocity operator Landau and Lifshits (1965).
(4) 
where , is the commutator, while is the anticommutator. With the Hermitian Hamiltonian defined in Eq. (1) . The momentum operator gives the paramagnetic current density, i.e. , and the potentialdependent term gives the diamagnetic current density. We note that the velocity operator, and therefore the current operator, depends on the Hamiltonian. We will discuss this important point later in the paper. Density and currentdensity variations are induced as a response to the perturbing potentials in Eqs. (2a) and (2b).
We restrict ourselves to macroscopic longitudinal perturbations with a small transferred momentum . In the following, therefore, we consider the longitudinali, i.e., parallel to , component only of the vector potential and the current (see App. B). Longitudinal perturbations can be described by a scalar potential , for example in the Coulomb gauge , or by a longitudinal vector potential , for example in the (incomplete) Weyl gauge . In the optical limit the two gauges generate the socalled length and velocity gauges (see App. C).
The macroscopic densitydensity,and the longitudinal currentcurrent response functions, and , respectively, can be written in reciprocal space as (see App. D)
(5a)  
(5b) 
When the induced density and current density are calculated from the Hamiltonian in (1), then the response functions in Eqs. (5) are the independent particle ones, and . We can now define the longitudinal component of the macroscopic dielectric tensor , which conveniently describes the optical properties of semiconductors in the long wavelength limit. It can be obtained, for , in terms of the longitudinal currentcurrent response function as Strinati (1988)
(6) 
For , the dielectric function can also be expressed in terms of the densitydensity response function as
(7) 
with . Equation (7), in the limit , is the expression commonly used for the calculations of optical properties in solids. For both expressions (6) and (7) the nonanalytic point can be described only via a limiting procedure and the final result depends on the direction of the limit.
Both and should lead to the same result. Indeed the two response functions are related by the expression Giuliani and Vignale (2005); Schafer and Wegener (2002)
(8) 
which follows from the continuity equation , which guarantees local charge conservation. Thus at and relation (8) ensures . We note that although we have established formal equivalence this does not guarantee numerical equivalence, as we show in Section III.
Choice of the reference oneparticle hamiltonian
Within the IP picture defined by the Hamiltonian in Eq. (1) the response functions can be written as
(9) 
with
(10)  
(11)  
(12)  
(13) 
(the spin index in and is identical in the collinear case) and
(14) 
where it is understood that the limit is taken at the end of the calculation.
Here is either the oneparticle density operator or
the paramagnetic currentdensity operator , and are Bloch states, which are eigenstates of the equilibrium oneparticle Hamiltonian ),
with corresponding energies and occupation numbers .
(15) 
We shall now choose the stationary part of the oneparticle Hamiltonian (1). A common choice is the KohnSham Hamiltonian from density functional theory (DFT) where
(16) 
is the equilibrium KohnSham (KS) potential. It is the sum of the potential generated by the nuclei, , the Hartree potential, , and the exchangecorrelation potential, . We refer to the IP response function derived from the KS hamiltonian as . However, usually DFTKS band structures are not a good starting point for response calculations, since, for example, the fundamental band gap is systematically underestimated in semiconductors and insulators. To improve over DFT one can introduce a more general nonlocal and frequency dependent quasiparticle (QP) potential
(17) 
where is the exchangecorrelation manybody selfenergy, defined within the formalism of ManyBody Perturbation Theory (MBPT), evaluated at equilibrium. In particular we consider the (firstorder) QP hamiltonian defined as the firstorder correction in the perturbation of the KS hamiltonian:
(18) 
We have here emphasized the dependence of , and dropped the space and spin dependence of both and . The QP eigenvalues are thus corrected KS energies
(19) 
while we keep the same wave functions . From the quasiparticle energies , solution of Eq. (19), we can define the QP response function, , which differs from the KS response function by the replacement .
Expansion of the currentbased approach at finite momentum
We now focus on the relation between the densitybased and currentbased formalism by extending the approach of Ref. Sipe and Ghahramani, 1993 to . To simplify the notation we rewrite the response function (9) as
(20) 
where .
By using the exact relation
(21) 
in Eq.(20) and inserting the result into Eq.(6), we can decompose as
(22) 
with
(23a)  
(23b)  
(23c) 
where in Eq.(23a) we used Eqs.(15) and (20). From the conductivity sum rule (CSR) for , given by
(24) 
we see that . Moreover, if timereversal symmetry holds, also . Therefore, in case of timereversal symmetry, only the term survives and we have, at finite momentum, the general result
(25) 
The last equality
The optical limit
We are interested in computing optical properties, for which , with in the order of a few eV. Therefore we consider compared to the size of the Brillouin zone. However, , given in Eq. (7), is not defined for because of the term. Therefore, to obtain an explicit expression for for small , we Taylor expand , defined in Eq. (13), around ,
(26) 
where
(27) 
the longitudinal dipoledipole response function. Thus, for small , we can rewrite Eq. (7) as
(28) 
which is well defined at .
However, Eq. (28) is formally exact only in the limit ,
since at contributions from intraband transitions, i.e., transitions within a single band, are excluded
(29a)  
(29b) 
We now discuss the cases and in more detail by separating , and into inter and intraband contributions. Moreover, also contains the constant diamagnetic term .
The case: interband transitions.
At only interband transitions,
i.e., transitions between two different bands, contribute to the summations
in Eqs. (23a)(23c), since for intraband transitions .
Because of the missing intraband contributions in the CSR in Eq. (24) in general does not hold at .
Let us define
(30a)  
(30b)  
(30c) 
One can verify
Let us first consider systems with a gap. Then since the dielectric function must go to a constant Sipe and Ghahramani (1993) as . We can thus focus on . Using Eq. (12) for the paramagnetic currentdensity, we can write
(31)  
where we used and given by Eq. (12) with . From this relation we deduce that
(32) 
Substitution of this identity into Eq. (28) shows that at both and can be expressed in terms of for systems with a gap:
(33) 
Equation (33) proves the equivalence between the densitybased and the currentbased approaches for cold semiconductors and insulators.
For metals, however, we also need to describe the divergent Drudelike term. Since the currentbased approach is exact also at , this term must be described by , i.e. the Drude tail originates from the breaking of the CSR. Thus we have
(34) 
Instead, the densitybased approach does not contain any extra term beyond and thus cannot describe metals at . The Drudelike tails in the densitybased approach can be obtained only explicitly dealing with the limit as explained in the next subsection.
The limit: intraband transitions.
In the limit also intraband transitions contribute to the summations
in Eqs. (23a)(23c).
One can thus define
(35a)  
(35b)  
(35c) 
where we used Eq. (30b) and the fact that owing to the CSR in Eq. (24). If timereversal symmetry holds and hence . We note that here and in the rest of the paper the limit is taken at finite . Using the results of Eqs. (35a)(35c) in Eq. (22) we obtain
(36) 
where we used Eq. (30c). Owing to Eq. (29a) we can compare this result to Eq. (34) and conclude that
(37) 
This means that, within the currentbased approach, the Drudelike tail, which at is described by , in the limit is described via . Indeed, the exact expression for at finite in Eq. (25) only depends on . Thus must describe all intraband transitions in metals when .Marini (2001).
To have an explicit expression of such intraband contribution in the densitybased approach, one needs to Taylor expand the energy and occupation number differences as Romaniello and de Boeij (2005)
(38)  
(39) 
where . At the IP level there is no Drude tail in the absorption (see App. E for details). Only when introducing a smearing a Drudelike peak in the absorption appears Romaniello and de Boeij (2005); Berger et al. (2005). Experimentally such smearing also exist, because of the interaction between electrons and thus in practice a peak is always measured.
In conclusion, the currentbased approach is exact both in the limit and at . In particular, if metals are considered, the Drudelike tail is present in both cases, but it is described by different terms, namely by in the limit and by at . Also the densitybased approach is exact in the limit . This can be summarized by the following set of relations
ii.2 Interacting electronhole pairs
So far we have only considered independent particles. We now also take into account the electron–hole interaction by considering the variations induced in the Hartree and the exchange–correlation self–energy by the external potential. Within MBPT such variations can be conveniently described using the BSEStrinati (1988) for the (timeordered) twoparticle propagator , which reads
(40) 
where . In Eq. (40) is given in terms of QP Green’s functions as and the fourpoint kernel is given by
(41) 
where is the Hartree potential without the longrange component of the Coulomb potential, .
We are now looking at the selfconsistent response of the system to a macroscopic field composed of both the external and the induced macroscopic field.
Therefore
is effectively removed Bussi (2004).
We use the selfenergy with an instantaneous screened Coulomb potential .
In this case the propagator depends only on the time difference
(42) 
where, for notational convenience, we dropped the spin and space
arguments
The BSE can be mapped onto an effective twoparticle equation, written in a basis of electronhole transitions , according to Onida et al. (2002); Gatti and Sottile (2013)
(43) 
where is the excitonic Hamiltonian with eigenvectors and eigenvalues .
The (retarded)
where and the overlap matrix are defined by
(44)  
(45) 
Upon solving Eq. (43), one can obtain the densitydensity and the currentcurrent excitonic response functions as
(46) 
where and are defined analogously to and in
Eqs. (12) and (13) but the expectation value is with respect to the excitonic wave function.
As for the IP case, the zeroenergy transitions give no contribution to the summation
Substitution of and in Eqs. (6) and (7), respectively, yields an expression for the macroscopic dielectric tensor in the current and densitybased approach. As in the IP case the two expressions are identical for and thanks to Eq. (8).
The optical limit
The analysis for the optical limit in Sec. II.1.2Sec. II.1.3 also applies to the excitonic case. In particular, by rewriting the expression for the excitonic response function as
(47) 
where one obtains
(48a)  
(48b)  
(48c) 
The interband dielectric function in terms of the excitonic dipoledipole response function reads
(49) 
where now contains the longitudinal excitonic dipoles .
Finally, the analogous of the relation (32) exists also for the exitonic case:
(50) 
from which expressions analogous to Eqs. (34) and (35) can be written for .
Iii Issues in the practical implementation of currentbased approaches
In the previous section the equivalence between the currentbased and densitybased approaches was established for the longitudinal macroscopic dielectric function. In the present section we compute optical absorption in a semiconductor (bulk silicon), an insulator (LiF), and a metal (copper) comparing the two approaches numerically, i.e., using both Eq. (6) for the currentbased approach and Eq. (7) for the densitybased approach. In doing so we will show potential pitfalls in the implementation of currentbased approaches due to: (i) the use of a smearing parameter (Sec. III.2); (ii) the violation of the conductivity sum rule (Sec. III.3); (iii) the inclusion of lifetimes (Sec. III.4); (iv) the inclusion of QP energy corrections and excitonic effects with an incorrect velocity operator (Sec. III.5). Moreover, we demonstrate how those pitfalls can be avoided.
iii.1 Computational details
The response functions entering Eqs. (6) and (7) are constructed starting from ground state DFT calculations performed with either the QuantumEspresso codeGiannozzi et al. (2009) (Si and Cu) or the Abinit codeGonze et al. (2002) (LiF) using norm conserving pseudopotentials. The groundstate of all three materials have been calculated by using the local density approximation (LDA) for the exchangecorrelation functional. For Si we used 4 valence electrons ( configuration), a facecentered cubic (FCC) cell with a two atoms (diamond structure) and the experimental lattice parameter of Bohr. For calculating the ground state density we use an energy cutoff of 10 Ha and a sampling of the Brillouin zone (BZ). For the calculation of the response function we used instead a sampling of the BZ, resulting in 145 kpoints in the irreducible BZ (IBZ) and 4096 in the BZ. For LiF we used 8 valence electrons ( configuration for Li and for F)Goedecker et al. (1996), a FCC cell with two atoms (sodiumchloride structure) and the experimental lattice parameter of Bohr. For calculating the ground state density and the response function we use an energy cutoff of 40 Ha with a sampling of the BZ, resulting in 29 (512) kpoints in the IBZ (BZ). For Cu we used with 11 valence electrons ( configuration), a FCC cell with a single atom and the experimental lattice parameter of Bohr. For calculating the ground state density and the response function we use 32.5 Ha and a sampling of the BZ, resulting in 145 (4096) kpoints in the IBZ (BZ).
We computed the dielectric function starting from the DFTKS wave functions and energies
using the Yambo code Marini et al. (2009) where we implemented the equations for the currentbased approach.
We also consider QP lifetimes by introducing an imaginary part in the QP energies. To mimic the effect of the electronphonon Fan selfenergy Giustino (2016); Bernardi et al. (2014), we use a term proportional to the density of states. To mimic the effect of the selfenergy Aryasetiawan and Gunnarsson (1998) we use a term which grows quadratically in energy, from the Fermi level in Cu and from the conduction band maximum (valence band minimum) plus (minus) the band gap in Si.
iii.2 Smearing parameter and conductivity sum rule
The macroscopic densitydensity and paramagnetic currentcurrent response functions in Eq. (9) contain the infinitesimal which ensures causality and avoids having poles on the real axis. Numerically can be used as a smearing parameter to simulate finite lifetime effects of the excitations. As a result each peak in the absorption spectrum acquires a finite width. This is done by replacing
(51) 
in Eq. (9) for both the densitybased and the currentbased approach, where is now a finite number. In the currentbased approach however the frequency enters also in the definition of the dielectric function, Eq. (6), as a factor . Moreover it is common to numerically impose Berger et al. (2006); Raimbault et al. (2015) the CSR (Eq. 24) replacing the diamagnetic term, , with minus the paramagnetic term, , evaluated at . It is thus natural to wonder whether we should replace also by and if we should use evaluated at or while imposing the CSR.
In Fig. 1 we plot the optical spectra () obtained for LiF and Si inserting in Eq. (7) and in Eq. (6). For the currentbased approach we consider both the factors and in the definition of the dielectric function (Eq. (6)). In both cases the diamagnetic term is replaced by , i.e. the CSR is imposed with . The optical spectra obtained within the currentbased approach are different from those obtained within the densitybased approach. In particular they present clearly unphysical features: the case with the factor in Eq. (6) shows a divergent low energy contribution which resembles the Drude like behaviour of metals; the case with the factor shows a negative peak at . Both unphysical features are related to the existence of a finite smearing in the low frequency region of the spectrum.
One possible solution is to adopt the recipe proposed by Cazzaniga et al. Cazzaniga et al. (2010) in another context, i.e. to choose , instead of . The latter choice implies no smearing at and a CSR uniquely defined by . For the case of Si we show this recipe cures both unphysical features although some residual numerical noise remains.
A more rigorous solution requires to consider again the expansion defined by Eq.(21). Having a smearing parameter we can now expand either around or around (i.e. ). The expansion around is more suited to analyze the case with the factor in Eq. (6) and yields
(52) 
The paramagnetic term entering is correctly balanced by replacing in the diamagnetic term. However is not zero anymore (it is zero only for ) and thus leads to a divergence in the spectrum at .
The expansion around is more suited to analyze the case with the factor in Eq. (6) and yields
(53) 
In this case is numerically zero as expected theoretically, however the the paramagnetic term entering is not correctly balanced (since we are using ) and the CSR is broken. is here multiplied by leading to a negative energy peak around . To summarize, in order to avoid unphysical divergencies and negative peaks there are two options:

the factor can be used in Eq. (6) together with the CSR imposed by