Optical properties of periodic systems within the current-current response framework: pitfalls and remedies

Optical properties of periodic systems within the current-current response framework: pitfalls and remedies

Abstract

We compare the optical absorption of extended systems using the density-density and current-current linear response functions calculated within many-body perturbation theory. The two approaches are formally equivalent for a finite momentum of the external perturbation. At , however, the equivalence is maintained only if a small expansion of the density-density response function is used. Moreover, in practical calculations this equivalence can be lost if one naively extends the strategies usually employed in the density-based approach to the current-based approach. Specifically we discuss the use of a smearing parameter or of the quasiparticle lifetimes to describe the finite width of the spectral peaks and the inclusion of electron-hole interaction. In those instances we show that the incorrect definition of the velocity operator and the violation of the conductivity sum rule introduce unphysical features in the optical absorption spectra of three paradigmatic systems: silicon (semiconductor), copper (metal) and lithium fluoride (insulator). We then demonstrate how to correctly introduce lifetime effects and electron-hole interactions within the current-based approach.

pacs:

I Introduction

The electromagnetic linear response of solids can be measured experimentally by applying a small external perturbation-or probe-which induces a small change in the sample material. This change is the response of the material to the perturbing field and it can have both longitudinal and transverse components, depending on the experimental setup and the inhomogeneity of the system. Well-known examples of such experiments are the measurements of absorption, electron energy loss, Kerr and Faraday effects, and dichroism.

From the theoretical point of view the linear response of a system to longitudinal fields can be obtained from both the density-density response function and the current-current response function , which is a tensor Strinati (1988). Instead, the response to transverse fields can only be described using . This is due to the fact that the density determines the longitudinal current—through the continuity equation— but not the transverse current. An important instance where transverse electric fields come into play is the recent experimental progress on topological insulators Checkelsky et al. (2012); Chang et al. (2013). Therefore approaches based on are more general than those based on . On the other hand the current–density based approach is more susceptible to numerical issues and instabilities. For example the paramagnetic and diamagnetic contributions to must be treated on equal footing Kootstra et al. (2000); Raimbault et al. (2015) otherwise divergencies can arise.

The differences between the two response functions are intrinsically related to the different gauge used to define the coupling of the external field with the electrons. While transverse fields can be described only in terms of the transverse component of the vector potential, for longitudinal fields there are two options. One can use a scalar potential, which couples to the electron density of the system, or the longitudinal component of the vector potential, which couples to the current density. In the optical limit, i.e., in the limit where the momentum carried by the external perturbation is negligible, these two gauges are called length and velocity gauge, respectively. In this limit the distinction between longitudinal, or parallel to , and transverse, or perpendicular to , fields vanishes. However the optical limit is also the case where many subtle differences between the two approaches arise, fundamentally because the density-based approach is ill-defined at .

Various works in the literature have discussed and compared the two approaches either within the independent-particle (IP) picture Sipe and Ghahramani (1993); Lamb et al. (1987); Sangalli et al. (2012), in which optical properties can be described in terms of a simple sum-over-states expression, or within a generalized Kohn-Sham approach. Springborg and Kirtman (2008) These studies focus mostly on the limitations of the density-based approaches and the avoidance of divergencies in the current-based approaches. An important point of debate is whether the two approaches give the same longitudinal macroscopic dielectric function. At the linear-response level the equivalence has been shown, but only in some specific cases, i.e., for semiconductors and insulators, for Hamiltonians with only local operators, and for absorption at resonance Schafer and Wegener (2002). The comparison for metals, for Hamiltonians with nonlocal operators, and for absorption out of resonance due to smearing,  1 has received less or no attention. More importantly, no systematic comparison of the density-based and the current-based approaches has been carried out so far at the many-body level.

The objectives of this work are (i) to better elucidate the differences between the density and the current-based approaches at the IP level; (ii) to extend the discussion to the many-body framework. In particular, at the IP level, we compare the two approaches when lifetime effects are introduced (either from first-principles or by introducing a smearing parameter). At the many-body level we consider the Bethe-Salpeter equation (BSE). Based on Green functions theory, the BSE is the fundamental equation of Many-Body Perturbation Theory for the description of linear response properties Fetter and Walecka (2003). In particular we address some important and rather subtle aspects related to the definition of the velocity operator in the current-based BSE approach Rohlfing and Louie (2000) which did not receive proper attention so far. Strinati (1988); Rohlfing and Louie (2000); Del Sole and Fiorino (1984)

We note that alternative approaches exist in which the explicit calculation of response functions is avoided, for example by calculating the induced current-density in the frequency domain Kootstra et al. (2000); Romaniello and de Boeij (2005); Berger et al. (2005, 2006) or by calculating the time-dependent density or current-density using real-time propagation Attaccalite et al. (2011); Castro and Appel (2006); Bertsch et al. (2000). Many of the issues discussed in this work are also pertinent to these methods.

The manuscript is organized as follows. In Sec. II we compare the density and the current-based approaches at a formal level, within both the independent-particle approximation and the BSE. We compare the two approaches in the optical limit and elucidate the origin of their fundamental difference at . In Sec III we show how an incorrect implementation of smearing and an erroneous definition of the velocity operator can lead to different spectra in the two approaches. We then show how these problems can be solved. For sake of completeness, in Sec. IV we briefly discuss alternative approaches to calculate optical spectra. Finally, in Sec. V we draw our conclusions.

Ii Formal equivalence of density and current-based approaches

In this section we review the basic equations which describe optical properties within both a density-based and a current-based approach and we compare these approaches in the optical limit . We first consider the case of noninteracting electron-hole pairs referred to in the following as IP level, 2 and then the case in which the electron-hole interaction is treated within the BSE of Many-Body Perturbation Theory. In both cases we demonstrate that the two approaches should lead to the same results for the longitudinal macroscopic dielectric function (see App. B for its definition).

ii.1 Noninteracting electron-hole pairs

Let us consider a collection of electrons moving in a periodic potential , with and inside the unit cell volume , whose position is identified by the Bravais lattice vector . We are interested in the response of the system to a macroscopic time-dependent electromagnetic field characterized by a set of scalar and vector potentials, . A macroscopic quantity is defined as an average taken over the unit cell whose location is given by  Wiser (1963) (see App. A).

The motion of the system is then governed by the following one-particle Hamiltonian,

(1)

where , and

(2a)
(2b)

for which the macroscopic external perturbations and vanish identically for . The term instead describes a periodic potential (i.e. it does not depend on ) and can be seen as the mean field felt by the electrons. Here and throughout the article we use atomic units with the gaussian convention for electromagnetism. It is important to notice that the perturbing potentials and are external potentials and they do not take into account the contributions due to the response of the system.

The density and the current-density of the system are defined in terms of the Bloch wave functions , which are eigenstates of the one-particle hamiltonian , as

(3a)
(3b)

Here is a generalized index that comprises the band index and the wave vector , and are the occupations factors. In the collinear case can be further specified as the band with the corresponding spin index , while in the spinorial case it is the spinorial band index. The density operator is the identity in real space . The current-density is given in terms of the velocity operator Landau and Lifshits (1965).

(4)

where , is the commutator, while is the anticommutator. With the Hermitian Hamiltonian defined in Eq. (1) . The momentum operator gives the paramagnetic current density, i.e. , and the potential-dependent term gives the diamagnetic current density. We note that the velocity operator, and therefore the current operator, depends on the Hamiltonian. We will discuss this important point later in the paper. Density and current-density variations are induced as a response to the perturbing potentials in Eqs. (2a) and (2b).

We restrict ourselves to macroscopic longitudinal perturbations with a small transferred momentum . In the following, therefore, we consider the longitudinali, i.e., parallel to , component only of the vector potential and the current (see App. B). Longitudinal perturbations can be described by a scalar potential , for example in the Coulomb gauge , or by a longitudinal vector potential , for example in the (incomplete) Weyl gauge . In the optical limit the two gauges generate the so-called length and velocity gauges (see App. C).

The macroscopic density-density,and the longitudinal current-current response functions, and , respectively, can be written in reciprocal space as (see App. D)

(5a)
(5b)

When the induced density and current density are calculated from the Hamiltonian in (1), then the response functions in Eqs. (5) are the independent particle ones, and . We can now define the longitudinal component of the macroscopic dielectric tensor , which conveniently describes the optical properties of semiconductors in the long wavelength limit. It can be obtained, for , in terms of the longitudinal current-current response function as Strinati (1988)

(6)

For , the dielectric function can also be expressed in terms of the density-density response function as

(7)

with . Equation (7), in the limit , is the expression commonly used for the calculations of optical properties in solids. For both expressions (6) and (7) the non-analytic point can be described only via a limiting procedure and the final result depends on the direction of the limit.

Both and should lead to the same result. Indeed the two response functions are related by the expression Giuliani and Vignale (2005); Schafer and Wegener (2002)

(8)

which follows from the continuity equation , which guarantees local charge conservation. Thus at and relation (8) ensures . We note that although we have established formal equivalence this does not guarantee numerical equivalence, as we show in Section III.

Choice of the reference one-particle hamiltonian

Within the IP picture defined by the Hamiltonian in Eq. (1) the response functions can be written as

(9)

with

(10)
(11)
(12)
(13)

(the spin index in and is identical in the collinear case) and

(14)

where it is understood that the limit is taken at the end of the calculation. Here is either the one-particle density operator or the paramagnetic current-density operator , and are Bloch states, which are eigenstates of the equilibrium one-particle Hamiltonian ), with corresponding energies and occupation numbers .3 It is important to notice that the transitions with give no contribution since at equilibrium . Thus all the summations in the present manuscript are intended without the zero energy poles. The full current-current response function is obtained via

(15)

We shall now choose the stationary part of the one-particle Hamiltonian (1). A common choice is the Kohn-Sham Hamiltonian from density functional theory (DFT) where

(16)

is the equilibrium Kohn-Sham (KS) potential. It is the sum of the potential generated by the nuclei, , the Hartree potential, , and the exchange-correlation potential, . We refer to the IP response function derived from the KS hamiltonian as . However, usually DFT-KS band structures are not a good starting point for response calculations, since, for example, the fundamental band gap is systematically underestimated in semiconductors and insulators. To improve over DFT one can introduce a more general nonlocal and frequency dependent quasiparticle (QP) potential

(17)

where is the exchange-correlation many-body self-energy, defined within the formalism of Many-Body Perturbation Theory (MBPT), evaluated at equilibrium. In particular we consider the (first-order) QP hamiltonian defined as the first-order correction in the perturbation of the KS hamiltonian:

(18)

We have here emphasized the dependence of , and dropped the space and spin dependence of both and . The QP eigenvalues are thus corrected KS energies

(19)

while we keep the same wave functions . From the quasiparticle energies , solution of Eq. (19), we can define the QP response function, , which differs from the KS response function by the replacement .

Expansion of the current-based approach at finite momentum

We now focus on the relation between the density-based and current-based formalism by extending the approach of Ref. Sipe and Ghahramani, 1993 to . To simplify the notation we rewrite the response function (9) as

(20)

where . By using the exact relation 4

(21)

in Eq.(20) and inserting the result into Eq.(6), we can decompose as

(22)

with

(23a)
(23b)
(23c)

where in Eq.(23a) we used Eqs.(15) and (20). From the conductivity sum rule (CSR) for , given by

(24)

we see that . Moreover, if time-reversal symmetry holds, also . Therefore, in case of time-reversal symmetry, only the term survives and we have, at finite momentum, the general result

(25)

The last equality 5 holds thanks to Eq. (8). In the next section we will use (23a)-(23c) to discuss the optical limit.

The optical limit

We are interested in computing optical properties, for which , with in the order of a few eV. Therefore we consider compared to the size of the Brillouin zone. However, , given in Eq. (7), is not defined for because of the term. Therefore, to obtain an explicit expression for for small , we Taylor expand , defined in Eq. (13), around ,

(26)

where 6 with the direction of . Substitution into Eq. (20) leads to with

(27)

the longitudinal dipole-dipole response function. Thus, for small , we can rewrite Eq. (7) as

(28)

which is well defined at .7 It is Eq. (28) that is usually implemented to compute absorption spectra of cold semiconductors and insulators in the density-based approach.

However, Eq. (28) is formally exact only in the limit , since at contributions from intraband transitions, i.e., transitions within a single band, are excluded 8. Instead, no such problem exists for the current-based approach. We can summarize this difference between the two approaches as

(29a)
(29b)

We now discuss the cases and in more detail by separating , and into inter- and intraband contributions. Moreover, also contains the constant diamagnetic term .

The case: interband transitions.
At only interband transitions, i.e., transitions between two different bands, contribute to the summations in Eqs. (23a)-(23c), since for intraband transitions . Because of the missing intraband contributions in the CSR in Eq. (24) in general does not hold at . Let us define

(30a)
(30b)
(30c)

One can verify 9 that by exchanging and in Eq. (23b).

Let us first consider systems with a gap. Then since the dielectric function must go to a constant Sipe and Ghahramani (1993) as . We can thus focus on . Using Eq. (12) for the paramagnetic current-density, we can write

(31)

where we used and given by Eq. (12) with . From this relation we deduce that

(32)

Substitution of this identity into Eq. (28) shows that at both and can be expressed in terms of for systems with a gap:

(33)

Equation (33) proves the equivalence between the density-based and the current-based approaches for cold semiconductors and insulators.

For metals, however, we also need to describe the divergent Drude-like term. Since the current-based approach is exact also at , this term must be described by , i.e. the Drude tail originates from the breaking of the CSR. Thus we have

(34)

Instead, the density-based approach does not contain any extra term beyond and thus cannot describe metals at . The Drude-like tails in the density-based approach can be obtained only explicitly dealing with the limit as explained in the next subsection.

The limit: intraband transitions.
In the limit also intraband transitions contribute to the summations in Eqs. (23a)-(23c). One can thus define

(35a)
(35b)
(35c)

where we used Eq. (30b) and the fact that owing to the CSR in Eq. (24). If time-reversal symmetry holds and hence . We note that here and in the rest of the paper the limit is taken at finite . Using the results of Eqs. (35a)-(35c) in Eq. (22) we obtain

(36)

where we used Eq. (30c). Owing to Eq. (29a) we can compare this result to Eq. (34) and conclude that

(37)

This means that, within the current-based approach, the Drude-like tail, which at is described by , in the limit is described via . Indeed, the exact expression for at finite in Eq. (25) only depends on . Thus must describe all intraband transitions in metals when .Marini (2001).

To have an explicit expression of such intraband contribution in the density-based approach, one needs to Taylor expand the energy and occupation number differences as Romaniello and de Boeij (2005)

(38)
(39)

where . At the IP level there is no Drude tail in the absorption (see App.  E for details). Only when introducing a smearing a Drude-like peak in the absorption appears Romaniello and de Boeij (2005); Berger et al. (2005). Experimentally such smearing also exist, because of the interaction between electrons and thus in practice a peak is always measured.

In conclusion, the current-based approach is exact both in the limit and at . In particular, if metals are considered, the Drude-like tail is present in both cases, but it is described by different terms, namely by in the limit and by at . Also the density-based approach is exact in the limit . This can be summarized by the following set of relations

ii.2 Interacting electron-hole pairs

So far we have only considered independent particles. We now also take into account the electron–hole interaction by considering the variations induced in the Hartree and the exchange–correlation self–energy by the external potential. Within MBPT such variations can be conveniently described using the BSEStrinati (1988) for the (time-ordered) two-particle propagator , which reads

(40)

where . In Eq. (40) is given in terms of QP Green’s functions as and the four-point kernel is given by

(41)

where is the Hartree potential without the long-range component of the Coulomb potential, . We are now looking at the self-consistent response of the system to a macroscopic field composed of both the external and the induced macroscopic field. Therefore is effectively removed Bussi (2004). We use the self-energy with an instantaneous screened Coulomb potential . In this case the propagator depends only on the time difference 10. A Fourier transformation to frequency space leads to

(42)

where, for notational convenience, we dropped the spin and space arguments 11.

The BSE can be mapped onto an effective two-particle equation, written in a basis of electron-hole transitions , according to  Onida et al. (2002); Gatti and Sottile (2013)

(43)

where is the excitonic Hamiltonian with eigenvectors and eigenvalues . The (retarded)12 particle-hole propagator can be obtained via

where and the overlap matrix are defined by

(44)
(45)

Upon solving Eq. (43), one can obtain the density-density and the current-current excitonic response functions as

(46)

where and are defined analogously to and in Eqs. (12) and (13) but the expectation value is with respect to the excitonic wave function. As for the IP case, the zero-energy transitions give no contribution to the summation 13 since and thus . Note that neglecting the kernel in the BSE (42), Eq. (46) reduces to the IP response function given in Eq. (9).

Substitution of and in Eqs. (6) and (7), respectively, yields an expression for the macroscopic dielectric tensor in the current- and density-based approach. As in the IP case the two expressions are identical for and thanks to Eq. (8).

The optical limit

The analysis for the optical limit in Sec. II.1.2-Sec. II.1.3 also applies to the excitonic case. In particular, by rewriting the expression for the excitonic response function as

(47)

where one obtains

(48a)
(48b)
(48c)

The interband dielectric function in terms of the excitonic dipole-dipole response function reads

(49)

where now contains the longitudinal excitonic dipoles .

Finally, the analogous of the relation (32) exists also for the exitonic case:

(50)

from which expressions analogous to Eqs. (34) and (35) can be written for .

Iii Issues in the practical implementation of current-based approaches

In the previous section the equivalence between the current-based and density-based approaches was established for the longitudinal macroscopic dielectric function. In the present section we compute optical absorption in a semiconductor (bulk silicon), an insulator (LiF), and a metal (copper) comparing the two approaches numerically, i.e., using both Eq. (6) for the current-based approach and Eq. (7) for the density-based approach. In doing so we will show potential pitfalls in the implementation of current-based approaches due to: (i) the use of a smearing parameter (Sec. III.2); (ii) the violation of the conductivity sum rule (Sec. III.3); (iii) the inclusion of lifetimes (Sec. III.4); (iv) the inclusion of QP energy corrections and excitonic effects with an incorrect velocity operator (Sec. III.5). Moreover, we demonstrate how those pitfalls can be avoided.

iii.1 Computational details

The response functions entering Eqs. (6) and (7) are constructed starting from ground state DFT calculations performed with either the Quantum-Espresso codeGiannozzi et al. (2009) (Si and Cu) or the Abinit codeGonze et al. (2002) (LiF) using norm conserving pseudopotentials. The ground-state of all three materials have been calculated by using the local density approximation (LDA) for the exchange-correlation functional. For Si we used 4 valence electrons ( configuration), a face-centered cubic (FCC) cell with a two atoms (diamond structure) and the experimental lattice parameter of Bohr. For calculating the ground state density we use an energy cutoff of 10 Ha and a sampling of the Brillouin zone (BZ). For the calculation of the response function we used instead a sampling of the BZ, resulting in 145 k-points in the irreducible BZ (IBZ) and 4096 in the BZ. For LiF we used 8 valence electrons ( configuration for Li and for F)Goedecker et al. (1996), a FCC cell with two atoms (sodium-chloride structure) and the experimental lattice parameter of Bohr. For calculating the ground state density and the response function we use an energy cutoff of 40 Ha with a sampling of the BZ, resulting in 29 (512) k-points in the IBZ (BZ). For Cu we used with 11 valence electrons ( configuration), a FCC cell with a single atom and the experimental lattice parameter of Bohr. For calculating the ground state density and the response function we use 32.5 Ha and a sampling of the BZ, resulting in 145 (4096) k-points in the IBZ (BZ).

We computed the dielectric function starting from the DFT-KS wave functions and energies using the Yambo code Marini et al. (2009) where we implemented the equations for the current-based approach.14. For Si and LiF, we considered electron-hole pairs built from the top 3 valence and the bottom 3 conduction bands. For Cu we included 30 bands in the band summation. A scissor operator is used to mimic the effect of the quasiparticle corrections in Si ( eV) and LiF ( eV) consistently to what already reported in the literature. For the BSE calculations the static screening in the random-phase approximation is computed using 50 bands in the band summation and 2.3 Ha energy cutoff in the summation over reciprocal lattice vectors for Si, and 30 bands and a 3.6 Ha energy cutoff for LiF.

We also consider QP lifetimes by introducing an imaginary part in the QP energies. To mimic the effect of the electron-phonon Fan self-energy Giustino (2016); Bernardi et al. (2014), we use a term proportional to the density of states. To mimic the effect of the self-energy Aryasetiawan and Gunnarsson (1998) we use a term which grows quadratically in energy, from the Fermi level in Cu and from the conduction band maximum (valence band minimum) plus (minus) the band gap in Si.

iii.2 Smearing parameter and conductivity sum rule

The macroscopic density-density and paramagnetic current-current response functions in Eq. (9) contain the infinitesimal which ensures causality and avoids having poles on the real axis. Numerically can be used as a smearing parameter to simulate finite lifetime effects of the excitations. As a result each peak in the absorption spectrum acquires a finite width. This is done by replacing

(51)

in Eq. (9) for both the density-based and the current-based approach, where is now a finite number. In the current-based approach however the frequency enters also in the definition of the dielectric function, Eq. (6), as a factor . Moreover it is common to numerically impose Berger et al. (2006); Raimbault et al. (2015) the CSR (Eq. 24) replacing the diamagnetic term, , with minus the paramagnetic term, , evaluated at . It is thus natural to wonder whether we should replace also by and if we should use evaluated at or while imposing the CSR.

Figure 1: (color online) Optical absorption in bulk LiF (panel a) and Si (panel b) at the IP level. The spectra obtained by replacing by in Eq. (9), for the density-based approach, Eq. (7), (grey shadow) and the current-based approach, Eq. (6) with either (red dashed line) or (blue continuous line), are compared. Results with the numerical recipe are also shown. Here eV for Si and eV for LiF. The conductivity sum rule is always enforced as explained in the main text.

In Fig. 1 we plot the optical spectra () obtained for LiF and Si inserting in Eq. (7) and in Eq. (6). For the current-based approach we consider both the factors and in the definition of the dielectric function (Eq. (6)). In both cases the diamagnetic term is replaced by , i.e. the CSR is imposed with . The optical spectra obtained within the current-based approach are different from those obtained within the density-based approach. In particular they present clearly unphysical features: the case with the factor in Eq. (6) shows a divergent low energy contribution which resembles the Drude like behaviour of metals; the case with the factor shows a negative peak at . Both unphysical features are related to the existence of a finite smearing in the low frequency region of the spectrum.

One possible solution is to adopt the recipe proposed by Cazzaniga et al. Cazzaniga et al. (2010) in another context, i.e. to choose , instead of . The latter choice implies no smearing at and a CSR uniquely defined by . For the case of Si we show this recipe cures both unphysical features although some residual numerical noise remains.

A more rigorous solution requires to consider again the expansion defined by Eq.(21). Having a smearing parameter we can now expand either around or around (i.e. ). The expansion around is more suited to analyze the case with the factor in Eq. (6) and yields

(52)

The paramagnetic term entering is correctly balanced by replacing in the diamagnetic term. However is not zero anymore (it is zero only for ) and thus leads to a divergence in the spectrum at .

The expansion around is more suited to analyze the case with the factor in Eq. (6) and yields

(53)

In this case is numerically zero as expected theoretically, however the the paramagnetic term entering is not correctly balanced (since we are using ) and the CSR is broken. is here multiplied by leading to a negative energy peak around . To summarize, in order to avoid unphysical divergencies and negative peaks there are two options:

  • the factor can be used in Eq. (6) together with the CSR imposed by