Static and dynamic variational principles for strongly correlated electron systems
The equilibrium state of a system consisting of a large number of strongly interacting electrons can be characterized by its density operator. This gives a direct access to the ground-state energy or, at finite temperatures, to the free energy of the system as well as to other static physical quantities. Elementary excitations of the system, on the other hand, are described within the language of Green’s functions, i.e. time- or frequency-dependent dynamic quantities which give a direct access to the linear response of the system subjected to a weak time-dependent external perturbation. A typical example is angle-revolved photoemission spectroscopy which is linked to the single-electron Green’s function. Since usually both, the static as well as the dynamic physical quantities, cannot be obtained exactly for lattice fermion models like the Hubbard model, one has to resort to approximations. Opposed to more ad hoc treatments, variational principles promise to provide consistent and controlled approximations. Here, the Ritz principle and a generalized version of the Ritz principle at finite temperatures for the static case on the one hand and a dynamical variational principle for the single-electron Green’s function or the self-energy on the other hand are introduced, discussed in detail and compared to each other to show up conceptual similarities and differences. In particular, the construction recipe for non-perturbative dynamic approximations is taken over from the construction of static mean-field theory based on the generalized Ritz principle. Within the two different frameworks, it is shown which types of approximations are accessible, and their respective weaknesses and strengths are worked out. Static Hartree-Fock theory as well as dynamical mean-field theory are found as the prototypical approximations.
:71.10.-w, 71.10.Fd, 71.27.+a, 71.30.+h, 79.60.-i
address= I. Institut für Theoretische Physik, Universität Hamburg, Jungiusstr. 9, 20355 Hamburg, Germany
- 1 Motivation
- 2 Models and variational methods
- 3 Static variational principle
- 4 Using the Ritz principle to construct approximations
- 5 Dynamical quantities
- 6 Self-energy-functional theory
- 7 Consistency, symmetry and systematics
- 8 Bath sites and dynamical mean-field theory
- 9 Concluding discussion
To understand the physics of systems consisting of a large number of interacting fermions constitutes one of the main and most important types of problems in physics. In condensed-matter physics many materials properties are governed, for example, by the interacting “gas” of valence electrons. From the theoretical perspective, the Coulomb interaction among the valence electrons must be considered as strong or at least of the same order of magnitude as compared to their kinetic energy for transition metals and their oxides, for example. This implies that usual weak-coupling perturbation theory [Abrikosow et al.(1964), Fetter and Walecka(1971), Negele and Orland(1988)] does not apply. Density-functional theory (DFT) [Hohenberg and Kohn(1964), Kohn and Sham(1965), Almbladh and von Barth(1985), Jones and Gunnarsson(1989), Eschrig(1996)] can be regarded as a standard technique in the field of electronic-structure calculations for condensed-matter systems. It provides an in principle exact approach which yields the electron density and the energy of the ground state. In practice, however, it must be combined with approximations such as the famous local density approximation (LDA). While this DFT-LDA scheme has been proven to be extremely successful in predicting ground-state properties of a large class of materials, there are also several well-known shortcomings for so-called strongly correlated systems. These comprise many of 3d or 4f transition-metals and their oxides, for example. Another defect of the standard DFT consists in its inability to predict excited-state properties and the dynamic linear response. This is crucial, however, to make contact to experimental probes such as angle-resolved photoemission, for example. Interpretations of photoemission spectra are often based on the DFT-LDA band structure. This lacks a fundamental justification and is is essentially equivalent to a Hartree-Fock-like picture of essentially independent electrons. The Hartree-Fock theory can be derived from a “static” variational principle where the ground-state energy or, at finite temperatures, the grand potential is minimized when expressed in a proper way as a functional of the pure or mixed state of the system, respectively. This is the Ritz variational principle.
Opposed to the static variational principle, however, there is a well-known “dynamical” variational principle which directly focuses on the one-electron excitation spectrum [Luttinger and Ward(1960)]. Here the grand potential is expressed as a functional of the one-electron Green’s function or the self-energy and can be shown to be stationary at the respective physical quantity. Similar to the density functional and similar to the Ritz principle, the dynamical variational principle is formally exact but needs additional approximations for a practical evaluation. Since long the approximations constructed in this way [Baym and Kadanoff(1961), Baym(1962)] have been perturbative as they are defined via partial resummations of diagrams where contributions at some finite order are missing. Hence, they are valid in the weak-coupling regime only. The question arises whether it is possible to derive approximations from a dynamical variational principle which are non-perturbative and able to access the physics of strongly correlated electron systems where several interesting phenomena, like spontaneous magnetic order [M. Donath and Nolting(1998), Baberschke et al.(2001)], correlation-driven metal-insulator transitions [Mott((1949)), Mott(1990), Gebhard(1997)] or high-temperature superconductivity [Anderson(1987), Orenstein and Millis(2000)] emerge.
Rather than starting from the non-interacting Fermi gas as the reference point around which the perturbative expansion is developed, a local perspective appears to be more attractive for strongly correlated electron systems, in particular for prototypical lattice models with local interaction, such as the famous Hubbard model [Hubbard(1963), Gutzwiller(1963), Kanamori(1963)]. The idea is that the local physics of a solid-state ion with a strong and due to screening effects essentially local Coulomb interaction is the more proper starting point for a systematic theory and that a self-consistent embedding of the ion in the lattice environment captures the main effects. Since the invention of dynamical mean-field theory (DMFT) [Metzner and Vollhardt(1989), Jarrell(1992), Georges and Krauth(1992), Georges et al.(1996), Kotliar and Vollhardt(2004)], a non-perturbative approximation with many attractive properties is available which just relies on this local perspective. The paradigmatic field of applications for the DMFT is the Mott-Hubbard metal-insulator transition [Mott(1961), Gebhard(1997)] which, at zero temperature, can be seen as prototypical quantum phase transition that is driven by the electron-electron interaction and cannot be captured by perturbative methods. “Mottness”, i.e. physical phenomena originating from a close parametric distance to the Mott transition or the Mott insulator, is also believed to be a possible key feature for an understanding of the many unusual and highly interesting properties of cuprate-based high-temperature superconductors. This example shows that the DMFT, at least as a starting point for further methodical improvements, nowadays appears as an attractive approach to the electronic structure of unconventional materials. In particular, there is the exciting perspective that, when combined with DFT-LDA, dynamical mean-field theory will ultimately be able to constitute a new standard for ab initio electronic-structure calculations with a high predictive power.
The DMFT can be derived in an elegant way from the dynamical variational principle. The purpose of these lecture notes is to demonstrate how this is achieved and whether it is possible to derive similar or new approximations in the same way and to characterize the strengths and weak points of these “dynamical” approximations. The strategy to be pursued here is to first understand the formalism related to the static Ritz principle and to show up the differences but also the close analogies with the dynamic approach.
The notes are organized as follows: The next section introduces the systems we are interested in and discusses on a general level the variational approach as such. Sec. 3 then develops the static variational principle as a generalization of the Ritz principle. This is used in Sec. 4 to construct static mean-field theory. To transfer the insight that has been gained from the static approach to the dynamic one, Sec. 5 introduces the concept of Green’s functions and diagrammatic perturbation theory. With this it becomes possible to define the central Luttinger-Ward functional and the self-energy functional which serve to set up the dynamical variational principle. These points are discussed in Sec. 6. With the variational cluster approximation we give a standard example for a non-perturbative approximation constructed from the dynamical variational principle. Consistency issues, symmetry breaking and the systematics of dynamical approximations are discussed in Sec. 7. Sec. 8 particularly focuses on approximations related to dynamical mean-field theory. A summary and the conclusions are given in Sec. 9.
Secs. 2 – 5 are written on a standard textbook level and can be understood with basic knowledge in many-body theory. The contents of Secs. 6 – 8 is basically taken from Ref. [Potthoff(in press)] but include some extensions and changes necessary for a self-contained presentation and to make the topic more accessible to the less experienced reader.
2 Models and variational methods
We consider a system of electrons in thermodynamical equilibrium at temperature and chemical potential . The Hamiltonian of the system consists of a non-interacting part specified by one-particle parameters and an interaction part with interaction parameters :
The index refers to an arbitrary set of quantum numbers labeling an orthonormal basis of one-particle states . As is apparent from the form of , the total particle number with is conserved. and refer to the set of hopping matrix elements and interaction parameters and are formally given by:
where is the electron’s kinetic energy, the external potential and the electrostatic Coulomb interaction between two electrons “1” and “2”. One has to be aware, however, that in many contexts and are merely seen as model parameters or considered as effective parameters which in addition account for effects not included explicitly in the Hamiltonian, such as metallic screening, for example.
The Hamiltonian describes the most general two-particle interaction. To give examples and to apply the techniques to be discussed below to a more concrete situation, it is sometimes helpful to focus on a less general model. The famous Hubbard model [Hubbard(1963), Gutzwiller(1963), Kanamori(1963)],
is a prototypical model for a system of strongly correlated electrons. Here, electrons are assumed to hop over the sites of an infinitely extended lattice with a single spin-degenerate atomic orbital per lattice site: . The hopping integrals are assumed to be diagonal with respect to the spin index and to be spin-independent. Furthermore, the interaction is assumed to be strongly screened and to act only locally, i.e. two electrons must occupy the same lattice site to interact via the Hubbard-. Due to the Pauli principle, these electrons must then have opposite spin projections .
There are numerous and largely different many-body techniques for an approximate solution of the Hubbard model or for the more general model Eq. (1). Here, we will concentrate on ground-state properties or properties of the system in thermal equilibrium and focus on two classes of approaches, namely techniques based on a
“static” variational principle
as well as techniques based on a
“dynamic” variational principle
which represent prototypical examples of different variants of variational principles. These two classes of principles are different, and actually there is no (known) mapping between them. On the other hand, there are a number of illuminating and apparent analogies which are worth to be discussed. Formally, the principles are exact. The static principle provides the exact state of the quantum system or, at finite temperature, the exact density matrix of the system in thermal equilibrium. The dynamical principle, on the other hand, yields the exact equilibrium self-energy or Green’s function of the system. For all practical issues, it is clear, however, that approximations are necessary.
There are some obvious advantages of approximations constructed from a variational principle of the form :
The usual way to apply the variational principle is to propose some physically motivated form for the quantity of interest which may depend on a number of variational parameters . The optimal is then found by varying to find a set of parameters that satisfies . This yields the approximation to the exact . As there is not necessarily a small parameter involved, this way of constructing approximations is essentially non-perturbative. This also means, however, that the ansatz has to be justified very carefully.
The variational procedure not only yields an approximation for but also for the grand potential . As is obvious from Fig. 1, if the approximate is sufficiently close to the exact or physical value , i.e. if is sufficiently small, then the error in the grand potential is of second order only, .
From the approximate grand potential one can derive, by differentiation with respect to parameters of the Hamiltonian, an in principle arbitrary set of physical quantities comprising thermal expectation values but also time-dependent correlation functions via higher-order derivatives. As a rule of thumb, the higher the derivative the more accurate must be the approximate grand potential to get reliable estimates. The fact that an approximate but explicit form for a thermodynamical potential is available, ensures that all quantities are derived consistently. E.g. the thermodynamical Maxwell’s relations are fulfilled by construction.
An approximation based on a variational principle can in most cases be generalized systematically. One simply has to allow for more variational parameters in the ansatz . There is a clear tradeoff between the accuracy of the approximation on the one hand and the necessary computational effort to evaluate the resulting Euler equation on the other hand when increasing the parameter space.
If the grand potential is at a (global) minimum for the physical value , then any approximation yields an upper bound to the physical . This is an extremely helpful property since it allows to judge on the relative quality of an approximation (i.e. as compared to another one). However, not all variational principles are minimum principles since usually is a multicomponent quantity. Then, merely means that the grand potential is stationary at but it is not necessarily at a minimum (or maximum). As will be seen below, the static principle is a minimum principle while the dynamical variational principle is not.
3 Static variational principle
To derive the static (generalized Ritz) variational principle, we will first compute the static response of an observable to a small static perturbation. This will be used to prove the concavity of the grand potential which is necessary to derive the desired minimum principle.
3.1 Static response
The grand potential of the system with Hamiltonian at temperature and chemical potential is given by where
is the partition function and
the equilibrium density operator and . The dependence of the partition function (and of other quantities discussed below) on the parameters and is frequently made explicit through the subscripts.
Let be a (one-particle or an interaction) parameter of the Hamiltonian that couples linearly to the observable . We furthermore assume that the “physical” Hamiltonian is obtained for , i.e. where . A straightforward calculation then yields
Note and do not necessarily commute and that the physical value of the expectation value is obtained by setting .
The computation of the second derivative is a bit more involved but also straightforward. We have:
Using , this yields
The derivative can be performed with the help of the Trotter decomposition:
Writing for short,
With we get:
In the continuum limit we define . Hence , and
with the Heisenberg representation
for imaginary time where . Collecting the results, we finally find:
Physically, this is the response of the grand-canonical expectation value of the observable subjected to a small static external perturbation .
Here the result can be used to show that the grand potential is a concave function of :
This is seen as follows: With we have
Using the definition of the quantum-statistical average and , this implies
Hence the grand potential is a concave function of any parameter that linearly enters the Hamiltonian.
3.2 Generalized Ritz principle
To set up the famous Ritz variational principle, we define
This represents the energy of the quantum systems as a functional of the state vector. The functional parametrically depends on and . The Ritz variational principle then states that the functional is at a (global) minimum for the ground state of the system:
The proof is straightforward and can be found in standard textbooks on quantum mechanics.
In the following we will generalize this principle to cover systems in thermal equilibrium with a heat bath at finite temperature and refer to this as the generalized Ritz principle. The classical version of the generalized principle goes back to Gibbs [Gibbs(1948)] and was lateron proven for quantum systems by von Neumann and Feynman [von Neumann(1955), Feynman(1955), Mermin(1965)].
Let us first define a functional which gives the grand potential of the system in terms of the density operator:
Again, the functional parametrically depends on and as made explicit by the subscripts and on and (this dependence is suppressed in the notations). The generalized Ritz principle then states that the grand potential is at a (global) minimum,
for the exact (the “physical”) density operator of the system, i.e. for
and that, if evaluated at the physical density operator, yields the physical value for the grand potential:
For the proof, we first note that the latter is satisfied immediately when inserting Eq. (22) into Eq. (20). Hence, it remains to show that for “arbitrary” . The argument of the functional, however, should represent a physically meaningful density operator, i.e. shall be normalized (), positive definite () and Hermitian ().
To get a sufficiently general ansatz, we introduce the concept of a reference system. This is an auxiliary system with a Hamiltonian
that has the same structure as the Hamiltonian of the original model but with different one-particle and interaction parameters. The only purpose of the reference system is to span a space of trial density operators
which are given as the exact density operators of the reference system when varying the parameters and . Hence, a trial is given by
where the expectation value is done with respect to the reference system.
Now, consider the following the partition:
We have and and with
we get and . The first term on the r.h.s. of Eq. (27) represents the expectation value of a Hermitian operator that couples linearly via to the Hamiltonian , see Eq. (28). Using Eq. (6) we can therefore immediately write Eq. (27) in the form
On the other hand, is a concave function of , as has been shown in the preceding section. Since any concave function is smaller than its linear approximation in some fixed point, e.g. in , we have:
Evaluating this relation for and using Eq. (30), yields
This proves the validity of the generalized Ritz principle.
4 Using the Ritz principle to construct approximations
The standard application of the (generalized) Ritz principle is to construct the static mean-field approximation. This represents the well-known Hartree-Fock approximation but generalized to systems at finite temperatures.
4.1 Variational construction of static mean-field theory
The general scheme to define variational approximations which can be evaluated in practice is to start from the variational principle and to insert an ansatz for for which the functional can be evaluated exactly. To this end, one has to restrict the domain of the functional:
Usually, this is necessary since the grand potential and the expectation value on the r.h.s. of Eq. (27) are not available for interacting systems. A restriction of the domain of the functional is equivalent with a restriction of the reference system, i.e. with a restricted set of parameters and . Any choice for results in a particular approximation.
Static mean-field theory emerges for the reference system
where the interaction term is dropped, , but where all one-particle parameters are considered as variational parameters. This is an auxiliary system of non-interacting electrons. The corresponding restricted domain is:
Hence, static mean-field theory aims at the optimal independent-electron density operator to describe an interacting system. In the following we write
and for short. Our goal is to determine the optimal set of variational parameters from the conditional equation
To start with, we note
Inserting the trial density operator of the non-interacting reference system, we find:
The dependence on the variational parameters is twofold: There is an explicit dependence that is obvious in the third and the fourth term on the r.h.s. and there is an additional implicit dependence via the expectation value . To calculate the derivative in Eq. (37), we first note that according to Eq. (6) . Furthermore, we define (see Eq. (14)):
Therewith, Eq. (37) reads:
At this point we can make use of the fact that the reference system is given by a Hamiltonian that is bilinear in the creators and annihilators. In this case Wick’s theorem (see e.g. Ref. [Abrikosow et al.(1964), Fetter and Walecka(1971), Negele and Orland(1988)]) applies: Any -point correlation function consisting of creators and annihilators in an expectation value with respect to a bilinear Hamiltonian can be simplified and written as the sum over all different full contractions. Here, a full contraction is a distinct factorization of the -point correlation function into a product of two-point correlation functions. Usually, Wick’s theorem is formulated for time-ordered product of creators and annihilators, and time ordering produces, in the case of fermions, a minus sign for each transposition of these operators. For the static expectation value encountered here, we only have to consider a creator to be “later” than an annihilator, to realize that in this sense the expectation value in the second term on the r.h.s. is already time ordered and to take care of the minus sign when ordering the resulting two-point correlation functions. We find:
The result can also be derived in a more direct (but less elegant) way without using Wick’s theorem, of course. Using this and rearranging terms, we have:
We carry out the differentiation:
and again rearrange terms to get:
where we made use of . Collecting the results, we have
Assuming that can be inverted, this implies
Hence, the optimal one-particle Hamiltonian of the reference system reads:
is the (frequency-independent) Hartree-Fock self-energy. Note that the self-energy has to be determined self-consistently: Starting with a guess for , we can fix the reference system’s Hamiltonian . The two-point correlation function of the reference system is then easily calculated by a unitary transformation of the one-particle basis set such that the correlation function becomes diagonal, , and by using Fermi gas theory to get from the Fermi-Dirac distribution and, finally, by back-transformation to find . With this, a new update of the Hartree-Fock self-energy is obtained from Eq. (52).
The first term in Eq. (52) is the so-called Hartree potential. It can be interpreted classically as the electrostatic potential of the charge density distribution resulting from the electrons of the system. Opposed to the first term, the second one is spatially non-local if written in real-space representation. This is the Fock potential produced by the electrons and has no classical analogue. Note that there is no self-interaction of an electron with the potential generated by itself: Within the real-space representation, the corresponding Hartree and Fock terms are seen to cancel each other exactly.
4.2 Grand potential within static mean-field theory
The final task is to compute the grand potential for the optimal (Hartree-Fock) density operator, i.e.
where (the self-consistent) is taken from Eq. (50). We find:
Using Wick’s theorem,
and inserting the optimal , we arrive at:
With the substitution in the second term, this yields
Using Wick’s theorem “inversely”,
This is an interesting result as it shows that the Hartree-Fock grand potential is different from the grand potential of the reference system which is the grand potential of a system of non-interacting electrons. Due to the “renormalization” of the one-particle parameters , the grand potential of the reference system does already include some interaction effects. As Eq. (58) shows, however, there is a certain amount of “double counting” of interactions in which has to be corrected for by the second term. The second term is the Coulomb interaction energy of the electrons in the renormalized one-particle potential and lowers the Hartree-Fock grand potential. This is important as we know that must represent an upper bound to the exact grand potential of the system:
Concluding, we can state that Hartree-Fock theory can very easily be derived from the generalized Ritz principle. The only approximation consists in the choice of the reference system which serves to span a set of trial density operators. The rest of the calculation is straightforward and provides consistent results.
To estimate the quality of the approximation, we replace and expand the exact grand potential in powers of the interaction strength :
Using Eq. (6), this gives
where we have replaced the expectation value with respect to the non-interacting system by the one with respect to the Hartree-Fock reference system where is the self-consistent one-particle potential. Since this is correct up to terms of order . With the same argument we can treat the first term on the r.h.s. yielding:
Using Eq. (6) once more, , it follows
where we have set in the end. Comparing with Eq. (54), this shows that
Static mean-field theory thus predicts the correct grand potential of the interacting electron system up to first order in the interaction strength. It is easy to see, however, that already at the second order there are deviations. In fact, the diagrammatic perturbation theory shows that static mean-field (Hartree-Fock) theory is fully equivalent with self-consistent first-order perturbation theory only. This leads us to the conclusion that despite its conceptual beauty, static mean-field theory must be expected to give poor results if applied to a system of strongly correlated electrons.
4.3 Approximation schemes
Let us try to learn from the presented construction of static mean-field theory using the generalized Ritz principle and pinpoint the main concepts such that these can be transferred to another variational principle which might then lead to more reliable approximations. We consider a variational principle of the form
where is some unspecified multicomponent physical quantity. It is assumed that the functional is stationary at the physical value for and, if evaluated at the physical value, yields the physical grand potential . Typically and as we have seen for the Ritz principle, it is generally impossible to exactly evaluate the functional for a given and that one has to resort to approximations. Three different types of approximation strategies may be distinguished, see also Fig. 2:
In a type-I approximation one derives the Euler equation first and then chooses (a physically motivated) simplification of the equation afterwards to render the determination of possible. This is most general but also questionable a priori, as normally the approximated Euler equation no longer derives from some approximate functional. This may result in thermodynamical inconsistencies.
A type-II approximation modifies the form of the functional dependence, , to get a simpler one that allows for a solution of the resulting Euler equation . This type is more particular and yields a thermodynamical potential consistent with . Generally, however, it is not easy to find a sensible approximation of a functional form.
Finally, in a type-III approximation one restricts the domain of the functional which must then be defined precisely. This type is most specific and, from a conceptual point of view, should be preferred as compared to type-I or type-II approximations as the exact functional form is retained. In addition to conceptual clarity and thermodynamical consistency, type-III approximations are truly systematic since improvements can be obtained by an according extension of the domain.
Examples for the different cases can be found e.g. in Ref. [Potthoff(2005)]. The presented derivation of the Hartree-Fock approximation shows that this type-III. The classification of approximation schemes is hierarchical: Any type-III approximation can also be understood as a type-II one, and any type-II approximations as type-I, but not vice versa (see Fig. 2). This does not mean, however, that type-III approximations are superior as compared to type-II and type-I ones. They are conceptually more appealing but do not necessarily provide “better” results.
5 Dynamical quantities
To set up the dynamical variational principle, some preparations are necessary. We first introduce the one-particle Green’s function and the self-energy, briefly sketch diagrammatic perturbation theory and also discuss how within the framework of perturbation theory it is possible to construct non-perturbative approximations like the so-called cluster perturbation theory (CPT).
5.1 Green’s functions
We again consider a system of interacting electrons in thermodynamical equilibrium at temperature and chemical potential . The Hamiltonian of the system is , see Eq. (1). Now, the one-particle Green’s function [Abrikosow et al.(1964), Fetter and Walecka(1971), Negele and Orland(1988)]
of the system will be the main object of interest. This is a frequency-dependent quantity which provides information on the static expectation value of the one-particle density matrix but also on the spectrum of one-particle excitations related to a photoemission experiment [Potthoff(2001a)]. The Green’s function can be defined for complex via its spectral representation:
where the spectral density
is the Fourier transform of
which involves the anticommutator of an annihilator and a creator with a Heisenberg time dependence . Due to the thermal average, , the Green’s function parametrically depends on and and is denoted by .
For the diagram technique employed below, we need the Green’s function on the imaginary Matsubara frequencies with integer [Abrikosow et al.(1964), Fetter and Walecka(1971), Negele and Orland(1988)]. In the following the elements are considered to form a matrix which is diagonal with respect to .
The “free” Green’s function is obtained for , and its elements are given by:
This is a result that can easily be derived from the equation-of-motion technique [Fetter and Walecka(1971)]. Therewith, we can define the self-energy via Dyson’s equation
5.2 Diagrammatic perturbation theory
The main reason why the Green’s function is put into the focus of the theory is that a systematic expansion in powers of the interaction can be set up. Here, a brief sketch of perturbation theory can be given only (see Refs. [Abrikosow et al.(1964), Fetter and Walecka(1971), Negele and Orland(1988)] for details). Starting point is the so-called S-matrix defined for as
where and .
There are two main purposes of the S-matrix. First, it serves to rewrite the partition function in the following way: