Chebyshev Matrix Product State Impurity Solverfor the Dynamical Mean-Field Theory

Chebyshev Matrix Product State Impurity Solver for the Dynamical Mean-Field Theory

Abstract

We compute the spectral functions for the two-site dynamical cluster theory and for the two-orbital dynamical mean-field theory in the density-matrix renormalization group (DMRG) framework using Chebyshev expansions represented with matrix product states (MPS). We obtain quantitatively precise results at modest computational effort through technical improvements regarding the truncation scheme and the Chebyshev rescaling procedure. We furthermore establish the relation of the Chebyshev iteration to real-time evolution, and discuss technical aspects as computation time and implementation in detail.

I Introduction

The dynamical mean-field theory (DMFT)(Metzner and Vollhardt, 1989; Georges and Kotliar, 1992; Georges et al., 1996; Kotliar et al., 2006) and its cluster extensions(Maier et al., 2005) are among the most successful methods to study strongly correlated electron systems in dimensions higher than one. The impurity problem within DMFT is usually solved with continuous-time quantum Monte Carlo (CTQMC) algorithms, (Gull et al., 2011; Rubtsov et al., 2005; Gull et al., 2008; Werner et al., 2006) the numerical renormalization group (NRG) (Bulla et al., 2008) or exact diagonalization (ED).(Caffarel and Krauth, 1994; Granath and Strand, 2012; Lu et al., 2014) While CTQMC is computationally feasible even for problems with many bands or a high number of cluster sites, it provides numerically exact results only on the imaginary frequency axis. Many experimentally relevant frequency-dependent quantities like e.g. the conductivity therefore can only be obtained via the numerically ill-conditioned analytical continuation. NRG, by contrast, solves the problem on the real frequency axis. But it badly resolves spectral functions at high energies and cannot treat DMFT calculations with more than e.g. two bands. The limiting factor for this is the exponential growth of the local Hilbert space with the number of bands. Only recently, a reformulation of the mapping problem could avoid this exponential growth,(Mitchell et al., 2014) but it is still unclear whether this can be efficiently exploited in the context of DMFT. ED faces the problem of a limited spectral resolution due to the limited number of bath sites it can treat, although recent publications could substantially improve that. (Granath and Strand, 2012; Lu et al., 2014)

As the impurity problem of DMFT is one-dimensional, there has been a long-time interest to solve it using density matrix renormalization group (DMRG),(White, 1992; Schollwöck, 2005, 2011) which operates on the class of matrix product states (MPS). DMRG features an unbiased energy resolution and shows no exponential growth of the local Hilbert space with respect to the number of baths. It also works directly on the real-frequency axis, avoiding analytic continuation. The earliest DMRG approach to spectral functions, the Lanczos algorithm approach,(Hallberg, 1995) is computationally cheap, but does not yield high-quality DMFT results due to its intrinsic numerical instability.(García et al., 2004) Recent improvements using a fully MPS-based representation of this algorithm(Dargel et al., 2012) are not sufficient to resolve this issue. (Wolf, 2013) The dynamical DMRG (DDMRG) approach(Kühner and White, 1999; Jeckelmann, 2002) yields very precise results for single-site DMFT on the real frequency axis,(Karski et al., 2008, 2005; Nishimoto and Jeckelmann, 2004) but is computationally extremely costly and therefore not competitive with other impurity solvers for DMFT.

Recently, a new approach to spectral functions based on expansions in Chebyshev polynomials(Weiße et al., 2006) represented with matrix product states (CheMPS)(Holzner et al., 2011; Braun and Schmitteckert, 2013; Tiegel et al., 2013; Ganahl et al., 2014a) was introduced by two of us in Ref. Holzner et al., 2011, which gave essentially the accuracy as the DDMRG approach at a fraction of the computational cost. At the same time, the availability of real-time evolution(Daley et al., 2004; Vidal, 2004; White and E., 2004) within time-dependent DMRG (tDMRG) and closely related methods generally also permits access to spectral functions by a Fourier transformation.(White and E., 2004) Both Chebyshev expansions (CheMPS)(Ganahl et al., 2014a) and tDMRG(Ganahl et al., 2014b) were recently seen to be applicable to the solution of the DMFT. Both approaches are computationally cheaper than DDMRG and numerically stable. For the single-impurity single-band case, results on the real-frequency axis are excellent, but for more typical present-day DMFT setups involving clusters or multiple bands, results are not available in the case of Chebyshev expansions or do so far not reach the quality of the competing QMC and NRG methods in the case of real-time evolution.

In this paper, we push the application of CheMPS to DMFT further: (i) We solve the dynamical cluster approximation (DCA)(Maier et al., 2005) for a two-site cluster and the DMFT for a two-band Hubbard model. The accuracy of the results for the latter case is better than those shown in Ref. Ganahl et al., 2014b, where the problem has been solved using tDMRG. (ii) We consider the experimentally relevant case of finite doping, which is significantly more complicated than the half-filled cases treated so far. (iii) We suggest a new truncation scheme for CheMPS, which allows to maintain the same error level at strongly reduced computational cost. (iv) We establish that the Chebyshev recurrence iteration can be interpreted as a discrete real-time evolution. (v) By comparing different methods to set up CheMPS, we obtain another substantial increase in computation speed. (vi) We discuss limitations of post-processing methods, which have been crucial to the success of DMRG as an DMFT impurity solver.

With these improvements, CheMPS immediately provides an efficient, precise and controlled way to solve DMFT problems with two baths (two-site clusters) on the real-frequency axis with feasible extensions to problems with more bands. The presentation proceeds as follows. After a general introduction to Chebyshev expansions of spectral functions in Sec. II, we move on to discuss its implementation in the approximate framework of MPS: in Sec. III, we present a new truncation scheme, and in Sec. IV, we discuss the mapping of the Hamiltonian to the convergence interval of Chebyshev polynomials, because this interacts non-trivially with efficient MPS calculations. Sec. V treats the post-processing of Chebyshev moments obtained in the expansion. These improvements are then applied to various DMFT problems. As the case of the single-impurity single-band DMFT has been treated extensively in the literature and just serves as an initial benchmark, we move those results to the Appendix. In the main text, we give examples for the relevance of our improvements to CheMPS by solving a two-site DCA in Sec. VI.1 and a single-site two-orbital DMFT in Sec. VI.2. Technical details of these calculations are again found in the Appendix. Sec. VII concludes the paper.

Ii Chebyshev expansion of spectral functions

In this Section, we establish notation and explain the general ideas behind Chebyshev expansions of spectral functions. The zero-temperature single-particle Green’s function associated with a many-body hamiltonian is

(1)

where creates a particle in a particular quantum state and is the ground state with energy . The spectral function reads

(2)

with weights . If evaluated exactly in a finite system, is a comb of delta peaks, which only in the thermodynamic limit becomes a smooth function . If evaluated in an approximate way that averages over the finite-size structure of , it is possible to extract also from a sufficiently big finite-size system. Among various techniques that provide such an approximation,(Lin et al., 2013) the most popular one is the definition of a broadened representation of

(3)

where the broadening function is given by the Gaussian kernel

(4)

Besides the Gaussian kernel, a Lorentzian kernel

(5)

is often implicitly used as it emerges automatically when computing the spectral function from the shifted Green’s function . In general, is indistinguishable from if the latter has no structure on a scale smaller than .

An efficient way to generate the broadened version of is via iterative expansions in orthogonal polynomials. Historically most frequently used in this context is the Lanczos algorithm, which is intrinsically numerically unstable, though. By contrast, expansions in Chebyshev polynomials can be generated in a numerically stable way. As they haven’t been used much in either the DMRG or DMFT community so far, we briefly introduce them based on Ref. Weiße et al., 2006.

ii.1 General implementation

The Chebyshev polynomials of the first kind can be represented explicitly by

(6)

or generated with the recursion

(7)

which is numerically stable if . Chebyshev polynomials are orthonormal with respect to the weighted scalar product

(8a)
(8b)

Any sufficiently well-behaved function can be expanded in Chebyshev polynomials

(9a)
(9b)

where the definition of the so-called Chebyshev moments via the non-weighted scalar product follows when applying to both sides of (9a).

If is smooth, the envelope of decreases at least exponentially to zero with respect to ; if is the step function, the envelope decreases algebraically; and if is the delta function, the envelope remains constant. (Boyd, 2001) For a smooth function, the truncated expansion therefore approximates very well if is chosen high enough. But for the delta function, any truncated expansion yields an approximation with spurious (Gibbs) oscillations. A controlled damping scheme for the oscillations, the so-called kernel polynomial approximation (KPM), can be obtained with a simple modification of the Chebyshev expansion,

(10a)
(10b)

where is the so-called Jackson kernel that leads to a very good Gaussian approximation with -dependent width of the delta function, and hence directly leads to (4).

In the case of the spectral function (II), one aims at an expansion of a superposition of delta functions. This can in practice often be done without damping: When expanding (II) in Chebyshev polynomials, the integration in (9b) averages over the delta-peak as well as over the finite-size peak structure of . If the weights vary slowly on the scale of the spacing of finite-size peaks, the sequence approaches zero as soon as the characteristic form of this slow variation is resolved. The value of at which this pseudo-convergence occurs is the one that resolves the spectral function in the thermodynamic limit , provided that has no structure on a smaller scale than the spacing of finite-size peaks. Only for much higher values of , the Chebyshev moments start deviating from zero again to then oscillate forever, resolving first the finite-size structure of and finally the delta-peak structure. Therefore, if one can generate the sequence up to pseudo-convergence, then there is no need for Jackson damping.

ii.2 Operator valued Chebyshev expansion

In order to expand the spectral function (II), one usually introduces a rescaled and shifted version of in order to map its spectrum into the interval , where Chebyshev polynomials are bounded and have a stable recursion relationship,

(11)

Obviously, there is a lot of leeway in the choice of and , which will be found to have large implications for CheMPS (Sec. IV). Generally,

(12)

Expanding in Chebyshev polynomials yields the moments

(13)

Inserting the recursive definition (7) of in the definition of one obtains a practical calculation scheme for the power series expansion of

(14a)
(14b)

One can double the expansion order with the following relation(Weiße et al., 2006)

(15a)
(15b)

but has to be aware of the fact that moments computed this way are more prone to numerical errors.(Holzner et al., 2011)

ii.3 Retarded fermionic Green’s function

In the case of fermionic problems, as encountered in DMFT, an additional technical complication comes up. The spectral representation of the fermionic retarded Green’s function is the sum of its particle and hole parts

(16)

As have steps at , their representation in terms of smooth polynomials is notoriously ill-conditioned. One should therefore try to represent the smooth function by a single Chebyshev expansion: Allowing for two different rescaling prescriptions, one has

(17a)
(17b)

In order to write in terms of a single Chebyshev expansion, one can use the symmetries and . These restrict the rescaling parameters via to and . Making the particular choice hence defines a common expansion via(Ganahl et al., 2014a)

(18)

Although provides one with a controlled treatment of the step function, it comes at the price of a loss in computational speed. We will compare advantages and disadvantages of two practical shifting possibilities ( and ) in detail in Sec. IV.

Iii Matrix Product implementation

So far everything has been general, or it was somehow assumed that all calculations can be carried out exactly, which meets severe limitations in computational practice. Representing Chebyshev states with matrix product states (MPS)(Holzner et al., 2011) enables more efficient computations than in an exact representation, as the size of the effective Hilbert space can be tremendously reduced. As an MPS is usually only an approximate representation of a strongly correlated quantum state, the issue of optimal compression, i.e. the representation of a quantum state as an MPS using finite-dimensional matrices with a minimal loss of accuracy (information), is crucial. Here, we argue in the following that instead of controlling the maximal matrix dimension,(Holzner et al., 2011; Ganahl et al., 2014a; Tiegel et al., 2013) one should rather control the cumulated truncated weight (a proxy measure of the loss of accuracy), allowing for more efficient and more controlled calculations of Chebyshev moments.

iii.1 Adaptive matrix dimension

If one follows through the recursive scheme for Chebyshev vectors, one starts out from a ground state, which we may assume has been obtained by a standard DMRG (MPS) calculation to extremely high precision, this means that an optimally compressed starting MPS is available where matrices have some computationally feasible dimension at very small loss of accuracy compared to the exact starting state. This, in turn, yields an extremely precise starting Chebyshev state . Now, in each step of the recursion (14a), one applies and subtracts a preceding Chebyshev state. As is well-known for MPS, the application of (and to a lesser extent the subtraction) lead to a drastic increase in matrix dimension, which necessitates a state compression (Sec. 4.5 of Ref. Schollwöck, 2011) of the new Chebyshev state to a computationally manageable state with smaller matrix dimension , which generates the error

(19)

Here, we used the upper error bound(Verstraete and Cirac, 2006) provided by the cumulated truncated weight

(20)

where is the sum over the discarded reduced density-matrix eigenvalues per bond and the sum over is over all bonds. This error bound for a single step of the recursion unfortunately does not provide a statement about the total error that accumulates over all compression steps in preceding Chebyshev recursion steps. Still, we experienced that the numerical stability of the Chebyshev recursion rather leads to a helpful compensation of errors of single recursion steps. Fig. 1 shows that the total error stays at the order of the error of a single step also for high iteration numbers . In the case in which one fixes the matrix dimension , Fig. 1 shows a steady, uncontrolled increase of the total error. This is particularly undesirable in view of the desired post-processing of Chebyshev moments (Sec. V).

Figure 1: (Color online) Error of Chebyshev moments (as they appear in (17a)), computed as , where is obtained with a quasi-exact calculation with high matrix dimension . If one fixes the matrix dimension , the error steadily increases. If, instead, one fixes the cumulated truncated weight , the error remains approximately constant and does not accumulate. This is the procedure followed in this paper. As here, , equals the upper error bound of a single compression step. Results shown are for the spectral function of the half-filled single-impurity Anderson model (SIAM) (Appendix C.1) with semi-elliptic density of states of half-bandwidth , interaction , represented on a chain with lattice sites. This is equivalent to considering the local density of states at the first site of a fermionic chain with constant hopping and an interaction of that acts solely at the first site.

Another possibility would be to fix the local discarded weight as defined in (20). But this does in general not lead to a viable computation scheme for impurity models: In the simplest and most-employed chain representation of impurity models, the impurity site is located at an edge of the chain. Fixing the same value for for all bonds then leads to extremely high matrix dimensions in the center of the chain, i.e. in the center of the bath, where entanglement for systems with open boundary conditions is maximal. The relevant entanglement, by contrast, is the one between the impurity site and the bath. This becomes clear when noticing that upon projecting the Chebyshev state on to compute , only correlations with respect to the local excitation are measured. The high computational effort of high matrix dimensions that follows when faithfully representing entanglement within the bath, is therefore in vain. For geometries with the impurity at the center, like the two-chain geometry used for the two-bath problems in this paper, the preceding argument is not valid. An inhomogeneous distribution of matrix dimensions with high values at the center and low values at the boundaries is a priori consistent with open boundary conditions. This distribution can therefore be achieved by fixing a constant value for for each bond. Another possible truncation scheme could be obtained by using an estimator for the correlations of the impurity with the bath, which then fixes the matrix dimensions as a function of bonds (distance to the impurity). Both approaches constitute possible future refinements. For simplicity, in this paper, we consider the truncation scheme that fixes a constant value of based on the cumulative truncated weight.

iii.2 State compression

During the repeated solution of (14a) we monitor the truncated weight . If exceeds a certain threshold of the order of to , we slightly increase the matrix dimension , and repeat the compression. For the first compression step we take as an initial guess the previous Chebyshev state . For repeated compression steps we take as an initial guess the state of the previous compression step. It turns out that in practice one almost never faces repeated compressions, which gains one approximately a factor 2 in computation speed compared to the error monitoring of Ref. Holzner et al., 2011: in Ref. Holzner et al., 2011, the authors keep the matrix dimension fixed and variationally(Schollwöck, 2011) compress an exact representation of the right hand side of (14a) for fixed by repeated iterations (“sweeps”) until the error

(21)

drops below a certain threshold. Here, denotes the state before a sweep, and the state after a sweep. This error measure is not related to the factual error of Chebyshev moments, for any but the first sweep. Its monitoring is costly to compute and leads to at least two compression sweeps.

Iv Optimal Chebyshev setup

One can generally state that the effectiveness of the MPS evaluation of the Chebyshev recursion (14a) for a certain system is unknown a priori but must be experienced by observing how strong entanglement in the Chebyshev vectors, and therefore matrix dimension needed for a faithful representation grows as compared to the speed of convergence of . For very high iteration numbers one will always reach a regime in which matrix dimensions have grown so much that further calculations become too expensive computationally. This is known from tDMRG as hitting an exponential wall and defines an accessible time scale, or in our case, an accessible expansion order. In the case of the computation of Chebyshev moments, the accessible time scale strongly depends on the choice of the shifting parameter , which leads us to consider the two cases and .

Comparing these cases, one finds a much slower speed of convergence of the Chebyshev moments in the case than in the case . Putting that differently: per fixed amount of entanglement growth (application of in one step of (14a)), much less information about the spectral function is extracted in case than in case . Independent of that, one finds that the advantage of the choice to provide one with an analytic expression for in terms of a single Chebyshev expansion (Sec. II.3) can be detrimental. We therefore need to study both cases in more detail.

iv.1 No shift:

If choosing , one can derive a scaling property of Chebyshev moments that simplifies extracting the thermodynamic limit as well as the examination of computational performance.

The spectral function of a one-particle operator is non-zero only in the vicinity of the groundstate energy , up to a distance of the order of the single-particle bandwidth . The rescaled spectral function is non-zero up to a distance of from . For all rescaling parameters that have been proposed up to now,(Weiße et al., 2006; Holzner et al., 2011; Ganahl et al., 2014a) one has . Usually is much smaller than the upper bound . As is well approximated by its linear term already for , Chebyshev polynomials (6) behave like a shifted cosine function in the region where is non-zero. The expansion of in Chebyshev polynomials is therefore essentially equivalent to a Fourier expansion. This means that the iteration number of the Chebyshev expansion has the same meaning as a discrete propagation time, the evolution of which is mediated by simple applications of instead of the ordinary continuous time propagation . To answer the question of whether an ordinary time evolution(Ganahl et al., 2014b) is more effective in generating information about the spectral function, one has to study the entanglement entropy production of repeated applications of compared to the one of . The following results are first steps in this direction.

In discrete time evolution, the rescaling of the frequency directly translates to an inverse scaling of time. Considering two calculations of Chebyshev moments, one for performed with and another for performed with , one therefore has the simple approximate relation

(22)

This means that if rescaling with , one has to compute times more Chebyshev moments than in the case without rescaling. An exact version of statement (IV.1) is given in (32) in Appendix A. Fig. 2(a) illustrates the scaling property (IV.1) for a system of fixed size.

Figure 2: (Color online) Panel (a): Chebyshev moments vs for fixed system size and different values of and . Except for a different total number of points, the rescaled moments all lie on the line obtained when and becomes continuous. Here, we study the half-filled SIAM (Appendix C.1) with semi-elliptic density of states of half-bandwidth and , represented on a chain with length . The full many-body bandwidth is . Panel (b): Chebyshev moments for different system sizes . Except for the system size and the scaling parameter, parameters are the same as in panel (a). Here all calculations were done with a rescaling constant of . For low values of , the results for different system sizes are virtually indistinguishable. For higher values of , moments start to disagree as finite-size features start to be resolved. The and the results would be indistinguishable in this plot.

Extracting the thermodynamic limit. One direct application of the scaling property (IV.1), lies in the study of the thermodynamic limit by comparing systems of increasing size . For low values of , even small systems have the same Chebyshev moments as in the thermodynamic limit. Finite-size features are averaged out in the integral (9b) as long as oscillates slowly enough. oscillates times on . An th order Chebyshev expansion therefore resolves features on the scale , which on the original energy scale is . Finite-size oscillations appear at a spacing of , where is the single-particle bandwidth. Equating resolution with the spacing of finite-size oscillations

(23)

gives the expansion order at which finite-size features are first resolved. Fig. 2(b) illustrates these statements by comparing Chebyshev moments computed for different system sizes.

Optimizing computation time. Fig. 3 shows how computation time depends on the rescaling constant for the example of the moments shown in Fig. 2(a). As already qualitatively stated previously (Holzner et al., 2011; Ganahl et al., 2014a), one observes that upon using a lower value of computation time is reduced. In all cases, computation time diverges exponentially (Fig. 2(b)). Note that rescaling with a higher value of allows to compute at smaller matrix dimensions. Note further that if choosing too small, numerical errors can render the recursion (14a) unstable. In contrast to common belief, it is possible to use much smaller values of than the full many-body bandwidth. Achieving even smaller values of can be done with the so-called energy truncation(Holzner et al., 2011), but after several tests, we did not find this to lead to an effective speed-up of calculations. We therefore discard it in our calculations as a source of additional tuning parameters. We have also tested the idea of Ganahl et al. (2014a) to map the spectrum of into via . The idea might be worth to study in more detail, but again, we could not gain any performance improvement over a simple rescaling procedure.

Figure 3: (Color online) Performance of the adaptive matrix dimension algorithm (Sec. III.1) for the example described in the caption of Fig. 2. Panel (a): Adaption of matrix dimensions for different rescaling factors, fixing a truncation error of . Panel (b): Computer time needed to generate the same amount of information for different scalings running on a single-core 2.0GHz workstation. Solid lines: fixing a truncation error of . Dashed lines: . The iteration number where the irregular behavior of the dashed line for starts corresponds to the point where numerical errors render the Chebyshev recursion unstable. Note that while small leads to the largest matrix sizes, which is costly in MPS, the overall cost of CPU time nevertheless is lowest, as a smaller expansion order is needed.

iv.2 Shifting by .

The choice in (11) makes an analytic expression of the complete spectral function in terms of a single Chebyshev expansion impossible, but has beneficial effects on the computation time. This is to be understood in the following sense: Due to the increased oscillation frequency of close to the interval boundaries of , the integral (9a) extracts much more information about the spectral function in the vicinity of these boundaries. This is reflected e.g. in the fact that the width of the Gaussian obtained by the kernel polynomial expansion approaches zero close the interval boundaries of (see the discussion below (10b)). It is therefore desirable to shift the relevant part of the spectral function, the part slightly above the Fermi edge, to match the left boundary . This is achieved by the choice . In practice, one adds a small correction , , to avoid problems with the diverging weight function in (8b).

Another advantage of the setup is that one can use a smaller scaling constant than in the setup. The Chebyshev iteration becomes unstable when the iteration number becomes so high that has accumulated erroneous contributions from eigen states with eigen energies . For fixed , the additional subtraction in the setup ensures that the instability appears for a higher iteration number than in the setup. Therefore, the setup allows smaller values of . We finally note that the choice is equivalent to the choice suggested by Weiße et al. (2006), if one rescales with the full many-body bandwidth . In this case, the computation can be carried out to arbitrarily high order and will never become unstable. In the setup, one would have to choose to reach arbitrarily high expansion orders.

Figure 4: (Color online) Local particle density of states of the half-filled SIAM (Appendix C.1) with semi-elliptic density of states of half-bandwidth . , and in all cases. Panel (a): Chebyshev moments. Lines connect every 4th moment and by that reveal the relevant slow oscillation. They are a guide to the eye. Panel (b): Corresponding spectral functions evaluated using Jackson damping (10b). The calculation requires three times more iterations than the calculation to resolve the right Hubbard peak with the same resolution. In this case the central peak is still much better resolved for .

In Fig. 4(a), we plot Chebyshev moments for both types of shifts and . The moments obtained for show a slow structureless oscillation whereas the moments obtained for show a much faster oscillation. Fig. 4(b) shows that upon using the same rescaling constant and the same expansion order , which leads to very similar entanglement growth, both shift types differ strongly in the achieved resolution. To resolve at least the right Hubbard peak with a calculation at the resolution of calculation, one needs moments. As computation time increases exponentially (Fig. 3(b)) with respect to expansion order in both cases, this difference is highly relevant.

We apply both setups, and , to the benchmark test of the DCA in Sec. VI.1, and find a significant speed-up for at a small loss in accuracy. Previously,(Ganahl et al., 2014a) only has been considered for the solution of the DMFT.

V Post-processing moments

Whereas Jackson damping (10b) can be seen as one possibility to post-process Chebyshev moments in order to achieve uniform convergence even for the truncated Chebyshev expansion of a delta function, there is another, fundamentally different approach.

The computation of the Chebyshev moments becomes very costly for high iteration numbers. In the case in which Chebyshev moments start to follow a regular pattern when exceeds a certain threshold, it is possible to continue this pattern to infinity, and one can avoid the costly computation of moments. Consider the typical example in which the spectral function is a superposition of Lorentzians (quasiparticle peaks) and of a slowly varying background density. As for low values of , extracts information via (9b) only about the slowly varying background density, while for high values of , extracts information only about the sharp and regular Lorentzian structures, starts to follow a regular pattern for high numbers of . For a sum of Lorentzians, with weights , widths , and positions , this pattern can be obtained analytically:

(24)

as shown in Appendix B. If one recalls (Sec. IV) that the Chebyshev recursion corresponds to a discrete time evolution if choosing , the result of (24) could have been anticipated.

Fig. 5(a) shows the spectral density for a SIAM together with a fitted superposition of three Lorentzians. Their difference corresponds to a background density that is composed of either slowly varying features or features with negligible weight. Fig. 5(b) shows the corresponding Chebyshev moments. The slowly varying background density only contributes for the first 200 moments. After that, the Chebyshev moments for the superposition of Lorentzians starts to be a very good approximation to the original moments, and it seems unnecessary to compute more than about 400 moments. For , one can simply fit the analytical expression (24) to the original data. Using the analytical expression with the fitted parameters, one can then continue the Chebyshev moments to infinity.

Figure 5: (Color online) Panel (a): for a semi-elliptic density of states, half filling and (Appendix C.1). Quantities are shown in units of the full many-body bandwidth . The superposition of three Lorentz peaks has been fitted to . Panel (b): Corresponding Chebyshev moments. The result presented here was obtained with a fermionic chain and CheMPS. It agrees with the result of Raas et al. (2004), see Appendix C.1. The legend in panel (a) is valid also for panel (b).

Fitting (24) to the data between iterations 200 and 400 is a nonlinear optimization problem, which can easily be solved numerically. Still, there exists a linear reformulation of this optimization problem, coined under the name linear prediction (Press et al., 2007). The linear problem can be analytically reformulated as a matrix inversion problem. Its solution is faster and more stable than that of the original non-linear problem. This allows in principle to optimize a superposition of many more Lorentzians than in the non-linear case.

Linear prediction

In the context of time evolution linear prediction has been long established in the DMRG community,(White and Affleck, 2008; Barthel et al., 2009) but it has only recently been applied to the computation of Chebyshev moments.(Ganahl et al., 2014a) The optimization problem for the sequence becomes linear, if the sequence can be defined recursively

(25)

which is easily found to be equivalent to (24)(Barthel et al., 2009). The strategy is then as follows. Compute Chebyshev moments, and predict moments for higher values of using (25). The coefficients are optimized by minimizing the least-square error for a subset of the computed data. We confirmed to be a robust choice,(Barthel et al., 2009; Ganahl et al., 2014a) small enough to go beyond spurious short-time behavior and large enough to have a good statistics for the fit. Minimization yields

(26)

We found that linear prediction loses its favorable filter properties if choosing to be very high. Therefore one should restrict the number of Lorentzians to . Furthermore, one adds a small constant to the diagonal of in order to enable the inversion of the singular matrix . Defining(Barthel et al., 2009)

one obtains the predicted moments , where . The matrix usually has eigenvalues with absolute value larger than , either due to numerical inaccuracies or due to the fact that linear prediction cannot be applied as rather increases than decreases on the training subset . In order to obtain a convergent prediction, we set the weights that correspond to these eigenvalues to zero measuring the ratio of the associated discarded weight compared to the total weight. If this ratio is higher than a few percent, we conclude that linear prediction cannot yet be applied and restart the Chebyshev calculation to increase the number of computed moments .

Failure of linear prediction

It is not a priori clear that the spectral function can be well approximated by a superposition of Lorentzians, although this is true for the SIAM as shown in Fig. 5. Other types of smooth functions lead to a different functional dependence of the moments on than the exponentially damped behavior. Close to phase transitions, e.g. one might find an algebraic decay in the time evolution, corresponding to an algebraic decay in the Chebyshev moments. If the spectral function has rather Gaussian shaped peaks, the decrease of Chebyshev moments is (Appendix B). For both scenarios, linear prediction is a non-controlled extrapolation scheme. It still extracts oscillation frequencies (peak positions) with high reliability, but predicts a wrong decrease of the envelope, which often leads to an overestimation of peak weights.

In practice it turns out that a combination of damping with a Jackson kernel (Kernel Polynomial Method) and linear prediction is a powerful way to get controlled estimates for the spectral function. While damping always underestimates peak heights, linear prediction typically overestimates peak heights. Both methods trivially converge to the exact result, when . One therefore obtains upper and lower bounds for the spectral function. This is particularly valuable in the DMFT as overestimated (diverging) peak heights can spoil convergence of the DMFT loop.

A historically much used alternative to linear prediction, suitable for arbitrary forms of the spectral function, is an extrapolation of Chebyshev moments using maximum entropy methods (Silver and Röder, 1997). These suffer from severe numerical instabilities, though. Of course, one might also think of fitting another ansatz than the one of the exponential decrease. As it is a priori not clear which ansatz should be better, it is meaningful to stick to the easily implemented linear prediction that is moreover known to be applicable for the description of quasi-particle features.

Vi Results for DMFT calculations with two baths

vi.1 Results for two-site DCA (VBDMFT)

In order to benchmark the Chebyshev technique for a two-bath situation, which goes beyond previous work(Ganahl et al., 2014a) (see Appendix C), we study the Hubbard model on the two dimensional square lattice

(27)

in a two-site dynamical cluster approximation(Maier et al., 2005) (DCA) developed by Ferrero et al. (2009). This so-called valence bond DMFT (VBDMFT) is a minimal description of the normal phase of the high-temperature superconductors, using a minimal two patches DCA cluster. It leads to a simple physical picture of the pseudogap phase in terms of a selective Mott transition in the momentum space. We choose this model here as a benchmark since its solution contains low energy features in the spectral functions (pseudogap), which have required high-precision QMC computations followed by a careful Padé analytic continuation. Moreover, real-frequency computations are very important for the comparison with experiments that measure e.g. the optical conductivity along c-axis.(Ferrero et al., 2010) It is therefore a non-trivial case where DMRG impurity solvers would bring significant improvements over the QMC in practice.

To set up the VBDMFT, one splits the Brioullin zone into a central patch , where , and a border patch . In the DCA, the -dependence of the self-energy within each patch is neglected and one computes a Green’s function for a patch by averaging over all vectors in the patch

(28a)
(28b)

Representing the non-interacting baths in a chain-geometry, and taking the two impurities to be the first of two chains , the model Hamiltonian that needs to be solved is

(29)

where and the term accounts for high-frequency contributions of the hybridization function (see Appendix D.4).

The -space interaction term in (29) arises when diagonalizing the hybridization function of a real-space two-site cluster , where are annihilation operators for the cluster sites in real-space, and for the cluster sites in space. In real-space, the interaction is a simple Hubbard expression, but then the hybridization function is non-diagonal. A diagonal hybridization function, which leads to two uncoupled baths for the patches and by that allows a simple chain-geometry for the whole system, is therefore only possible in -space. The more complex form of the interaction in -space does not affect the efficiency of DMRG.

We iteratively solve the self-consistency equation obtained by inserting the self-energy estimates of the impurity model (29) into the lattice Green functions (28a). We do that on the real-energy axis with an unbiased energy resolution. The details of this calculation are described in Appendix D.

In Fig. 6(a) and (b), we compare our CheMPS results for the spectral densities of the two momentum patches with those of Ferrero et al. (2009) obtained using CTQMC and analytical continuation. We observe a good overall agreement between the two methods, in particular at low frequencies. Low energy features (pseudogap), in particular in , are well reproduced by both methods. At high energy (Hubbard bands) however, there are some differences between QMC and CheMPS (and also between the two variants of CheMPS). This is to be expected since the Padé analytic continuation technique used on the QMC data in Ref. Ferrero et al., 2009 is not a precision method at high energy.

Figure 6: (Color online) Spectral functions (a,b) and Green’s functions on the imaginary axis (c,d) within VBDMFT (Ferrero et al., 2009) for and . We compare our zero-temperature CheMPS results (solid lines) with CTQMC data for (dashed lines) from Ferrero et al. (2009). For this computation, we used the setup, a chain length of per bath, a truncation error of , and , .
Figure 7: (Color online) The same comparison as in Fig. 6. For this computation, we used the setup, a chain length of per bath, a truncation error of , and and . For the setup, one can use a smaller value of as in the setup, as discussed in Sec. IV.

In Fig. 6(c) and (d), we do the analogous comparison on the imaginary axis, and find much better agreement. On the imaginary axis, the QMC results can be considered numerically exact. The very low temperature () used for QMC should yield results that are indistinguishable from a zero-temperature calculation. The slight disagreement of our data and the QMC data on the Matsubara axis could probably be removed if we were able to reach higher expansion orders. One DMFT iteration for the presented calculation took around \SI5h running on four cores with \SI2.5GHz. Convergence is achieved after 10 iterations starting from the non-interacting solution. The calculation has been carried out with two attached chains of lattice sites each. We did not observe changes for higher bath sizes up to , but could not reach high enough expansion orders for chains longer than . We computed moments using a scaling constant , which corresponds to the full bandwidth.

The calculation can be accelerated significantly by using the setup of Sec. IV.2 and avoiding linear prediction. This leads to the same quality of agreement with QMC on the Matsubara axis, but on the real axis, peaks are a bit less pronounced while the pseudogap is still well resolved (Fig. 7). While the study of systems with higher bath sizes increases the computational cost tremendously in the setup, we could easily go to within the setup. This did not change the results. Computation times varied from \SI1.2h per iteration for , over \SI3h for to around \SI10h for the calculation. We computed moments using a scaling of in all cases.

vi.2 Single-site two-orbital DMFT

In the following, we apply CheMPS to the DMFT treatment of the two-orbital Hubbard model

(30)

on the Bethe lattice. We study a parameter regime close to the Metal-Insulator phase transition. This regime is computationally particularly expensive and we had to use a logarithmic discretization to reach Chebyshev expansion orders at which spectral functions are completely converged with respect to expansion order and system size. The linear discretization was feasible in the case of the VBDMFT studied in the previous section, as there, we faced a smaller entanglement entropy production during Chebyshev iterations.

Figure 8: (Color online) Spectral function for the two-band Hubbard model. Panel (a): , (half filling). Panel (b) , (quarter filling). In both cases , . We fixed a truncation error , used a scaling , computed moments and used linear prediction. To represent the two baths, we used two chains of length each, obtained with a logarithmic discretization parameter of , leading to grid energies (see e.g. Ref. Bulla et al., 2008). The NRG calculation was done for temperature , the QMC calculation for . Both should be almost indistinguishable from a calculation. NRG data from K. Stadler(Stadler, 2013) computed with a code of A. Weichselbaum.(Weichselbaum, 2012) QMC data from M. Ferrero.(Ferrero, )

Using a logarithmic discretization is not necessary for CheMPS. But as it leads to exponentially decaying hopping constants, it gives rise to three advantages: (i) One can use smaller scaling constants as the many-body bandwidth is considerably reduced due to the exponentially small value of most hopping constants in the system. (ii) One faces a smaller entanglement entropy production: at the edges of the bath chains (far away from the impurity), hopping constants are exponentially small, and application of therefore creates much less entanglement than in the case in which a linear discretization is used. In (14a), the action of on is then only a small perturbation for most parts of the system, and the recursion is therefore dominated by the second term . Entanglement therefore builds up only in the region where it is relevant, that is, in the vicinity of the impurity. Hence, matrix dimensions grow considerably more slowly when using a logarithmic discretization as compared to a linear discretization. (iii) One faces a faster speed of convergence of the Chebyshev moments as in the linear case: The complexity of the spectral function is considerably reduced when averaging over possible peaks in the high-energy structure of the spectral function, as is done when using a logarithmic grid. The associated Chebyshev expansion therefore converges more quickly than in the case of a linear grid.

When using a logarithmic discretization, one has to convolute the resulting spectral function with a Gaussian(Weichselbaum and von Delft, 2007) to average over the finite-size features that originate from the coarse log resolution at high energies.

Figure 9: (Color online) Spectral function for the two-band Hubbard model. The system parameters , , , are very similar to the one in Fig. 8(a). We performed a calculation with linear (“lin”, per bath) and one with logarithmic discretization (“log”, per bath). We fixed a truncation error . For the calculation with logarithmic discretization, we used a scaling and computed moments. For the calculation with linear discretization, we used a scaling of and computed moments. We used linear prediction in all cases. The logarithmic discretization used a discretization parameter , leading to grid energies (see e.g. Ref. Bulla et al., 2008).

In Fig. 8, we compare exemplary calculations for the two-band Hubbard model with NRG and analytically continued QMC data. We find good agreement in the regions around the Fermi energy, where the pinning criterion is respected to high accuracy without being enforced. We explain the observed disagreement far away from the Fermi energy with a different specific implementation of the broadening convolution. One DMFT iteration for our calculations took around \SI20min running on two \SI2.5GHz cores.

In Fig. 9, we study the case of Ref. Ganahl et al., 2014b, which is very similar to the one studied in Fig. 8(a). Our results suggest that the data shown in Ref. Ganahl et al., 2014b is not fully converged with respect to computed time in tDMRG, as it does not fulfill the pinning criterion. We face a similar problem when using a linear discretization: For the reachable Chebyshev expansion orders, we do not observe convergence of the central peak height for increasing expansion orders. All peaks, side peaks as well as central peak, increase for increasing expansion order and the pinning criterion is not fulfilled. The additional structure in the Hubbard band, which is not visible in the calculation with the logarithmic discretization, is seen to be similar to the one observed in Ref. Ganahl et al., 2014b. One DMFT iteration for the computation that uses a logarithmic grid took \SI20min running on two \SI2.5GHz cores. For the linear grid this time was \SI10h per DMFT iteration.

Figure 10: (Color online) Results for the spectral densities in the two orbitals for . Our results are for , , (half filling) and depicted by the solid lines. The reference NRG results(Greger et al., 2013) are for , and depicted by the dashed lines. We had to choose a slightly smaller interaction for a meaningful comparison, as for the parameters of Greger et al. (2013), we converged, though very slowly, into an insulating solution without central peak. The non-interacting single-particle half-bandwidth of the first band is , and the one of the second band is . We used two chains of length each, and a logarithmic discretization parameter of , leading to grid energies (see e.g. Ref. Bulla et al., 2008).

Finally, we study parameters that lead to a system close to the metal-insulator phase transition. Fig. 10 shows that we obtain satisfactory agreement with NRG data, given the fact that we had to reduce the interaction slightly in order to stay in the metallic phase. This slight quantitative mismatch can possibly again be explained with a differing broadening convolutions in the two calculations. One DMFT iteration took \SI2h for the calculation of Fig. 10, when fixing a truncated weight of .

Vii Conclusions

We solved several DMFT problems with two baths on the real frequency axis with unbiased energy-resolution based on an DMRG impurity solver using Chebyshev polynomials for the representation of spectral functions at moderate numerical effort. DMRG is thereby seen to be a viable alternative for DMFT impurity solvers also beyond the well-understood single-impurity single-band case.

Technically, it was crucial to apply the adaptive truncation scheme of Sec. III to maintain a modest numerical effort: in all cases, the new scheme gave much better results than the previously employed scheme based on fixed matrix dimensions. Another important way of tuning the calculation is provided by the mapping of the spectrum to the convergence interval of Chebyshev polynomials: The different options to set up a CheMPS calculation can be summarized to yield two alternatives. (i) One uses the setup and post-processes moments with linear prediction. (ii) One uses the setup and avoids linear prediction, using simple Jackson damping. Depending on the problem, the first or the second method can be more efficient. The second alternative is computationally much more efficient for cases in which linear prediction is a non-controlled extrapolation scheme, but has problems to resolve sharp peaks at the Fermi edge.

The method presented in this paper can in principle be extended to the case of more than two baths without major changes to the DMFT-DMRG interface and the Chebyshev-based impurity solver as such. However, while two baths can still be modeled by a single chain with the impurity at the center (instead of at the end, as in single-band DMFT), this is no longer possible for three and more baths. This will necessitate a new setup of the DMRG calculation replacing the chain-like by a star-like geometry with the impurity at the center of the star, hence a generalization from a matrix-based to a tensor-based representation at the location of the impurity. It remains to be seen at which numerical cost reliable results on the real frequency axis will be obtainable.

Viii Acknowledgements

FAW acknowledges discussions with C. Hubig and K. Stadler. US acknowledges discussions with M. Ganahl and H.-G. Evertz. FAW and US acknowledge discussions with P. Werner and support by the research unit FOR 1807 of the DFG. O. P. acknowledges support from the ERC Starting Grant 278472–MottMetals. We acknowledge K. Stadler and M. Ferrero for providing the data of their NRG and QMC calculations.

Appendix A Scaling of Chebyshev Moments with respect to energy scaling

The Chebyshev moments obtained by using two different scalings and are from (13) and . As we consider one-particle operators the weights fulfill

(31)

where is the single-particle bandwidth. If the scalings are chosen large enough, , then

(32)

Proof: If these requirements are met, the eigenvalues with are close to the groundstate energy: . The Taylor expansion becomes reliable already when , which is fulfilled if is at least twice the single-particle bandwidth as in all hitherto known applications(Weiße et al., 2006; Holzner et al., 2011; Ganahl et al., 2014a).

Consider a particular energy for which . It holds

A sufficient condition for the last line to hold is that both and are multiples of 2, i.e. the statement of (32).

Appendix B Chebyshev Moments of Lorentzian and Gaussian

If we fix the shift to be , equation (24) is obtained as follows. As we only have to compute the moments for a single Lorentzian, which allows to drop the index