# Efficient numerical solver for first-principles transport calculation based on real-space finite-difference method

## Abstract

We propose an efficient procedure to obtain Green’s functions by combining the shifted conjugate orthogonal conjugate gradient (shifted COCG) method with the nonequilibrium Green’s function (NEGF) method based on a real-space finite-difference (RSFD) approach. The bottleneck of the computation in the NEGF scheme is matrix inversion of the Hamiltonian including the self-energy terms of electrodes to obtain perturbed Green’s function in the transition region. This procedure first computes unperturbed Green’s functions and calculates perturbed Green’s functions from the unperturbed ones using a mathematically strict relation. Since the matrices to be inverted to obtain the unperturbed Green’s functions are sparse, complex-symmetric and shifted for a given set of sampling energy points, we can use the shifted COCG method, in which once the Green’s function for a reference energy point has been calculated, the Green’s functions for the other energy points can be obtained with a moderate computational cost. We calculate the transport properties of a C@(10,10) carbon nanotube (CNT) peapod suspended by (10,10)CNTs as an example of a large-scale transport calculation. The proposed scheme opens the possibility of performing large-scale RSFD-NEGF transport calculations using massively parallel computers without the loss of accuracy originating from the incompleteness of the localized basis set.

###### pacs:

## I Introduction

Since the discovery of quantized conductance in atomic-size contacts, there has been considerable interest in studying the electronic transport in nanoscale systems. So far, a number of calculation methods to investigate the electron transport properties through a nanostructure embedded by semi-infinite electrodes have been proposed such as the nonequilibrium Green’s function (NEGF) method (1), recursion transfer matrix method (2), Lippmann-Schwinger method (3) and overbridging boundary-matching (OBM) method (5); (4). Among them, the NEGF approach is most commonly used in connection with tight-binding models and first-principles methods for solving transport problems. In the NEGF formalism, localized basis sets consisting of either atomic orbitals or Gaussians are extensively utilized because it is straightforward to partition a whole system into a transition region and electrodes. However, the incompleteness of the basis sets sometimes degrades the accuracy (6) and the diffuse functions in the basis sets prevent us from achieving high-performance computing on massively parallel computers.

On the other hand, the calculation method using the real-space finite-difference (RSFD) approach (5); (7); (8) does not suffer from these problems. However, the linear-equation solver used to obtain the perturbed Green’s functions of a transition region becomes computationally intensive as the numbers of grid points and sampling energy points increase. More efficient linear-equation solvers are required to perform large-scale first-principles transport calculations by the NEGF approach using the RSFD scheme.

In this study, we propose an efficient scheme to compute perturbed Green’s functions that combines the procedure developed in the OBM method and the shifted conjugate orthogonal conjugate gradient (COCG) algorithm (9); (10). In the NEGF method, the matrix to be inverted includes the self-energy terms of electrodes, which are energy-dependent and destroy the shift invariance of the Krylov subspace with respect to the energy of electrons. The total computational cost of obtaining perturbed Green’s functions is proportional to the number of energy points. On the other hand, unperturbed Green’s functions are first calculated by the OBM method and then the effect of the electrodes is introduced in the wavefunction-matching procedure. Since the matrix to be inverted does not include nonlinear energy dependent terms, the unperturbed Green’s functions are obtained by solving a set of shifted linear equations with respect to the energy. The shifted linear equations can be solved by the shifted COCG method, in which the calculation is actually carried out only for a reference energy point and the solutions for the other shifted energy points are obtained with a moderate computational cost owing to the shift invariance of the Krylov subspace. Employing the mathematically strict relationship between perturbed and unperturbed Green’s functions developed in the OBM method (11), one can greatly decrease the computational cost of the linear-equation solver.

The rest of this paper is organized as follows. In Sec. II, the NEGF method using the RSFD approach is briefly introduced. In Sec. III, we state the problem of obtaining perturbed Green’s functions and introduce the computational scheme combining the procedure developed in the OBM method and the shifted COCG algorithm together with a numerical example. In Sec. IV, to demonstrate the efficiency and applicability of the newly developed method, we present a transport calculation of a C@(10,10) carbon nanotube (CNT) peapod connected to CNT electrodes. Finally, we summarize our work in Sec. V.

## Ii Computational scheme

### ii.1 Hamiltonian of whole system

We consider ballistic transport through a nanostructure sandwiched between two semi-infinite crystalline electrodes as shown in Fig. 1. The periodic boundary condition is applied in the and directions, while the semi-infinite boundary condition is imposed in the direction. The system under investigation can be divided into three regions: the left electrode (L), the transition region (T) and the right electrode (R). By introducing localized basis sets or real-space grids, we can decompose a whole Hamiltonian into the following matrix form:

(1) |

where , and are the Hamiltonian matrices for the left electrode, transition region and right electrode, respectively. () is the coupling term between the transition region and the left (right) electrode. Because of electronic screening effects, the coupling between the left and right electrodes is zero when the transition region is sufficiently large.

We next derive the Kohn-Sham (KS) Hamiltonian (12) for the RSFD scheme. Since the direction of current flow is assumed to be along the axis, the KS Hamiltonian is composed of the block-matrix elements with a dimension of , where and are the numbers of grid points in the and directions, respectively. Wavefunction values on the - plane at the point are expressed by the column vector with components given by {}. To simplify the explanation, we here restrict ourselves to the case of the central finite-difference approximation, i.e., in Eq. (1) of Ref. (7), and local pseudopotentials. The extensions to higher-order finite-difference approximations and the inclusion of nonlocal parts of pseudopotentials are given in Ref. (5). The KS equation for a whole system is expressed as a relation between three adjacent terms along the direction

(2) |

where is the energy of an electron, is an -dimensional block tridiagonal matrix including the potentials on the - plane at and the coefficients of the finite-difference approximation, and is a constant matrix that is proportional to the -dimensional unit matrix . The discretized KS equation of the whole system is written as

(3) |

where is the block tridiagonal matrix

(4) |

with . The partitioning in Eq. (4) corresponds to those in Fig. 1 and Eq. (1). Since the KS Hamiltonian matrix for the RSFD scheme is very sparse and block tridiagonal, we can employ efficient iterative algorithms such as the steepest descent (SD) or conjugate gradient (CG) method to solve the eigenvalue and matrix inversion problems.

### ii.2 NEGF method using RSFD approach

This subsection gives a brief explanation of the NEGF method for later convenience. The NEGF method gives the electron transport properties without calculating the scattering wavefunction explicitly. The Green’s function of a whole system can be defined as the resolvent of the Hamiltonian matrix of Eq. (4),

(5) |

where () is a complex energy with being a positive infinitesimal. The important quantity determining electron transport properties is the perturbed Green’s function of the transition region written as

(6) |

where

(7) |

are the self-energy terms of the left and right electrodes, respectively. Here

(8) |

are the surface Green’s functions of the semi-infinite left and right electrodes, respectively. In the RSFD-NEGF scheme, since () has only one nonzero -dimensional block-matrix element [see Eq. (4)], the self-energy terms are found to take the very simple forms of

(9) |

and

(10) |

where

(11) |

and

(12) |

with being the -dimensional block-matrix element of . The perturbed Green’s function of the transition region can be obtained by inverting the finite-size matrix given as

(13) |

where .

The expression for the conductance at a real energy is given by the Fisher-Lee formula (13):

(14) |

where is the retarded Green’s function of the transition region defined as

(15) |

is the advanced Green’s function, and is the coupling matrix calculated from the imaginary part of the self-energy term,

(16) |

## Iii Efficient procedure to obtain perturbed Green’s function

### iii.1 Difficulties in obtaining perturbed Green’s function by iterative method

The simplest way to obtain is the direct inversion of the matrix given by Eq. (13). However, direct matrix inversion becomes impractical when the matrix size increases. In addition, iterative solvers for linear equations, such as the SD and CG methods, are not efficient in this case because the matrix to be inverted given by Eq. (13) is neither Hermitian nor very sparse owing to the self-energy terms. In addition, the a computation of , which is carried out independently at each energy, is a computationally demanding task. To circumvent these difficulties, we propose a computational procedure employing the unperturbed Green’s function used in the OBM method (4); (5) and the shifted COCG algorithm (9); (10) that uses the collinear residual theorems (14) in the following subsection.

### iii.2 Unperturbed Green’s function and shifted COCG method

The unperturbed Green’s function of the transition is written as

(17) |

The relationship between the unperturbed and perturbed Green’s functions is expressed as the following equation, which was derived in Ref. (11).

(18) |

For and ,

(19) | |||||

(20) | |||||

(21) | |||||

and

(22) |

where is the modified under the influence of the self-energy terms, and , which is expressed as

(23) | |||||

and

(24) | |||||

The advantage of computing the unperturbed Green’s function is that the matrix inverted in Eq. (17), , is very sparse and complex symmetric. Moreover, are shifted matrices for a given set of energy points , which is in contrast to in Eq. (6). Therefore, shifted methods such as the shifted COCG method (9) can be applied to obtain unperturbed Green’s functions.

To solve the complex-symmetric linear equations , where denotes the th unit vector, by the conventional COCG method (15), we build up the Krylov subspaces independently at each complex energy , which requires time-consuming matrix-vector operations for each energy. On the other hand, the shifted COCG method makes use of the fact that Krylov subspaces are shift-invariant; to solve the linear equations for a given set of sampling energy points, the spanned Krylov subspace is common to all the energy points. Once the Krylov subspace has been built up for the reference energy point, we can obtain the Green’s functions for all the sampling energy points without matrix-vector operations during the iteration steps.

It is noteworthy that we do not need to store the whole matrices of the unperturbed Green’s functions in memory. To calculate the transport properties by Eq. (14), the block-matrix elements of the perturbed Green’s functions, and , are required. In addition, the density of states is relevant to the diagonal elements of the perturbed Green’s functions, . The evaluation of requires the additional computation of the off-diagonal block elements and , which are analytically determined in an analogous manner to the derivation of Eqs. (19)-(22) from Eq. (18). It is sufficient to store a couple of block-matrix elements of the unperturbed Green’s functions so as to perform matrix-matrix operations between the -dimensional block-matrix elements of the Green’s functions and the self-energy terms. Thus, we can obtain the perturbed Green’s functions with a moderate computation time and memory load.

There is no guarantee that the residual norms of all the sampling energy points satisfy the convergence criterion when that of the reference energy point is sufficiently small. Because the iterations are performed until all the residual norms satisfy the convergence criterion, the total number of iterations increases, resulting in the slight increase of CPU time in the shifted COCG method with respect to the number of sampling energy points. In addition, when the norm of the residual vector of the seed system becomes small, the numerical precision of the residual vectors for the sampling energy points degrades. The seed switching technique (10), which switches the reference energy point to the point where the residual norm is largest among the sampling energy points after the solution of the reference energy point converges, enables us to build up the residual vectors for the sampling energy points without loss of numerical precision and significant increase of the computational cost. The detail of the seed switching technique is introduced in Appendix.

### iii.3 Numerical test

To demonstrate the efficiency of the present scheme, we apply the shifted COCG method to the calculation of the unperturbed Green’s functions of the transition region for a Na atomic wire. The valence electron-ion interaction is described using norm-conserving pseudopotentials (16) generated by the scheme proposed by Troullier and Martins (17). Exchange-correlation effects are treated within the local density approximation (18) of density functional theory (19); (12). The central finite-difference approximation is adopted for the second-order derivation arising from the kinetic energy operator in the KS equation. The dimensions of the supercell are bohr and a single Na atom is contained in the supercell. The real-space grid spacing is chosen to be bohr, which corresponds to a total Hamiltonian dimension of 12800. All calculations are carried out on a workstation with an Intel Xeon E5-2667v2 CPU.

Figure 2 shows the CPU time versus the number of sampling energy points for the shifted COCG and conventional COCG methods. The initial reference energy point is chosen to be the Fermi energy. For the conventional COCG method, the CPU time is proportional to the number of energy points because the Green’s functions are calculated independently at each energy point. On the other hand, in the shifted COCG method, the matrix-vector operations are carried out only at the reference energy point and the cost of the scalar-vector operations to update the approximate solutions for all the shifted energy points is negligibly small. The increase in the computational time is attributed to the time-consuming matrix-vector operations in the additional iteration steps to reach global convergence.

## Iv Application

We next compute the conductance spectrum of a C@(10,10)CNT peapod, in which a (10,10)CNT encapsulates a C molecule, as an application of the proposed method to a large system. Figure 3 shows the computational model. This system has been observed by transmission electron microscopy (20) and its electronic structure has been intensively studied by theoretical calculations (21); (22); (23). However, the transport properties of peapods have mostly been investigated using the tight-binding approximation (23) because of the large system size required for calculations.

The size of the transition region is taken to be bohr and the transition region contains 340 carbon atoms. The initial carbon-carbon distance is set to 2.68 bohr (22) and structural relaxation is carried out. To determine the KS effective potential for the transport calculation, the conventional supercell indicated by the rectangle in Fig. 3 is used, where the periodic boundary condition in imposed in all directions, and the grid spacing is bohr, which gives a matrix of for the Hamiltonian of the transition region. The number of sampling energy points in the transport calculation is 101. Calculations are implemented using 64 Intel Xeon E5-2697v2 CPUs. The total elapsed time required to calculate the unperturbed Green’s functions for 101 energy points is 15294 second, in which the average number of iterations required to satisfy the convergence criterion is 6713, while the elapsed time and the average number of iterations are 6029 second and 3122, respectively, when the unperturbed Green’s function is only calculated for the Fermi energy.

The electronic band structures of the (10,10)CNT and C@(10,10)CNT peapod are shown in Figs. 4(a) and 4(b), respectively. In the band structure of the C@(10,10)CNT peapod, there are three flat bands above the Fermi level, which originate from the orbitals of the isolated C molecule. Owing to the hybridization between the orbitals of the C molecule and orbitals of the (10,10)CNT, the bands of the (10,10)CNT are divided by the bands originating from the orbitals.

The conductance spectrum of the C@(10,10)CNT is plotted in Fig. 5. Owing to the use of the shifted COCG method, we can sample a large number of energy points so as to identify the spiky dips in the spectrum. It is found that three dips appear near the energy range of the flat bands. Figure 6 shows a real-space picture of the charge densities of scattering wavefunctions, which is useful for understanding the spatial behaviors of the transport phenomenon. The charge density distribution with the energy at spreads around the (10,10)CNT, while that with the energy at the dip of the conductance spectrum accumulates in the vicinity of the C molecule. These results imply the incident electrons are scattered by the orbitals of the encapsulated C molecule. Since peapods have recently become experimentally accessible structures, this information should be helpful in understanding and predicting the resonant scattering in experimental transport measurements (24).

## V Conclusion

An efficient scheme to compute perturbed Green’s functions, which combines the procedure developed in the OBM method based on the RSFD approach and the shifted COCG algorithm, has been proposed. Since the number of grid points in the RSFD approach is larger than the number of bases in the tight-binding approach, the computational cost of obtaining the perturbed Green’s functions of the transition region has been the bottleneck. In this procedure, unperturbed Green’s functions are calculated and perturbed Green’s functions are obtained by employing the mathematically strict relationship between the perturbed and unperturbed Green’s functions. The main advantage of the present scheme for computing unperturbed Green’s functions is that the spanned Krylov subspace is common to all the sampling energy points. Therefore, once the Krylov subspace for the reference energy point has been built up, the Green’s functions for all the sampling energy points can be obtained without time-consuming matrix-vector operations. To demonstrate the potential power of the present method, the transport properties of a C@(10,10)CNT peapod suspended between (10,10)CNTs are investigated. By sampling a large number of energy points owing to the use of the present method, the scattering attributed to the coupling between the states of the CNT and C can be observed in the calculated conductance spectrum.

The present technique enables us to efficiently obtain perturbed Green’s functions in the framework of the RSFD approach. The RSFD scheme of first-principles calculations is free of problems concerning the completeness of the basis set that appear in the methods using localized basis sets of either atomic orbitals or Gaussians. Moreover, this scheme has the advantage of being scalable to massively parallel computers without compromising on precision. We believe that this development will be important for executing large-scale transport calculations using massively parallel computers.

## Acknowledgment

The authors would like to thank Professor Shao-Liang Zhang of Nagoya University, Professor Tomohiro Sogabe of Aichi Prefectural University, and Professor Susumu Yamamoto of Tokyo University of Technology for contributing to our fruitful discussions. This research was partially supported by the Computational Materials Science Initiative (CMSI) and a Grant-in-Aid for Scientific Research on Innovative Areas (Grant No. 22104007) from the Ministry of Education, Culture, Sports, Science and Technology, Japan. The numerical calculations were carried out using the computer facilities of the Institute for Solid State Physics at the University of Tokyo, the Center for Computational Sciences at University of Tsukuba, and K computer under a trial use at Advanced Institute for Computational Science at RIKEN.

*

## Appendix A Shifted COCG algorithm and seed switching technique

The COCG method is a numerical method for complex-symmetric linear equations as

(25) |

where A is not Hermitian but complex symmetric, . Its algorithm gives the vectors , and of the th iteration as follows under the initial conditions, , , , and (10).

(26) | |||||

(27) | |||||

(28) | |||||

(29) | |||||

(30) |

Now we solve sets of the shifted linear equations using the reference system , where is a given set of the sampling energy points and is an identity matrix. It is proved that the residual vector for the sampling energy point and the residual vector for the reference system are collinear:

(31) |

where is a scalar variable (14). Using the collinear relation of Eq. (31), the algorithm of the shifted COCG method is given as follows.

(32) | |||||

(33) | |||||

(34) | |||||

(35) | |||||

(36) |

where and the other initial conditions are same as the reference system. Using the information of the reference system, the shifted linear equations are readily solved. Note that the computational costs for Eqs. (31)-(36) are negligible because they consist of scalar-scalar and/or scalar-vector products. In addition, and as well as the required elements of and for the sampling energy points are stored in memory during the iterations. The iterations are carried out until all norms of the residual vectors satisfy the convergence criterion.

When the residual vector of the reference system is small, the numerical precision of the residual vectors of the sampling energy points degrades. The seed switching technique replaces the reference system by the energy point having the largest norm of the residual vector among the sampling energy points to circumvent this numerical difficulty. It is obvious from Eqs. (32)-(36) that and have to be updated after the reference system changes while , , , and are independent from the choice of the reference system. From the collinear relation of the residual vectors in Eq. (31), and are derived as

(37) |

where is the index of the new reference system. The extension to the switching to an arbitrary energy point () is also available (10).

### References

- S. Datta, in Electronic Transport in Mesoscopic Systems, edited by H. Ahme, M. Pepper, and A. Broers, Cambridge Studies in Semiconductor Physics and Microelectronic Engineering,Vol. 3 (Cambridge University Press, Cambridge, 1995).
- K. Hirose and M. Tsukada, Phys. Rev. B 51, 5278 (1995).
- N. D. Lang, Phys. Rev. B 52, 5335 (1995).
- Y. Fujimoto and K. Hirose, Phys. Rev. B 67, 195315 (2003).
- K. Hirose, T. Ono, Y. Fujimoto, and S. Tsukamoto, First-Principles Calculations in Real-Space Formalism, Electronic Configurations and Transport Properties of Nanostructures, (Imperial College Press, London, 2005).
- S.-H. Ke, H. U. Baranger, and W. Yang, J. Chem. Phys. 127, 144107 (2007); M. Strange, I. S. Kristensen, K. S. Thygesen, and K. W. Jacobsen, J. Chem. Phys. 128, 114714 (2008); C. Herrmann, G. C. Solomon, J. E. Subotnik, V. Mujica, and M. A. Ratner, J. Chem. Phys. 132, 024103 (2010).
- J. R. Chelikowsky, N. Troullier, and Y. Saad, Phys. Rev. Lett. 72, 1240 (1994); J. R. Chelikowsky, N. Troullier, K. Wu, and Y. Saad, Phys. Rev. B. 50, 11355 (1994).
- T. Ono and K. Hirose, Phys. Rev. Lett. 82, 5016 (1999); Phys. Rev. B 72, 085115 (2005); T. Ono, M. Heide, N. Atodiresei, P. Baumeister, S. Tsukamoto, and S. Blügel, Phys. Rev. B 82, 205115 (2010).
- R. Takayama, T. Hoshi, T. Sogabe, S.-L. Zhang, and T. Fujiwara, Phys. Rev. B 73, 165108 (2006)
- S. Yamamoto, T. Sogabe, T. Hoshi, S.-L. Zhang, and T. Fujiwara, J. Phys. Soc. Jpn. 77, 114713 (2008).
- T. Ono, Y. Egami, and K. Hirose, Phys. Rev. B 86, 195406 (2012).
- W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965).
- D. S. Fisher and P. A. Lee, Phys. Rev. B 23, R6851 (1981).
- A. Frommer, Computing 70, 87 (2003)
- H. A. Van Der Vorst and J. B. M. Melissen, IEEE Trans. Magn. 26 706 (1990).
- We used the norm-conserving pseudopotentials NCPS97 constructed by K. Kobayashi using the procedure proposed in Ref. (17). See K. Kobayashi, Comput. Mater. Sci. 14, 72 (1999).
- N. Troullier and J. L. Martins, Phys. Rev. B 43, 1993 (1991).
- S. H. Vosko, L. Wilk, and M. Nusair, Can. J. Phys. 58, 1200 (1980); J. P. Perdew and A. Zunger, Phys. Rev. B 23, 5048 (1981).
- P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964).
- Z. Liu, M. Koshino, K. Suenaga, A. Mrzel, H. Kataura, and S. Iijima, Phys. Rev. Lett. 96, 088304 (2006).
- S. Okada, S. Saito, and A. Oshiyama, Phys. Rev. Lett. 86, 3835 (2001); O. Dubay and G. Kresse, Phys. Rev. B 70, 165424 (2004).
- O. Dubay and G. Kresse, Phys. Rev. B 67, 035401 (2003).
- H. Kondo, H. Kino, and T. Ohno, Phys. Rev. B 71, 115413 (2005).
- D. J. Hornbaker, S.-J. Kahng, S. Misra, B. W. Smith, A. T. Johnson, E. J. Mele, D. E. Luzzi, and A. Yazdani, Science 295, 825 (2002); C. L. Kane, E. J. Mele, A. T. Johnson, D. E. Luzzi, B. W. Smith, D. J. Hornbaker, and A. Yazdani, Phys. Rev. B 66, 235423 (2002).