# QCAD Simulation and Optimization of Semiconductor Quantum Dots

## Abstract

We present the Quantum Computer Aided Design (QCAD) simulator that targets modeling multi-dimensional quantum devices, particularly silicon multi-quantum dots (QDs) developed for quantum bits (qubits). This finite-element simulator has three differentiating features: (i) its core contains nonlinear Poisson, effective mass Schrodinger, and Configuration Interaction solvers that have massively parallel capability for high simulation throughput, and can be run individually or combined self-consistently for 1D/2D/3D quantum devices; (ii) the core solvers show superior convergence even at near-zero-Kelvin temperatures, which is critical for modeling quantum computing devices; and (iii) it interfaces directly with the full-featured optimization engine Dakota. In this work, we describe the capabilities and implementation of the QCAD simulation tool, and show how it can be used to both analyze existing experimental QD devices through capacitance calculations, and aid in the design of few-electron multi-QDs. In particular, we observe that computed capacitances are in rough agreement with experiment, and that quantum confinement increases capacitance when the number of electrons is fixed in a quantum dot. Coupling of QCAD with the optimizer Dakota allows for rapid identification and improvement of device layouts that are likely to exhibit few-electron quantum dot characteristics.

###### pacs:

. \normalcaption \captionwidth

## I Introduction

Silicon-based qubits for quantum information processing have attracted enormous interest and made remarkable progress (1); (2); (3); (4) during recent years due to long spin decoherence times and the well-established Si nanoelectronic fabrication infrastructure. There has been a wide range of research work (5); (6); (7); (8) in academia and national labs dedicated to design and fabricate Si-based multi-quantum dot (QD) devices for qubit application. However, many practical issues have made it challenging to achieve reliable few-electron multi-QD qubits, including fabrication process variation, interface quality, defect/disorder, and device design.

To facilitate the experimental development of the technology, a simulation tool was needed to efficiently solve for semiclassical (*i.e.*, using the Thomas-Fermi approximation) and self-consistent quantum electrostatic potentials, single- and multi-electron wave functions, and energies at near-zero temperatures. It was also essential that such a tool be able to simulate and optimize many different single- and multi-QD structures very efficiently, and provide fast feedback on which device layouts are more likely to lead to few-electron behavior. Although there have been numerous papers (9); (10); (11); (12); (13) on simulation methods for quantum dots, they focus on different issues than those addressed in this work. Existing commercial (14) and academic (15) device simulators either target room-temperature and many-electron devices, whereas qubit applications require temperatures close to zero Kelvin and one/few-electron devices, or target simple and few geometries, whereas multi-QD devices have very complex three-dimensional (3D) shapes and can have many different layouts due to their inherently large design space. These existing tools often have license and platform restrictions which place non-fundamental but very real limits on simulation efficiency. The Quantum Computer Aided Design (QCAD) simulator has been developed to address these challenges associated with modeling realistic multi-QDs, including complex geometries, many device layouts, low temperature operation, and 3D quantum confinement effects. It is an integrated open-source finite-element-based tool that leverages a number of Sandia-developed software programs (16), including the Trilinos suite, the Albany code (17), the Dakota toolbox, and the Cubit geometry and meshing tool.

The first and major part of this work details the QCAD simulator (18) (or just “QCAD” for brevity), which is built upon the Albany finite element framework (17) and contains three core modules of Poisson (P), Schrodinger (S), and Configuration Interaction (CI) solvers. These physical solvers can be run individually or combined self-consistently (i.e., self-consistent S-P and S-P-CI solvers) for simulating arbitrary 1D/2D/3D quantum devices made from multiple different materials. They have demonstrated fast and robust convergence behavior even at very low (milli-Kelvin) temperatures. Furthermore, very high simulation throughput has been achieved using a combination of pre- and post-processing scripting, automated structure creation and meshing, and distributed parallel computing capability and resources. In Sec. II, details of the QCAD simulator and its core physical solvers are presented, with the semiclassical Poisson solver detailed in Sec. II.1, the Schrodinger solver described in Sec. II.2, and the self-consistent S-P solver detailed in Sec. II.3.

The second part of this work presents two examples of applying QCAD to simulate realistic QD structures. Since QCAD solves for the electron density at a given set of device control voltages, it is able to compute the capacitance between a quantum dot and a control gate. Such capacitances can also be determined experimentally. In Sec. III, we compute the dot-to-gate capacitances in an experimental QD structure, using the semiclassical and self-consistent S-P solvers, and compare them with experiment. A detailed analytic analysis is then conducted to help understand the effect of quantum confinement on capacitances. Finally, in Sec. IV, QCAD’s ability to assess and improve upon the design of nano-scale devices is demonstrated with few-electron double QDs (DQDs). Through its coupling with the optimization driver Dakota, QCAD is able to optimize gate voltages to simultaneously achieve multiple desired targets (*e.g.* few electrons in a dot and a tunnel barrier being controllable) in many different DQD device designs. We show how this aids in the identification of device designs which are likely to exhibit few-electron quantum dot behavior.

## Ii Qcad Simulator

The QCAD code structure is given in Fig. 1. The base of the structure is Trilinos (16), an open-source suite consisting of mathematical libraries (nonlinear and linear solvers, preconditioners, eigensolvers, etc.), discretization (finite element and finite volume) utilities, automatic differentiation library, distributed parallel library, and many more packages (refer to trilinos.sandia.gov for details). The Albany code (17) provide a unified and flexible interface to those Trilinos packages to minimize the coding effort for users developing physical models. The Dakota driver (16) interfaces with QCAD and repeatedly calls the QCAD executable to perform specified optimization tasks. It is an open-source tool that provides a broad spectrum of capabilities including nonlinear least square optimization (see dakota.sandia.gov for details). Pre- and post-processing scripts have been developed to automatically generate QCAD input files for multiple DQD devices, submit jobs to computing clusters, and collect simulation results. These scripts are one of the enabling factors that make efficient QCAD simulations possible.

The QCAD core contains nonlinear Poisson (P), effective-mass Schrodinger (S), and CI solvers. These solvers can be run individually or combined in a self-consistent manner. The Poisson and Schrodinger solvers can be coupled self-consistently to obtain single-particle envelope wave functions and energy levels. The S-P solutions can then be used as wave function basis for the CI solver which is then coupled with the Poisson solver. This self-consistent S-P-CI technique produces accurate many-particle wave functions and energies within the effective mass approximation, which are important for quantitative study of few-electron multi-QD behavior (*e.g.* for computing the exchange energy in DQDs). Details about the Poisson, Schrodinger, and self-consistent S-P solvers are described in the following subsections, and the CI method can be found in Ref. (19). We note that the current version of the QCAD simulator solves one or a system of equations in the full number of dimensions of the problem, and does not make any separability assumptions that allow a higher dimensional problem (e.g. 3D) to be divided into multiple lower dimensional problems (e.g. 2D+1D). Such simplifying approximations(20) are a possible future enhancement to the QCAD core.

### ii.1 Poisson Solver

The well-known Poisson equation in a bulk semiconductor is given by

(1) |

where is the electrostatic potential, is the static permittivity, is the elementary charge, and are the electron and hole concentrations respectively, and are ionized donor and acceptor concentrations respectively. Note that the form of the left hand side in Eq. (1) allows to have spatial dependence.

#### Carrier Statistics

and are given by carrier statistics for bulk (spatially unconfined) semiconductors. Both Maxwell-Boltzmann (MB) and Fermi-Dirac (FD) statistics are implemented in QCAD. For the MB statistics, and take the exponential forms,

(2) |

where is the Boltzmann constant, is the lattice temperature, and are the conduction and valence band edge respectively, and is the extrinsic Fermi level (more details on , , and are given in Sec. II.1.2). For the FD statistics, and are expressed in terms of the Fermi-Dirac integrals (see Appendix A for the derivation),

(3) |

and are effective density of states (DOS) in the conduction and valence band, respectively. Assuming parabolic band structure, we have

(4) |

where is the reduced Planck constant, and are respectively the electron and hole DOS effective mass including all equivalent band minima. For bulk silicon, there are six equivalent conduction minima, and the valence band minimum is degenerate including heavy hole and light hole bands at the valley, hence

(5) |

with , , , and being the electron longitudinal, electron transverse, heavy hole, and light hole effective mass, respectively.

The function in Eq. (II.1.1) is the Fermi-Dirac integral of order and is defined as (22)

(6) |

Although the closed form of this integral can be formally expressed by the polylogarithm function(23) or by a complete expansion discussed in Ref. (24), the polylogarithm function and the complete expansion involve summations of infinite series. Hence in practice, one has to either use certain approximation to obtain a computable analytic expression, or use numerical integration techniques(21); (25). There have been a few approximate analytic forms proposed in literature (26); (22); (27) that offer relatively simple expressions and sufficient accuracy. Among them, the approximate expression in Ref. (26) takes a single simple form and provides a relative error less than for , hence has been widely used in the device modeling community (28). The expression in Ref. (26) is given as

The asymptotic expansion at leads to , implying that Eq. (II.1.1) becomes equivalent to Eq. (II.1.1), which is the case for non-degenerate semiconductors where [a few ]) for , and [a few ]) for . The asymptotic form at is , which corresponds to the strongly degenerate case near 0 K, where [a few ]) for (i.e., the Fermi level is located within the conduction band), and [a few ]) for (i.e., the Fermi level is inside the valence band). Figure 2 shows a comparison of the 1/2-order Fermi-Dirac integral calculated by different methods. It is clear that the approximate expression in Eq. (II.1.1) produces visually the same result as the numerical approach (21), and follows the proper asymptotic forms for large . In the small regime, neither of the asymptotic forms is valid. Since semiconductor DQD qubits are currently operated in this regime (corresponding to very low temperatures, mK to a couple of K), it is important to adopt a sufficiently accurate evaluation of the Fermi-Dirac integral. Due to the good accuracy and simplicity of Eq. (II.1.1), we implemented this form in QCAD for the FD statistics. In the actual implementation, we approximate by for to avoid numerical instability caused by the term in Eq. (II.1.1). Such large negative values of can occur at very low temperatures, and this approximation results in no discernible loss of accuracy, as shown in Fig. 2.

#### The Reference Potential

Before solving the Poisson Eq. (1) for the electrostatic potential , one needs to relate to the band energies of the materials making up the device. One requirement for the electrostatic potential is that it must be continuous everywhere in a device. For a homo-junction device such as a PN silicon diode, , , and (the intrinsic Fermi level) as functions of position are parallel to each other and continuous across the device, so it is natural to choose , i.e., to solve for the inverse of intrinsic Fermi level. In an arbitrary hetero-junction structure, however, , , and could all be discontinuous. Figure 3 shows a schematic of the band structure of a MOS-type device under zero bias illustrating the discontinuity of and . What *is* always continuous in arbitrary homo- and hetero-junction devices is the vacuum level indicated as in Fig. 3. Therefore, we choose to satisfy (29)

(8) |

where is a constant reference potential and is the electron affinity of a material. This choice implies that we are solving for the inverse of the vacuum level shifted by a constant value. While in theory different values only change the resulting solution, , by a constant offset, in practice they can lead to different numerical convergence behavior during simulation. A good choice of that has shown numerical robustness in devices containing silicon is to select as the reference potential the intrinsic Fermi level of silicon relative to the vacuum level, i.e., (Si). For a more detailed explanation of band diagrams schematics such as that in Fig. 3, we refer the reader to Ref. (30).

With the choice of in Eq. (8), we can rewrite and as

(9) |

with being the band gap of a material. Then the and in Eq. (1) becomes a function of only, i.e., and , as and are material-dependent parameters and known for most semiconductors.

As QCAD does not solve the carrier transport (e.g., drift-diffusion) equations, all calculations must assume that thermal equilibrium (zero current flow) has been attained. The Fermi level, , is taken to be a constant throughout any electrically-connected region of a device. The value of this constant is set by the voltages applied to the device. For example, if a voltage is applied to the substrate (right side) of the device in Fig. 3, will become . The band structure shown in Fig. 3 corresponds to a MOS-type structure with metal gate (left side). If the structure instead had a highly-doped semiconductor gate (e.g., n polysilicon gate) with applied voltage , in the gate region would be while in the substrate region remains at .

#### Incomplete Ionization

When impurities are introduced into the semiconductor crystals, depending on the impurity energy level and the lattice temperature, not all dopants are necessarily ionized, especially at very low lattice temperatures (where DQD qubits are commonly operated). The ionized concentration for donors and acceptors is given by (31)

(10) |

where is the donor energy level, is the donor ground state degeneracy factor, is the acceptor energy level, and is the acceptor ground state degeneracy factor. is equal to 2 because a donor level can accept one electron with either spin or can have no electron when filled. is equal to 4 because in most semiconductors each acceptor level can accept one hole of either spin and the impurity level is doubly degenerate as a result of the two degenerate valence bands (heavy hole and light hole bands) at the point.

To write and as a function of , we need to rewrite with being the donor ionization energy, and with being the acceptor ionization energy. The most common donors in bulk Si are phosphorus (P) and arsenic (As), which have ionization energies of meV and 54 meV respectively(31). The most common acceptor dopant in bulk Si is boron (B), which has meV(31).

Substituting Eq. (II.1.2) and , into Eq. (II.1.3), we get (32)

(11) |

With these expressions, and also become a function of , i.e., and . Hence, the entire right hand side (RHS) of Eq. (1) can be written as a nonlinear function of . Applying integration by parts and divergence theorem, we then rewrite the equation into the finite element (FE) weak form,

(12) |

where is the FE nodal basis function and the second term is a line integral over the simulation domain boundary with being the unit normal vector of the surface element . The weak form is discretized using the Trilinos/Intrepid library, and the resulting discrete equation is solved by a nonlinear Newton solver also in Trilinos. Both the discretization library and the Newton solver were made directly available to QCAD through the Albany framework (cf. Fig. 1).

#### Boundary Conditions

An essential ingredient to the formulation of a differential equation are boundary conditions (BCs). QCAD supports three types of BCs: Dirichlet, Neumann, and Robin BCs. We will next discuss the implementation of these types in turn.

Dirichlet BCs are divided into two cases: (1) setting a voltage on the surface of a metallic region that borders insulator, and (2) setting a voltage on the surface of an Ohmic contact region which borders semiconductor. Case (1) is used for gate electrodes in field effect transistor (FET)-like devices, and the Dirichlet BC value on the bordering insulator(s) is given by the simple expression

(13) |

with being the applied gate voltage and being the metal work function.

In the second case (used for Ohmic contacts in semiconductors), the potential on the bordering semiconductor surfaces is computed assuming thermal equilibrium and charge neutrality at the contacts. The calculation depends on carrier statistics and dopant ionization. For MB statistics, the charge neutrality condition leads to

(14) |

With complete ionization of dopants (i.e., and ), the potentials at n-type and p-type Ohmic contacts are respectively given by

(15) |

where is an externally applied voltage. When including incomplete ionization effect of dopants, we have for n-type and p-type semiconductors respectively,

(16) |

Let us denote and . Then, by the definitions of and , we obtain the identities, and . Substituting the identities into Eq. (II.1.4), we obtain

(17) |

From the definitions of and , the use of , and Eq. (II.1.2), we can obtain the potentials that include dopant incomplete ionization effect at the n-type and p-type Ohmic contacts respectively as,

At very low temperatures, the exponential terms in Eq. (II.1.4) could blow up numerically. To avoid numerical instability in QCAD, we approximate the and terms for very low temperatures as,

(19) |

For FD statistics and assuming complete ionization of dopants, we have

(20) |

To solve for the potentials at Ohmic contacts, in principle, we need to numerically solve Eq. (II.1.4) as the Fermi-Dirac integral does not have an analytic result. In QCAD, we use an approximate expression for the inverse of the 1/2 order Fermi-Dirac integral, that is, given , is computed as (33),

(21) |

This approximation has an error of less than for the entire range. Using this expression, we obtain the BC potentials as,

(22) |

with given in Eq. (II.1.4) where for n-type and for p-type. For the case of FD statistics and incomplete ionization, there exists no approximate analytic expressions for the BC potentials, and one has to solve a non-trivial nonlinear equation if want to be very accurate. In QCAD, we approximate this case using MD statistics with incomplete ionization and utilize the BC potentials given in Eq. (II.1.4).

Neumann BCs in finite element methods are used to specify how “flux” is conserved across boundaries. By default, all boundaries that are not given any other type of boundary conditions, assume implicit Neumann BCs which preserve the flux. In the case of the Poisson equation, the flux is , where is the unit normal of the boundary surface. Thus, by default (i.e. when no other boundary condition is specified), on outer boundaries of the finite element mesh and on internal boundaries. These two conditions are automatically satisfied in the finite element framework by setting the term to 0 in Eq. (II.1.3)

QCAD has the ability to specify non-flux-conserving Neumann BCs on specific boundaries such that the difference between the fluxes on either side of the boundary are equal to some specified constant value. Written mathematically, , where is the unit normal vector of the interface pointing from material 2 to 1 and is the specified constant. Physically, is a surface charge density located at the boundary. Note that when (i.e. no surface charge), the boundary condition reduces to the default flux-conserving condition. Within the finite element discretization in QCAD, this type of Neumann BC is implemented in the integral form

(23) |

A major shortcoming of the Neumann BCs is their inability to characterize surface charge on (or extremely close to) an interface whose voltage is set by a Dirichlet BC. This is due to the simple fact that specifying both Dirichlet and Neumann BCs on the same surface over-determines the problem. Yet, this is essentially what is needed to model a layer of charge that is stuck to one of the conducting gates (often polysilicon) used to control a device. One way around this technical difficulty is to place a layer of very thin finite-element cells around the charged gate and set a Neumann BC on the new surface lying a small distance away from the gate itself. This approach, however, suffers due to the thin finite elements adversely affecting convergence and their being hard to create in the first place. Instead, we use what are called Robin boundary conditions(34) to address the issue of charged gates. Robin BCs are similar to Neumann BCs but allow the flux at a surface to depend on the value of the solution (in this case the potential) there. Specifically, the Robin BC for an internal surface element can be written , where and are fixed constants. At an external surface, we have . We would like to roll into a single boundary condition, a Dirichlet condition at one surface followed by a Neumann boundary condition at a parallel surface lying a very small distance away from the first surface. This can be done at an external surface using and , as shown in Fig. 4, which places a surface charge of a distance away from a point at which is pinned to . We somewhat arbitrarily choose , which is much smaller than any of the mesh features (for semiclassical Poisson simulations) in our devices of interest. Robin BCs are enforced in an integral form, similar to that of Neumann BCs (cf. Eq. 23).

### ii.2 Schrodinger Solver

The time-independent single-particle effective mass Schrodinger equation takes the form of

(24) |

The FE weak form of the equation is

(25) |

The weak form is discretized by the FE method and the resulting eigenvalue problem is solved by the Trilinos eigensolver package called Anasazi (16).

The Schrodinger solver supports two types of boundary conditions: Dirichlet and Neumann. For Dirichlet boundaries, . All other boundaries excluding Dirichlet are treated as Neumann BCs which, require on outer boundaries, and being continuous (i.e., flux conservation) across material interfaces on internal boundaries. As in the Poisson solver, Neumann BCs are automatically satisfied in the FE framework by setting in Eq. (II.2). (The ability to set non-flux-conserving Neumann boundary conditions is absent in the Schrodinger solver since it would have no application in our work.)

Figure 5 shows a comparison between the QCAD Schrodinger solver and analytic results of the lowest six wave functions and energies for a 1D parabolic potential well. The QCAD and analytic results are in excellent agreement. The solver was also applied to 2D and 3D infinite potential wells. The obtained wave functions and energies were compared with the analytic results and excellent agreement was also observed.

One of the advantages of using the FE discretization over the finite difference discretization is that the continuities of and across heterojunctions are automatically satisfied in the former case, whereas they have to be explicitly enforced in the latter case. Specifically, when going from a homojunction device to a heterojunction device, the QCAD Schrodinger solver does not require any code change except setting the proper effective masses for the materials used. As an example, consider a 1D finite potential well that has a width of 20 nm, a potential height of 100 meV, and can have different effective masses for the well and barrier. Figure 6(a) shows the lowest three wave functions and energies obtained from QCAD for a homojunction device with the same effective mass for the well and barrier. In this case, the wave functions and their first derivatives are all continuous across the junctions. The wave functions and energies agree very well with the corresponding analytic results in Figure 2.15, Ref. (35). When the well and barrier have different effective masses, as shown in Fig. 6(b) for a heterojunction device, the wave functions are still continuous across the junctions, but their first derivatives are discontinuous due to the difference in effective masses.

### ii.3 Self-Consistent Schrodinger-Poisson Solver

In realistic quantum devices such as DQDs, we can divide the entire structure (relatively large) into semiclassical and quantum regions. These regions are chosen such that in semiclassical regions, solving the nonlinear Poisson equation alone is often sufficient to obtain a good estimate of electrostatics, whereas in quantum regions, the Poisson and Schrodinger equations need to be coupled self-consistently for electrons (we focus on electrons only as they are used for qubit operation). The coupled two equations take the following form

(26) |

where the electron density becomes a function of the th energy level and the envelope wave function of the Schrodinger equation, while the potential energy is a function of and . The general expression for is given by , where the term takes different expressions depending on confinement dimensionality.

#### Quantum Electron Density

In Si quantum devices, when we focus on those devices where the Si/other material (e.g., Si/SiO) interfaces are parallel to the [100] plane, the six equivalent conduction band minima of the bulk silicon are split into two groups due to the breaking of crystallographic symmetry, widely known as (fourfold degeneracy) and (double degeneracy) valleys, with valleys are lower in energy. At low temperatures, especially the operating temperatures for DQD qubits which are in the mK to a few Kelvin range, only the valleys are occupied by electrons; therefore, we consider the valleys only for Si devices in QCAD. Due to the ellipsoidal energy surfaces at the minima, the electron effective mass in the Schrodinger equation is different from that used in computing the electron density, and it depends on the confinement direction and the number of confined directions.

Two un-confined dimensions. In 1D-confined devices such as a 1D Si MOS capacitor, electrons are spatially confined in one direction (assumed direction in QCAD) but free to move in the and directions. The Si/SiO interface is in the plane and perpendicular to the longitudinal axis of the valleys. The coupled Schrodinger equation is 1D and given by

(27) |

where is the electron longitudinal effective mass of silicon. From Appendix A, the volume electron density is computed as

where is the electron transverse effective mass of silicon and 2 accounts for the double degeneracy of the valleys. is spatially normalized to 1, i.e., , and hence has the unit of 1/length. When , the term in Eq. (II.3.1) can numerically go to infinity for very small T (mK to a few K), which can cause numerical instability. To avoid such problems, when the argument is relatively large (e.g., ), we replace the term with .

One un-confined dimension. We next consider devices, such as quantum wire structures, where electrons are confined along two dimensions (assumed and directions in QCAD) and are free to move in the direction (the wire direction). The Si/SiO interface is perpendicular to the axis and also the longitudinal axis of the valleys. The coupled Schrodinger equation is 2D and given by

(29) |

From Appendix A, the volume electron density is computed as

(30) | |||||

where is the Fermi-Dirac integral of -1/2 order. It is computed by using the approximate analytic expressions in Ref. (27), which have a very small error less than 0.001 in the entire range. is spatially normalized to 1, i.e., , and hence has the unit of 1/length.

Zero un-confined dimensions. In devices such as quantum dot structures, electrons are spatially confined in all three directions and there are no (zero) dimensions in which they are free to move. The Si/SiO interface is perpendicular to the direction and also the longitudinal axis of the valleys. The coupled Schrodinger equation is 3D and given by

(31) |

The volume electron density is computed as

(32) |

where 4 accounts for the double degeneracy of the valleys and that of the spin. is normalized to 1 in the 3D quantum domain, and has the unit of 1/length. When , the term in Eq. (32) can blow up numerically. To avoid such problem, when is relatively large (e.g., ), we replace the term with .

All the above derivations are also applicable to other devices where the semiconductors have a single conduction band minimum located at the valley such as GaAs-based devices, except that the valley degeneracy is 1 and a single electron effective mass is used in all the equations.

Next we discuss the potential energy term in the coupled Schrodinger equation. It takes the form of

(33) |

where is the exchange-correlation correction due to the Pauli exclusion principle in real many-electron systems. For the term, we use the well-known local density parameterization suggested by Hedin and Lundqvist (36) that has also been widely used by other authors (37); (28); (38). It is given as

(34) |

Since this parameterization requires a scalar effective mass as input, we use an average mass for Si as suggested in Ref. (38),

(35) |

#### Self-Consistency

The Schrodinger (S) and Poisson (P) equations in Eq. (II.3) have strong nonlinear coupling. They need to be solved self-consistently by certain iterative numerical schemes. Various iteration schemes (39); (40); (41); (42); (43); (44); (45); (46) have been proposed and used over the past few decades. Among them, three are notable: the under-relaxation method, the damped Newton method, and the predictor-corrector approach.

The under-relaxation scheme (39); (42) (also called convergence-factor or simple average method) solves the S and P equations in succession, and under-relaxes the electron density or the electrostatic potential for the th S-P outer iteration, using either a pre-set constant or an adaptively determined relaxation parameter (see the references for details). The advantage of this method is its simplicity. Its weakness is that the relaxation parameter is not known in advance and needs to be dynamically but heuristically readjusted during the course of iterations; if is too large, the iteration loop cannot reach convergence, whereas, if is too small, it takes too many iteration steps to achieve convergence.

The damped Newton method (41); (44) also solves the S and P equations in succession, but uses a damped Newton method (47) for the outer S-P iteration. Specifically, this approach starts from an initial guess , solves the Schrodinger eigenvalue problem, computes quantum electron density according to section II.3.1, and then solves a linear P equation, obtained by linearizing the Poisson equation in Eq. (II.3) according to the Newton method (47) with an approximate Jacobian matrix; the potential from the linearized P equation is used to obtain , which is then input to the S equation, and the procedure continues until self-consistency is reached. Here, the damping parameter is not heuristic, but can be determined by the selection algorithm of the damped Newton method (cf. Ref. (47)). represents the th Newton iteration and also the th outer S-P iteration. The Jacobian matrix must be approximated because the quantum electron density does not have explicit dependence on the potential . The approximate Jacobian matrix is obtained by simply assuming a semi-classical electron density expression in the P equation. The approach has shown reasonably robustness (41), however, because of the approximate nature of the Jacobian, it often takes many Newton iterations (e.g., 50) to achieve sufficient self-consistent accuracy (e.g., is converged within 0.01 meV).

It is well-known that the under-relaxation (39), damped Newton (41), and other similar iteration schemes (40); (43) do not necessarily lead to convergence, or take too many iterations to achieve it. These schemes have been used mostly in 1D S-P problems and in only a limited number of 2D applications, and one may rightly expect that they would have much more difficulty in achieving convergence in 3D S-P problems (e.g. in quantum dots). The key reason for the instability of these methods is that they do not physically address the strong nonlinear coupling between the S and P equations. In 1997 Trellakis et. al (45) proposed the predictor-corrector (p-c) iteration scheme based on a perturbation argument. Due to its solid physical groundings, the p-c method has shown fast and robust convergence behavior (45); (32); (46), and has been widely used in 2D and 3D simulations of various quantum semiconductor devices (48); (49); (50). Given its excellent track record, we implemented this p-c method in QCAD for the self-consistent S-P loop.

The key feature of the p-c method is that it partially decouples the S and P equations by moving most nonlinearities into the nonlinear Poisson equation

(36) |

where is an approximate expression for the quantum electron density , which has an explicit dependence on the potential (note the exact quantum density does not have explicit dependence on ). The nonlinear Poisson equation can be solved by a Newton method (the predictor step). The predicted result for and from this equation is then corrected in an outer iteration step by the solution of Schrodinger equation (the corrector step).

The approximate quantum density is obtained by using the first-order perturbation theory and the derivative property of Fermi-Dirac integrals (45). The resulting expression is the same as the exact quantum density given in Section II.3.1, except that the argument in the Fermi-Dirac integral is modified to include an explicit dependence on . For 1D-confined Si devices, is given by

where the superscripts denote quantities obtained in the previous th outer S-P iteration step (hence they are known quantities). For 2D-confined Si devices,

(38) | |||||

For 3D-confined Si devices,

(39) |

Note that there is a minus sign in the term in Eq. (39). In principle, once the self-consistent S-P loop is converged, this term should be numerically zero, which might suggest that the sign shall not matter. However, our experience with QCAD is that the minus sign is very important for the 3D-confined case to achieve self-consistent convergence; if we used a plus sign here, the outer S-P loop ran into numerical oscillations.

The self-consistent p-c procedure in QCAD is done in the following steps.

(1) Solve the semiclassical nonlinear Poisson equation, Eq. (1), using the Newton solver in Trilinos (16), to obtain an initial potential and compute the initial total potential energy without the exchange-correlation correction .

(2) Solve the coupled Schrodinger equation for the th () S-P iteration step,

to obtain and (performed using an eigensolver available in Trilinos).

(3) Solve the coupled nonlinear Poisson equation with the approximate quantum electron density ,

using the Trilinos Newton solver to obtain the updated potential , and compute and including . Note we want to use the latest electron density to compute for good convergence.

(4) Check if for everywhere in the device, with being a pre-defined tolerance often chosen as V; if not, repeat steps (2) to (4).

It is clear from the above procedure that there is no under-relaxation step between two S-P iterations and the outer iteration reduces to a simple alternation between solving S and P equations. In addition, the Newton Jacobian matrix for the nonlinear P equation, Eq. (36), can be found analytically, avoiding the necessity of using an approximate Jacobian matrix in the damped Newton iteration scheme (41). In terms of code implementation, the p-c method was very straightforward to implement in QCAD within the Albany framework. Here we emphasize that, because of the automatic differential capability in Trilinos, the Newton Jacobian matrix is computed directly by the code, and we do not need to derive the Jacobian matrix. More details on the implementation and the Albany code structure are found in Ref. (17).

#### Validation Example

To validate the self-consistent S-P solver, we performed simulations on two structures and compared with other simulation results. The first one is a 1D MOS Si capacitor with 4-nm oxide and cm p-substrate doping. Figure 7 compares the -valley lowest four wave functions and energies in the capacitor obtained from QCAD and SCHRED (15) at T = 50 K, the lowest temperature allowed by SCHRED. (SCHRED is a 1D self-consistent Poisson-Schrodinger solver for MOS capacitors available on www.nanohub.org.) There are two simulation differences between QCAD and SCHRED: (i) QCAD applies the S-P solver to both Si and SiO regions, leading to slight wave function penetration in the oxide () as seen in Fig. 7a, while SCHRED assumes at the Si/SiO interface; (ii) QCAD considers the two valleys only, whereas SCHRED includes both the two and the four valleys. A typical effective mass of 0.5 is often assumed (51); (52) for SiO, where is the free electron mass. To minimize the wave function difference near the Si/SiO interface due to the different boundary conditions imposed by the two tools, we used 0.005 as the SiO effective mass for the QCAD simulations. The choice of setting is because, at the Si/SiO interface, QCAD applies the flux conservation condition of