Perturbation Analysis of An Eigenvector-Dependent Nonlinear Eigenvalue Problem With Applications††thanks: This work was supported by NSFC No. 11671023, 11421101, 11671337, 11771188.
The eigenvector-dependent nonlinear eigenvalue problem (NEPv) , where the columns of are orthonormal, , is Hermitian, and , arises in many important applications, such as the discretized Kohn-Sham equation in electronic structure calculations and the trace ratio problem in linear discriminant analysis. In this paper, we perform a perturbation analysis for the NEPv, which gives upper bounds for the distance between the solution to the original NEPv and the solution to the perturbed NEPv. A condition number for the NEPv is introduced, which reveals the factors that affect the sensitivity of the solution. Furthermore, two computable error bounds are given for the NEPv, which can be used to measure the quality of an approximate solution. The theoretical results are validated by numerical experiments for the Kohn-Sham equation and the trace ratio optimization.
Keywords. nonlinear eigenvalue problem, perturbation analysis, Kohn-Sham equation, trace ratio optimization
AMS subject classifications. 65F15, 65F30, 15A18, 47J10
In this paper, we study the perturbation theory of the following eigenvector-dependent nonlinear eigenvalue problem (NEPv)
where has orthonormal column vectors, , is a continuous Hermitian matrix-valued function of , and is Hermitian, the eigenvalues of are also eigenvalues of . Usually, in practical applications, , and the eigenvalues of are the smallest or largest eigenvalues of . In this paper, we restrict our discussions to the case of the smallest eigenvalues. Furthermore, we consider in the following form
where , and are all Hermitian, is a constant matrix, is a homogeneous linear function of , and is a nonlinear function of .
Notice that if is a solution (1), then so is for any unitary matrix . Therefore, two solutions , are essentially the same if , where and are the subspaces spanned by the column vectors of and , respectively. Throughout the rest of this paper, when we say that is a solution to (1), we mean that the class solves (1).
Perhaps, the most well-known NEPv of the form (1) is the discretized Kohn-Sham (KS) equation arising from density function theory in electronic structure calculations (see [3, 11, 14] and references therein). NEPv (1) also arises from the trace ratio optimization in the linear discriminant analysis for dimension reduction [12, 20, 21], and the Gross-Pitaevskii equation for modeling particles in the state of matter called the Bose-Einstein condensate [1, 5, 6]. We believe that more potential applications will emerge.
The most widely used method for solving NEPv (1) is the so-called self-consistent field (SCF) iteration [11, 14]. Starting with orthonormal , at the th SCF iteration, one computes an orthonormal eigenvector matrix associated with the smallest eigenvalues of , and then is used as the approximation in the next iteration. Convergence analysis of SCF iteration for the KS equation is studied in [9, 10, 19], for the trace ratio problem in . Quite recently, in , an existence and uniqueness condition of the solutions to NEPv (1) is given, and the convergence of the SCF iteration is also studied.
In practical applications, is usually obtained from the discretization of operators or constructed from empirical data, thus, contaminated by errors and noises. As a result, the NEPv (1) to be solved is in fact a perturbed NEPv. So, it is natural to ask whether we can trust the approximate solution obtained by solving the perturbed NEPv via certain numerical methods, say the SCF iteration. To be specific, let the perturbed NEPv be of the form
where has orthonormal column vectors, , , and
is a continuous Hermitian matrix-valued function of ,
is a constant Hermitian matrix,
and are perturbed functions of and , respectively,
and , are still Hermitian.
Assume that the original NEPv (1) has a solution .
Then we need to answer the following two fundamental questions:
Q1. Under what conditions the perturbed NEPv (3) has a solution nearby ?
Q2. What’s the distance between and ?
Let and be two -dimensional subspaces of . Let the columns of form an orthonormal basis for and the columns of form an orthonormal basis for . We use to measure the distance between and , where
Here, ’s denote the canonical angles between and [15, p. 43], which can be defined as
where ’s are the singular values of .
In this paper, we will focus on Q1 and Q2. The results are established via two approaches. One is based on the well-known theorem in the perturbation theory of Hermitian matrices  and Brouwer’s fixed-point theorem ; The other is inspired by J.-G. Sun’s technique (e.g., [8, 16, 17, 18]) – finding the radius of the perturbation by constructing an equation of the radius via the fixed-point theorem. Two perturbation bounds can be obtained from these two approaches, and each of them has its own merits. Based on the perturbation bounds, a condition number for the NEPv (1) is introduced, which quantitatively reveals the factors that affect the sensitivity of the solution. As corollaries, two computable error bounds are provided to measure the quality of the computed solution. Theoretical results are validated by numerical experiments for the KS equation and the trace ratio optimization.
The rest of this paper is organized as follows. In section 2, we use two approaches to answer Q1 and Q2, followed by some discussions on the condition number and error bounds for NEPv (1). In section 3, we apply our theoretical results to the KS equation and the trace ratio optimization problem, respectively. Finally, we give our concluding remarks in section 4.
2 Main results
In this section we provide two approaches to answer Q1 and Q2. A condition number and error bounds for NEPv will also be discussed. Before we proceed, we introduce the following notation, which will be used throughout the rest of this paper.
stands for the set of all matrices with complex entries. The superscripts “” and “” take the transpose and the complex conjugate transpose of a matrix or vector, respectively. The symbol denotes the 2-norm of a matrix or vector. Unless otherwise specified, we denote by for the eigenvalues of a Hermitian matrix and they are always arranged in nondecreasing order: . Define
Denote , , , and also
Note here that can be used to measure the magnitude of the perturbation, and is a “local Lipschitz constant” such that
for all . Thus, we may use to measure the sensitivity of within .
2.1 Approach one
Proof: Using (13), we know that given by (14) is a positive constant. Then it is easy to see that is a nonempty bounded closed convex set in . For any , letting , we define for , where is an eigenvector of corresponding with for and . If we can show that
(which implies that the mapping is well-defined in the sense that is unique);
is a continuous mapping within ;
then by Brouwer’s fixed-point theorem , has a fixed point in . Let be the fixed point, where . Then is a solution to the perturbed NEPv (3). Hence the conclusion follows immediately. Next, we show , and in order.
Third, by the famous Weyl Theorem [15, p.203], we have
Then it follows that
Proof of We verify that is a continuous mapping within by showing that for any , , as , where and .
Let , , and
By Davis-Kahan theorem , we have
Proof of Define
where . Then
Then it follows that
2.2 Approach two
Proof: Let be a unitary matrix such that
Then that the perturbed NEPv (3) has a solution is equivalent to that there exists a unitary matrix such that
where is Hermitian and its eigenvalues are the smallest eigenvalues of .
Without loss of generality111Note that and thus . By the CS decomposition [15, Chapter 1, Theorem 5.1], we know that there exist unitary matrices and such that . Rewrite . It still holds (28). Then (29) follows immediately by setting ., we let
Then the perturbed NEPv (3) has a solution is equivalent to
there exists such that (30c) holds;
Next, we first prove (a) then (b).
Note that since defined in (12) is positive, is an invertible linear operator with
Therefore, we may define a mapping as
By (29), we have
Proof of If
then by Weyl Theorem , we have . Consequently,
Therefore, we only need to show (37), under the assumption .
Direct calculations give rise to