Perturbation Analysis of An EigenvectorDependent Nonlinear Eigenvalue Problem With Applications^{†}^{†}thanks: This work was supported by NSFC No. 11671023, 11421101, 11671337, 11771188.
Abstract
The eigenvectordependent nonlinear eigenvalue problem (NEPv) , where the columns of are orthonormal, , is Hermitian, and , arises in many important applications, such as the discretized KohnSham equation in electronic structure calculations and the trace ratio problem in linear discriminant analysis. In this paper, we perform a perturbation analysis for the NEPv, which gives upper bounds for the distance between the solution to the original NEPv and the solution to the perturbed NEPv. A condition number for the NEPv is introduced, which reveals the factors that affect the sensitivity of the solution. Furthermore, two computable error bounds are given for the NEPv, which can be used to measure the quality of an approximate solution. The theoretical results are validated by numerical experiments for the KohnSham equation and the trace ratio optimization.
Keywords. nonlinear eigenvalue problem, perturbation analysis, KohnSham equation, trace ratio optimization
AMS subject classifications. 65F15, 65F30, 15A18, 47J10
1 Introduction
In this paper, we study the perturbation theory of the following eigenvectordependent nonlinear eigenvalue problem (NEPv)
(1) 
where has orthonormal column vectors, , is a continuous Hermitian matrixvalued function of , and is Hermitian, the eigenvalues of are also eigenvalues of . Usually, in practical applications, , and the eigenvalues of are the smallest or largest eigenvalues of . In this paper, we restrict our discussions to the case of the smallest eigenvalues. Furthermore, we consider in the following form
(2) 
where , and are all Hermitian, is a constant matrix, is a homogeneous linear function of , and is a nonlinear function of .
Notice that if is a solution (1), then so is for any unitary matrix . Therefore, two solutions , are essentially the same if , where and are the subspaces spanned by the column vectors of and , respectively. Throughout the rest of this paper, when we say that is a solution to (1), we mean that the class solves (1).
Perhaps, the most wellknown NEPv of the form (1) is the discretized KohnSham (KS) equation arising from density function theory in electronic structure calculations (see [3, 11, 14] and references therein). NEPv (1) also arises from the trace ratio optimization in the linear discriminant analysis for dimension reduction [12, 20, 21], and the GrossPitaevskii equation for modeling particles in the state of matter called the BoseEinstein condensate [1, 5, 6]. We believe that more potential applications will emerge.
The most widely used method for solving NEPv (1) is the socalled selfconsistent field (SCF) iteration [11, 14]. Starting with orthonormal , at the th SCF iteration, one computes an orthonormal eigenvector matrix associated with the smallest eigenvalues of , and then is used as the approximation in the next iteration. Convergence analysis of SCF iteration for the KS equation is studied in [9, 10, 19], for the trace ratio problem in [21]. Quite recently, in [2], an existence and uniqueness condition of the solutions to NEPv (1) is given, and the convergence of the SCF iteration is also studied.
In practical applications, is usually obtained from the discretization of operators or constructed from empirical data, thus, contaminated by errors and noises. As a result, the NEPv (1) to be solved is in fact a perturbed NEPv. So, it is natural to ask whether we can trust the approximate solution obtained by solving the perturbed NEPv via certain numerical methods, say the SCF iteration. To be specific, let the perturbed NEPv be of the form
(3) 
where has orthonormal column vectors, , , and
(4) 
is a continuous Hermitian matrixvalued function of ,
is a constant Hermitian matrix,
and are perturbed functions of and , respectively,
and , are still Hermitian.
Assume that the original NEPv (1) has a solution .
Then we need to answer the following two fundamental questions:
Q1. Under what conditions the perturbed NEPv (3) has a solution nearby ?
Q2. What’s the distance between and ?
Let and be two dimensional subspaces of . Let the columns of form an orthonormal basis for and the columns of form an orthonormal basis for . We use to measure the distance between and , where
(5) 
Here, ’s denote the canonical angles between and [15, p. 43], which can be defined as
(6) 
where ’s are the singular values of .
In this paper, we will focus on Q1 and Q2. The results are established via two approaches. One is based on the wellknown theorem in the perturbation theory of Hermitian matrices [4] and Brouwer’s fixedpoint theorem [7]; The other is inspired by J.G. Sun’s technique (e.g., [8, 16, 17, 18]) – finding the radius of the perturbation by constructing an equation of the radius via the fixedpoint theorem. Two perturbation bounds can be obtained from these two approaches, and each of them has its own merits. Based on the perturbation bounds, a condition number for the NEPv (1) is introduced, which quantitatively reveals the factors that affect the sensitivity of the solution. As corollaries, two computable error bounds are provided to measure the quality of the computed solution. Theoretical results are validated by numerical experiments for the KS equation and the trace ratio optimization.
The rest of this paper is organized as follows. In section 2, we use two approaches to answer Q1 and Q2, followed by some discussions on the condition number and error bounds for NEPv (1). In section 3, we apply our theoretical results to the KS equation and the trace ratio optimization problem, respectively. Finally, we give our concluding remarks in section 4.
2 Main results
In this section we provide two approaches to answer Q1 and Q2. A condition number and error bounds for NEPv will also be discussed. Before we proceed, we introduce the following notation, which will be used throughout the rest of this paper.
stands for the set of all matrices with complex entries. The superscripts “” and “” take the transpose and the complex conjugate transpose of a matrix or vector, respectively. The symbol denotes the 2norm of a matrix or vector. Unless otherwise specified, we denote by for the eigenvalues of a Hermitian matrix and they are always arranged in nondecreasing order: . Define
(7a)  
(7b) 
Let , be the solutions to (1) and (3), respectively. For any , define
(8)  
(9) 
Denote , , , and also
(10a)  
(10b)  
(10c)  
(10d) 
Note here that can be used to measure the magnitude of the perturbation, and is a “local Lipschitz constant” such that
(11) 
for all . Thus, we may use to measure the sensitivity of within .
2.1 Approach one
In this subsection, we use the famous Weyl Theorem [15, p.203], DavisKahan theorem [4], and Brouwer’s fixedpoint theorem [7] to answer questions Q1 and Q2.
Theorem 2.1
Proof: Using (13), we know that given by (14) is a positive constant. Then it is easy to see that is a nonempty bounded closed convex set in . For any , letting , we define for , where is an eigenvector of corresponding with for and . If we can show that

(which implies that the mapping is welldefined in the sense that is unique);

is a continuous mapping within ;

,
then by Brouwer’s fixedpoint theorem [7], has a fixed point in . Let be the fixed point, where . Then is a solution to the perturbed NEPv (3). Hence the conclusion follows immediately. Next, we show , and in order.
Third, by the famous Weyl Theorem [15, p.203], we have
(17) 
Then it follows that
(18a)  
(18b)  
(18c) 
where (18a) uses (17), (18b) uses (16), (18c) uses (15) and (13).
Proof of We verify that is a continuous mapping within by showing that for any , , as , where and .
Let , , and
Then
and hence
(19) 
By DavisKahan theorem [4], we have
(20) 
Letting , we know that since is continuous. Then it follows from (19) and (20) that
Therefore, .
2.2 Approach two
In this subsection, we use another approach to answer questions Q1 and Q2, which is inspired by J.G. Sun’s technique, see e.g., [8, 16, 17, 18].
Theorem 2.3
Proof: Let be a unitary matrix such that
(27) 
where
Then that the perturbed NEPv (3) has a solution is equivalent to that there exists a unitary matrix such that
(28) 
where is Hermitian and its eigenvalues are the smallest eigenvalues of .
Without loss of generality^{1}^{1}1Note that and thus . By the CS decomposition [15, Chapter 1, Theorem 5.1], we know that there exist unitary matrices and such that . Rewrite . It still holds (28). Then (29) follows immediately by setting ., we let
(29) 
where is a parameter matrix, and are arbitrary unitary matrices. Substituting (29) into (28), we get
(30a)  
(30b)  
(30c) 
where
(31) 
Then the perturbed NEPv (3) has a solution is equivalent to

there exists such that (30c) holds;

.
Next, we first prove (a) then (b).
Note that since defined in (12) is positive, is an invertible linear operator with
(33) 
Therefore, we may define a mapping as
(34) 
By (29), we have
(35) 
Then it follows from (32), (16) and (35) that
(36) 
Denote
Note that is a nonempty bounded closed convex set, defined in (34) is a continuous mapping, and for any , by (36) and (25), it holds
i.e., maps into itself. So by Brouwer’s fixedpoint theorem [7], has a fixed point in . In other words, (30c) has a solution . This completes the proof of (a).
Proof of If
(37) 
then by Weyl Theorem [15], we have . Consequently,
Therefore, we only need to show (37), under the assumption .
Let the singular value decomposition (SVD) of be , where has orthonormal columns, , , , and is unitary. Let for , , . Then using (30a), (38), (39), we have
(40) 
Similarly,
(41) 
Direct calculations give rise to