The inexact residual iteration method for quadratic eigenvalue problem and the analysis of convergence
Abstract
In this paper, we first establish the convergence criteria of the residual iteration method for solving quadratic eigenvalue problems. We analyze the impact of shift point and the subspace expansion on the convergence of this method. In the process of expanding subspace, this method needs to solve a linear system at every step. For large scale problems in which the equations cannot be solved directly, we propose an inner and outer iteration version of the residual iteration method. The new method uses the iterative method to solve the equations and uses the approximate solution to expand the subspace. We analyze the relationship between inner and outer iterations and provide a quantitative criterion for the inner iteration which can ensure the convergence of the outer iteration. Finally, our numerical experiments provide proof of our analysis and demonstrate the effectiveness of the inexact residual iteration method.
1 Introduction
The quadratic eigenvalue problem (QEP) is to find scalars and nonzero vectors satisfying
(1) 
(2) 
where are complex matrices, are the right and left eigenvectors, respectively, corresponding to the eigenvalue . It has many important applications including least squares problem with constraints, fluid mechanics, circuit simulation, and structural mechanics. In[1], Tisseur systematically summarized and reviewed the QEP.
There are two major classes of numerical methods to solve large QEPs. The first method is to linearize the QEP into an equivalent generalized eigenvalue problem (GEP) such as[5]
(3) 
where
When is reversible, we can transform it to an equivalent standard eigenvalue problem (SEP)
(4) 
Then we can obtain the eigenpair from the eigenpair of or . There are also other linearization forms [1, 6].
The biggest advantage of this approach is that it can use all the theoretical and numerical results of standard and generalized eigenvalue problems. There are many well developed methods available, for example, rational Krylov method [7], displacement inverse Arnoldi method[6, 8], JacobiDavidson method[4, 9]. The disadvantages of this method are also obvious. First, the size of the linearization is twice of the original QEP. This results in a significant increase in computational and storage requirements. Second, this method encounters stability problems [11].
The second class of methods to solve large QEP are direct projection methods. These methods project the large QEP directly to a subspace to obtain a smaller QEP. They can preserve the structure iteration of the original problem and has better numerical stability. Some direct projection methods, are the residual iteration method[12, 13, 14], and the JacobiDavidson method[4, 9, 10, 16].
The main difficulty with these methods is the lack of theoretical basis. QEPs are an important type of nonlinear eigenvalue problems that are less familiar and less routinely solved than the SEP and the GEP. A dimension QEP can have eigenvalues (when is singular, the number of the eigenvalue is less than ) and eigenvectors. SEP has Schur decomposition form, GEP has generalized Schur decomposition form, but QEP does not have such forms. For the direct projection methods, the properties of the projection subspace has not been thoroughly analyzed. We are not very familiar with the information contained in the subspace. Therefore, there are no perfect convergence analysis of these algorithms. Another disadvantage is that most methods require matrix inversion at each iteration. While the convergence speed is fast, the computational costs are larger.
In this paper, We first analyze the convergence of the residual iteration method. We show the property of the subspace which is constructed by the residual iteration process. We have established the relationship between the subspace and the desired eigenvectors. We analyze the impact of shift selection and subspace expansion on the capacity of the subspace containing desired eigenvectors. In the process of expanding subspace, this method needs to solve a linear system at every step. For large scale problems in which the equations cannot be solved directly, then we propose an inner and outer iteration version of the residual iteration method. The new method uses iterative method to solve the equations and uses the approximate solution to expand the subspace. We establish the relationship between inner and outer iteration and give a quantitative criterion for inner iteration which can ensure the convergence of outer iteration.
The remainder of this paper is arranged as follows. In section 2, we introduce the quadratic residual iteration method and analyze the convergence of this method. In section 3, we propose an inexact residual iteration method and give a quantitative convergence analysis of the method. In section 4, several numerical experiments are presented to verify the results in this paper.
Throughout the paper, we denote by the 2norm of a vector or matrix, by the identity matrix with the order clear from the context, by the superscript the conjugate transpose of a vector or matrix. We measure the distance between a nonzero vector and a subspace by
where is the orthogonal projector onto and the columns of form an orthonormal basis of the orthogonal complement of .
2 Convergence analysis of the quadratic residual iteration method
In this section, we first introduce the quadratic residual iteration method. Then we analyze the convergence property of the method. This provides a theoretical basis to further improve this type of methods.
For the most original residual iteration method, we can find inspiration from Newton iterative method. If we transform the quadratic eigenvalue problem into an equation
and solve it using the Newton’s Method, then we can obtain the following method.
Method 1: One step quadratic residual iteration method

for do:

.

.

.

end for
where . This is a very elementary method which can only find one eigenvalue and eigenvector. It is similar to the Rayleigh quotient iteration method for SEP. Depending on the characteristics of the Newton iteration method, it can have a faster convergence rate when it begins to converge. But the convergence result is affected by the initial value . Under certain conditions, this method may have error convergence. In [12], some deficiencies are addressed resulting in an improved method. In process Method 1, we need one matrix inversion and one more matrixvector product .
When we use the subspace projection method, we use the residual iteration process to expand the subspace[18] as follows:
Method 2: Subspace quadratic residual iteration method (QRI)

set shift and initial vector , and convergence tolerance .

for do:

set .

project the large QEP (1) onto subspace and solve the smaller QEP
to get the eigenvalue and eigenvector .

compute Ritz vector as approximate eigenvector.

compute the residual .

select the first nonconvergence .

compute .

get from .

end for
This method projects the large QEP to the subspace directly at step 4. It constructs the subspace by using the residual iteration method to generate new vector at each step. At step 7, if the first residual , we can use the second residual to expand the subspace. So we can compute more than one eigenpairs using this method. In Method 1, the new vector is . If we use it to expand the subspace in Method 2, it needs one more matrixvector product. We can remove the term and use a fixed value in Method 2. In order to get an orthogonal basis of the projection subspace, we should orthogonalize with at step 9. At step 8, when the matrix size is small, the expanding vector can be computed directly. Usually, can be obtained by solving the linear equations .
Jia and Sun proposed a refined residual iteration method [15]. They introduced the idea of refined projection to quadratic eigenvalue problem. For computing the approximate eigenvector of Ritz value at step 5, the refined vector satisfies
(5) 
This method also has some changes and modifications, for example, we can use different shift at each step. We can use vector to expand subspace. So this method has a close relationship with JacobiDavidson method [4, 9, 10, 16]. All methods have shown good convergence properties. However the convergence of these methods have not thoroughly analyzed. According to the property of the residual iteration method for SEP, with this method, it is easy to determine the eigenvalues which are close to the target point . So, beneficial shift can accelerate the convergence of the method. For a given target point , is the closest eigenvalue and is the corresponding eigenvector. Then is the desired eigenpair of the subspace . We can use the distance between and or the angle to measure the convergence of the method.
Let be a subspace produced by Method 2. We use to denote the projection subspace and its orthogonal base. Let be an orthogonal projection operator onto subspace . When we get with and is residual, define , the expanded subspace can be written as , where .
To improve subspace expansion, we have the following demonstration:
Theorem 1 Let and where the subspace is computed by Method 2, is the desired eigenpair. Suppose , and , then we haveï¼
(6) 
Proof
Because
and . Then we have
(7)  
Finally, we get
Since , so we know that the value represents the ability of the method to obtain information from the outside subspace. At the th step, we set then we can get the following result
(8) 
and
This means that the subspace can contain more information of the desired eigenvector through expanding. The value determines the effect of each extension. So it should be as small as possible and we can use the upper bound to estimate the convergence rate. To give this bound, we first need the following lemma.
Assume that is a generalized linear form of , then we have [1]
(9) 
Lemma[15] Let the diagonal elements of are the eigenvalues of . , are the corresponding right and left eigenvectors. Assume that is not an eigenvalue of and , then we have
(10) 
For the convergence rate of the Method 2, we have the following result.
Theorem 2 Let , , , ,() be the same as the earlier definitions. For a given shift , is the closest eigenvalue and is the corresponding eigenvector, is the residual, and
Let
Then we have
(11) 
Proof
From
we know
then
so
Since , , then it holds that
(12) 
Finally we get
(13) 
We can use this result to show the convergence of both Method 1 and Method 2. When we use a fixed shift in Method 1, we get a constant matrix and Method 1 becomes to the power method of matrix . Let are the eigenvalues of . Then we can get the eiagenpair by the power method. If is the nearest eigenvalue to , is near to but there must be a constant gap between and . We cannot find the exact eigenvector from constant matrix . In Method 1, we use constantly changing shift to get changing matrix .
We have two ways to achieve the convergence. The first is to use better shift at each iteration. Method 1 is one such way. At each iteration, we use new shift to get new approximate eigenvector . From (13), we have
(14) 
When is converging to , . We can get better from , similarly, better can give better .
The second way to achieve the convergence is subspace expanding. Now, we use fixed shift in the iterative process and we obtain the better approximate eigenvector in the expanded subspace with factor . Combining (8) and (13), we obtain an estimation of total convergence rate for Method 2,
(15) 
Assume a moderate value at each iteration then the convergence rate is decided by the value . The size of this value depends on the distribution of eigenvalues and selection of shift . So a good shift can accelerate the convergence of the method.
3 Inexact quadratic residual iteration method
The good convergence of the quadratic residual iteration method mainly benefit from the special expanding vector . When we compute the expanding vector at step 8 of Method 2, we do not compute the inverse matrix directly. We can compute by solving the corresponding linear equations. Usually, this is one of the most resource intensive part of the method. For large scale problems, it is hard to compute the expanding vector, even by solving linear equations. For linear eigenvalue problems, one way to solve this problem is to compute an inexact solution of the linear equations. This kind of methods are called inexact method or outer inner iteration methods[14, 15].
Based on this principle, we propose the inexact iteration method for quadratic eigenvalue problems.
Method 3: Inexact quadratic residual iteration method

set shift and initial vector , and convergence tolerance .

for do:

set .

project the large QEP (1) onto subspace and solve the smaller QEP
to get the eigenvalue and eigenvector .

compute Ritz vector as approximate eigenvector.

compute the residual .

select the first nonconvergence .

solve the equations

get from .

end for
The difference between Method 2 and Method 3 is at Step 8. Method 2 computes an exact solution, but Method 3 only needs to calculate an approximate solution. For large scale problems, it is difficult to solve the equations exactly. It is feasible to use a method to compute an approximate solution. Normally, we use an iterative method to compute the approximate solution of the equations. So, we call it inner iteration and call the iterative of Method 3 the outer iteration. The main difficulty of the method is to determine the accuracy of the approximate solution. We must to find the balance between outer and inner iteration. Let be the exact solution of and be the approximate solution. Then the relative error between them is
(16) 
Then we can write
where is the normalized error direction vector.
So we get:
(17) 
where
(18) 
Define
(19) 
and
(20) 
where and are the normalized subspace expansion vectors in the inexact and exact methods, respectively. We can measure the difference between and by or . The relationship between and is [17],
(21) 
The relationship between and is,
(22) 
Then we obtain:
(23) 
For the standard eigenvalue problem, we have obtained the convergence properties of the exact method. We can prove the convergence of the inexact method by proving that can mimic . According to the above relationships, we can show that can be a good imitation of under a moderately precise inner iteration. But these convergence results are more or less based on prior knowledge of eigenvalues. For the quadratic eigenvalue problem, we need to analyze the requirements of the inner iteration directly from the convergence of the inexact method.
The convergence condition of Method 3 is , is the desired eigenpair. According to the conclusion , it is equivalent to
(24) 
According to their relationships of , the convergence can be analyzed by the value or .
For Method 3, we can take the approximate solution as an exact solution of the following perturbed equation
(25) 
here is the perturbation matrix of .
Lemma 1 If , the approximate solution and the exact solution have the following relationship
(26) 
Proof
For a matrix and the corresponding unit matrix , if , then is invertible[19] and
(27) 
Now, we can get
(28)  
We use formula(27) to and ignore higher order terms to obtain
Then we obtain the relationship about and .
For the convergence condition of Method 3, we have the following result.
Theorem 4 Suppose is the desired eigenpair, and , then the angle between the inexact solution and the desired eigenvector satisfies
Proof
From
we can get
(29) 
When , we can ignore the small value . Then the formula (29) can be written as
(30) 
We can rewrite this formula as
(31) 
Similarly, the formula (26) also can be written as
(32) 
Therefore, combining the last relation with (31) establishes
(33) 
Then we can get
From the following relationship
(34) 
we can finish the proof.
From Theorem 4, we know that the inexact method can have fast convergence when the difference between and is very small. Because and , these two values are always close, regardless of whether they are large or small.
When is a very small value, it means that both angles between and , and are small. So the exact residual iteration method can have fast convergence rate. For the inexact method, if the inner iteration is not very precise, there is a large error between and . At first glance, the convergence rate of the inexact method may not be very fast. But small means small . So the angle between and is also small. That is to say, the inexact method can have fast convergence rate in spite of the moderate accuracy of inner iteration.
When the value is not too small, this means that the inexact method cannot get fast convergence speed using high accuracy in inner iteration. But the convergence of the inexact method can be guaranteed by , where is a constant value less than one. This condition is relatively easy to be satisfied and is independent of the inner iteration. The analysis shows that the convergence of the inexact method is mainly determined by the shift selection and subspace expansion. However, there are limitations to improving the precision of the inner iteration.
4 Numerical experiments
Several numerical experiments are presented in this section to demonstrate the effectiveness of Method 3 and the analysis results. For all examples, DIM stands for the dimension of matrix, TOL denotes the convergent precision of outer iteration, tol is the precision of inner iteration. We use GMRES to solve the inner iteration. We show the CPU time of each part of the method and unit is in seconds. TOTAL expresses the total CPU time, CGMRES stands for the steps of gmres, TGMRES stands for the CPU time of GMRES, unit is in seconds. ITER is the number of iteration. We select three different tol in our experiments, which respective denoted by ”inexact1”, ”inexact2”, ”inexact3”. In following figures, the horizontal axis is the dimension of subspace and the vertical axis is the relative largest of the six residuals norm at each subspace expansion.
Example 1 For a fixed shift , let and . We show that the eigenvector of is different from the eigenvector of the corresponding quadratic eigenvalue problem. So we cannot use the power method of to compute the eigenvector of .
We use example in[1]. The matrices are , , and is a unite matrix. The six eigenpairs are
k  1  2  3  4  5  6 

1  
We choose , the eigenvalue which is closest to is , the corresponding eigenvector is . The eigenvalues of are the eigenvalues of E are 0.735294117647062, 9.999999999999933, 0.552486187845304, and the eigenvector of the largest eigenvalue is .
We can see that the corresponding eigenvector of is different from the wanted eigenvector. Through the above example, we show that there is a difference between the eigenvectors of shifted standard eigenvalue problem and the quadratic eigenvalue problem.
Example 2 This example is arising from a finite element model of a linear spring in parallel with Maxwell elements (a Maxwell element is a spring in series with a dashpot) [20]. The quadratic matrix polynomial is , where the mass matrix is rank deficient and symmetric, the damping matrix is rank deficient and block diagonal, and the stiffness matrix is symmetric and has arrowhead structure. The matrices have the form:
where and are element mass and stiffness matrices, and measure the spring stiffness, and is the material density. In this example, these matrices are randomly generated by the method of reference[20]. The size of the matrices is 20000. We use the new method to compute 6 eigenvalues which are closest to the shift . We select three precision for the inner iteration to show the effect of the precision of inner iteration on the outer iteration. The precision of outer iteration is . Table 1 and Figure 1 show the results.
Standard  tol  TOTAL  CGMRES  TGMRES  ITER 

inexact1  1e3  6.66  69  1.41  22 
inexact2  1e4  7.04  78  1.48  22 
inexact3  1e5  7.55  84  1.54  19 
It can be seen from Table 1 that the method with lowest precision of inner iteration uses more outer iterations than the method with highest precision of inner iteration, but the total time is less. Inner iteration using high precision, may reduce the number of external iteration but it need more number of inner iteration. So the inner iteration of low precision requires less time at each outer iteration and has advantage on total CPU time.
Figure 1 shows the largest residual norm of different inner precision at each outer iteration. They have the same convergence history in the first 14 outer iterations. The convergence of the higher accuracy is more smooth than that of the lower. But the method with the lowest precision can get the desired result using only three outer iterations. In general, whether the CPU time or iteration number, the difference is not very obvious between methods with difference inner precision.
Example 3 This example was tested in [20]. It come from the finite element discretization 2D version of timeharmonic wave equation. On the unit square with mesh size , the coefficient matrices of with are given by:
where denotes the Kronecker product, , is the (possibly complex) impedance, and , . We set , and use the new method to compute 6 eigenvalues which are closest to the shift . We select three precision for the inner iteration. The results are shown in Table 2 and Figure 2.
Standard  tol  TOTAL  CGMRES  TGMRES  ITER 

inexact1  1e3  663.27  471  291.95  63 
inexact2  1e4  972.78  672  481.89  63 
inexact3  1e5  1.1682e+03  890  599.73  63 
For large scale problem, it needs more large subspace to find the desired eigenvalues. All of the three methods with different inner precision need more outer iteration. But, we can find out from Figure 2 that they have the same convergence history. There are no advantages if we use higher precision for inner iteration.
We can see from the table 2 that for the inner iteration with the lower precision, the less time it will use at each outer iteration. Then the total time saving is very considerable. So the method with lowest inner iteration is more superiority than that using highest inner precision. This is consistent with our theoretical analysis. In many practical applications, it is impossible, even if we want to find a very accurate solution fot the inner iteration. So both theory and experiments show that this new method is a feasible and efficient method.
References
 [1] Tisseur F, Meerbergen K. The quadratic eigenvalue problems. SIAM Review, 2001, 43: 235–286
 [2] Kublanovskaya V N. On an approach to the solution of the generalized latent value problem for Lambdamatrices. SIAM J Numer Anal, 1970, 7: 532–537
 [3] Guo J S, Lin W W, and Wang C S. Numerical solution for large sparse quadratic eigenvalue problems. Lin Alg Appl, 1995, 225: 57–85
 [4] Sleijpen G L G, Booten A G L, Fokkema D R, et al. JacobiDavidson type methods for generalized eigenproblems and polynomial eigenproblems. BIT, 1996, 36: 595–633
 [5] Tisseur F. Backward error and condition of polynomial eigenvalue problems. Lin Alg Appl, 2000, 309: 339–361
 [6] Saad Y. Numerical Methods for Large Eigenvalue Problems. Algorithms and Architectures for Advanced Scientific Computing, Manchester, UK: Manchester University Press, 1992
 [7] Ruhe A. Rational Krylov: A practical algorithm for large sparse nonsymmetric matrix pencils. SIAM J Sci Comput, 1998, 19: 1535–1551
 [8] Natarajan R. An Arnoldibased iterative scheme for nonsymmetric matrix pencils arising infinite element stability problems. J Comput Phys, 1992, 100: 128–142
 [9] Sleijpen G L G, Van der Vorst H V. JacobiDavidson iteration method for linear eigenvalue problems. SIAM J Matrix Anal Appl, 1996, 17: 401–425
 [10] Sleijpen G L G, Van der Vorst H V ,and van Gijzen M. Quadratic eigenproblems are no problem. SIAM News, 1996, 29: 8–9
 [11] Higham N J, Li R C, Tisseur F. Backward error of polynomial eigenproblems solved by linearization. SIAM J Matrix Anal Appl, 2007, 29: 1218–1241
 [12] Neumaier A. Residual inverse iteration for the nonlinear eigenvalue problem. SIAM J Numer Anal, 1985, 22: 914–923
 [13] Huitfldt J, Ruhe A. A new algorithm for numerical path following applied to an example from hydrodynamical flow. SIAM J Sci Statist Comput, 1990, 11: 1181–1192
 [14] Jia Z X. Refined iterative algorithms based on Arnoldi’s process for large unsymmetric eigenproblems. Linear Algebra Appl, 1997, 259: 1–23
 [15] Jia Z X, Sun Y Q. A refined residual iteration method. AMS sunject classifications, 2004, 26: 147–155
 [16] Van Gijzen M B. The parallel computation of the smallest eigenpair of an acoustic problem with damping. Internat J Numer Methods Engrg, 1999, 45: 765–777
 [17] Jia Z X, Li C. Inner iterations in the shiftinvert residual Arnoldi method and the JacobiDavidson method. Sci China Math, 2014, 57: 1733–1752
 [18] Meerbergen K. Locking and restarting quadratic eigenvalue solvers. SIAM J Sci Comput, 2001, 22: 1814–1839
 [19] Demmel J. Applied numerical linear algebra. SIAM, Philadelphia, PA, 1997
 [20] Betcke T, Higham N J, Mehrmann V, et al. NLEVP: A collection of nonlinear eigenvalue problems. ACM TOMS, 2011, 116: 18–19
 [21] ChaitinChatelin F, Van Gijzen M B. Analysis of parameterized quadratic eigenvalue problems in computational acoustics with homotopic deviation theory. Numer Linear Algebra Appl, 2006, 13: 487–512