The inexact residual iteration method for quadratic eigenvalue problem and the analysis of convergence

The inexact residual iteration method for quadratic eigenvalue problem and the analysis of convergence

Liu Yang LMIB & School of Mathematics and Systems Science, BeiHang University, Beijing China, 100191.
corresponding author, sunyq@buaa.edu.cn
   Yuquan Sun    Fanghui Gong LMIB & School of Mathematics and Systems Science, BeiHang University, Beijing China, 100191.
corresponding author, sunyq@buaa.edu.cn
Abstract

In this paper, we first establish the convergence criteria of the residual iteration method for solving quadratic eigenvalue problems. We analyze the impact of shift point and the subspace expansion on the convergence of this method. In the process of expanding subspace, this method needs to solve a linear system at every step. For large scale problems in which the equations cannot be solved directly, we propose an inner and outer iteration version of the residual iteration method. The new method uses the iterative method to solve the equations and uses the approximate solution to expand the subspace. We analyze the relationship between inner and outer iterations and provide a quantitative criterion for the inner iteration which can ensure the convergence of the outer iteration. Finally, our numerical experiments provide proof of our analysis and demonstrate the effectiveness of the inexact residual iteration method.

1 Introduction

The quadratic eigenvalue problem (QEP) is to find scalars and nonzero vectors satisfying

(1)
(2)

where are complex matrices, are the right and left eigenvectors, respectively, corresponding to the eigenvalue . It has many important applications including least squares problem with constraints, fluid mechanics, circuit simulation, and structural mechanics. In[1], Tisseur systematically summarized and reviewed the QEP.

There are two major classes of numerical methods to solve large QEPs. The first method is to linearize the QEP into an equivalent generalized eigenvalue problem (GEP) such as[5]

(3)

where

When is reversible, we can transform it to an equivalent standard eigenvalue problem (SEP)

(4)

Then we can obtain the eigenpair from the eigenpair of or . There are also other linearization forms [1, 6].

The biggest advantage of this approach is that it can use all the theoretical and numerical results of standard and generalized eigenvalue problems. There are many well developed methods available, for example, rational Krylov method [7], displacement inverse Arnoldi method[6, 8], Jacobi-Davidson method[4, 9]. The disadvantages of this method are also obvious. First, the size of the linearization is twice of the original QEP. This results in a significant increase in computational and storage requirements. Second, this method encounters stability problems [11].

The second class of methods to solve large QEP are direct projection methods. These methods project the large QEP directly to a subspace to obtain a smaller QEP. They can preserve the structure iteration of the original problem and has better numerical stability. Some direct projection methods, are the residual iteration method[12, 13, 14], and the Jacobi-Davidson method[4, 9, 10, 16].

The main difficulty with these methods is the lack of theoretical basis. QEPs are an important type of nonlinear eigenvalue problems that are less familiar and less routinely solved than the SEP and the GEP. A dimension QEP can have eigenvalues (when is singular, the number of the eigenvalue is less than ) and eigenvectors. SEP has Schur decomposition form, GEP has generalized Schur decomposition form, but QEP does not have such forms. For the direct projection methods, the properties of the projection subspace has not been thoroughly analyzed. We are not very familiar with the information contained in the subspace. Therefore, there are no perfect convergence analysis of these algorithms. Another disadvantage is that most methods require matrix inversion at each iteration. While the convergence speed is fast, the computational costs are larger.

In this paper, We first analyze the convergence of the residual iteration method. We show the property of the subspace which is constructed by the residual iteration process. We have established the relationship between the subspace and the desired eigenvectors. We analyze the impact of shift selection and subspace expansion on the capacity of the subspace containing desired eigenvectors. In the process of expanding subspace, this method needs to solve a linear system at every step. For large scale problems in which the equations cannot be solved directly, then we propose an inner and outer iteration version of the residual iteration method. The new method uses iterative method to solve the equations and uses the approximate solution to expand the subspace. We establish the relationship between inner and outer iteration and give a quantitative criterion for inner iteration which can ensure the convergence of outer iteration.

The remainder of this paper is arranged as follows. In section 2, we introduce the quadratic residual iteration method and analyze the convergence of this method. In section 3, we propose an inexact residual iteration method and give a quantitative convergence analysis of the method. In section 4, several numerical experiments are presented to verify the results in this paper.

Throughout the paper, we denote by the 2-norm of a vector or matrix, by the identity matrix with the order clear from the context, by the superscript the conjugate transpose of a vector or matrix. We measure the distance between a nonzero vector and a subspace by

where is the orthogonal projector onto and the columns of form an orthonormal basis of the orthogonal complement of .

2 Convergence analysis of the quadratic residual iteration method

In this section, we first introduce the quadratic residual iteration method. Then we analyze the convergence property of the method. This provides a theoretical basis to further improve this type of methods.

For the most original residual iteration method, we can find inspiration from Newton iterative method. If we transform the quadratic eigenvalue problem into an equation

and solve it using the Newton’s Method, then we can obtain the following method.

Method 1: One step quadratic residual iteration method

  1. for do:

  2. .

  3. .

  4. .

  5. end for

where . This is a very elementary method which can only find one eigenvalue and eigenvector. It is similar to the Rayleigh quotient iteration method for SEP. Depending on the characteristics of the Newton iteration method, it can have a faster convergence rate when it begins to converge. But the convergence result is affected by the initial value . Under certain conditions, this method may have error convergence. In [12], some deficiencies are addressed resulting in an improved method. In process Method 1, we need one matrix inversion and one more matrix-vector product .

When we use the subspace projection method, we use the residual iteration process to expand the subspace[18] as follows:

Method 2: Subspace quadratic residual iteration method (QRI)

  1. set shift and initial vector , and convergence tolerance .

  2. for do:

  3. set .

  4. project the large QEP (1) onto subspace and solve the smaller QEP

    to get the eigenvalue and eigenvector .

  5. compute Ritz vector as approximate eigenvector.

  6. compute the residual .

  7. select the first non-convergence .

  8. compute .

  9. get from .

  10. end for

This method projects the large QEP to the subspace directly at step 4. It constructs the subspace by using the residual iteration method to generate new vector at each step. At step 7, if the first residual , we can use the second residual to expand the subspace. So we can compute more than one eigenpairs using this method. In Method 1, the new vector is . If we use it to expand the subspace in Method 2, it needs one more matrix-vector product. We can remove the term and use a fixed value in Method 2. In order to get an orthogonal basis of the projection subspace, we should orthogonalize with at step 9. At step 8, when the matrix size is small, the expanding vector can be computed directly. Usually, can be obtained by solving the linear equations .

Jia and Sun proposed a refined residual iteration method [15]. They introduced the idea of refined projection to quadratic eigenvalue problem. For computing the approximate eigenvector of Ritz value at step 5, the refined vector satisfies

(5)

This method also has some changes and modifications, for example, we can use different shift at each step. We can use vector to expand subspace. So this method has a close relationship with Jacobi-Davidson method [4, 9, 10, 16]. All methods have shown good convergence properties. However the convergence of these methods have not thoroughly analyzed. According to the property of the residual iteration method for SEP, with this method, it is easy to determine the eigenvalues which are close to the target point . So, beneficial shift can accelerate the convergence of the method. For a given target point , is the closest eigenvalue and is the corresponding eigenvector. Then is the desired eigenpair of the subspace . We can use the distance between and or the angle to measure the convergence of the method.

Let be a subspace produced by Method 2. We use to denote the projection subspace and its orthogonal base. Let be an orthogonal projection operator onto subspace . When we get with and is residual, define , the expanded subspace can be written as , where .

To improve subspace expansion, we have the following demonstration:

Theorem 1 Let and where the subspace is computed by Method 2, is the desired eigenpair. Suppose , and , then we have:

(6)
Proof

Because

and . Then we have

(7)

Finally, we get

Since , so we know that the value represents the ability of the method to obtain information from the outside subspace. At the th step, we set then we can get the following result

(8)

and

This means that the subspace can contain more information of the desired eigenvector through expanding. The value determines the effect of each extension. So it should be as small as possible and we can use the upper bound to estimate the convergence rate. To give this bound, we first need the following lemma.

Assume that is a generalized linear form of , then we have [1]

(9)

Lemma[15] Let the diagonal elements of are the eigenvalues of . , are the corresponding right and left eigenvectors. Assume that is not an eigenvalue of and , then we have

(10)

For the convergence rate of the Method 2, we have the following result.

Theorem 2 Let , , , ,() be the same as the earlier definitions. For a given shift , is the closest eigenvalue and is the corresponding eigenvector, is the residual, and

Let

Then we have

(11)
Proof

From

we know

then

so

Since , , then it holds that

(12)

Finally we get

(13)

We can use this result to show the convergence of both Method 1 and Method 2. When we use a fixed shift in Method 1, we get a constant matrix and Method 1 becomes to the power method of matrix . Let are the eigenvalues of . Then we can get the eiagenpair by the power method. If is the nearest eigenvalue to , is near to but there must be a constant gap between and . We cannot find the exact eigenvector from constant matrix . In Method 1, we use constantly changing shift to get changing matrix .

We have two ways to achieve the convergence. The first is to use better shift at each iteration. Method 1 is one such way. At each iteration, we use new shift to get new approximate eigenvector . From (13), we have

(14)

When is converging to , . We can get better from , similarly, better can give better .

The second way to achieve the convergence is subspace expanding. Now, we use fixed shift in the iterative process and we obtain the better approximate eigenvector in the expanded subspace with factor . Combining (8) and (13), we obtain an estimation of total convergence rate for Method 2,

(15)

Assume a moderate value at each iteration then the convergence rate is decided by the value . The size of this value depends on the distribution of eigenvalues and selection of shift . So a good shift can accelerate the convergence of the method.

3 Inexact quadratic residual iteration method

The good convergence of the quadratic residual iteration method mainly benefit from the special expanding vector . When we compute the expanding vector at step 8 of Method 2, we do not compute the inverse matrix directly. We can compute by solving the corresponding linear equations. Usually, this is one of the most resource intensive part of the method. For large scale problems, it is hard to compute the expanding vector, even by solving linear equations. For linear eigenvalue problems, one way to solve this problem is to compute an inexact solution of the linear equations. This kind of methods are called inexact method or outer inner iteration methods[14, 15].

Based on this principle, we propose the inexact iteration method for quadratic eigenvalue problems.

Method 3: Inexact quadratic residual iteration method

  1. set shift and initial vector , and convergence tolerance .

  2. for do:

  3. set .

  4. project the large QEP (1) onto subspace and solve the smaller QEP

    to get the eigenvalue and eigenvector .

  5. compute Ritz vector as approximate eigenvector.

  6. compute the residual .

  7. select the first non-convergence .

  8. solve the equations

  9. get from .

  10. end for

The difference between Method 2 and Method 3 is at Step 8. Method 2 computes an exact solution, but Method 3 only needs to calculate an approximate solution. For large scale problems, it is difficult to solve the equations exactly. It is feasible to use a method to compute an approximate solution. Normally, we use an iterative method to compute the approximate solution of the equations. So, we call it inner iteration and call the iterative of Method 3 the outer iteration. The main difficulty of the method is to determine the accuracy of the approximate solution. We must to find the balance between outer and inner iteration. Let be the exact solution of and be the approximate solution. Then the relative error between them is

(16)

Then we can write

where is the normalized error direction vector.
So we get:

(17)

where

(18)

Define

(19)

and

(20)

where and are the normalized subspace expansion vectors in the inexact and exact methods, respectively. We can measure the difference between and by or . The relationship between and is [17],

(21)

The relationship between and is,

(22)

Then we obtain:

(23)

For the standard eigenvalue problem, we have obtained the convergence properties of the exact method. We can prove the convergence of the inexact method by proving that can mimic . According to the above relationships, we can show that can be a good imitation of under a moderately precise inner iteration. But these convergence results are more or less based on prior knowledge of eigenvalues. For the quadratic eigenvalue problem, we need to analyze the requirements of the inner iteration directly from the convergence of the inexact method.

The convergence condition of Method 3 is , is the desired eigenpair. According to the conclusion , it is equivalent to

(24)

According to their relationships of , the convergence can be analyzed by the value or .

For Method 3, we can take the approximate solution as an exact solution of the following perturbed equation

(25)

here is the perturbation matrix of .

Lemma 1 If , the approximate solution and the exact solution have the following relationship

(26)
Proof

For a matrix and the corresponding unit matrix , if , then is invertible[19] and

(27)

Now, we can get

(28)

We use formula(27) to and ignore higher order terms to obtain

Then we obtain the relationship about and .

For the convergence condition of Method 3, we have the following result.

Theorem 4 Suppose is the desired eigenpair, and , then the angle between the inexact solution and the desired eigenvector satisfies

Proof

From

we can get

(29)

When , we can ignore the small value . Then the formula (29) can be written as

(30)

We can rewrite this formula as

(31)

Similarly, the formula (26) also can be written as

(32)

Therefore, combining the last relation with (31) establishes

(33)

Then we can get

From the following relationship

(34)

we can finish the proof.

From Theorem 4, we know that the inexact method can have fast convergence when the difference between and is very small. Because and , these two values are always close, regardless of whether they are large or small.

When is a very small value, it means that both angles between and , and are small. So the exact residual iteration method can have fast convergence rate. For the inexact method, if the inner iteration is not very precise, there is a large error between and . At first glance, the convergence rate of the inexact method may not be very fast. But small means small . So the angle between and is also small. That is to say, the inexact method can have fast convergence rate in spite of the moderate accuracy of inner iteration.

When the value is not too small, this means that the inexact method cannot get fast convergence speed using high accuracy in inner iteration. But the convergence of the inexact method can be guaranteed by , where is a constant value less than one. This condition is relatively easy to be satisfied and is independent of the inner iteration. The analysis shows that the convergence of the inexact method is mainly determined by the shift selection and subspace expansion. However, there are limitations to improving the precision of the inner iteration.

4 Numerical experiments

Several numerical experiments are presented in this section to demonstrate the effectiveness of Method 3 and the analysis results. For all examples, DIM stands for the dimension of matrix, TOL denotes the convergent precision of outer iteration, tol is the precision of inner iteration. We use GMRES to solve the inner iteration. We show the CPU time of each part of the method and unit is in seconds. TOTAL expresses the total CPU time, CGMRES stands for the steps of gmres, TGMRES stands for the CPU time of GMRES, unit is in seconds. ITER is the number of iteration. We select three different tol in our experiments, which respective denoted by ”inexact1”, ”inexact2”, ”inexact3”. In following figures, the horizontal axis is the dimension of subspace and the vertical axis is the relative largest of the six residuals norm at each subspace expansion.

Example 1 For a fixed shift , let and . We show that the eigenvector of is different from the eigenvector of the corresponding quadratic eigenvalue problem. So we cannot use the power method of to compute the eigenvector of .

We use example in[1]. The matrices are , , and is a unite matrix. The six eigenpairs are

k 1 2 3 4 5 6
1

We choose , the eigenvalue which is closest to is , the corresponding eigenvector is . The eigenvalues of are the eigenvalues of E are 0.735294117647062, 9.999999999999933, 0.552486187845304, and the eigenvector of the largest eigenvalue is .

We can see that the corresponding eigenvector of is different from the wanted eigenvector. Through the above example, we show that there is a difference between the eigenvectors of shifted standard eigenvalue problem and the quadratic eigenvalue problem.

Example 2 This example is arising from a finite element model of a linear spring in parallel with Maxwell elements (a Maxwell element is a spring in series with a dashpot) [20]. The quadratic matrix polynomial is , where the mass matrix is rank deficient and symmetric, the damping matrix is rank deficient and block diagonal, and the stiffness matrix is symmetric and has arrowhead structure. The matrices have the form:

where and are element mass and stiffness matrices, and measure the spring stiffness, and is the material density. In this example, these matrices are randomly generated by the method of reference[20]. The size of the matrices is 20000. We use the new method to compute 6 eigenvalues which are closest to the shift . We select three precision for the inner iteration to show the effect of the precision of inner iteration on the outer iteration. The precision of outer iteration is . Table 1 and Figure 1 show the results.

Figure 1: Example 2,
Standard tol TOTAL CGMRES TGMRES ITER
inexact1 1e-3 6.66 69 1.41 22
inexact2 1e-4 7.04 78 1.48 22
inexact3 1e-5 7.55 84 1.54 19
Table 1: Example 2, DIM=20000, TOL

It can be seen from Table 1 that the method with lowest precision of inner iteration uses more outer iterations than the method with highest precision of inner iteration, but the total time is less. Inner iteration using high precision, may reduce the number of external iteration but it need more number of inner iteration. So the inner iteration of low precision requires less time at each outer iteration and has advantage on total CPU time.

Figure 1 shows the largest residual norm of different inner precision at each outer iteration. They have the same convergence history in the first 14 outer iterations. The convergence of the higher accuracy is more smooth than that of the lower. But the method with the lowest precision can get the desired result using only three outer iterations. In general, whether the CPU time or iteration number, the difference is not very obvious between methods with difference inner precision.

Example 3 This example was tested in [20]. It come from the finite element discretization 2D version of time-harmonic wave equation. On the unit square with mesh size , the coefficient matrices of with are given by:

where denotes the Kronecker product, , is the (possibly complex) impedance, and , . We set , and use the new method to compute 6 eigenvalues which are closest to the shift . We select three precision for the inner iteration. The results are shown in Table 2 and Figure 2.

Figure 2: Example 3,
Standard tol TOTAL CGMRES TGMRES ITER
inexact1 1e-3 663.27 471 291.95 63
inexact2 1e-4 972.78 672 481.89 63
inexact3 1e-5 1.1682e+03 890 599.73 63
Table 2: Example 3, DIM=160000, TOL=

For large scale problem, it needs more large subspace to find the desired eigenvalues. All of the three methods with different inner precision need more outer iteration. But, we can find out from Figure 2 that they have the same convergence history. There are no advantages if we use higher precision for inner iteration.

We can see from the table 2 that for the inner iteration with the lower precision, the less time it will use at each outer iteration. Then the total time saving is very considerable. So the method with lowest inner iteration is more superiority than that using highest inner precision. This is consistent with our theoretical analysis. In many practical applications, it is impossible, even if we want to find a very accurate solution fot the inner iteration. So both theory and experiments show that this new method is a feasible and efficient method.

References

  • [1] Tisseur F, Meerbergen K. The quadratic eigenvalue problems. SIAM Review, 2001, 43: 235–286
  • [2] Kublanovskaya V N. On an approach to the solution of the generalized latent value problem for Lambda-matrices. SIAM J Numer Anal, 1970, 7: 532–537
  • [3] Guo J S, Lin W W, and Wang C S. Numerical solution for large sparse quadratic eigenvalue problems. Lin Alg Appl, 1995, 225: 57–85
  • [4] Sleijpen G L G, Booten A G L, Fokkema D R, et al. Jacobi-Davidson type methods for generalized eigenproblems and polynomial eigenproblems. BIT, 1996, 36: 595–633
  • [5] Tisseur F. Backward error and condition of polynomial eigenvalue problems. Lin Alg Appl, 2000, 309: 339–361
  • [6] Saad Y. Numerical Methods for Large Eigenvalue Problems. Algorithms and Architectures for Advanced Scientific Computing, Manchester, UK: Manchester University Press, 1992
  • [7] Ruhe A. Rational Krylov: A practical algorithm for large sparse nonsymmetric matrix pencils. SIAM J Sci Comput, 1998, 19: 1535–1551
  • [8] Natarajan R. An Arnoldi-based iterative scheme for nonsymmetric matrix pencils arising infinite element stability problems. J Comput Phys, 1992, 100: 128–142
  • [9] Sleijpen G L G, Van der Vorst H V. Jacobi-Davidson iteration method for linear eigenvalue problems. SIAM J Matrix Anal Appl, 1996, 17: 401–425
  • [10] Sleijpen G L G, Van der Vorst H V ,and van Gijzen M. Quadratic eigenproblems are no problem. SIAM News, 1996, 29: 8–9
  • [11] Higham N J, Li R C, Tisseur F. Backward error of polynomial eigenproblems solved by linearization. SIAM J Matrix Anal Appl, 2007, 29: 1218–1241
  • [12] Neumaier A. Residual inverse iteration for the nonlinear eigenvalue problem. SIAM J Numer Anal, 1985, 22: 914–923
  • [13] Huitfldt J, Ruhe A. A new algorithm for numerical path following applied to an example from hydrodynamical flow. SIAM J Sci Statist Comput, 1990, 11: 1181–1192
  • [14] Jia Z X. Refined iterative algorithms based on Arnoldi’s process for large unsymmetric eigenproblems. Linear Algebra Appl, 1997, 259: 1–23
  • [15] Jia Z X, Sun Y Q. A refined residual iteration method. AMS sunject classifications, 2004, 26: 147–155
  • [16] Van Gijzen M B. The parallel computation of the smallest eigenpair of an acoustic problem with damping. Internat J Numer Methods Engrg, 1999, 45: 765–777
  • [17] Jia Z X, Li C. Inner iterations in the shift-invert residual Arnoldi method and the Jacobi-Davidson method. Sci China Math, 2014, 57: 1733–1752
  • [18] Meerbergen K. Locking and restarting quadratic eigenvalue solvers. SIAM J Sci Comput, 2001, 22: 1814–1839
  • [19] Demmel J. Applied numerical linear algebra. SIAM, Philadelphia, PA, 1997
  • [20] Betcke T, Higham N J, Mehrmann V, et al. NLEVP: A collection of nonlinear eigenvalue problems. ACM TOMS, 2011, 116: 18–19
  • [21] Chaitin-Chatelin F, Van Gijzen M B. Analysis of parameterized quadratic eigenvalue problems in computational acoustics with homotopic deviation theory. Numer Linear Algebra Appl, 2006, 13: 487–512
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
13710
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description