Two recursive GMRES-type methods for shifted linear systems with general preconditioningThis version dated July 13, 2019.

Two recursive GMRES-type methods for shifted linear systems with general preconditioningthanks: This version dated July 13, 2019.

Kirk M. Soodhalter222Industrial Mathematics Institute, Johannes Kepler University, Altenbergerstraße 69, A-4040 Linz, Austria. (kirk.soodhalter@indmath.uni-linz.ac.at)
Abstract

We present two minimum residual methods for solving sequences of shifted linear systems, the right-preconditioned shifted GMRES and shifted Recycled GMRES algorithms which use a seed projection strategy often employed to solve multiple related problems. These methods are compatible with general preconditioning of all systems, and when restricted to right preconditioning, require no extra applications of the operator or preconditioner. These seed projection methods perform a minimum residual iteration for the base system while improving the approximations for the shifted systems at little additional cost. The iteration continues until the base system approximation is of satisfactory quality. The method is then recursively called for the remaining unconverged systems. We present both methods inside of a general framework which allows these techniques to be extended to the setting of flexible preconditioning and inexact Krylov methods. We present some analysis of such methods and numerical experiments demonstrating the effectiveness of the algorithms we have derived.

�

Key words. Krylov subspace methods, shifted linear systems, parameterized linear systems, quantum chromodynamics

AMS subject classifications. 65F10, 65F50, 65F08

1 Introduction

We develop techniques for solving a family (or a sequence of families) of linear systems in which the coefficient matrices differ only by a scalar multiple of the identity. There are many applications which warrant the solution of a family of shifted linear systems, such as those arising in lattice quantum chromodynamics (QCD) (see, e.g., [14]) as well as other applications such as Tikhonov-Phillips regularization, global methods of nonlinear analysis, and Newton trust region methods [5]. The goal is to develop a framework in which minimum residual methods can be applied to shifted systems in a way that:

  1. allows us to exploit the relationships between the coefficient matrices

  2. is compatible with general (right) preconditioning.

In this paper, we use such a framework to propose two new methods: one which is built on top of the GMRES method [31] for solving a family of shifted systems (cf. (LABEL:eqn.one-shifted-fam)) and one which is built on top of a GCRO-type augmented Krylov method [10] which, when paired with a harmonic Ritz vector recycling strategy [25, 26], is an extension of the GCRO-DR method [27] to solve a sequence of shifted system families (cf. (LABEL:eqn.one-shifted-family-seq)). To do this, we use a seed projection strategy, often proposed for use in conjunction with short-term recurrence iterative methods [6, 7, 19, 28].

The rest of this paper is organized as follows. In the next section, we discuss some previous strategies to treat such problems and discuss some of their limitations. In Section LABEL:section.prelim, we review the minimum residual Krylov subspace method GMRES as well as two GMRES variants, one for shifted linear systems and the other extending GMRES to the augmented Krylov subspace setting, i.e., Recycled GMRES. In Section LABEL:section.dir-proj-framework, we present a general framework to perform minimum residual projections of the shifted system residuals with respect to the search space generated for the base system. In Subsection LABEL:subsection.GMRES-shift we use this framework to derive our shifted GMRES method and in Subsection LABEL:subsection.rgmres-shift we derive a shifted Recycled GMRES method. In Section LABEL:section.analysis, we present some analysis of the expected performance of these methods. In Section LABEL:section.num-results, we present some numerical results before concluding in Section LABEL:section.conclusions.

2 Background

Consider a family of shifted linear systems, which we parameterize by , i.e.,

\hb@xt@.01(2.1)

We call the numbers shifts, the base matrix, and a shifted matrix. Systems of the form (LABEL:eqn.one-shifted-fam) are called shifted linear systems. Krylov subspace methods have been proposed to simultaneously solve this family of systems, see, e.g., [8, 12, 13, 20, 36]. These methods satisfy requirement (LABEL:item.shift-rel) but are not compatible with general preconditioning strategies, as they rely on the invariance of the Krylov subspace under constant shift of the coefficient matrix; cf. (LABEL:eqn.shift-invariance). Specially chosen polynomial preconditioners, however, have been shown to be compatible with such methods; see, e.g., [1, 3, 4, 18, 23, 33, 42].

We can introduce an additional parameter , which indexes a sequence of matrices , and for each , we solve a family of the form

\hb@xt@.01(2.2)

We consider the case that the right-hand side varies with respect to but not for each shift. What we propose is indeed applicable in the more general setting, but we do not treat that here. Augmented Krylov subspace methods have been proposed for efficiently solving a sequence of linear systems with a slowly changing coefficient matrix, allowing important spectral information generated while solving to be used to augment the Krylov subspace generated when solving ; see, e.g., [27, 32, 41]. In cases such as a Newton iteration, these matrices are available one at a time, while in a case such as an implicit time-stepping scheme, the matrix may not change at all.

In [40], the authors explored solving a family of shifted systems over an augmented Krylov subspace. Specifically, the goal was to develop a method which solved the family of systems simultaneously, using one augmented subspace to extract all candidate solutions, which also had a fixed storage requirement, independent of the number of shifts . It was shown that in general within the framework of GMRES for shifted systems [13] and subspace recycling [27], such a method, does not exist. In the context of subspace recycling for Hermitian linear systems, in the absence of preconditioning Kilmer and de Sturler proposed a MINRES method in a subspace recycling framework which simultaneously solves multiple non-Hermitian systems, which all differ from a real-symmetric system by a complex multiple of the identity [20], by minimizing the shifted residuals over the augmented Krylov subspace subspace, built using the symmetric Lanczos process. In this paper, we focus exclusively on problems in which the base coefficient matrices are non-Hermitian.

A conclusion one can draw from [40] is that we should consider avoiding methods relying on the invariance of Krylov subspaces under a constant shift of the identity; cf. (LABEL:eqn.shift-invariance). Relying on this invariance imposes restrictions on our ability to develop an algorithm. Furthermore, relying on this shift invariance means we cannot use arbitrary preconditioners. General preconditioners are unavailable if we want to exploit shift invariance, as Krylov subspaces generated by preconditioned systems are not invariant with respect to a shift in the coefficient matrix. In the case that preconditioning is not used, a subspace recycling technique has been proposed [39], built on top of the Sylvester equation interpretation of (LABEL:eqn.one-shifted-fam) observed by Simoncini in [35]. However, this is also not compatible with general preconditioning.

Learning from the results in [40], we focus on methods which do not rely on the shift invariance. Rather than focusing on specific Krylov subspace techniques (augmented or not), we instead begin by developing a general framework of minimum residual projection techniques for shifted linear systems. In this framework, we extract candidate solutions for all shifted systems from the augmented Krylov subspace for one linear system and we select each candidate solution according to a minimum residual Petrov-Galerkin condition. This framework is compatible with arbitrary right preconditioners, and the computational cost for each additional shifted system is relatively small but nontrivial. By specifying subspaces once the framework is developed, we derive minimum residual methods for shifted systems that are compatible with general right preconditioning. Though not considered in this paper, the framework is also compatible with flexible and inexact Krylov methods. These methods descend from the Lanczos-Galerkin seed methods, see, e.g., [6, 7, 19, 28].

In this work, we restrict ourselves to right preconditioned methods. Doing this allows us to derive methods which require extra storage but no extra applications of the operator or preconditioner, and, we minimize the unpreconditioned residual -norm rather than in some other norm; see [34] for more details.

3 Preliminaries

We begin with a brief review of Krylov subspace methods as well as techniques of subspace recycling and for solving shifted linear system. Recall that in many Krylov subspace iterative methods for solving

\hb@xt@.01(3.1)

with , we generate an orthonormal basis for

\hb@xt@.01(3.2)

with the Arnoldi process, where is some starting vector. Let be the matrix with orthonormal columns generated by the Arnoldi process spanning . Then we have the Arnoldi relation

\hb@xt@.01(3.3)

with ; see, e.g., [30, Section 6.3] and [37]. Let be an initial approximation to the solution of a linear system we wish to solve and be the initial residual. At iteration , we choose , with . In GMRES [31], satisfies

which is equivalent to solving the smaller minimization problem

\hb@xt@.01(3.4)

where denotes the th Cartesian basis vector in . We then set . Recall that in restarted GMRES, often called GMRES(), we run an -step cycle of the GMRES method and compute an approximation . We halt the process, discard , and restart with the new residual. This process is repeated until we achieve convergence. An adaption of restarted GMRES to solve (LABEL:eqn.one-shifted-fam) has been previously proposed; see, e.g., [13].

Many methods for the simultaneous solution of shifted systems (see, e.g., [8, 12, 13, 14, 21, 36]) take advantage of the fact that for any shift , the Krylov subspace generated by and is invariant under the shift, i.e.,

\hb@xt@.01(3.5)

as long as the starting vectors are collinear, i.e., for some , with a shifted Arnoldi relation similar to (LABEL:eqn.arnoldi-relation)

\hb@xt@.01(3.6)

where . This collinearity must be maintained at restart. In [40], this was shown to be a troublesome restriction when attempting to extend such techniques augmented Krylov methods. In the case of GMRES, Frommer and Glässner were able to overcome this by minimizing only one residual in the common Krylov subspace and forcing the others to be collinear. This strategy also works in the case of GMRES with deflated restarts [8] because of properties of the augmented space generated using harmonic Ritz vectors. However, it was shown in [40] that residual collinearity cannot be enforced in general. Furthermore, it is not compatible with general preconditioning. The invariance (LABEL:eqn.shift-invariance) can lead to great savings in memory costs; but with a loss of algorithmic flexibility. Thus in Section LABEL:section.dir-proj-framework, we explore an alternative.

We briefly review Recycled GMRES for non-Hermitian . Augmentation techniques designed specifically for Hermitian linear systems have also been proposed; see, e.g., [19, 32, 41]. For a more general framework for these types of methods, see [16], elements of which form a part of the thesis of Gaul [15], which contains a wealth of information on this topic. Gaul and Schlömer describe recycling techniques in the context of self-adjoint operator equations in a general Hilbert space [17].

We begin by clarifying what we mean by Recycled GMRES. We use this expression to describe the general category of augmented GMRES-type methods which are then differentiated by the choice of augmenting subspace. As we subsequently explain, these methods can all be formulated as a GMRES iteration being applied to a linear system premultiplied with a projector. The intermediate solution to this projected problem can then be further corrected yielding a minimum residual approximation for the original problem over an augmented Krylov subspace. GCRO-DR [27] is one such method in this category, in which the augmented subspace is built from harmonic Ritz vectors.

The GCRO-DR method represents the confluence of two approaches: those descending from the implicitly restarted Arnoldi method [22], such as Morgan’s GMRES-DR [24], and those descending from de Sturler’s GCRO method [10]. GMRES-DR is a restarted GMRES algorithm, where at the end of each cycle, harmonic Ritz vectors are computed, and a subset of them is used to augment the Krylov subspace generated at the next cycle. The GCRO method allows the user to select the optimal correction over arbitrary subspaces. This concept is extended by de Sturler in [11], where a framework is provided to optimally reduce convergence rate slowdown due to discarding information upon restart. This algorithm is called GCROT, where OT stands for optimal truncation. A simplified version of the GCROT approach, based on restarted GMRES (called LGMRES) is presented in [2]. Parks et al. in [27] combine the ideas of [24] and [11] and extend them to a sequence of slowly-changing linear systems. They call their method GCRO-DR. This method and GCROT are Recycled GMRES methods.

Suppose we are solving (LABEL:eqn.Axb), and we have a -dimensional subspace whose image under the action of is . Let be the orthogonal projector onto . Let be such that . At iteration , the Recycled GMRES method generates the approximation

where and . The corrections and are chosen according to the minimum residual, Petrov-Galerkin condition over the augmented Krylov subspace, i.e.,

\hb@xt@.01(3.7)

At the end of the cycle, an updated is constructed, the Krylov subspace basis is discarded, and we restart. At convergence, is saved, to be used when solving the next linear system. This process is equivalent to applying GMRES to the projected problem

\hb@xt@.01(3.8)

where is the th GMRES correction for (LABEL:eqn.rGMRES-proj-prob) the second correction is the orthogonal projection of onto where the orthogonality is with respect to the inner product induced by the positive-definite matrix 222We can write explicitly where we define which can be rewritten ; see, e.g., [15, 16].

Recycled GMRES can be described as a modified GMRES iteration. Let have columns spanning , scaled such that has orthonormal columns. Then we can apply to using steps of the Modified Gram-Schmidt process. The orthogonalization coefficients are stored in the th column of , which is simply with one new column appended. Let and be defined as before, but for the projected Krylov subspace . Enforcing (LABEL:eqn.rgmres-petrov-galerkin) is equivalent to solving the GMRES minimization problem (LABEL:eqn.gmres-min) for and setting

so that

This is a consequence of the fact that the Recycled GMRES least squares problem, as stated in [27, Equation 2.13] can be satisfied exactly in the first rows, and this was first observed in [10]. The choice of the subspace then determines the actual method.

4 A direct projection framework

We develop a general framework of minimum residual methods for shifted linear systems which encompasses both unpreconditioned and preconditioned systems. We propose to solve both a single family of shifted systems (LABEL:eqn.one-shifted-fam) and sequences of shifted system families of the form (LABEL:eqn.one-shifted-family-seq). However, it suffices to propose our method in a simpler setting in which we drop the index and assume there are only two systems, a base system and a shifted system. Thus for simplicity, we restrict our description to two model problems: the unpreconditioned problem

\hb@xt@.01(4.1)

and the right-preconditioned problem

\hb@xt@.01(4.2)

where and , and after iterations we set and we set . In this setting, we can propose minimum residual Krylov subspace methods in the cases that we do and do not have an augmenting subspace .

We describe the proposed methods in terms of a general sequence of nested subspaces

This allows us to cleanly present these techniques as minimum residual projection methods and later to give clear analysis, applicable to any method fitting into this framework. Then we can derive different methods by specifying , e.g., .

Let be the nested sequence of subspaces produced by some some iterative method for solving (LABEL:eqn.model-problem-unprec) or (LABEL:eqn.model-problem-prec), after iterations. In the unpreconditioned case (LABEL:eqn.model-problem-unprec), suppose we have initial approximations and for the base and shifted systems, respectively. For conciseness, let us denote . At iteration , we compute corrections which satisfy the minimum residual conditions

\hb@xt@.01(4.3)

In the preconditioned case (LABEL:eqn.model-problem-prec), suppose we begin with initial approximations and . Let us denote the preconditioned operators

At iteration , we compute corrections which satisfy the minimum residual conditions

\hb@xt@.01(4.4)

We emphasize that the same preconditioner is used for all systems.

In this framework, we assume that the minimizer for the base case is constructed via a predefined iterative method, the method which generates the sequence . Therefore, it suffices to describe the residual projection for the shifted system. We can write the update of the shifted system approximation by explicitly constructing the orthogonal projector which is applied during a Petrov-Galerkin projection. Let be a basis for which we take as the columns of . Then we can write this projection and update

\hb@xt@.01(4.5)

where and is the projection scaling matrix, since we assume that does not have orthonormal columns. For well-chosen , these projections can be applied using already-computed quantities.

In the following subsections, we derive new methods by specifying subspaces and a matrix . This will define . We show that for these choices, is composed of blocks which can be built from already-computed quantities. Thus, for appropriate choices of , either (LABEL:eqn.unprec-Galerkin) or (LABEL:eqn.prec-Galerkin), can be applied with manageable additional costs.

We highlight that a strength of this framework that we can develop methods for shifted systems on top of an existing iterative methods, with a few modifications. As the framework only requires a sequence of nested subspaces, it is completely compatible with with both standard Krylov subspace methods as well as flexible and inexact Krylov subspace methods.

4.1 A GMRES method for shifted systems

In the case that we apply the GMRES iteration to the base system, at iteration , our search space is , and the matrix has the first Arnoldi vectors as columns. The projection and update (LABEL:eqn.GMRES-shift-proj-unprec) can be simplified due to the shifted Arnoldi relation (LABEL:eqn.shifted-arnoldi-relation). The matrix can be constructed from the already computed upper Hessenberg matrix. Thus the projection (LABEL:eqn.unprec-Galerkin) can be rewritten

where . As it can be appreciated, applying this is equivalent to solving the least squares problem

\hb@xt@.01(4.6)

and setting . This method has similarities with the GMRES method for shifted systems of Frommer and Glässner [13], which is derived from the invariance (LABEL:eqn.shift-invariance). In the method proposed in [13], one must solve small linear systems for each shifted system whereas here one must solve the small least-squares problem (LABEL:eqn.y-shift-LS). The main difference is that what we propose does not guarantee convergence of all system in one Krylov subspace whereas in [13], this is guaranteed under certain conditions. The strength here comes from the ability to precondition.

4.1.1 Preconditioning

Introducing preconditioning into this setting presents complications. No longer can we use the shifted Arnoldi relation (LABEL:eqn.shifted-arnoldi-relation) as we could in the unpreconditioned case. However, by storing some extra vectors, as in Flexible GMRES [29], one can enforce (LABEL:eqn.prec-Galerkin) with no additional application of the operator or preconditioner.

Recall that in right-preconditioned GMRES (see, e.g., [30, Sections 9.3.2 and 9.4.1]) that , and . This space is never explicitly constructed, though, since if is the solution to the GMRES least squares problem (LABEL:eqn.gmres-min) in the preconditioned case, we simply set . However, in flexible GMRES, one must store this basis. For all , let , and let these vectors be the columns of so that .

With these vectors, one can enforce (LABEL:eqn.prec-Galerkin). Observe that we can write . We explicitly project the residual, but this time onto ,

\hb@xt@.01(4.7)

where . With the right-preconditioned shifted Arnoldi relation

we rewrite

Thus, the approximation update and the residual projection (LABEL:eqn.shift-gmres-prec-proj) can be rewritten

where . This projection process involves only the precomputed matrices (, , and ). The matrices , , and can be computed once, independent of the number of shifted systems. The solution of a dense Hermitian linear system with must be performed for each . This solution of a Hermitian linear system costs floating point operations (FLOPS). The right-preconditioned shifted GMRES algorithm (sGMRES) is shown in Algorithm LABEL:algorithm.sgmres. Observe that an implementation can rely heavily on an existing GMRES code. It should be noted that all but one step of the shifted residual projections can be formulated in terms of block/BLAS-3 operations so that most computations for all shifts are performed simultaneously.

Input : ; ; ; Initial Approximations ; ; Cycle length
Output :  such that for all
1 for  do
2      
3
4 if  then
5       while  do
6             Compute and overwrite , , , , by calling GMRES() for , , , , and
7             Compute and overwrite , , and
8             for  do
9                  
10                  
11                  
12                  
13            For all compute and overwrite by recursively calling sGMRES() for , , , , , , and
14      
15else
16       while  do
17             Compute and overwrite , by calling GMRES() for , , , , and
18      
Algorithm 1 Right preconditioned shifted GMRES (sGMRES())

4.2 An rGMRES method for shifted systems

Suppose now that our iteration for the base system is a Recycled GMRES method.

We begin by projecting the initial residual associated to initial approximation , so that we begin with . This is equivalent to computing the minimum residual correction and setting . In Recycled GMRES, such a projection is necessary to correctly derive the algorithm. For the shifted system, the projection is not necessary, but it does allow for some simplifications later in the derivation. We have then,

\hb@xt@.01(4.8)

where and . Since , this projection can be simplified and computed with manageable additional expense,

where we rewrite . The matrices and must only be computed once, regardless of the number of shifts, and for each shift we solve .

After a cycle of Recycled GMRES for the base system, (LABEL:eqn.unprec-Galerkin) must be enforced for each shifted system. At iteration , our search space . The augmented matrix contains as columns the basis for and . In this case, we have . From [40], we have the identity

Thus, in the unpreconditioned case, for the augmented Krylov subspace, we can rewrite (LABEL:eqn.unprec-Galerkin)

\hb@xt@.01(4.9)

where and

This projection can be performed using already computed quantities, and the matrices , , , , , , and need only be computed once, regardless of the number of shifts. The computation of must be performed for every shift at a cost of .

4.2.1 Preconditioning

Introducing right preconditioning creates some difficulties which we can again surmount by storing some extra vectors. We note that in the case of preconditioning, we have . In this case, for right preconditioned Recycled GMRES, the search space for the base system is . Let and , as in Section LABEL:subsection.GMRES-shift.

Using , we can cheaply perform the initial residual projection,

\hb@xt@.01(4.10)

where and . We can write

The subspace either is available from at the start of the algorithm (in which case must be scaled so that has orthonormal columns), or it is constructed at the end of a restart cycle. In either case, is available in the course of the computation and can be saved. Thus the projection (LABEL:eqn.init-recyc-proj) can be performed with already computed quantities,

\hb@xt@.01(4.11)

where we rewrite and

After a cycle of right-preconditioned Recycled GMRES, we must perform the projection (LABEL:eqn.prec-Galerkin) for each shifted system. We proceed slightly differently in this derivation than in the unpreconditioned case. We have

Following from [27], we define

which yields the augmented Arnoldi relation

\hb@xt@.01(4.12)

Using the relation (LABEL:eqn.aug-arn-rel), an identity for the shifted operator with right preconditioning follows,

\hb@xt@.01(4.13)

We use the relation (LABEL:eqn.shifted-prec-aug-arnoldi) to derive the expansion

\hb@xt@.01(4.14)

Thus, the projection can be performed for each shift using already computed quantities. This yields the following updates of the approximation and residual

Input : ; ; ; Initial Approximations ; ; ; Cycle length
Output :  such that for all
1 for  do
2      
3
4
5
6 Compute QR-factorization
7 , ,
8 and
9 Compute and
10 for  do
       %%%%% Shifted System Initial Projections %%%%%
11      
12