# Inexact methods for symmetric stochastic eigenvalue problems^{†}^{†}thanks: This work is based upon work supported by the U. S. Department of Energy Office of Advanced Scientific Computing Research,
Applied Mathematics program under Award Number DE-SC0009301, and by the
U. S. National Science Foundation under grant DMS1521563.

###### Abstract

We study two inexact methods for solutions of random eigenvalue problems in the context of spectral stochastic finite elements. In particular, given a parameter-dependent, symmetric matrix operator, the methods solve for eigenvalues and eigenvectors represented using polynomial chaos expansions. Both methods are based on the stochastic Galerkin formulation of the eigenvalue problem and they exploit its Kronecker-product structure. The first method is an inexact variant of the stochastic inverse subspace iteration [B. Sousedík, H. C. Elman, SIAM/ASA Journal on Uncertainty Quantification 4(1), pp. 163–189, 2016]. The second method is based on an inexact variant of Newton iteration. In both cases, the problems are formulated so that the associated stochastic Galerkin matrices are symmetric, and the corresponding linear problems are solved using preconditioned Krylov subspace methods with several novel hierarchical preconditioners. The accuracy of the methods is compared with that of Monte Carlo and stochastic collocation, and the effectiveness of the methods is illustrated by numerical experiments.

igenvalues, subspace iteration, inverse iteration, Newton iteration, stochastic spectral finite element method

35R60, 65F15, 65F18, 65N22

## 1 Introduction

Eigenvalue analysis is important in a number of applications, for example in modeling of vibrations of mechanical structures, neutron transport criticality computations, or stability of dynamical systems, to name a few. The behavior of the underlying mathematical models depends on proper choice of parameters entering the model through coefficients, boundary conditions or forces. However, in practice the exact values of these parameters are not known and they are treated as random processes. The uncertainty is translated by discretization into the matrix operators and subsequently into eigenvalues and eigenvectors. The standard techniques to solve this problems include Monte Carlo (MC) methods [2, 25, 31], which are robust but relatively slow, and perturbation methods [15, 17, 32, 38], which are limited to models with low variability of uncertainty.

In this study, we use spectral stochastic finite element methods (SSFEM) [7, 18, 20, 43] for the solution of symmetric eigenvalue problems. The assumption in these methods is that the parametric uncertainty is described in terms of polynomials of random variables, and they compute solutions that are also polynomials in the same random variables in the so-called generalized polynomial chaos (gPC) framework [7, 44]. There are two main approaches: stochastic collocation (SC) and stochastic Galerkin (SG) methods. The first approach is based on sampling, so the problem is translated into a set of independent deterministic systems; the second one is based on stochastic Galerkin projection and the problem is translated into one large coupled deterministic system. While the SSFEM methods have become quite popular for solving stochastic partial differential equations, the literature addressing eigenvalue problems is relatively limited. The stochastic inverse iteration in the context of the SG framework was proposed by Verhoosel et al. [37]. Meidani and Ghanem [22, 23] formulated stochastic subspace iteration using a stochastic version of the modified Gram-Schmidt algorithm. Sousedík and Elman [33] introduced stochastic inverse subspace iteration by combining the two techniques, they showed that deflation of the mean matrix can be used to compute expansions of the interior eigenvalues, and they also showed that the stochastic Rayleigh quotient alone provides a good approximation of an eigenvalue expansion; see also [3, 4, 27] for closely related methods. The authors of [23, 33] used a quadrature-based normalization of eigenvectors. Normalization based on a solution of a small nonlinear problem was proposed by Hakula et al. [12], and Hakula and Laaksonen [13] also provided an asymptotic convergence theory for the stochastic iteration. In an alternative approach, Ghanem and Ghosh [6, 9] proposed two numerical schemes—one based on Newton iteration and another based on an optimization problem (see also [8, 10]). Most recently, Benner et al. [1] formulated an inexact low-rank Newton–Krylov method, in which the stochastic Galerkin linear systems are solved using the BiCGStab method with a variant of mean-based preconditioner. In alternative approaches, Pascual and Adhikari [28] introduced several hybrid perturbation-polynomial chaos methods, and Williams [40, 41, 42] presented a method that avoids the nonlinear terms in the conventional method of stochastic eigenvalue calculation but introduces an additional independent variable.

We formulate two inexact methods for symmetric eigenvalue problems formulated in the SSFEM framework and based on the SG formulation. The first method is an inexact variant of the stochastic inverse subspace iteration from [33], in which the linear stochastic Galerkin systems are solved using the conjugate gradient method with the truncated hierarchical preconditioner [35] (see also [36]). The second method is an inexact variant of the Newton iteration from [6], in which the linear stochastic Galerkin systems are solved using preconditioned MINRES and GMRES. The methods are derived using the Kronecker-product formulation and we also comment on the so-called matricized format. The formulation of the Newton’s method is closely related to that of [1], however we consider general parametrization of stochastic coefficients, the Jacobian matrices are symmetrized, and we propose a class of hierarchical preconditioners, which can be viewed as extensions of the hierarchical preconditioners used for the first method. We also note that we have recently successfully combined an inexact Newton–Krylov method with the stochastic Galerkin framework in a different context [19, 34]. The performance of both methods is illustrated by numerical experiments, and the results are compared to that of MC and SC methods.

The paper is organized as follows. In Section 2 we introduce the stochastic eigenvalue problem, in Section 2.1 we recall the solution techniques using sampling methods (Monte Carlo and stochastic collocation), in Section 2.2 we introduce the stochastic Galerkin formulation, in Section 3 we formulate the inverse subspace iteration and in Section 4 the Newton iteration, in Section 5.1 we report the results of numerical experiments, and in Section 6 we summarize our work. In Appendices A and B we describe algorithmic details, and in Appendix C we discuss the computational cost.

## 2 Stochastic eigenvalue problem

Let be a bounded physical domain, and let be a complete probability space, that is, is the a sample space with a -algebra and a probability measure . We assume that the randomness in the mathematical model is induced by a vector of independent, identically distributed random variables , where . Let denote the Borel -algebra on induced by and denote the induced measure. The expected value of the product of measurable functions on determines a Hilbert space with inner product

(1) |

where the symbol denotes the mathematical expectation.

In computations we will use a finite-dimensional subspace spanned by a set of multivariate polynomials that are orthonormal with respect to the density function , that is , where is the Kronecker delta, and is constant. This will be referred to as the gPC basis [44]. The dimension of the space depends on the polynomial degree. For polynomials of total degree, the dimension is . We suppose we are given a symmetric matrix-valued random variable represented as

(2) |

where each is a deterministic matrix of size with size determined by the discretization of the physical domain, and is the mean value matrix, that is . The representation (2) is obtained from either the Karhunen-Loève expansion or, more generally, a stochastic expansion of an underlying random process.

We are interested in a solution of the following stochastic eigenvalue problem: find a set of stochastic eigenvalues and corresponding eigenvectors , , which almost surely (a.s.) satisfy the equation

(3) |

where and , along with a normalization condition

(4) |

where denotes the inner product of two vectors.

We will search for expansions of eigenpairs, , in the form

(5) |

where and are the coefficients corresponding to the basis . Equivalently to (5), using the symbol for the Kronecker product, we write

(6) |

where , , and .

###### Remark \thetheorem

One can in general consider different number of terms in the two expansions (5). However, since the numerical experiments in [33] and also in the present work indicate virtually no effect when the number of terms in eigenvalue expansion is larger than in the eigenvector expansion, we consider here the same number of terms in both expansions, see also Remark 3.

### 2.1 Sampling methods

Both Monte Carlo and stochastic collocation methods are based on sampling. The coefficients are defined by a discrete projection

(7) |

The evaluations of coefficients in (7) entail solving a set of independent deterministic eigenvalue problems at a set of sample points, or,

In the Monte Carlo method, the sample points, are generated randomly following the distribution of the random
variables, , and moments of solution are
computed by ensemble averaging. In addition, the coefficients in (5) can be computed as^{1}^{1}1In numerical experiments we avoid projections on the gPC and work with the sampled quantities.

where is the th element of . For stochastic
collocation, which is used here in the form of so-called nonintrusive stochastic
Galerkin method, the sample points, consist
of a predetermined set of *collocation points*, and the coefficients and in expansions (5) are
determined by evaluating (7) in the sense of (1) using numerical quadrature

(8) |

where are the quadrature (collocation) points and are quadrature weights. We refer, e.g., to [18] for a discussion of quadrature rules. Details of the rule we use in our numerical experiments are discussed in Section 5.1.

### 2.2 Stochastic Galerkin formulation

The main contribution of this paper is the development of two inexact methods based on the stochastic Galerkin formulation of eigenvalue problem (3)–(4). The formulation entails a projection

(9) | ||||||

(10) |

Let us introduce the notation

(11) |

Substituting (2) and (5) into (9)–(10) yields a nonlinear system,

(12) | |||||

(13) |

where the symbol is the Hadamard product, see, e.g., [14, Chapter 5]. An equivalent formulation of (12)–(13) is obtained as follows. Substituting (6) into (3)–(4) and rearranging, we get

and employing Galerkin projection (9)–(10) yields the equivalent formulation

(14) | |||||

(15) |

Finally, we note that the methods can be equivalently formulated in the
so-called *matricized* format, which can also simplify the
implementation. To this end, we make use of isomorphism between and determined by the operators vec and mat: , (, where , and the upper/lower case notation is assumed
throughout the paper, so (, etc.
Specifically, we define the *matricized* coefficients of the
eigenvector expansion

(16) |

where the column contains the coefficients associated with the basis function.

## 3 Inexact stochastic inverse subspace iteration

We formulate an inexact variant of the inverse subspace iteration from [33] for the solution of (12)–(13). Stochastic inverse iteration was formulated in [37] for the case when a stochastic expansion of a single eigenvalue is sought. It was suggested in [33] that the matrix can be deflated, rather than applying a shift, to find an expansion of an interior eigenvalue, and a stochastic version of modified Gram-Schmidt process [23] can be applied if more eigenvalues are of interest. In this section, we formulate an inexact variant of the stochastic inverse subspace iteration [33, Algorithm 3.2], whereby the linear systems (19) are solved only approximately using preconditioned conjugate gradient method (PCG). The method is formulated as Algorithm 1. We now describe its components in detail, and for simplicity we drop the superscript in the description.

(17) |

(18) |

.

#### Matrix-vector product

The conjugate gradient method and computation of the stochastic Rayleigh quotient require a stochastic version of a matrix-vector product, which corresponds to evaluation of the projection

Since , the coefficients of the expansion are

(20) |

The use of this computation for the Rayleigh quotient is described below. We also note that Algorithm 1 can be modified to perform subspace iteration [23, Algorithm ] for identifying the largest eigenpairs. In this case, the solve (19) is simply replaced by a matrix-vector product (20).

#### Stochastic Rayleigh quotient

In the deterministic case, the Rayleigh quotient is used to compute the eigenvalue corresponding to a normalized eigenvector as , where . For the stochastic Galerkin method, the Rayleigh quotient defines the coefficients of a stochastic expansion of the eigenvalue defined via a projection

The coefficients of are computed using (20) and the coefficients are

which is

(21) |

###### Remark \thetheorem

The Rayleigh quotient (21) finds
coefficients of the eigenvalue expansion, which is consistent with Newton
iteration formulated in Section 4 and also with the literature [23, 37]. We note that it would be
possible to compute the coefficients for as
well, because the inner product of two eigenvectors which are
expanded using chaos polynomials up to degree has nonzero chaos
coefficients up to degree . An alternative is to use a *full*
representation of the Rayleigh quotient based on the projection of . However, from our experience in the present and the previous work [33], the representation (21) is sufficient.

#### Normalization and the Gram-Schmidt process

Let denote the vector norm, induced by the inner product. That is, for a vector evaluated at a point,

(22) |

At each step of stochastic iteration the coefficients of a given set of vectors are transformed into an orthonormal set such that the condition

(23) |

and in particular (13), is satisfied. We adopt the same strategy as in[23, 33], whereby the coefficients of the orthonormal eigenvectors are calculated using a discrete projection and a quadrature rule. An alternative approach to normalization, based on solution of a relatively small nonlinear system was proposed by Hakula et al. [12].

Let us first consider *normalization* of a vector, so . The
coefficients in column of corresponding to coefficients of
a normalized vector are computed as

(24) |

When , the *orthonormalization* (23) is
performed by a combination of stochastic Galerkin projection and the
modified Gram-Schmidt algorithm as proposed in [23],

(25) |

Using the expansion (6) and rearranging, the coefficients in column of are

where

and the coefficients are computed using a discrete projection as in (8),

#### Stopping criteria

The inexact iteration entails in each step of Algorithm 1 a solution of the stochastic Galerkin problem (19) using the preconditioned conjugate gradient method. We use the criteria proposed by Golub and Ye [11, Eq. (1)]; the criteria is satisfied when the relative residual of PCG gets smaller than a factor of the nonlinear residual from the previous step, that is

(26) |

where the factor . It is important to note that Algorithm 1 provides only the coefficients of expansion of the projection of residual on the gPC basis, that is

(27) |

One could assess accuracy using Monte Carlo sampling of this residual by computing

However, in the numerical experiments we use a much less expensive computation, which is based on using coefficients directly as an error indicator. In particular, we monitor the norms of the terms of corresponding to expected value and variance,

(28) |

### 3.1 Preconditioners for the stochastic inverse iteration

We use two preconditioners for problem (19) – the mean-based preconditioner [29, 30] and the hierarchical Gauss-Seidel preconditioner [35]. Both preconditioners are formulated in the Kronecker-product format and we also comment on the matricized formulation. We assume that a preconditioner for the mean matrix is available.

The mean-based preconditioner (MB) is listed as Algorithm 2. Since , the preconditioner entails block diagonal solves with, and recalling that we can write , , its action can be equivalently obtained by solving

(29) |

The preconditioner for (19) is defined as follows.

(30) |

(31) |

The hierarchical Gauss-Seidel preconditioner (hGS) is listed as Algorithm 3. We will denote by a subvector of containing gPC coefficients , and, in particular,. There are two components of the preconditioner. The first component consists of block-diagonal solves with blocks of varying sizes, but computed just as in Algorithm 2, resp. in (29). The second component is used in the setup of the right-hand sides for the solves and consists of matrix-vector products by certain subblocks of the stochastic Galerkin matrix by vectors of corresponding sizes. To this end, we will write , with and denoting a set of (consecutive) rows and columns of matrix so that, in particular, . Then, the matrix-vector products can be written, cf. (20) and note the symmetry of, as

(32) |

where is an index set indicating that the matrix-vector products may be truncated. Possible strategies for truncation are discussed in [35]. In this study, we use with for some and, in particular, we set . We also note that, since the initial guess is zero in Algorithm 3, the multiplications by and vanish from (30)–(31).

## 4 Newton iteration

Use of Newton iteration to solve (9)–(10) was proposed in [6], and most recently studied in [1]. We use a similar strategy also here and formulate a line-search Newton method as Algorithm 4. To begin, we consider the system of nonlinear equations (14)–(15) and rewrite it as

(33) |

where

(34) | ||||

(35) |

The Jacobian matrix of(33) is

(36) |

where

(37) | ||||

(38) | ||||

(39) |

Step of Newton iteration entails solving a linear system

(40) |

followed by an update of the solution

(41) |

The matrix is non-symmetric, but since , we modify linear system (40) in our implementation as

(42) |

which restores symmetry of linear systems solved in each step of Newton iteration. The symmetric Jacobian matrix in (42) will be denoted by. The hierarchical structure of the Jacobian matrix, which is due to the stochastic Galerkin projection, is illustrated by the left panel of Figure 1. The systems (42) are solved inexactly using a preconditioned Krylov subspace method, and the details of evaluation of the right-hand side and the matrix-vector product are given in Appendix A.

### 4.1 Inexact line-search Newton method

In order to improve global convergence behavior of Newton iteration, we consider a line-search modification of the method following [26, Algorithm 11.4]. To begin, let us define the merit function as the sum of squares,

where is the residual of (33), and denote

As the initial approximation of the solution, we use the eigenvectors and eigenvalues of the associated mean problem given by the matrix concatenated by zeros, that is and , and the initial residual is

The line-search Newton method is summarized in our setting as Algorithm 4, and the choice of parameters and in the numerical experiments is discussed in Section 5.1.

The inexact iteration entails in each step a solution of the stochastic Galerkin linear system in Line 4 of Algorithm 4 given by (42) using a Krylov subspace method. In our algorithm we use the adaptive stopping criteria for the method,

(43) |

where . The for-loop is terminated when the convergence check in Line 12 is satisfied; in our numerical experiments we check if .

### 4.2 Preconditioners for the Newton iteration

The Jacobian matrices in (42) are symmetric, indefinite, and so the linear systems can be ideally solved using MINRES iterative method. It is well known that a preconditioner for MINRES must be symmetric and positive definite cf., e.g., [39]. A popular choice is a block diagonal preconditioner, cf. [24],

where and the Schur complement are obtained as approximations of the blocks in (60). Such preconditioner, based on truncation of the series in(57) and (58) to the very first term, was used in [1]. In such setup, we get