Block preconditioning of stochastic Galerkin problems:
New two-sided guaranteed spectral bounds††thanks: This version dated: April 30, 2019.
\fundingThe work of M. K. was supported by the Czech Academy of Sciences through the project L100861901 (Programme for promising human resources – postdocs) and by the Ministry of Education, Youth and Sports of the Czech Republic through the project LQ1602 (IT4Innovations excellence in science). The work of I. P. was supported by the Grant Agency of the Czech Republic under the contract No. 17-04150J.
The paper focuses on numerical solution of parametrized second order partial differential equations with scalar parameter-dependent coefficient function by the stochastic (spectral) Galerkin method. We study preconditioning of the related discretized problems using preconditioners obtained by modifying the stochastic part of the partial differential equation. We present a simple but general approach for obtaining two-sided bounds to the spectrum of the resulting matrices, based on a particular splitting of the discretized operator. Using this tool and considering the stochastic approximation space formed by classical orthogonal polynomials, we obtain new spectral bounds depending solely on the properties of the coefficient function and the type of the approximation polynomials for several classes of block-diagonal preconditioners. These bounds are guaranteed and applicable to various distributions of parameters. Moreover, the conditions on the parameter-dependent coefficient function are only local, and therefore less restrictive than those usually assumed in the literature. The technique introduced in the paper can be employed also in a posteriori error estimation and in adaptive algorithms.
remarkRemark \newsiamremarkassumptionAssumption \newsiamremarkexampleExample \headersBlock preconditioning in stochastic GalerkinM. Kubínová, I. Pultarová
tochastic Galerkin method, preconditioning, block-diagonal preconditioning, spectral bounds
Growing interest in uncertainty quantification of numerical solutions of partial differential equations stimulates new modifications of standard numerical methods. A popular choice for partial differential equations with parametrized or uncertain data is the stochastic Galerkin method [4, 35]. Similarly to deterministic problems, approximate solutions, which depend on physical and stochastic variables (parameters), are searched for in finite-dimensional subspaces of the original Hilbert space. More precisely, the approximate solutions are orthogonal projections of the exact solution to the finite-dimensional subspaces with respect to the energy inner product defined by the operator of the equation; see, e.g., [5, 9, 10, 23]. The approximation subspaces are considered in the form of a tensor product of a physical variable space (finite-element functions) and a stochastic variable space (polynomials); see, e.g., [4, 14]. The form and qualities of the system matrix of the discretized problem are determined by the structure of the uncertain data and the type of the finite-dimensional solution spaces. For special classes of parameters, it was shown, see, e.g., [23, 31], that certain block-diagonal matrices are spectrally equivalent to independently of the degree of polynomials and the number of random parameters, and thus they can be used for preconditioning. Having a good preconditioning method or, in other words, a good and feasible approximation of , we may also efficiently estimate a posteriori the energy norm of the error during iterative solution processes [1, 5, 6, 9, 17]. This estimate can be used in adaptive algorithms [5, 7, 8]. In practice, matrix is never built explicitly, only matrix-vector products are evaluated ().
In this paper, we focus on matrices arising in the discretized stochastic Galerkin method and present new guaranteed two-sided bounds to the spectra of the preconditioned matrices for several types of preconditioner. We consider only preconditioning with respect to the stochastic parts of problems, and thus we assume that a suitable preconditioning method or an efficient solver for the underlying deterministic problem is available; see, e.g., [12, 22, 32]. We formulate an idea of obtaining bounds to the spectra of the preconditioned matrix from the spectrum of small Gram matrices depending solely on the stochastic part of the approximation space. The motivation, however, comes from techniques and tools of the algebraic multilevel preconditioning introduced in [11, 2]. Similar idea was, in a simpler form, used already in [27, 28]. In the current paper, it is applied in a more general setting, and we believe that the derived technique may lead to an improvement of some other recently introduced estimates, such as [17, 22]. The derived technique is also applicable to systems in the form of multi-term matrix equation (see [24, eq. (1.8)]).
The paper is organized as follows. In Section 2, we briefly recall the stochastic Galerkin method and the structure of the matrices of the resulting systems of linear equations for the tensor product polynomials and complete polynomials. Since the structure of plays a crucial role in the analysis, theoretical considerations will be accompanied by illustrative examples throughout the paper. Section 3 formulates a general concept of proving spectral equivalence for a broad class of (not only) stochastic Galerkin preconditioners. In Section 4, we apply this idea to preconditioners which are represented by a special type of block-diagonal (or Schur complement) approximations of , and show how to obtain the spectral bounds of the preconditioned problems from the spectral bounds of small Gram matrices of the corresponding polynomial chaos. We also evaluate those bounds explicitly for the considered polynomial chaoses. Simple numerical examples demonstrating the obtained theoretical outcomes are presented at the end of the section.
Throughout the paper, we denote by , where and are symmetric positive definite, the spectral condition number of , i.e., the standard condition number of or, in other words, . By we denote the -th column of the identity matrix, where its size follows from the context.
2 Stochastic Galerkin matrices
Consider the variational problem of finding , such that
where is a bounded polygonal domain, or , is a parametric measure space, , , , and . The gradient is applied only with respect to the (physical) variable . Let , where are outcomes of independent random variables with probability densities , . The joint probability density is then . In the following, we consider defined on such that outside . Thus, instead of and , we further write and , respectively. For the convenience of notation, the probability densities are not normalized, see also Table 1, and we further refer to them as weights.
We assume in the affine form
where , . While it is usually assumed that there exist constants and such that
in this paper we consider more general functions . We will only require that the left-hand side of Eq. 1 defines an inner product on a finite-dimensional approximation space ; see Section 2.3. This will allow us to use random variables with unbounded images and still obtain positive definite system matrices. In other words, we can avoid truncation of supports of distribution functions or any other modification of them. Of course, under such (weaker) condition on , Eq. 1 may not be well-defined. In this paper we, however, focus only on the discretized problem obtained from Eq. 1; see also the discussion in .
We consider discretization using the tensor product space [3, 4, 14] of the form , where is an -dimensional space spanned by the finite-element (FE) functions , and is an -dimensional space spanned by -variate polynomials , …, of variables , …, . Denoting the basis functions of by a couple of coordinates and , we obtain the matrix of the system of linear equations of the discretized Galerkin problem Eq. 1 with elements
where we formally set . If the numbering of the basis functions is anti-lexicographical, the structure of is
In other words, the matrix is composed of blocks, each of size .
Assume , the uniform distribution , and let , and be the normalized Legendre orthogonal polynomials of degrees 0, 1, and 2, see Table 1. Then and
2.1 Approximation spaces and their bases
For approximation of the physical part of the solution, we use an -dimensional space . To approximate the stochastic part of the solution we use the -dimensional space of -variate polynomials , . To simplify the notation, we assume that the parameters , …, are identically distributed, i.e., . Thus, we omit the superscripts and subscripts in and , respectively. The extension of the results to polynomial bases with different is straightforward.
In practice, sets of complete polynomials (C) or tensor product polynomials (TP) are usually used; see, e.g., [14, 23]. The set of the tensor product polynomials of the degree at most in variable , , is defined as
Let us denote by the corresponding approximation space of Eq. 1. The set of complete polynomials of the maximum total degree is defined as
Let us denote by the corresponding approximation space of Eq. 1.
For both and , the bases are usually constructed as products of classical orthogonal polynomials. More precisely , , where are normalized orthogonal polynomials of the degrees , with respect to the weight function , i.e.,
The basis functions of the discretization space are then of the form
For the tensor product polynomials, we consider the anti-lexicographical ordering of the basis functions, i.e., the leftmost index () in Eq. 8 is changing the fastest, while the rightmost index () is changing the slowest. For the complete polynomials, we consider ordering by the total degree of the polynomials, going from the smallest to the largest.
If are orthogonal polynomials of degrees with respect to , then the double orthogonal polynomials are obtained as the Lagrange polynomials corresponding to the zeros of . If we use the double orthogonal polynomials as a basis of , the matrix becomes block-diagonal with the diagonal blocks of the sizes . Such block-diagonal matrix can be also obtained by simultaneous diagonalization of all matrices , see . This diagonal structure of the resulting matrices seems favourable for practical computations. However, the double orthogonal polynomials cannot be used as a basis for complete polynomials . Moreover, for this basis, we cannot obtain methods a posteriori error estimation or adaptivity control in a straightforward way. In addition, to refine the space , all diagonal blocks of the matrix must be recomputed, because a new set of zeros of (instead of the zeros of ) and thus a new set of basis polynomials must be used. Therefore, in this paper, we only consider the classical orthogonal polynomials to construct the bases of or .
2.2 Matrices for classical orthogonal polynomials
The form of the matrices , , in Eq. 4 depends on the choice of the basis of or and will be important for our future analysis. As will be described later, the matrices can be constructed from (the elements of) a sequence of smaller matrices
Let the normalized orthogonal polynomials satisfy the well-known three-term recurrence
then , where is the identity matrix, and have the form of the Jacobi matrix
The eigenvalues of this matrix are given by the roots of the polynomial , which are distinct and lie in the support of ; see, e.g., . In Table 1, we list the classical orthogonal polynomials with symmetric statistical distribution considered here together with the weight function corresponding to the non-normalized probability density. Note that due to the symmetry, the diagonal entries of in Eq. 12 become trivially zero. These matrices will play a crucial role in deriving spectral bounds, see Section 4.
|statistical distribution||weight function||support||polynomial chaos|
|Wigner semicircle||Chebyshev ( kind)||0|
For the tensor product polynomials, the matrices , , are obtained as
Consider the tensor product Legendre polynomials of two variables and , with , then and the matrix has the form
where the blocks corresponding to the changing degree of the approximation polynomials of the variable are separated graphically.
For complete polynomials, the matrices lose the Kronecker product structure, since is not a tensor product space. However, since , each matrix is permutation-similar to a submatrix of the matrices in Eq. 13, [14, Lemma 3].
Consider the complete Legendre polynomials of two variables and and , then and the relevant submatrix of the tensor-product matrix Eq. 17 is
Reordering the entries by the total degree of the corresponding polynomial, we obtain
where the blocks corresponding to the total degrees 0, 1, and 2 are separated graphically.
2.3 Positive definiteness
The left-hand side of the equation Eq. 1 defines the bilinear form on . We present sufficient conditions on the function , under which becomes an inner product (called energy inner product; see, e.g., [5, 9, 10]) on the finite-dimensional space . To achieve positive definiteness of the bilinear form , we need to assume some dominance of the deterministic part over the stochastic part , . In this paper, we will assume that there exists a constant such that
where the particular choice of depends on the weight . For the Beta distribution on , is suffices to take , while for the Gauss distribution, we take for tensor product polynomials and for complete polynomials.111Since the eigenvalues of matrix are the eigenvalues of the Hermite polynomials and thus lie in the interval [33, p.120], the eigenvalues of are strictly positive. Note that this choice of also trivially implies that is positive definite. For further discussion on bounds of Hermite and Lagrange polynomials see, e.g., .
We emphasize that the assumption Eq. 20 is weaker than the classical assumption widely used to obtain spectral estimates, e.g.,
for uniform distribution; see [13, 17, 22, 23, 34]. The main difference between Eq. 20 and Eq. 21 is that the former is considered point-wise, while the latter uses the norms of over . The condition Eq. 20 allows us to obtain not only more accurate two-sided guaranteed bounds to the spectra, but these bounds also apply to parameter distribution and functions for which no estimate could be obtained using the standard approach; see Section 4.4. Assumption Eq. 20 is sufficient to achieve positive definiteness of . In some applications, we can assume a stronger dominance of , i.e.,
The smaller the , the more favourable spectral bounds of the matrices and of the preconditioned matrices are generally achieved. We will further assume that is the smallest number for which Eq. 22 is satisfied.
3 Proving spectral equivalence of inner products on
We consider preconditioning methods based on inner products that are spectrally equivalent to the energy inner product on , but are represented by matrices with more favourable non-zero structures such as, for example, block-diagonal matrices. We base our approach on a splitting of the inner products to subdomains (Lemma 1) and on a preconditioning of a tensor product matrix (Lemma 2).
Let be partitioned into arbitrary non-overlapping elements (subdomains) , . Consider the following decomposition of from Eq. 5
We further assume that the functions , , (and thus the function ) are constant on every element (subdomain) , . We define
If are not constant on elements, we would assume a stronger, element-wise, dominance of over , i.e.,
instead of Eq. 22, which would result in a slight modification of the spectral estimates derived in subsequent sections. To simplify the presentation, we do not describe these modifications in more detail.
Using Eq. 25, we obtain
Therefore, we can write
In other words, we obtained a decomposition of in which the dependence of the FE matrices on is compensated by splitting of the operator to elements.
In this paper, we consider preconditioners corresponding to an inner product defined on whose matrix representation (with respect to the same basis) is of the form analogous to Eq. 29, in particular
where and are such that the matrices are positive semidefinite for all . We will see later that many of the preconditioners that are used in practice are indeed of the form Eq. 30.
The following theorem shows that the spectral equivalence between and can be obtained from the spectral equivalence between and on each element , . The obtained spectral bounds do not depend on the type and the number of the FE basis functions.
The proof of Theorem 1 is based on the two following lemmas.
Let and be two inner products on a Hilbert space . Let the inner products be composed as
where and , , are positive semidefinite bilinear forms on . Let there exist two positive real constants and such that the induced seminorms are uniformly equivalent in the following sense
Then the induced (cumulative) norms are also equivalent with the same constants, i.e.,
The proof follows trivially from:
Lemma 1 can also be formulated in terms of matrices: If , and for all and , then for all .
Let be symmetric positive definite and be symmetric positive semidefinite. Let
hold for some positive real constants and . Then also
If is invertible, then the proof follows trivially from