Analysis of quasi-optimal polynomial approximations for parameterized PDEs with deterministic and stochastic coefficients
In this work, we present a generalized methodology for analyzing the convergence of quasi-optimal Taylor and Legendre approximations, applicable to a wide class of parameterized elliptic PDEs with finite-dimensional deterministic and stochastic inputs. Such methods construct an optimal index set that corresponds to the sharp estimates of the polynomial coefficients. Our analysis furthermore represents a new approach for estimating best -term approximation errors by means of coefficient bounds, without using Stechkin inequality. The framework we propose for analyzing asymptotic truncation errors is based on an extension of the underlying multi-index set into a continuous domain, and then an approximation of the cardinality (number of integer multi-indices) by its Lebesgue measure. Several types of isotropic and anisotropic (weighted) multi-index sets are explored, and rigorous proofs reveal sharp asymptotic error estimates in which we achieve sub-exponential convergence rates (of the form , with a constant depending on the shape and size of multi-index sets) with respect to the total number of degrees of freedom. Through several theoretical examples, we explicitly derive the constant and use the resulting sharp bounds to illustrate the effectiveness of Legendre over Taylor approximations, as well as compare our rates of convergence with current published results. Computational evidence complements the theory and shows the advantage of our generalized framework compared to previously developed estimates.
This paper focuses on a relevant model boundary value problem, involving the simultaneous solution of a family of equations, parameterized by a finite-dimensional vector , on a bounded Lipschitz domain . In particular, we consider a differential operator defined on , and let , with and , represent the input coefficient associated with the operator . The forcing term is assumed to be a fixed function of . We concentrate on the following parameterized boundary value problem: for all , find , such that the following equation holds
subject to suitable (possibly parameterized) boundary conditions. We require and to be chosen such that system is well-posed in a Banach space, with unique solution , such that, when suppressing the explicit dependence on , the map is defined from the parameter domain into the solution space .
Problems such as arise in contexts of both deterministic and stochastic modeling. In the deterministic setting, the parameter vector is known or controlled by the user, and a typical goal is to study the dependence of on these parameters, e.g., optimizing an output of the equation with respect to (see  for more details). On the other hand, stochastic modeling is motivated by many engineering and science problems in which the input data is not known exactly. A quantification of the effect of the input uncertainties on the output of simulations is necessary to obtain a reliable prediction of the physical system. A natural way to incorporate the presence of input uncertainties into the governing model is to consider the parameters as random variables and a random vector, where and is the set of outcomes. In this setting, we assume the components of have a joint probability density function (PDF) , with known directly through, e.g., truncations of correlated random fields , such that the probability space is equivalent to , where denotes the Borel -algebra on and is the probability measure of .
Monte Carlo (MC) methods (see, e.g., ) are the most popular approaches for approximating high-dimensional integrals, such as expectation or two-point correlation, based on independent realizations , , of the solution to ; approximations of the expectation or other QoIs are obtained by averaging over the corresponding realizations of that quantity. The resulting numerical error is proportional to , thus, achieving convergence rates independent of dimension , but requiring a very large number of samples to achieve reasonably small errors. Moreover, MC methods do not have the ability to simultaneously approximate the solution map , since they are quadrature techniques and do not exploit the fact in many scenarios, the solutions smoothly depend on the coefficient Taking this smooth dependence into account, several global polynomial approximation techniques, for instance, intrusive Galerkin methods  and non-intrusive collocation methods , have been proposed, often featuring much faster convergence rates.
Let . Global polynomial approximation methods seek to build an approximation to the solution of the form:
for a finite multi-index set , where is a multivariate polynomial in for and is the coefficient to be computed, both of which are method specific. Here, for two vectors , we say if and only if for all . Also, given a vector of real numbers, we define with the convention . We will often suppress the dependence on and use the notations and without loss of generality. In this paper, we are interested in solving using a class of polynomial approximations based on the Taylor and Legendre expansions of solution . The polynomial basis considered herein is thus given by the monomials (in the former case) and the tensorized Legendre polynomials (in the latter case).
The evaluation of in requires the computation of coefficients , where is the cardinality of . A naive choice of and their corresponding polynomial spaces , for instance, tensor product polynomial spaces, could lead to an infeasible computational cost, especially when the dimension of the parameter domain is high. It is important to be able to construct the set of the most effective indices for the approximation , which provides maximum accuracy for a given cardinality. In other words, given a fixed , one searches for a set which minimizes the error among all index subsets of of cardinality . This practice has been known as best M-term approximations.
The literature on the best -term Taylor and Galerkin approximations has been growing fast recently, among them we refer to . In the benchmark work , the analytic dependence of the solutions of parametric elliptic PDEs on the parameters was proved under mild assumptions on the input coefficients, and convergence analysis of the best -term Taylor and Legendre approximations was established subsequently. Consider, for example, the expansion of on by a family of normalized polynomials, i.e., . Application of the triangle inequality yields
which suggests determining the optimal index set by choosing the set of indices corresponding to largest . Here, denotes the complement of in . In , the error of such approximation was estimated due to Stechkin inequality (see, e.g., ) such that
where is some number in such that is -summable. It should be noted that the convergence rate does not depend on the dimension of the parameter domain (which is possibly countably infinite therein). This error estimate, however, has some limitations. First, explicit evaluation of the coefficient is inaccessible in general (thus so is the total estimate). Secondly, often occurs with infinitely many values of and stronger rates, corresponding to smaller , are also attached to bigger coefficients. For a specific range of , the effective rate of convergence is unclear. In implementation, finding the best index set and polynomial space is an infeasible task, since this requires computation of all of the . As a strategy to circumvent this challenge, adaptive algorithms which generate the index set in a near optimal, greedy procedure were developed in . This method however comes with a high cost of exploring the polynomial space, which may be daunting in high-dimensional problems.
Instead of building the index set based on exact values of polynomial coefficients , an attractive alternative approach (referred to as quasi-optimal approximation throughout this paper) is to establish sharp upper bounds of (by a priori or a posteriori methods), and then construct the index set corresponding to largest such bounds. For this strategy, the main computational work for the selection of the (near) best terms reduces to determining sharp coefficient estimates, which is expected to be significantly cheaper than exact calculations. Quasi-optimal polynomial approximation has been performed for some parametric elliptic models with optimistic results: while the upper bounds of (denoted from now by ) were computed with a negligible cost, the method was comparably as accurate as best -term approach, as shown in . The first rigorous numerical analysis of quasi-optimal approximation was presented in  for with being a vector with . In that work, the asymptotic sub-exponential convergence rate was proved based on optimizing the Stechkin estimation. Briefly, the analysis applied Stechkin inequality to yield
then took advantage of the formula of to compute , depending on , which minimizes .
Although known as an essential tool to study the convergence rate of best -term approximations, Stechkin inequality is probably less efficient for quasi-optimal methods. As a generic estimate, it does not fully exploit the available information of the decay of coefficient bounds. In such a setting, a direct estimate of may be viable and advantageous to provide an improved result. In addition, the process of solving the minimization problem needs to be tailored to , making this approach not ideal for generalization. Currently, this minimization approach has been limited for some quite simple types of upper bounds. In many scenarios, the sharp estimates of the coefficients may involve complicated bounds which are not even explicitly computable, such as those proposed in . The extension of this approach to such cases seems to be impossible.
In this work, we present a generalized methodology for convergence analysis of quasi-optimal polynomial approximations for parameterized PDEs with deterministic and stochastic coefficients. As the errors of best -term approximations are bounded by those of quasi-optimal methods our analysis also gives accessible estimates (i.e., estimates depending only on known or computable terms) for best -term approximation errors under several established properties on the decaying of the polynomial coefficients. These are sharp explicit theoretical estimates in the scenario that: 1) the triangle inequality has to be employed; 2) one has to evaluate via their bounds , in which case represents the smallest accessible bound of . We particularly focus on elliptic equations where the input coefficient depends affinely and non-affinely on the parameters (see Section 2). However, since our error analysis only depends on the coefficient upper bounds, we expect that the methods and results presented herein can be applied to other, more general model problems with finite parametric dimension, including nonlinear elliptic PDEs, initial value problems and parabolic equations . Our approach seeks a direct estimate of without using the Stechkin inequality. It involves a partition of into a family of small positive real intervals and the corresponding splitting of into disjoint subsets of indices , such that . Under this process, the truncation error can be bounded as
and therefore, the quality of the error estimate mainly depends on the approximation of cardinality of . To tackle this problem, we develop a strategy which extends into continuous domain and, through relating the number of -dimensional lattice points to continuous volume (Lebesgue measure), establishes a sharp estimate of the cardinality up to any prescribed accuracy. This development includes the utilization and extension of several results on lattice point enumeration; for a survey we refer to . Under some weak assumptions on (which are satisfied by all existing coefficient estimates we are aware of), we achieve an asymptotic sub-exponential convergence rate of truncation error of the form , where is a constant depending on the shape and size of quasi-optimal index sets. Through several examples, we explicitly derive and demonstrate the optimality of our estimate both theoretically (by proving a lower bound) and computationally (via comparison with exact calculation of truncation error). The advantage of our analysis framework is therefore twofold. First, it applies to a general class of coefficient decay (and correspondingly, quasi-optimal and best -term approximations); and second, it yields sharp estimates of the asymptotic convergence rates. For convenience, for the rest this paper, we will drop the superscript and simply refer to the quasi-optimal set of cardinality as .
Our paper is organized as follows. In Section 2, we describe the elliptic equations with parameterized input coefficient and necessary mathematical notations. In Section 3, we present the analyticity of the solution with respect to parameter and derive coefficient estimates of Taylor and Legendre expansions of . The advantage of Legendre over Taylor expansions will also be discussed. Our main results on the convergence analysis for a general class of multi-indexed series are established in Section . By means of these results, we give accessible asymptotic error estimate of several quasi-optimal and best- term polynomial approximations in Section . Finally, Section is devoted to further discussions on the error lower bound, as well as the pre-asymptotic estimate in a simplified case.
We consider solving simultaneously the following parameterized linear, elliptic PDE:
on a bounded Lipschitz domain , with the coefficient defined on , with . We require the following assumption:
The Lax-Milgram lemma ensures the existence and uniqueness of solution in , where and is the space of square integrable functions on with respect to the measure with . This setting represents parametric elliptic models as well as stochastic models with bounded support random coefficient. We denote and, without loss of generality, assume in this work.
The corresponding weak formulation for is written as follows: find such that
Following the arguments in , we derive the convergence of Taylor and Legendre approximations based on the analyticity of the solution on complex domains. Here, the convergence is proved under the affine parameter dependence of diffusion coefficients for the Taylor series, but we relax this assumption for the Legendre series. More specifically, we only assume a holomorphic extension of for the complex variable :
This condition is easily fulfilled with consisting of polynomials, exponential, sine and cosine functions of the variables . Below, we give some examples of diffusion coefficients which can be accommodated in our framework. The rigorous proofs and discussion on the advantage of Legendre over Taylor approximations will be postponed to the next section.
Example 1. For the input coefficient depending affinely on the parameters, i.e.,
where , such that satisfies Assumption ?; both Taylor and Legendre series approximations of to converge.
Example 2. Consider the input coefficient defined as
with and . It is easy to see that satisfies Assumptions ?– ?. Thus, the Legendre series approximation of to converges for this model.
Example 3. Consider the input coefficient defined as
with and . We have satisfies Assumptions ?– ? and Legendre series approximation of to converges.
Another framework for establishing the convergence of Legendre series was presented in  and applied to a more general setting of parametric PDEs (non-elliptic, infinite dimensional noise and non-affine dependence on parameters). This approach imposes analyticity assumptions on the solution, which requires nontrivial validation in practice. Instead, in this work, we focus on elliptic equations which allows us to derive concise, minimal assumptions on the input coefficient (as seen above), under which the convergence of Legendre approximations holds straightforwardly. It is also worth recalling that our error estimates only depend on the sharp upper bound of the polynomial coefficients. Therefore, while not studied herein, PDE models covered by , bringing about same types of coefficient bounds as those considered in Section 5, can be treated by the forthcoming analysis.
3Analyticity of the solutions and estimates of the polynomial coefficients
Loosely speaking, the coefficients of Taylor and Legendre expansions can be estimated via three steps:
Extending the uniform ellipticity of from to certain polydiscs/polyellipses in ;
Proving the analyticity of the solution on those extended domains; and
Estimating the expansion coefficients using the analyticity properties and Cauchy’s integral formula.
We will discuss each step in the next subsections. By and , we denote the real and imaginary part of a complex number .
3.1Complex uniform ellipticity
The convergence of Taylor approximations is proved using the uniform ellipticity of the input coefficient in polydiscs containing , based on complex analysis argument.
At the same time, Legendre expansions require the uniform ellipticity in smaller complex domains: the polyellipses.
A close look at DUE and EUE reveals the advantage of Legendre over Taylor expansions. The polyellipses extend the real domain in a continuous manner, so that if tends toward , shrinks to . Thus, it is hopeful that the uniform ellipticity property of in (Assumption ?) can carry over to some polyellipses (at least with close to ). In fact, we prove that EUE property is a consequence of Assumptions ? and ?.
On the other hand, DUE always requires an extension of the coercive property in to the unit polydisc , to say the least, which is not possible generally. For illustration, the sets of such that for all with some fixed (referred to as the domains of uniform ellipticity) are plotted in Figure 1 for some typical -dimensional parametric coefficients. The maximal ellipses and discs contained in these domains are shown. We observe that for the affine coefficient, the set spans unrestrictedly along the imaginery axis, and discs covering can easily be placed inside. It highlights the success of Taylor approximations for parameterized models which depend affinely on the parameters, . This property however no longer holds for non-affine, yet holomorphic diffusion coefficients. Taylor approximations for these cases can be treated by a real analysis approach, but under additional strong constraints, see .
We close this subsection with a proof of Lemma ?.
Proof. (of Lemma ?). Since is holomorphic in , we have is a continuous mapping. By Heine-Cantor theorem, is uniformly continuous in any compact subset of . Fixing a , without loss of generality, we can choose such that satisfying , there holds
for all such that with some . Denoting , we proceed to prove there exists with such that the polyellipse is included in .
First, consider the “polyrectangle”
we will show that . Indeed, for every , choose as follows: if , then ; otherwise, . It is easy to see that . Furthermore, for all ,
Thus, and we have
This gives and .
It remains to find satisfying . To make this hold, we only need to select such that the lengths of axes of each ellipse are less than the lengths of corresponding sizes of the rectangle, i.e.,
The choice of fulfills this condition, with which . There follows
for all and satisfies EUE, as desired.
3.2Analyticity of the solutions with respect to the parameters
If DUE/EUE holds, according to the Lax-Milgram theorem, is defined and uniformly bounded in certain polydiscs /polyellipses containing . Exploiting this fact and the analyticity of in , we establish the analyticity of the map . The results given in this section essentially follow those in Section 2.1 of , but apply to the more general cases of smooth, non-affine diffusion coefficients.
Proof. We will prove this theorem for satisfying DUE. The other case should follow similarly. Defining
the proof consists of two steps showing that
is an open neighborhood of the polydisc ; and
the map is holomorphic in .
Here, is the interior of .
First, let us choose an arbitrary element in . For , we denote the open ball radius centered at in . Observing that the map is holomorphic in , we have is a continuous function in . There exists depending on such that for all ,
and for all . We obtain for all , which concludes Step 1.
As is holomorphic in , for all , there exists
where and denotes the Kronecker sequence with at index and at other indices. The proof of Step 2 is then similar to that of Lemma 2.2,  and would be omitted here.
3.3Estimates of the polynomial coefficients
Under the analyticity properties established in Theorem ?, the convergence of Taylor and Legendre expansions of the solutions, as well as estimates of the expansion coefficients, are well-studied, e.g., in . In this subsection, we review without proof those results in the context of finite-dimensional, possibly non-affine parametric coefficients. Recall that . For all , we introduce the multivariate notations , and define the partial derivative . The Taylor series of reads
where the coefficients are defined as
The convergence of the Taylor expansion in and estimates of are given in the following.
[Proof was removed.]
On the other hand, the tensorized Legendre series of is defined as
where , with denoting the monodimensional Legendre polynomials of degree according to normalization .
A second type of Legendre expansion, which employs the normalized version of , is also considered. Denote the multivariate polynomials , with given by
The Legendre series in this case can be written as
We note that the coefficients are defined by
The following proposition establishes estimates of and the convergence of the Legendre expansions of in .
[Proof was removed.]
Under Assumptions ?– ?, we remark that EUE and, if adding affine dependence on parameters, DUE normally hold for infinitely many couples of . We call the set of all such that EUE/DUE is fulfilled the admissible set and denote it by for both cases. For a fixed , the best coefficient bounds given by Propositions ? and ? will be
Finding an efficient computation of these infimums and algorithm to construct the corresponding quasi-optimal index sets is an open question. In the specific case where the basis functions have non-overlapping supports, however, the vectors solving the minimization problems in can be found easily. In this case, the best a priori estimates retrieve the forms and . Recent studies have shown that although these theoretical bounds are not sharp, they construct quite accurate polynomial spaces, see .
4Asymptotic convergence analysis for a general class of multi-indexed series
Consider a multi-indexed sequence of coefficient estimates written in the form . In this section, we introduce a new, generalized approach to estimating the asymptotic convergence of with respect to , under some general assumptions of which accommodate most types of Taylor and Legendre coefficient bounds established in current literature. We recall that this truncation error represents the error of quasi-optimal methods, as well as accessible error of best -term approximations. It is enough to conduct the analysis with being the sets of all such that with some .
Our method can be summarized as follows. First, we split into a family of disjoint subsets of based on values of , where contains satisfying , so that the truncation error can be bounded as
Obviously, finding a sharp approximation of is central to estimate . We define the superlevel sets of -dimensional real points
and, with notice that , seek to count points with integer coordinates in . An appealing approach to solving this problem is to study the interplay between and the continuous volume (Lebesgue measure) of . We first employ the following well-known result in measure theory, reflecting the intuitive fact that for a geometric body in , the volume of , denoted by , can be approximated by the number of shrunken integer points inside , see, e.g., Section 7.2 in  and Section 1.1 in .
Concerning our goal of estimating , Lemma ? has an interesting consequence: If is defined such that , , with some , one obtains a simple asymptotic formula for :
Such approximation is powerful since, loosely speaking, it would allow replacing by and reduce to a much easier, yet equivalent problem of estimating the truncation error via
The property that the sets are unchanged over is, however, restrictive, corresponding to only a few types of coefficient upper bounds, for instance, is linear in . For this approach of estimation to be considered useful in general quasi-optimal approximation setting, this condition needs to be relaxed.
For the technicality, we now extend definition to equip the superlevel sets with real indices: for , define
Note that the assertion of Lemma ? still holds if replacing by . We establish, in Lemma ? below, formula under some weaker assumptions on :
is Jordan measurable for countably infinite ,
The chain is either ascending or descending towards a Jordan measurable limiting set with .
As we shall see later, these properties are satisfied by most existing polynomial coefficient estimates.
Proof. We will give a proof with satisfying . The other case can be shown analogously. Let be an arbitrary positive number. By Lemma ?,
Since , we can choose such that ,
On the other hand, from , it yields . Let us pick an so that is Jordan measurable and . By Lemma ?,
There exists satisfying ,
Since , we have , which gives
Combining and proves .
Lemma ? provides us with an asymptotic formula of the form to approximate the number of integer points inside , under some conditions on . Given a coefficient upper bound , it is desirable to derive properties of such that its corresponding superlevel sets fulfill these conditions. For all , define the map as
We proceed to state and validate the following assumptions on .
Proof. From the continuity of in (Assumption ?.1), is Jordan measurable for all except a countable number of values of (see ).
Next, from Assumption ?.2, if is increasing for all , one has , which implies
Since converges towards as , is bounded for every . It is trivial that is bounded. Let , we have for large enough. Combining with Assumption ?.3 yields . Thus, and .
If, on the other hand, is decreasing for all , then , which gives
Since , it is trivial that . Furthermore, for any , with large enough. Combining with Assumption ?.3 that , this implies . Thus, and .
If Jordan measurable, since the family has been proved to satisfy the conditions of Lemma ?, we can apply this to get .
As seen in the proof, the continuity of (Assumption ?.1) assures that the superlevel sets are “well-behaved” (Jordan measurable). Meanwhile, the monotonicity of (Assumption ?.2) leads to the ascending (or descending) property of the chain . To guarantee the limiting set is bounded and not null, we assume for some (Assumption ?.3), so that . It should be noted that and are generic constants, which are only utilized to represent the boundedness of and do not affect our convergence rate, thus a specification of and is not necessary. In the subsequent analysis, we applies to derive an error estimate, only depending on and the parameter dimension , of the form . This rate is consistent with the proven sub-exponential convergence for some simple coefficient upper bounds . Nevertheless, our analysis completely exploits detailed information on the size and shape of the index sets in the asymptotic regime via the introduction of and, as a result, acquires the optimal value of .
It is worth remarking that Lemma ? requires to be Jordan measurable. Indeed, we show here a simple counterexample in which is not Jordan measurable and fails to hold. Consider the integer-indexed collection of Jordan measurable sets defined by
Observing that is descending towards , which is not Jordan measurable. We have while , contradictory to . The conditions on Jordan measurability of is, however, not restrictive in the context of quasi-optimal methods, since the shapes of limiting sets are often not very complicated, e.g., fractal. Indeed, all examples investigated herein show the convexity of , which trivially implies its Jordan measurability, as required.
The mathematical evidence that Assumption ? is satisfied by published Taylor and Legendre coefficient estimates will be presented in Section 5. Four examples of upper bounds will be considered, including (as in ), (as in ), (as in ), and (as in ). For now, with Lemma ? giving an approximation for , it remains to study the estimation problem . We proceed to prove the following supporting result.
Proof. We have