On discrete least square projection in unbounded domain with random evaluations and its application to parametric uncertainty quantification

# On discrete least square projection in unbounded domain with random evaluations and its application to parametric uncertainty quantification

Tao Tang   Tao Zhou Department of Mathematics, The Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong, China, Email: ttang@math.hkbu.edu.hkInstitute of Computational Mathematics and Scientific/Engineering Computing, AMSS, the Chinese Academy of Sciences, Beijing, China, Email: tzhou@lsec.cc.ac.cn
###### Abstract

This work is concerned with approximating multivariate functions in unbounded domain by using discrete least-squares projection with random points evaluations. Particular attention are given to functions with random Gaussian or Gamma parameters. We first demonstrate that the traditional Hermite (Laguerre) polynomials chaos expansion suffers from the instability in the sense that an unfeasible number of points, which is relevant to the dimension of the approximation space, is needed to guarantee the stability in the least square framework. We then propose to use the Hermite/Laguerre functions (rather than polynomials) as bases in the expansion. The corresponding design points are obtained by mapping the uniformly distributed random points in bounded intervals to the unbounded domain, which involved a mapping parameter . By using the Hermite/Laguerre functions and a proper mapping parameter, the stability can be significantly improved even if the number of design points scales linearly (up to a logarithmic factor) with the dimension of the approximation space. Apart from the stability, another important issue is the rate of convergence. To speed up the convergence, an effective scaling factor is introduced, and a principle for choosing quasi-optimal scaling factor is discussed. Applications to parametric uncertainty quantification are illustrated by considering a random ODE model together with an elliptic problem with lognormal random input.

## 1 Introduction

In recent years, there has been a growing need to model uncertainty in mathematical and physical models and to quantify the resulting effect on output quantities of interest (QoI). Several methodologies for accomplishing these tasks fall under the growing sub-discipline of Uncertainty Quantification (UQ). In general, one can use a probabilistic setting to include these uncertainties in mathematical models. In such a framework, the random input parameters are modeled as random variables; infinite-dimensional analogues leveraging random fields with a prescribed correlation structure extend this procedure to more general settings. Frequently, the goal of this mathematical and computational analysis becomes the prediction of statistical moments of the solution, or statistics of some QoI, given the probability distribution of the input random data.

A fundamental problems in UQ is approximation of a multivariate function where the parameters are -dimensional random vectors. The function might be a solution resulting from a stochastic PDE problem or a derived QoI from such a system. Efficient and robust numerical methods that address such problems have been investigated in detail in recent years (see, e.g. [7, 34, 35, 13, 32, 33, 25] and references therein). One of these methods that have enjoyed much attention and success is the generalized Polynomial Chaos (gPC) method, see, e.g., [32, 34, 35, 13], which is a generalization of the Wiener-Hermite polynomial chaos expansion [31]. In gPC, we expand the solution in polynomials of the input random variables . When exhibits regular variation with respect to , gPC yields efficient convergence rates with respect to the polynomial degree of expansion. With intrusive gPC approaches, existing deterministic solvers must be rewritten, and solvers for a coupled system of deterministic equations are needed, which can be very complicated if the underlying differential equations have nontrivial nonlinear form, see, e.g., [34, 7, 38]. By contrast, non-intrusive methods build a polynomial approximation by leveraging only existing deterministic solvers in a Monte-Carlo-like fashion.

To efficiently build a gPC approximation, one can resort to the discrete least-squares projection onto a polynomial space. A major design criterion for this approach is the specification of -sample locations. There exist a number of popular design grids: randomly generated points, Quasi Monte Carlo points, specially designed points, etc., see e.g., [17, 10, 18, 37]. It is known that obtaining the optimal sample design is not straightforward as demonstrated by a recent comparison work in [12]. Analysis for the least-squares approach utilizing random points is addressed in several contexts, see e.g., [24, 9, 37]. Generally speaking, the least square approach is stable when the number of sample points behaves quadratically with the dimension of the approximation space. This quadratic condition can be weakened if we seal with the Chebyshev measure [8].

Note that all the above results are for random parameters in bounded domains. As far as we have known, there is no exhaustive investigations for problems in unbounded domains, i.e., for functions with Gaussian or Gamma random parameters. In this paper, we will consider the problem of approximating functions with Gaussian or Gamma random parameters by using discrete least-squares projection with random points evaluations. In this case, the traditional approach is to use the so-called Hermite or Laguerre chaos expansions, where the collocation points with respect to the Gaussian or Gamma measure will be generated. However, we will show that such an approach suffers from the instability in the sense that the corresponding design matrices in the least square approach are well conditioned only when the number of random points is exponentially related to the dimension of the approximation space, i.e. the number of random points equals to with being the dimension of the approximation space. This is obviously unacceptable for practical computations.

To improve the stability we will propose to use the Hermite (Laguerre) function approximation to replace the Hermite (Laguerre) polynomial approach. Then the mapped uniformly distributed random points are used to control the condition number of the design matrix. By choosing a suitable mapping parameter, it is demonstrated numerically that these two strategies will make the condition number small provided that the number of design points is linearly proportional to the dimension of the approximation space. This stability result is further justified by a theoretical proof.

The rate of convergence is another serious issue. In fact, approximating a function by Hermite polynomials or functions was rejected by Gottlieb-Orszag ([14], pp. 44-45). They pointed out that to study the rate of convergence of Hermite series, we consider the expansion of … The result is very bad: to resolve wavelengths of requires nearly Hermite polynomials! Because of the poor resolution properties of Hermite polynomials the authors doubt they will be of much practical value in applications of spectral methods.

How to improve the resolution property of the Hermite expansion methods? One remedy is to use the so-called scaling factor which expands the underlying function by instead of , where is a properly chosen constant. In [29], a scaling factor formula combining the size of the solution decay rate and the roots of is proposed, where is the largest expansion term in the Hermite spectral expansion. Numerical analysis based on asymptotic analysis numerical experiments demonstrate that the use of the scaling factor can indeed provide a significant improvement over the observation on Gottlieb and Orszag. The theoretical justification of the use of the scaling factor proposed in [29] was made in [11, 22]. In particular, Hermite spectral methods are investigated in [22] for linear diffusion equations and nonlinear convection-diffusion equations in unbounded domains. When the solution domain is unbounded, the diffusion operator no longer has a compact resolvent, which makes the Hermite spectral methods unstable. To overcome this difficulty, a time-dependent scaling factor is employed in the Hermite expansions, which yields a positive bilinear form. As a consequence, stability is recovered and spectral convergence speed is significantly enhanced. In fact, in the past ten years, the use of the scaling factor proposed in [29] has been used in many areas including computational optics [19], computational astrophysics [26], etc. In particular, the scaling factor formula is included in the recent MATLAB code GSGPEs [5].

When studying uncertainty using the gPC methods, Karniadakis, Xiu etc pointed out in [20, 34] that the relatively poor resolution properties of Hermite and Laguerre expansions are well documented in [14]. They further pointed out the re-scaling procedure as done in [29] can be employed to accelerate convergence. However, the progress of using the scaling factor for the UQ problems has not been big. This is one of the main motivations of the present work. In this work, we will introduce suitable scaling factors to speed up the convergence. Applications to parametric UQ are discussed by considering random ODE models and elliptic type problems with lognormal random input. A number of numerical examples are provided to confirm the efficiency of the Hermite (Laguerre) function approach with the use of the scaling factors.

We summarize here the distinct features of our approach:

• We investigate the discrete least square approach for functions with Gaussian or Gamma random parameters; applications to UQ are discussed.

• We propose to use the Hermite (Laguerre) functions as the approximation bases, which is different with the traditional Hermite (Laguerre) polynomials. Stability is guaranteed with acceptable number of evaluation points and relevant theoretical justification is provided.

• We introduce the scaling factor in the least square approach to speed up the convergence, and a principle for choosing the scaling is provided. The numerical results indicate that the use of the proposed scaling factor is indeed very useful.

The rest of this paper is organized as follows. In section 2, we introduce the approximation problem of a function in -dimensions by discrete least-square projection. Some commonly used high dimensional approximation spaces are discussed. We also show that the Hermite (Laguerre) gPC expansions need an unacceptable number of evaluation points to guarantee the stability. In Section 3, we propose to use the Hermite (Laguerre) function approach. Stability under this approach is ensured with the use of the mapped uniform random points. Moreover, a useful scaling factor is introduced to speed up the convergence. Applications to parametric UQ are discussed in Section 4. Some conclusions will be drawn in the final section.

## 2 The least square projection

In this section, we follow closely the works [24, 9, 37] to give a basic introduction for the discrete least-squares approach, however, please note that we shall focus on problems in unbounded domains.

Let be a vector with random variables, which takes values in or We will focus on the cases where are Gaussian random variables () or Gamma random variables (). We suppose that the variables are independent with marginal probability density function (PDF) for each random variable The joint PDF is given by .

Assume that the functions considered in this paper are in the space endowed with the norm

 ||f||L2ρ=E[f2(y)]=(∫Γf2(y)ρ(y)dy)1/2. (2.1)

The purpose is to efficiently build a finite dimension approximation of or some general functionals associated with To this end, we first choose the one-dimensional orthogonal bases (not only limited to polynomials) with respect to each random variable :

 {ϕij}∞j=1∈L2ρ,i=1,...,d,

where is called the -th order basis. Then the multi-dimensional bases can be formed by tensorizing the univariate bases To explicitly form these bases, let us first define the following multi-index:

 n=(n1,⋅⋅⋅,nd)∈\mathdsNd,with|n|=d∑i=1ni.

Define the -dimensional bases as

 Φn(y)=d∏i=1ϕini(yi), (2.2)

where is the one-dimension basis. Let be a finite multi-index set, and denote by the cardinality of an index set The finite dimensional approximation space defined by is given by

 PΛ:=\textmdspan{Φn(y),n∈Λ}.

Throughout the paper, the best approximation of in will be denoted by namely,

 PΛf:=argminp∈PΛ∥f−p∥L2ρ. (2.3)

A formula for the best approximation involves standard Fourier coefficients with respect to the , but these coefficients require high-order moment information for the function and in general cannot be computed explicitly.

Alternatively, we consider the construction of such an approximation for the function by the least-squares approach. To this end, we compute the exact function values of at with , and then find a discrete least-squares approximation by requiring

 fΛ=PΛmf=argminp∈PΛm∑k=1(p(yk)−f(yk))2. (2.4)

We introduce the discrete inner product

 ⟨u,v⟩m=m∑k=1u(yk)v(yk). (2.5)
###### Remark 2.1.

We remark that usually the -best approximation polynomial is chosen as the approximation bases, which yields the so-called gPC method. For example, the Hermite polynomials are used for functions with Gaussian parameters, and the Leguerre polynomials are suitable for functions with Gamma parameters, and so on [35]. In such gPC expansions, a natural way to choose the design points is the random sampling method, that is, the random samples are generated with respect to Of course, other kinds (non-polynomials) of orthogonal bases can be used in the least-square approach.

### 2.1 Multivariate approximation spaces

Given a basis order and the dimension parameter define the following index sets

 Λq,dP:={n=(n1,…,nd)∈Nd:maxj=1,…,dnj≤q}, (2.6) Λq,dD:={n=(n1,…,nd)∈Nd:|n|≤q}. (2.7)

The traditional tensor product (TP) space is defined as

 Pdq:=span{Φn(y):n∈Λq,dP}. (2.8)

That is, we require in that the basis order in each variable less than or equal to . A simple observation is that the dimension of is

 dim(Pdq)=#Λq,dP=(q+1)d. (2.9)

Note that when the dimension of TP spaces grows very quickly with respect to the degree , which is the so-called curse of dimensionality. As a result, the TP spaces are rarely used in practice when is large. Alternatively, when is large, the following total degree (TD) space is often employed instead of using the TP space [25, 33]:

 Ddq:=span{Φn(x):n∈Λq,dD}. (2.10)

The dimension of is

 dim(Ddq)=#Λq,dD=(q+dd). (2.11)

It is seen that the growth of the dimension of is much slower than that of .

###### Remark 2.2.

We remark that the TP and TD spaces are originally defined for polynomial spaces. However, spaces based on general one-dimensional bases can be constructed using the same way. Consequently, we will still use the names of TP and TD for the spaces with general bases. Moreover, other types of multi-variate approximation spaces can be constructed in a similar way, e.g., the hyperbolic cross [8].

### 2.2 Algebraic formulation

Consider the approximation in the space with random samples If we choose a proper ordering scheme for the multi-index, we can order the multi-dimensional bases via a single index. For example, we can arrange the index set in the lexicographical order, namely, given

 n′

Then the space can be rewritten as with . The least square solution can be written in

 fΛ=N∑j=1cjΦj, (2.13)

where is the coefficient vector. The algebraic problem to determine the unknown coefficient can be formulated as:

 c=argminz∈RN||Dz−b||2, (2.14)

where

 D=(Φj(yk)),j=1,...,N,k=1,...,m,

and contains the evaluations of the target function in the collocation points. The solution to the least squares problem (2.14) can also be computed by solving an system, namely,

 Az =f (2.15)

with

 (2.16)

For the computation point of view, we can solve problem (2.14) by using the QR factorization. Alternatively, we can also solve (2.15) by the Cholesky factorization.

### 2.3 The Hermite (Laguerre) chaos expansion: stability issue

As was discussed in Remark 2.1, a nature way to approximate functions with Gaussian (Gamma) parameters is the Hermite (Laguerre) chaos expansion. In this section, we shall show, by numerical examples, that the least square projection with Hermite (Laguerre) polynomials expansion is unstable, in the sense that an unfeasible number of random points, i.e.,

 m=(#Λ)c#Λ,

are needed to guarantee the stability.

To this end, let us remind that the one-dimensional normalized Hermite polynomials defined on the whole line are orthogonal with respect to the weight function

 ρG(y)=\textmde−y2, (2.17)

namely,

 ∫+∞−∞ρG(y)Hm(y)Hn(y)dy=δmn. (2.18)

We denote by the multi-variate hermite polynomial with multi-index which obtained by tensorized the one-dimensional Hermite polynomials. Then, a natural way to approximate a multivariate function with Gaussian parameters is

 fG(y)=∑ncnHn(y),n∈Λ, (2.19)

where is the index set that can be either or

Similarly, for a function with Gamma random parameters a nature bases for such an expansion would be the tensorized Laguerre polynomials that are orthogonal with respect to the weight function . More precisely, we expand

 fE(Y)=∑ncnLn(y),n∈Λ. (2.20)

Note that we consider here a special type of Gamma random parameters for which the PDF yields Such random variables are also referred to the exponential random variables. More general types of Gamma random parameters with PDF

 ρE(y)=βαyα−1\textmde−βyΓ(α) (2.21)

can be considered in a similar way, and the corresponding chaos expansion is the generalized Laguerre chaos expansions.

In the least square framework, to construct the expansions (2.19) and (2.20), a natural choice of the collocation points is to generate random points according to the Gaussian (Gamma) measure. In both cases, we can obtain the corresponding design matrices and , respectively.

We remark that for problems in bounded domains, e.g., the uniform random parameters in , the relevant tests have been done by many researchers, see, e.g., [24, 9, 8, 37]. For instance for the uniform measure in it is known that a quadratic dependence of the number of random points, i.e. , is sufficient to guarantee the stability of the least square approach. Moreover, if the Chebyshev measure is considered, fewer points are needed to guarantee the stability [8].

What is the difference if the underlying domain is unbounded? The answer is quite negative: the quadratic random points cannot guarantee the stability.

We will demonstrate the above claim by testing the condition number of the design matrices, i.e.,

 (2.22)

Let us first consider the Hermite chaos expansion (2.19). In this case, the random points are generated with respect to the Gaussian measure. Note that the design matrix is a random matrix. Therefore, in the computations we will repeat the test for 100 times, and the mean condition number will be reported. In Fig. 1, the growth of condition numbers with respect to the polynomial order is shown for the one-dimensional case. It is noted that the condition number admits an exponential growth with respect to the polynomial order, for both the linear dependence (left) and the quadratic dependence (right) cases. In fact, similar tests with the dependence with produce results similar to those in Fig. 1.

We further consider the Laguerre Chaos expansion, which is suitable for approximating functions supported in Note that the corresponding random points are generated by the Gamma measure. The bottom of Fig. 1 shows the results for one-dimensional tests, which indicate that the condition number of the Gamma case grows faster than that in Gaussian.

Fig. 2 presents the two-dimensional tests for both the TP and TD constructions. The left figure is for the Gaussian, while the right one is for the Gamma. Again the exponential growth of the condition number is observed again, where it is seen that the TD spaces work better than the TP spaces.

With the above observations, it seems hopeless to control the condition number in the unbounded domain. In fact, to have a good control of the condition number, it is observed in the thesis of G. Migliorati [23] an unfeasible number of points with is needed. To improve this, we shall introduce the Hermite (Laguerre) function approach to replace the Hermite (Laguerre) polynomial expansion.

###### Remark 2.3.

We remark that we are not saying that the Hermite (Laguerre) polynomial chaos expansions are unfeasible in the least square framework. In fact, we can still use such approaches with small number of polynomial degrees. In this case, fast convergence can still be expected. However, the convergence rate deteriorates when a large number of polynomial degree is used due to the exponential growth of the condition number. Some numerical tests are provided in [23].

## 3 The Hermite (Laguerre) function expansions

In this section, we propose to use the Hermite (Laguerre) function approximation instead of the traditional Hermite (Laguerre) polynomial approximation. The one-dimensional Hermite functions, also named modified Hermite polynomials, are defined by

 ~Hm(y)=\textmde−y22Hm(y),m=0,1,... (3.1)

where are normalized Hermite polynomials. Note that the Hermite functions are orthogonal in the following sense

 ∫+∞−∞~Hm(y)~Hn(y)dy=δmn. (3.2)

The corresponding multivariate Hermite functions can be defined by tensorizing the one dimensional Hermite functions.

The Laguerre functions are defined as

 ~Lm(y)=\textmde−y2Lm(y),m=0,1,..., (3.3)

where are Laguerre polynomials. The corresponding multi-variate Laguerre functions can be defined in a similar way. Note that the Hermite/Laguerre functions are no longer polynomials. Nevertheless, in what follows, whenever we use polynomial order it is referring to the qth Hermite/Laguerre function.

It is clear that the Hermite (Laguerre) function expansions are suitable for approximating functions decaying to zero when goes to infinity. We claim that in UQ applications, we can almost always consider approximating decay functions. To see this, let (scalar case, for simplicity) be a function with Gaussian parameters, that might be the solution of certain stochastic ODEs/PDEs. In the UQ applications, one is interested in some statistic quantities of such as the th moment Let us consider a general expression of such QoI:

 \textmdQoI=∫Γρ(y)(g∘f)(y)dy. (3.4)

where is a general smooth functional of Even if is not a decay function, does, provided that grows slower than Gaussian. Thus, we can in fact consider the approximation for As long as a good approximation of is found, we can get a good approximation for the QoI in (3.4).

Without loss of generality, we can assume that decays exponentially. Consider the expansion

 fG(y)=K−1∑n=0cn~Hn(y),fE(y)=K−1∑n=0cn~Ln(y). (3.5)

We are now at the stage to find good collocation points in the least square framework. As we have discussed before, the most natural way to find such points is to generate the samples with respect to the PDF of the random parameters. Moreover, if such a PDF coincides with the weight function of the bases, the expectation of the design matrix would be the identical matrix, and this feature would help for the rigorous stability analysis [9]. In our setting, however, the Hermite (Laguerre) functions are orthogonal with respect to the Lebesgue measure. It is known that it is impossible to generate random points with respect to the Lebesgue measure (uniform measure) in unbounded domains. To overcome this difficulty, we shall introduce the mapped uniform samples, which transform the uniform random points in (or ) to in (or ).

Although there exist many feasible mappings, we shall restrict ourselves to a family of mappings defined by

 y′(ξ)=L(1−ξ2)1+r/2,r≥0, (3.6)

where is a constant, and determines how fast the mapping goes to infinity as goes to , see, e.g., [4, 28] for a thorough discussion on the pros and cons of different mappings. It is easy to verify that

 y(ξ)=⎧⎪ ⎪⎨⎪ ⎪⎩L2log1+ξ1−ξr=0,Lξ√1−ξ2r=1,ξ(y)=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩tanh(yL)r=0,y/L√y2/L2+1r=1. (3.7)

For other positive integers we can always use an algebraic computing software to derive the explicit expression of the mapping The mapping with is often referred to the logarithmic mapping which makes the transformed points decay exponentially, and the mapping with is referred as algebraic mapping. In our setting, the mapping with will be used when the Gaussian measure is considered, while the mapping with will be adopted when the Gamma measure is used.

We now summarize our least square approach by taking a one-dimensional function with Gaussian parameters as an example. Given the function to be approximated, i.e., we are interested in the QoI of .

• Step 1. Motivated by the discussion in the beginning of this section, we seek the following Hermite function expansion for :

 ~f(y)=K−1∑k=0ci~Hk(y). (3.8)
• Step 2. Let . We will find the following least square solution

 fK=PKmf=argminp∈PKm∑k=1(p(yk)−~f(yk))2, (3.9)

where the collocation points are chosen as the transformed uniform random points given by the mapping (3.6) with .

This procedure will lead to the desired QoI.

### 3.1 Stability

In this section, we shall investigate the stability of the least square approach by using the Hermite (Laguerre) functions, with mapped uniform distributed random points. Again, we test the condition number of the corresponding design matrices:

 (3.10)

Here we still use to avoid too many symbols although we should point out that () are evaluations of the Hermite (Laguerre) functions on the mapped random points in and , respectively. As such matrices are random, their condition numbers will be obtained by repeating the test 100 times so that the resulting mean condition number can be obtained. The mean condition number will be used to represent the condition number of the random matrices, which will be reported in the following figures.

In Fig. 3, the condition numbers with respect to the bases of order are given for one-dimensional Hermite function bases. The left plot is devoted to the linear rule with , while the right plot is for the quadratic rule with . In both cases, we can see that using a relatively large transform parameter the random matrices are well conditioned. The two-dimensional cases are reported in the bottom of Fig. 3 for both the TP space and the TD space. Again, the parameter results in well conditioned design matrix, for both the TP and TD spaces. However, under the same parameter (say ), the design matrix of the TD spaces are much better conditioned than that for the TP spaces, which is one of the reasons that the TD space is preferred for higher dimensional approximation.

Similar numerical tests are carried out for the Laguerre bases and in this case the mapping (3.7) with is used. The 1D result in the left of Fig. 4 suggests that the parameter can no longer guarantee the stability, while a larger parameter (say ) will work. The two-dimensional plot is given in the right of the figure. Again, the parameter results in a better condition number for the design matrices. We also note that more points and larger parameters are needed for higher dimensional cases. Moreover, the TD space ( and plots) provide better stability than that of the TP ( and plots) space.

We conclude that the design matrix can be well-conditioned under a set of transformed random points with some relatively large parameter . As the decay rate for Gaussian is faster than that for Laguerre, the transformation parameter for the Gaussian must be smaller than that for the Leguerre function

In the following, a rigorous analysis for the stability will be provided. We will only provide the proof for the one-dimensional Hermite functions case; the proof can be extended to the Laguerre case in a straightforward manner.

We first give a lemma concerning the decay properties of the Hermite functions.

###### Lemma 3.1.

For any integer we can find a constant such that

 |~Hk(y)|≤|y|−32,∀0≤k≤K−1, (3.11)

provided that

###### Proof.

Such a simple result is true because for any we have

 |~Hk(y)|⋅|y|t→0\textmdwhen|y|→∞, (3.12)

due to the involvement of the factor in the Hermite functions. ∎

We are now ready to prove the stability. Such analysis requires an understanding of how the scaled random matrix deviates from its expectation in probability . Note that the matrix can be written as

 ^A=X1+X2+⋅⋅⋅+Xm,

where the are i.i.d. copies of the random matrix

 X=Lm(~Hi(y)~Hj(y))i,j=0,...,K−1, (3.13)

where is a transformed uniform random variable. We now state the stability result

###### Theorem 3.2.

The least square approach using the Hermite functions (3.1) and the transformed uniform random points (3.7) is stable in the sense that the scaled design matrix satisfies that

 Pr{|||^A−I|||≥58}≤2m−r, (3.14)

provided that

 K≤κmlogm\textmdwithκ:=4c1/23(1+r),c12=12+12log12>0, (3.15)

and the mapping parameter in (3.7) satisfies

 L>max{3τ,5√K}, (3.16)

where is the number of the random points, is the degree of the polynomial, and is the constant given in Lemma 3.1.

###### Proof.

The analysis follows closely [9] and will use the following Chernoff bound [1, 30]. Let be independent random self-adjoint and positive matrices satisfying

 λmax(Xi)=|||Xi|||≤R

almost surely, and let

 μmin:=λmin(m∑i=1E[Xi]),μmax:=λmax(m∑i=1E[Xi]).

Then, one has for

 \textmdPr{λmin(m∑i=1Xi)<(1−δ)μmin}≤K(e−δ(1−δ)1−δ)μmin/R, (3.17)
 \textmdPr{λmax(m∑i=1Xi)>(1+δ)μmax}≤K(eδ(1+δ)1+δ)μmax/R. (3.18)

Note that a rank 1 symmetric matrix has its spectral norm equal to the product of the Euclidean norms of the vectors and and therefore we have

 |||Xi|||≤1mK−1∑i=0~H2i=M(K)m:=R\textmdwithM(K)=supy∈RK−1∑i=0~H2i(y).

We are now at the stage to find and Let

 ¯A=E[^A]=m∑i=1E[Xi].

Using the definition of the expectation and Eq. (3.6), we know that the elements of satisfy

 ai,j=∫1−1L~Hi(y(ξ))~Hj(y(ξ))dξ=∫+∞−∞(1−\textmdtanh2(yL))~Hi(y)~Hj(y)dy.

Let and

 a(u,v)=∫+∞−∞(1−\textmdtanh2(yL))uvdy,b(u,v)=∫+∞−∞uvdy.

By the Rayleigh quotient argument [2], we have

 μmin=minv∈Va(v,v)b(v,v),μmax=maxv∈Va(v,v)b(v,v). (3.19)

It is easy to verify that

 μmax=maxv∈Va(v,v)b(v,v)≤1. (3.20)

We now estimate Let . We have

 a(v,v)≥(1−\textmdtanh2(13))∫L3−L3v2dy=(1−\textmdtanh2(13))(∫∞−∞v2dy−2ε), (3.21)

where

 ε = ∫∞L3v2dy=∫∞L3(K−1∑k=0ck~Hk)2dy (3.22) ≤ K2maxi{c2i}∫∞L3y−3dy≤34K24L4(K−1∑k=0c2k),

where we have used Lemma 3.1 with If , then using Eqs. (3.19), (3.21) and (3.22) gives

 μmin≥(1−\textmdtanh2(13))(1−34K22L4)≥34. (3.23)

Putting and into Eqs. (3.17) and (3.18) respectively and letting yield

 \textmdPr{|||^A−I|||>58}≤2K⎛⎝e−12(12)1/2⎞⎠3/4R=2Kexp(−4c1/2m3M(K)), (3.24)

where Finally letting

 M(K)≤κmlogm\textmdwithκ:=4c1/23(1+r) (3.25)

yields the desired result (3.14). Note that we have Consequently, we can choose in (3.25). The proof is complete. ∎

Note that the requirement (3.16) for may not be optimal, as the numerical tests in Fig. 3 suggest that the mapping with parameter results in very stable approach (up to polynomial order of 25). In fact, inspired by the above proof, we need to choose a large parameter so that the integral (3.22) is sufficiently small. On the other hand, it is known that the largest root of behaves like so the requirement asymptotically coincides with that should be bigger than the largest root of .

We also point out that the proof above can be extended to the Laguerre case. However, as the Laguerre functions decay much slower than the Hermite functions, a larger parameter (approximately the square of the Hermite case) should be used. This can also be estimated by noting that the largest root of behaves like Again, these theoretical results are in good agreement with our numerical tests in Fig. 4, where leads to very stable approach under the linear rule.

### 3.2 Convergence and the scaling factor: motivation

In this subsection, we shall investigate the convergence issue. By the discussions in the last section, we know that we can use the transformation parameter to obtain a stable approach. Furthermore, inspired by the proof in [9, 8], one can expect the following convergence property of the least square approach

 Pr{||f−fm||ρ≥Cminv∈V||f−v||L∞(R)}≤2m−r, (3.26)

with suitable norm associate with the transformation where is the least square solution. As the proof follows directly the framework of [8], and thus is omitted here. Although the above results implies the error estimate in the finite space from the convergence point of view the rate of convergence (), may depend strongly on properties of the underlying function, such as the regularity and the decay rate.

To this end, we first demonstrate some numerical results for approximating the function with a Gaussian parameter and a constant . In the following experiments, we will report the error in the norm. More precisely, we compute the maximum error on 4000 random grids in The approximation error using the Hermite functions against the polynomial order is given in Fig. 5. In the computations, the parameter is used to guarantee the stability. It can be seen from Fig. 5 that both the linear rule (Right) and the quadratic rule (Left) produce very stable approach up to degree .

Another simple observation is that although the function is sufficiently smooth for any values of the convergence rate differs dramatically for For ( plot), the convergence is very fast, while for or , the convergence is very slow (yet, still stable). This is due to the use of the Hermite functions which behave approximately like at infinity. It is noted that when the approximated function matches such a decay property (e.g. which is close to 0.5), the convergence is fast, while the convergence is very slow when the approximated function decays much faster or much slower than the Gaussian function (e.g., or ).

A remedy to fix the above problem is the use of the so-called scaling factor [27, 29]. In spectral methods, the scaling factor is often used to speed up the convergence for approximating functions that decay fast in infinity. Such an idea was successfully applied to the studies of different problems [22, 21, 5].

We now introduce the basic idea of the scaling factor. To this end, let be a function that decay exponentially, namely,

 |f(y)|<ϵ,∀|y|>M, (3.27)

where and are some constants. The idea of using the scaling factor is to expand as

 (3.28)

where is a scaling factor. The key issue of using is to scale the points so that are well within the effective support of

To see the effect of the scaling, let us carry out some numerical tests. We first consider a fast decay function . In the top of Fig. 6, the maximum approximation error with respect to polynomial order is shown for one-dimensional case. In the left of the figure, we fix the parameter to ensure stability. It is noticed that the convergence for the original Hermite function approach plot) is very slow (although stable), while the use of a scaling factor indeed can significantly improve the convergence rate. In this example, the optimal scaling factor seems to be around ( plot). The right of the figure presents the convergence properties using the scaling but with variate parameters It is noticed that, under small parameters ( or 1), the convergence rate deteriorate when large polynomial order is used. This is due to the instability when small parameters are used. In contrast, the parameter ( plot) results in very stable approach.

Let us now consider a slowly decaying function . The corresponding convergence results are shown in the bottom of Fig. 6. The bottom left uses the fixed parameter and several values of . It is noticed that the optimal scaling factor in this case is about ( plot) in terms of rate of convergence, although the results for all are stable. The bottom right shows the error curves using the optimal scaling but with various parameters It is noticed that with small parameters ( or 2, and plots) the convergence rate deteriorates when large polynomial order is used. In contrast, the parameter ( plot) results in very stable approach.

### 3.3 Scaling factor: application to least square approach

The above tests suggest that proper scaling factors should be employed to speed up the rate of convergence. We now discuss how to find a feasible scaling in our least square approach. Note that the numerical solution (the expansion coefficients ) satisfies

 Ac=f (3.29)

with being the design matrix, where

 A=(⟨~Hi,~Hj⟩m)Ni,j=1,f=(⟨f,~Hj⟩m)Nj=1. (3.30)

For ease of discussion, we assume that the points are in a absolute increase order, i.e.,

 |y1|≤|y2|≤⋅⋅⋅≤|ym|.

Note that

 fk=⟨f,Hk⟩m=m∑i=1f(yiα)Hk(yi),k=1,...,N. (3.31)

Clearly, in order to compute we need to use information of from the interval out of which the contribution of is 0 in the sense of the floating number. This observation suggests that

 max1≤j≤m{|yj|}/α≤M⇒α=max1≤j≤m{|yj|}/M. (3.32)

This idea is similar to the proposal given in [29] in the context of pseudospectral methods. However, in our least square approach the points are generated randomly. The scaling in (3.32) may not be efficient from the probability point of view: there is possibility that only few points (may be only 2 or 3) are extremely large (we refer such points as bad points), which means that the scaling (3.32) may over scale the points. This motivates us to drop such bad points. More precisely, we choose

 ~α=max1≤j≤˜m{|yj|}/M,~m=⌊μm⌋, (3.33)

where is a parameter close to 1. That is, we drop possible bad points, and require points to contribute to the computation of In practice, it is found that we can just set , meaning that the probability of generating bad points is

We now repeat the numerical test in Fig. 6 (the left ones), with particular attention to the use of the scalings (3.32) and (3.33). The numerical results are given in Fig. 7, where scaling free stands for the results without using a scaling, maxmum scaling denotes the scaling computed by (3.32), while scaling with means that the scaling is computed by (3.33). The left of Fig. 7 shows the convergence for approximating . In this case, we simply set , i.e., the effective support of is chosen as . It can be seen that the numerical error with scaling factor (3.33) decays very fast ( and plots) as compare to the scaling free case ( plot), while the results with maxmum scaling ( plot) behaves almost the same as the scaling free case . The right plot is for and we set for this test. A similar phenomenon is observed.

For high dimensional cases, a reasonable scaling should be chosen in each direction. A two-dimensional test is provided in Fig. 8. The function to be approximated is and the approximating space is the TD space. In the left plot, we have used the linear rule while the quadratic rule with is used in the right plot. The scaling factors are computed by (3.33) with It is shown that the convergence is stable, and the scaling works very well. Furthermore, it is noticed that the convergence rate of the quadratic rule (right) is better than the linear rule. This might be due to that the linear rule uses less points than the quadratic rule.

###### Remark 3.3.

In practice, finding the optimal parameter is not straightforward due to the limit information of the function . Nevertheless, we can always find a reasonable if the information for is reasonably sufficient. It remains a research issue on how to find acceptable if only a few evaluations of are available; we will leave this problem for future studies.

## 4 Parametric UQ: illustrate examples

In this section, we discuss the application of the least square Hermite (Laguerre) approximations to parametric uncertainty analysis, precisely, we shall use the least square approach based on the Hermite (Laguerre) functions to compute the QoI of UQ problems.

### 4.1 A simple random ODE

We first consider a simple random ODE problems with Gamma random input:

 dfdt=−k(y)f,f(0)=1, (4.1)

where is a function with respect to a random Gamma parameter Note that for such problems with Gamma random input, the Laguerre functions will be used as the bases. To illustrate the idea, we set We are interested in the second moment of the solution, i.e.,

 \textmdQoI=∫R+\textmde−yf2(t,y)dy.

Note that in the least square approach, for each random point one has to solve the ODE to get the information The random points that located in used here is the transformed uniform random points. We will use the mapping (3.6) with parameters and to guarantee the stability. The numerical convergence results are shown () in Fig. 9. The left plots are for Note that we are in fact approximating the function It is noticed from Fig. 9 that the convergence is very slow without using a scaling, and this is again due to the fast decay of compared to the Gamma measure. In this test, both the maximum scaling and the scaling with work well, which is different with the observations for the Gaussian measure. It is likely due the slow decay of the Gamma measure, which results in a very big effective support (outside of the effective support is 0 with the machine accuracy) , and thus, the probability of over scale is not so large as in the Gaussian case. The right plot is for Again, all scaling values work well, although the scaling computed by (3.33) behave more stable.

It is seen from the above example that for problems with Gamma random parameters the maxmum scaling can be used. Moreover, if the partial maximum scaling associated with parameter is used, then larger (say ) should be used. This is quite different with the Gaussian case.

### 4.2 Elliptic problems with lognormal random input

We now take the following elliptic problems with lognormal random input as an example

 −∇⋅(a(x,ω)∇u)=f,x∈D,ω∈Ω, u(x,ω)|∂D=0. (4.2)

The coefficient is a lognormal random field, i.e.,

 a(x,ω)=\textmdeγ(x,ω),γ(x,ω)∼N(μ,σ2),∀x∈D, (4.3)

where denotes a Gaussian probability distribution with expected value and variance and is such that for the covariance function depends only on the distance (isotropic property). Moreover, is Lipschitz continuous, and is a positive definite function.

Several types of the covariance function have been proposed in the literature [3]. Such as the exponential correlation function

 Cγ(x,x′)=σ2\textmd