Computing derivative-based global sensitivity measures using polynomial chaos expansions

Computing derivative-based global sensitivity measures using polynomial chaos expansions

B. Sudret sudret@ibk.baug.ethz.ch [ C.V. Mai mai@ibk.baug.ethz.ch ETH Zürich, Institute of Structural Engineering, Chair of Risk, Safety & Uncertainty Quantification, Stefano-Franscini-Platz 5, CH-8093 Zürich, Switzerland
Abstract

In the field of computer experiments sensitivity analysis aims at quantifying the relative importance of each input parameter (or combinations thereof) of a computational model with respect to the model output uncertainty. Variance decomposition methods leading to the well-known Sobol’ indices are recognized as accurate techniques, at a rather high computational cost though. The use of polynomial chaos expansions (PCE) to compute Sobol’ indices has allowed to alleviate the computational burden though. However, when dealing with large dimensional input vectors, it is good practice to first use screening methods in order to discard unimportant variables. The derivative-based global sensitivity measures (DGSM) have been developed recently in this respect. In this paper we show how polynomial chaos expansions may be used to compute analytically DGSMs as a mere post-processing. This requires the analytical derivation of derivatives of the orthonormal polynomials which enter PC expansions. The efficiency of the approach is illustrated on two well-known benchmark problems in sensitivity analysis.

keywords:
global sensitivity analysis, derivative-based global sensitivity measures (DGSM), polynomial chaos expansions, Morris method
journal: Reliability Engineering & System Safety

1 Introduction

Nowadays, the increasing computing power allows one to use numerical models to simulate or predict the behavior of physical systems in various fields, e.g. mechanical engineering (Reuter and Liebscher, 2008), civil engineering (Hasofer, 2009), chemistry (Campolongo et al., 2007), etc. The considered systems usually lead to highly complex models with numerous input factors (possibly tens to hundreds (Sudret et al., 2013; Patelli and Pradlwarter, 2010)) that are required to represent all the parameters driving the system’s behaviour, e.g. boundary and initial conditions, material properties, external excitations, etc. On the one hand, this increases dramatically the computational cost. On the other hand, the input factors that are not perfectly known may introduce uncertainties into the predictions of the system response. In order to take into account the uncertainty, probabilistic approaches have been developed in the last two decades, in which the model input parameters are represented by random variables. Then the input uncertainties are propagated through the computational model and the distribution, moments or probability of exceeding prescribed thresholds may be computed (Sudret, 2007; de Rocquigny, 2012).

In this context, sensitivity analysis (SA) examines the sensitivity of the model output with respect to the input parameters, i.e. how the output variability is affected by the uncertain input factors (Saltelli et al., 2000, 2004, 2008). The use of SA is common in various fields: engineering (Hasofer, 2009; Pappenberger et al., 2008; Reuter and Liebscher, 2008; Kala, 2011), chemistry Campolongo et al. (2007), nuclear safety Fassò (2013), economy Borgonovo and Peccati (2006), biology Marino et al. (2008), and medicine Abraham et al. (2007), among others. One can traditionally classify SA into local and global sensitivity analyses. The former aims at assessing the output sensitivity to small input perturbations around the nominal values of input parameters. The latter aims at assessing the overall or average influence of input parameters onto the output. Local SA has the disadvantages of being related to a fixed nominal point in the input space, and the interaction between the inputs is not accounted for Kucherenko et al. (2009). On the other hand, global SA techniques take into account the input interaction and are not based on the choice of a nominal point but account for the whole input space.

The most common sensitivity analysis methods found in the literature are the method of Morris (1991), FAST Cukier et al. (1973, 1978); Mara (2009) and variance decomposition methods originally investigated in Sobol’ (1993); Sobol′ (2001); Sobol’ and Kucherenko (2005); Archer et al. (1997); Saltelli (2002); Saltelli et al. (2010). Usually standard Monte Carlo simulation (MCS) is employed for estimating the sensitivity indices in all these techniques. This approach requires a large number of model evaluations though and therefore is usually computationally expensive. When complex systems are of interest, the standard MCS becomes unaffordable. To overcome this problem, metamodels (also called surrogate models or emulators) are usually used in order to carry out the Monte Carlo simulation Sathyanarayanamurthy and Chinnam (2009); Zhang and Pandey (2014). In particular, polynomial chaos expansions (PCE) have been recognized as a versatile tool for building surrogate models and for conducting reliability and sensitivity analyses, as originally shown in Sudret (2006, 2008); Blatman and Sudret (2010a). Using PCE, variance-based sensitivity analysis becomes a mere post-processing of the polynomial coefficients once they have been computed.

More recently, a new gradient-based technique has been proposed for screening unimportant factors. The so-called derivative-based global sensitivity measures (DGSM) are shown to be upper bounds of the total Sobol’ indices while being less computationally demanding Sobol’ and Kucherenko (2009, 2010); Kucherenko et al. (2009); Lamboni et al. (2013). Although the computational cost of this technique is reduced compared to the variance-based technique Kucherenko et al. (2009), its practical computation still relies on the Monte Carlo simulation approach.

In this paper we investigate the potential of polynomial chaos expansions for computing derivative-based sensitivity indices and allow for an efficient screening procedure. The paper is organized as follows: the classical derivation of Sobol’ indices and their link to derivative-based sensitivity indices is summarized in Section 2. The machinery of polynomial chaos expansions and the link with sensitivity analysis is developed in Section 3. The computation of the DGSM based on PC expansions is then presented in Section 4, in which an original method for computing the derivatives of orthogonal polynomials is presented. Finally two numerical tests are carried out in Section 5.

2 Derivative-based global sensitivity measures

2.1 Variance-based sensitivity measures

Global sensitivity analysis (SA) aims at quantifying the impact of input parameters onto the output quantities of interest. One input factor is considered insignificant (unessential) when it has little or no effect on the output variability. In practice, screening out the insignificant factors allows one to reduce the dimension of the problem, e.g. by fixing the unessential parameters.

Variance-based SA relies upon the decomposition of the output variance into contributions of different components, i.e. marginal effects and interactions of input factors. Consider a numerical model where the input vector contains independent input variables with uniform distribution over the unit-hypercube and is the scalar output. The Sobol’ decomposition reads Sobol’ (1993):

(1)

in which is a constant term and each summand is a function of the variables . For the sake of conciseness we introduce the following notation for the subset of indices:

(2)

and denote by the subvector of that consists of the variables indexed by . Using this set notation, Eq. (1) rewrites:

(3)

in which is the summand including the subset of parameters . According to Sobol’ (1993), a unique decomposition requires the orthogonality of the summands, i.e.:

(4)

In particular each summand shall be of zero mean value. Accordingly the variance of the response reads:

(5)

In this expansion is the contribution of summand to the output variance.

The Sobol’ sensitivity index for the subset of variables is defined as follows Sobol′ (2001):

(6)

The total sensitivity index for subset is given by Sobol′ (2001):

(7)

where the sum is extended over all sets which contains . It represents the total amount of uncertainty apportioned to the subset of variables . For instance, for a single variable the Sobol’ sensitivity index reads:

(8)

and the total Sobol’ sensitivity index reads:

(9)

and respectively represent the sole and total effect of the factor on the system’s output variability. The smaller is, the less important the factor is. In the case when , is considered as unimportant (unessential or insignificant) and may be replaced in the analysis by a deterministic value.

In the literature one can find different approaches for computing the total Sobol’ indices, such as the Monte Carlo simulation (MCS) and the spectral approach. Sobol’ (1993); Sobol′ (2001) proposed direct estimation of the sensitivity indices for subsets of variables using only the model evaluations at specially selected points. The approach relies on computing analytically the integral representations of and respectively defined in Eq. (6) and Eq. (7).

Let us denote by the set that is complementary to , i.e. . Let and be vectors of independent uniform variables defined on the unit hypercube and define . The partial variance is represented as follows Sobol’ and Kucherenko (2005):

(10)

The total variance is given by Sobol’ and Kucherenko (2005):

(11)

A Monte Carlo algorithm is used to estimate the above integrals. For each sample point, one generates two -dimensional samples and . The function is evaluated at three points , and . Using independent sample points, one computes the quantities of interest , and by means of the following crude Monte Carlo estimators:

(12)
(13)
(14)
(15)

The computation of the sensitivity indices by MCS may exhibit reduced accuracy when the mean value is large. In addition, the computational cost is prohibitive: in order to compute the sensitivity indices for parameters, MCS requires model runs, in which is the number of sample points typically chosen equal to to reach an acceptable accuracy. Saltelli (2002) suggested a procedure that is more efficient for computing the first and total sensitivity indices. Sobol’ et al. (2007) modified the MCS procedure in order to reduce the lack of accuracy. Some other estimators for the sensitivity indices by MCS may be found in Monod et al. (2006); Janon et al. (2013).

2.2 Derivative-based sensitivity indices

The total Sobol’ indices may be used for screening purposes. Indeeed a negligible total Sobol’ index means that variable does not contribute to the output variance, neither directly nor in interaction with orther variables. In order to avoid the computational burden associated with estimating all total Sobol’ indices, a new technique based on derivatives has been recently proposed by Kucherenko et al. (2009).

Derivative-based sensitivity analysis originates from the Morris method introduced in Morris (1991). The idea is to measure the average of the elementary effects over the input space. Considering variable one first samples an experimental design (ED) in the input space and then varies this sample in the direction. The elementary effect () is defined as:

(16)

in which is the sample point and is the perturbed sample point in the -th direction. The Morris importance measure (Morris factor) is defined as the average of the ’s:

(17)

By definition, the variance of the s is calculated from:

(18)

Kucherenko et al.(Kucherenko et al., 2009) generalized these quantities as follows:

(19)
(20)

provided that is square-integrable. Any input parameter with and is considered as unimportant. It has been shown that the Morris factor has a higher convergence rate compared to variance-based methods, which makes it attractive from a computational viewpoint. Because the elementary effects may be positive or negative, they can cancel each other, which might lead to a misinterpretation of the importance of . To avoid this, Campolongo et al. (2007) modified the Morris factor as follows:

(21)

Recently, Sobol’ and Kucherenko (2009) introduced a new sensitivity measure (SM) which is the mean-squared derivative of the model with respect to :

(22)

Sobol’ and Kucherenko (2009) and Lamboni et al. (2013) could establish a link between the in Eq. (22) and the total Sobol’ indices in Eq. (9). In case is a uniform random variable over , one gets:

(23)

where is the upper-bound to the total sensitivity index and is the model output variance. In case of a uniform variable this upper bound scales to:

(24)

Finally the above results can be extended to other types of distributions. If is a Gaussian random variable with mean and variance and respectively, one gets:

(25)

In the general case, Lamboni et al. (2013) define the upper bound of the total Sobol’ index of as:

(26)

in which is the Cheeger constant, is the cumulative distribution function of and is the probability density function of .

3 Polynomial chaos expansions

Let us consider a numerical model where the input vector is composed of independent random variables and is the output quantity of interest. Assuming that has a finite variance, it can be represented as follows Ghanem and Spanos (2003); Soize and Ghanem (2004):

(27)

in which form a basis on the space of second order random variables and ’s are the coordinates of onto this basis. In case the basis terms are multivariate orthonormal polynomials of the input variables , Eq. (27) is called polynomial chaos expansion.

Assuming that the input vector has independent components with prescribed probability distribution functions , one obtains the joint probability density function:

(28)

For each , one can construct a family of orthogonal univariate polynomials with respect to the probability measure satifying:

(29)

where is the inner product defined on the space associated with the probability measure , is the Kronecker symbol with if , otherwise and is a constant. The univariate polynomial belongs to a specific class according to the distribution of . For instance, if is standard uniform (resp. Gaussian) random variable, are orthogonal Legendre (resp. Hermite) polynomials. Then the orthonormal univariate polynomials are obtained by normalization:

(30)

Introducing the multi-indices , a multivariate polynomial can be defined by tensor product as:

(31)

Soize and Ghanem (2004) prove that the set of all multivariate polynomials in the input random vector forms a basis of the Hilbert space of second order random variables:

(32)

where ’s are the deterministic coefficients of the representation.

In practice, the input random variables are usually not standardized, therefore it is necessary to transform the input vector into a set of standard variables. We define the isoprobabilistic transform which is a unique mapping from the original random space of ’s onto a standard space of basic independent random variables ’s. As an example may be a standard normal random variable or a uniform variable over .

In engineering applications, only a finite number of terms can be computed in Eq.(32). Accordingly, the truncated polynomial chaos expansion of can be represented as follows Sudret (2007):

(33)

in which is the set of multi-indices ’s retained by the truncation scheme.

The application of PCE consists in choosing a suitable polynomial basis and then computing the appropriate coefficients ’s. To this end, there exist several techniques including spectral projection Le Maître et al. (2002); Matthies and Keese (2005), stochastic collocation method Xiu (2009) or least square analysis (also called regression, see (Berveiller et al., 2004, 2006)). A review of these so-called non-intrusive techniques is given in (Blatman, 2009). Recently, the least-square approach has been extended to obtain sparse expansions (Blatman and Sudret, 2010b, 2011). This technique has been applied to global sensitivity analysis in Blatman and Sudret (2010a). The main results are now summarized.

First note that the orthonormality of the polynomial basis leads to the following properties:

(34)

As a consequence the mean value of the model output is whereas the variance is the sum of the square of the other coefficients:

(35)

Making use of the unique orthonormality properties of the basis, Sudret (2006, 2008) proposed an original post-processing of the PCE for performing global sensitivity analysis. For any subset variables , one defines the set of multivariate polynomials which depends only on :

(36)

The ’s form a partition of , thus the Sobol’ decomposition of the truncated PCE in Eq. (33) may be written as follows:

(37)

where:

(38)

In other words the Sobol’ decomposition is directly read from the PC expansion. Consequently, due to the orthonormality of PC basis, the partial variance reads:

(39)

As a cosequence the Sobol’ indices at any order may be computed by a mere combination of the squares of the coefficients. As an illustration, the first order PC-based Sobol’ indices read:

(40)

whereas the total PC-based Sobol’ indices are:

(41)

4 Derivative of polynomial chaos expansions

In this paper, we consider the combination of polynomial chaos expansions with derivative-based global sensitivity analysis. On the one hand, PCE are already known to provide accurate metamodels at reasonable cost. On the other hand, the derivative-based sensitivity measure (DGSM) is effective for screening unimportant input factors. The combination of PCE and DGSM appears as a promising approach for effective low-cost SA. In fact, once the PCE metamodel is built, the DGSM can be computed as a mere post-processing of the metamodel which simply consists of polynomial functions.

As seen in Section 2, a DGSM is related to the expectation of the square of the model derivative, which was denoted by . We will express the model derivative in a way such that the expectation operator can be easily computed, more precisely by projecting the components of the gradient onto a PC expansion.

4.1 Hermite polynomial chaos expansions

In this section we consider a numerical model where is the scalar output and is the input vector composed of independent Gaussian variables . The isoprobabilistic transform reads:

(42)

where are standard normal random variables. The truncated PCE of reads:

(43)

in which is a multi-index, is the set of indices in the truncated expansion, is the multivariate polynomial basis obtained as the tensor product of univariate orthonormal Hermite polynomials (see A) and is the deterministic coefficient associated with .

Since is a one-to-one mapping with , the derivative-based sensitivity index reads:

(44)

The DGSM of , in other words the corresponding upper bound to the total Sobol’ index , is computed according to Eq. (25):

(45)

in which . This requires computing the partial derivatives of polynomial functions of the form . One can prove that the derivatives read (see A):

(46)

Therefore the derivative of the multivariate orthonormal Hermite polynomial with respect to reads:

(47)

provided that , and otherwise. Then the derivative of a Hermite PCE with respect to is given the following expression:

(48)

in which is the set of multi-indices having a non-zero component and is the index vector derived from by subtracting 1 from . The expectation of the squared derivative in Eq. (45) is reformulated as:

(49)

Due to the linearity of the expectation operator, the above equation requires computing . Note that the orthonormality of the polynomial basis leads to where is the Kronecker symbol. Thus one has:

(50)

As a consequence, in case of a Hermite PCE the DGSM can be given the following analytical expression:

(51)

Note that the total Sobol’ indices can be obtained directly from the PCE by as shown in Eq. (41). With integer indices , it is clear that the inequality is always true by construction.

4.2 Legendre polynomial chaos expansions

Consider now a computational model where the input vector contains independent uniform random variables . We first use an isoprobabilistic transform to convert the input factors into normalized variables :

(52)

where are uniform random variables. The Legendre PCE has the form of the expansion in Eq. (43), except that is now the multivariate polynomial basis made of univariate orthonormal Legendre polynomials (see B). Again, since is a one-to-one linear mapping with the derivative-based sensitivity index reads:

(53)

Similarly to Eq. (45), the upper bound DGSM to the total Sobol’ index is computed from Eq. (24) as:

(54)

Thus the derivative of univariate and multivariate Legendre polynomials are required. Denoting by , one shows in B that:

(55)

in which is a constant matrix whose row contains the coordinates of the derivative of onto a basis made of lower-degree polynomials . In other words, . Using this notation, the derivative of the multivariate orthonormal Legendre polynomials with respect to reads:

(56)

For a given let us define by the index vector having the component equal to :

(57)

Using this notation Eq. (56) rewrites as follows:

(58)

Denote by the set of having a non-zero index , i.e. . The derivative of a Legendre PCE with respect to then reads:

(59)

Denote by the set of multi-indices representing the ensemble of multivariate polynomials generated by differentiating the linear combination of polynomials . is obtained as:

(60)

where:

(61)

The derivative of Legendre PCE rewrites:

(62)

in which the coefficient is obtained from Eq.(59). Since the polynomials are also orthonormal, one obtains:

(63)

Finally, the DGSMs read:

(64)

4.3 General case

Consider now the general case where the input vector contains independent random variables with different prescribed probability distribution functions, i.e. Gaussian, uniform or others. Such a problem can be addressed using generalized polynomial chaos expansions Xiu and Karniadakis (2002). As the above derivations for Hermite and Legendre polynomials are valid componentwise, they remain identical when dealing with generalized expansions. Only the proper matrix yielding the derivative of the univariate polynomials in the same univariate orthonormal basis is needed, see Appendix A for Hermite polynomials and Appendix B for Legendre polynomials. The derivation for Laguerre polynomials is also given in Appendix C for the sake of completeness.

5 Application examples

5.1 Morris function

We first consider the Morris function that is widely used in the literature for sensitivity analysis Morris (1991); Lamboni et al. (2013). This function reads:

(65)

in which:

  • except for where ,

  • the input vector contains 20 uniform random variables ,

  • for ,

  • for ,

  • for ,

  • ,

  • the remaining first and second order coefficients are defined by , and ,

  • and the remaining third order coefficients are set to 0.

First, a PCE is built using the Least Angle Regression technique based on a Latin Hypercube experimental design of size . Then the PCE is post-processed to obtain the total Sobol’ indices and the upper-bound derivative-based sensitivity measures (DGSMs) using Eq. (41) and Eq. (64), respectively. The procedure is replicated 100 times in order to provide the 95% confidence interval of the resulting sensitivity indices.

As a reference, the total Sobol’ indices are computed by Monte Carlo simulation as described in Section 2.1 using the sensitivity package in R Pujol et al. (2013). One samples two experimental designs of size denoted respectively by and then computes the corresponding output vectors and . To estimate the total sensitivity index with respect to random variable , one replaces the entire column in sample (which contains the samples of ) by the column in sample to obtain a new experimental design denoted by . Then the output is computed from the input . The variance-based is obtained by means of , and using the sobol2007 function Pujol et al. (2013); Saltelli et al. (2010). The total number of model evaluations required by the MCS approach is . After 100 replications we also obtain the 95% confidence interval of the sensitivity indices.

Figure 1: Morris function: PCE-based vs. MCS-based sensitivity measures

For each input variable, Figure 1 depicts the total sensitivity indices computed by MCS and PCE approaches, and the DGSM derived from PCE as well as their corresponding 95% confidence intervals (the bounds of the latter being obtained from the 100 replications). Figure 1 shows that PCE derivative-based sensitivity measures and total Sobol’ indices for parameters are both close to zero, i.e. are unimportant factors. This is consistent with the MCS-based sensitivity measures. Using a sample set of size , the PCE-based approach can detect the non-significant input parameters with acceptable accuracy compared to MCS which requires model runs. In addition, the PCE-based total Sobol’ indices for the remaining parameters vary in a significantly narrower confidence intervals, i.e. are more reliable, than the MCS-based sensitivity indices. Finally, the obtained total Sobol’ indices are always smaller than the DGSMs. The less significant the parameter, the closer DGSM gets to the total Sobol’ index.

5.2 Oakley & O’Hagan function

The second numerical example is the Oakley & O’Hagan function Sobol’ and Kucherenko (2009); Oakley and O’Hagan (2004) which reads:

(66)

in which the input vector consists of 15 independent standard normal random variables . The vectors and the matrix are provided at www.sheffield.ac.uk/st1jeo.

Given the complexity of the function, the PCE-based approach is run with a Latin Hypercube experimental design of size . The size of a single sample set for the MCS approach is resulting in model runs. The procedure is similar as in Section 5.1.

Figure 2: Oakley & O’Hagan function: PCE-based vs. MCS-based sensitivity measures

Figure 2 shows that the PCE-based approach using only model evaluations can estimate the least significant parameters