Shapley effects for sensitivity analysis with dependent inputs: comparisons with Sobol’ indices, numerical estimation and applications
The global sensitivity analysis of a numerical model aims to quantify, by means of sensitivity indices estimate, the contributions of each uncertain input variable to the model output uncertainty. The so-called Sobol’ indices, which are based on the functional variance analysis, present a difficult interpretation in the presence of statistical dependence between inputs. The Shapley effect was recently introduced to overcome this problem as they allocate the mutual contribution (due to correlation and interaction) of a group of inputs to each individual input within the group. In this paper, using several new analytical results, we study the effects of linear correlation between some Gaussian input variables on Shapley effects, and compare these effects to classical first-order and total Sobol’ indices. This illustrates the interest, in terms of sensitivity analysis setting and interpretation, of the Shapley effects in the case of dependent inputs. We also investigate the numerical convergence of the estimated Shapley effects. For the practical issue of computationally demanding computer models, we show that the substitution of the original model by a metamodel (here, kriging) makes it possible to estimate these indices with precision at a reasonable computational cost.
keywords:Computer experiments, Kriging, Sensitivity Analysis, Shapley effects, Sobol’ indices, Uncertainty Quantification
When constructing and using numerical models simulating physical phenomena, global sensitivity analysis (SA) methods are valuable tools ioolem15 (). These methods allow one to determine which model input variables contribute the most to the variability of the model outputs, or on the contrary which are not important and possibly which variables interact with each other. The standard quantitative methods are based on the measure of variance and lead to the so-called Sobol’ indices. In the simple framework of scalar inputs, denoted by , and a single scalar output , the model response is
In the case of independent inputs, the interpretation of the Sobol’ indices is simple because the variance decomposition of is unique sob93 (). For instance, the first-order Sobol’ index of , denoted , represents the amount of the output variance solely due to . The second-order Sobol’ index () expresses the contribution of the interactions of the pairs of variables and , and so on for the higher orders. As the sum of all Sobol’ indices is equal to one, the indices are interpreted as proportions of explained variance.
However, in many applications, it is common that the input variables have a statistical dependence structure imposed by a probabilistic dependence function kurcoo06 () (e.g., a copula function) or by physical constraints upon the input or the output space petioo10 (); kuckly17 (); lopbol17 (). As shown in previous studies, estimating and interpreting Sobol’ indices with correlated inputs are not trivial saltar02 (); davwah08 (). Many propositions appear in the literature that are not always easy to interpret. One strategy proposed by jaclav06 () is to evaluate the Sobol’ indices of subsets of inputs which are correlated within the subsets but not correlated without. However, this approach is not satisfactory because one may need to compute the Sobol’ indices of the individual variables.
In xuger08 () the authors proposed to decompose each partial variance contributed by parameter , and defined as , into partial variance () contributed by the uncorrelated variations of parameter and partial variance () contributed by the correlated variations of with all other parameters , . Such an approach allows to exhibit inputs that have an impact on the output only through their strong correlation with other inputs. However, their approach only applies to linear model output with linearly dependent inputs. In the same spirit, the authors in martar12 () proposed a strategy which support non-linear models and non-linear dependencies. Their methodology is decomposed in two steps: a first step of decorrelation of the inputs and then a second step based on the concept of High Dimensional Model Representation (HDMR). HDMR (see, e.g., lira08 ()) relies on a hierarchy of component functions of increasing dimensions (truncation of ANOVA-Sobol’ decomposition in the case of independent variables). The second step in martar12 () thus consists in performing the HDMR on the decorrelated inputs. At the same time, the authors in kuctar12 () proposed a non-parametric procedure to estimate first-order and total indices in the presence of dependencies, not necessarily of linear type. Their methodology requires knowledge of the conditional probability density functions and the ability to draw random samples from those. Later, in martar15 (), the authors established a link between the approaches in kuctar12 () and martar12 (), allowing the distinction between the independent contributions of inputs to the response variance and their mutual dependent contributions, via the estimation of four sensitivity indices for each input, namely full and independent first-order indices, and full and independent total indices. They proposed two sampling strategies for dependent continuous inputs. The first one is based on the Rosenblatt transform ros52 (). The second one is a simpler method that estimates the sensitivity indices without requiring the knowledge of conditional probability density functions, and which can be applied in case the inputs dependence structure is defined by a rank correlation matrix (see, e.g., imacon82 ()).
A different approach was introduced in lirab10 (). It is once more based on the HDMR. The component functions are approximated by expansions in terms of some suitable basis functions (e.g., polynomials sudcan13 (), splines …). This meta-modeling approach allows a covariance decomposition of the response variance.
It is worth noting that none of these works has given an exact and unambiguous definition of the functional ANOVA for correlated inputs as the one provided by Hoeffding-Sobol’ decomposition hoe48 (); sob93 () when inputs are independent. A generalization of the Hoeffding-Sobol’ decomposition was proposed in Stone sto94 () (see also Hooker hoo07 ()). Then, in chagam12 (), the authors defined a new variable importance measure based on this decomposition. However, such an important measure suffers from two conceptual problems underlined in owepri17 (): the sensitivity indices can be negative and the approach places strong restrictions on the joint probability distribution of the inputs.
As a different approach, bor07 (); borcas11 () initiated the construction of novel generalized moment-free sensitivity indices, called -importance measures. Based on some geometrical considerations, these indices measure the shift area between the outcome probability density function and this same density conditioned to a parameter. Thanks to the properties of these new indices, a methodology is available to obtain them analytically through test cases. lopbol17 () has shown an application of these sensitivity measures on a gas transmission model with dependent inputs. zholu14 () has also proposed a methodology to evaluate the -importance measures by employing the Rosenblatt transform for the decorrelation procedure .
The Shapley values sha53 (), well known in game theory and economics, have been proposed by owe14 () in the framework of SA of model outputs. In their variance-based form, the Shapley values, also called Shapley effects, have been discussed in sonnel16 (); owepri17 (); ioopri17 () to carry out a SA in the case where the model entries exhibit dependencies among them. SA based on Shapley effects does not rely anymore on the Hoeffding-Sobol’ decomposition, it rather consists in a direct allocation of a part of the variance of the output at each input. Then, the two main properties and advantages of the Shapley values are that they cannot be negative and they sum-up to the total output variance. The allocation rule is based on an equitable principle. For example, an interaction effect is equally apportioned to each input involved in the interaction. From the conceptual point of view, the major issue is to understand what is the effect of the dependence between inputs on the variance-based Shapley values.
Several test-cases where the variance-based Shapley values can be analytically computed have been presented in owepri17 (). This study has highlighted several properties of these indices (e.g., for , if there is a bijection between any two of the , , then those two variables have the same Shapley value). Moreover, owepri17 () gives their general analytical form in the case of Gaussian inputs and a linear function (with ). However, from this analytical result (given in Section 3.1), it is difficult to understand the effects of the dependence between the inputs in the variance-based Shapley values. Therefore, it appears important in this paper to provide a thorough investigation of several particular cases, sufficiently simple to provide some interpretation.
For the sake of practical applications, sonnel16 () has proposed two estimation algorithms of the Shapley effects (that we define as the normalized variance-based Shapley values), and illustrated them on two application cases. By using analytical solutions of Shapley effects on several particular test functions, we perform in the present work a numerical convergence study and we compare the theoretical and numerical results of the Shapley effects. As for the Sobol’ indices, one important issue in practice is the numerical cost in terms of number of required model evaluations of estimating the Shapley effects. In this case, a classical solution is to use a metamodel which is a mathematical approximation of the numerical model (1) from an initial and limited set of runs fanli06 (); ioovan06 (). The metamodel solution is a current engineering practice for estimating sensitivity indices legmar17 ().
In the following section, we recall the general mathematical formulation of Sobol’ indices and Shapley effects when the inputs are dependent. We also provide a discussion of the SA setting saltar02 () that can be addressed with the Shapley effects. In Section 3, we develop the analytical formulas that one can obtain in several particular cases: linear models with Gaussian inputs in various dimensions, block-additive structure, Ishigami model with various dependencies between the three inputs. In particular, we focus on inequalities that can be developed between Sobol’ indices and Shapley effects. Numerical algorithms for estimating Shapley effects are studied in Section 4. Section 5 presents an industrial application which requires the use of a metamodel to estimate the sensitivity indices. A conclusion synthesizes the findings and contributions of this work.
2 General formulation of sensitivity indices
2.1 Sobol’ indices
Starting from the model (1) , the Sobol’ index associated with a set of inputs indexed by () is defined by:
Sobol’ indices have been introduced in sob93 (). Indices defined by (2) are referred as closed Sobol’ indices in the literature (see, e.g., pritar17 ()) and are always comprised between and . For the mathematical developments of the following sections, we denote the numerator of as:
where is the vector not containing and with the variables over which the conditional operators are applied indicated in subscript.
Recall that this paper is not restricted to independent inputs, so that the knowledge of first-order and total Sobol’ indices does not give complete information on the way an input influences the output . In martar15 (), the authors propose a strategy based on the estimation of four sensitivity indices per input, namely , , and . and are the classical Sobol’ indices, while and are new ones that can be expressed by means of Rosenblatt transform ros52 (). These indices can also be derived from the law of total variance as discussed in tarmar18 ().
The indices and include the effects of the dependence of with other inputs, and are referred to as full sensitivity indices in martar12 (). The indices and measure the effect of an input , that is not due to its dependence with other variables . Such indices have also been introduced as uncorrelated effects in martar12 () and further discussed in martar15 () which refers to them as the independent Sobol’ indices. In martar15 (), the authors propose to estimate the four indices , , and for the full set of inputs (). Note that and that , but other inequalities are not known.
In the following, we focus our attention on the full first-order Sobol’ indices and the independent total Sobol’ indices . Indeed, these are the indices which are used in the definition of the Shapley effects (see Section 2.2). Moreover, the full first-order indices coincide with the associated classical first-order Sobol’ indices , and the independent total indices coincide with the associated classical total Sobol’ indices . Our preliminary comparative study is completed by the computations of and in beneli18 ().
2.2 Shapley effects
As an alternative to the Sobol’ indices, the Shapley effects are defined as
where is a cost function, is the set of indices not containing and is the cardinality of . The Shapley values sha53 (), well known in game theory and by economists, have been proposed by owe14 () in the framework of SA of model outputs. By using, from Eq. (2),
the corresponding Shapley values are new importance measures for SA of model output. These indices are called Shapley effects by the authors in sonnel16 () which also prove that it is equivalent to define as or as . Note that, in owe14 () and sonnel16 (), the cost function is not normalized by the variance of while, in the present paper, we consider its normalized version.
The Shapley effect relies on an equitable allocation of part of the variance of the output to each input. Then, it is not any linear combination of Sobol’ indices. Indeed, the Shapley effect equitably shares interaction effects of a subset of inputs with each individual input within the subset. Moreover, the Shapley effect associated with input factor () takes into account both interactions and correlations of with , , . The share allocation has a consequence that Shapley effects are non negative and sum-up to one, allowing an easy interpretation for ranking input factors.
The theoretical formula (5) with shows that the Shapley effect of an input is a by-product of its Sobol’ indices. Thus, if one can compute the complete set of Sobol’ indices, we can compute the Shapley effect of each input. Note that both algorithms proposed in sonnel16 () are based on consistent estimators of the Shapley effects. From the exact permutation algorithm, we can extract a consistent estimator of any Sobol’ index. Concerning the random permutation algorithm, the sample size related to the inner loop (conditional variance estimation), and the one related to the outer loop (expectation estimation) are fixed to and respectively. Thus it is not possible to extract from that algorithm accurate estimates of Sobol’ indices. However, this last algorithm is consistent for the estimation of Shapley effects and is particularly adapted in the case of high-dimensional inputs space (see Section 4.1 for more details).
In case the input factors are independent, first-order (resp. total) Sobol’ indices provide effectively computable lower- (resp. upper-) bounds for the Shapley effects. In the following, as in sonnel16 (), we will show that these bounds do not hold anymore in case the input factors present some dependencies.
2.3 SA settings
saltar02 () and saltar04 () have defined several objectives, called SA settings, that sensitivity indices can address. These SA settings clarify the objectives of the analysis. They are listed in saltar04 () as follows:
Factors Prioritization (FP) Setting, to know on which inputs the reduction of uncertainty leads to the largest reduction of the output uncertainty;
Factors Fixing (FF) Setting, to determine which inputs can be fixed at given values without any loss of information in the model output;
Variance Cutting (VC) Setting, to know which inputs have to be fixed to obtain a target value on the output variance;
Factors mapping (FM) Setting, to determine which inputs are most responsible for producing values of the output in a specific region of interest.
In the case of independent inputs, the Sobol’ indices directly address the FP, FF and VC Settings (see saltar04 ()). In the case of dependent inputs, the classical ANOVA-Sobol’ decomposition does not hold anymore and the VC setting cannot be directly obtained with Sobol’ indices (see saltar02 ()). However, the FP Setting can be achieved using the independent and full first order Sobol’ indices. The FF Setting is more difficult to address in the presence of dependencies. Indeed, fixing one or more of the input factors, has an impact on all the input factors that are correlated with them. However, as explained in martar15 (), independent and full total Sobol’ indices can help understanding the FF Setting in the dependent framework.
It is of interest now to give some hints about how modelers can use the Shapley effects to address some SA settings. The VC Setting is not achieved, in the independent and dependent inputs cases, because the Shapley effect of an input contains some effects due to other inputs.
In the case of independent inputs, each Shapley effect is framed by the corresponding first-order and total indices :
In addition to the individual effect of the variable , the Shapley effects take into account the effects of interactions by distributing them equally in the index of each input that plays in the interaction owe14 (). Therefore, the FF Setting is achieved by using the Shapley effects. However, the FP Setting is not precisely achieved because we cannot distinguish the contributions of the main and interaction effects in a Shapley effect.
In the case of dependent inputs, the inequality in Eq. (7) does not hold true anymore. However, due to the equitable principle on which the allocation rule is based, a Shapley effect close to zero means that the input has no significant contribution to the variance of the output, nor by its interactions nor by its dependencies with other inputs. Then, the FF Setting is also achieved.
3 Relations and inequalities between Sobol’ indices and Shapley effects
As said before, in the case of dependent inputs, no relation such as the one in Eq. (7) can be directly deduced. The goal of this section is to study particular cases where analytical deduction can be made. The linear model is the first model to be studied in sensitivity analysis because of its simplicity, its interpretability, and the tractable analytical results it provides when the inputs follow Gaussian distributions. Considering two and three inputs allow us to compare our new results to previous studies on Sobol’ indices. The Ishigami function is the most used analytical test function in sensitivity analysis. It has been considered in numerous papers dealing with Sobol’ indices with dependent inputs.
We focus the analysis on the full first-order indices (corresponding to classical first-order Sobol’ indices) (called “First-order Sobol” in the figures) and the independent total indices (corresponding to classical total Sobol’ indices) (called “Ind total Sobol” in the figures). Indeed these are the indices mainly studied and discussed in the previous works on this subject, in particular in kuctar12 (). The numerical tests of kuctar12 () have inspired the ones proposed in this section. Moreover, these indices are easily and directly provided by the Shapley effects estimation algorithms (sonnel16 (), see Section 4), during the first and last iterations of the algorithm. Finally, the estimation of the independent first-order Sobol’ and of the full total Sobol’ indices is based on a rather cumbersome process, based on the use of Rosenblatt transforms (see more details in Section 2.1). Comparisons with these complementary indices are made in beneli18 ().
3.1 Gaussian framework and linear model
Let us consider
with and a positive-definite matrix. We have . Note that the subscripts are added on (resp. ) in order to precise which components are contained in the vector (resp. matrix).
We get from owepri17 ():
Recall now the following classical formula:
From (10) and according to the law of total variance, we easily obtain the Sobol’ indices:
Note that and do not play any role as translation parameters in variance-based sensitivity analysis. However, no general conclusion can be drawn from this too general example. Therefore, particular cases are studied in the three following sections.
3.2 Gaussian framework, linear model with two inputs
We focus now on the case . We consider and , with
From that, we get
The equality of the first three assertions is obtained in the absence of correlation (). In this case, the Shapley effects are equal to the first-order and total Sobol’ indices in the absence of correlation. In the presence of non-zero correlation, the Shapley effects lie between the full first-order indices and the independent total indices: when then , and when then . We call this the sandwich effect. We remark that the effects of the dependence that we can see in the independent total indices (e.g. for ) and in the full first-order indices (e.g. for ) is allocated in half to the Shapley effect, in addition to the elementary effect (e.g. for ).
These results are also illustrated in Figure 1. In the left figure, as the standard deviations of each variable are equal, the different sensitivity indices are superimposed and the Shapley effects are constant. In the right figure, exhibits more variability than , thus the corresponding sensitivity indices of are logically larger than the ones of . The effect of the dependence between the inputs is clearly shared on each input variable. The dependence between the two inputs lead to a rebalancing of their corresponding Shapley effects, while a full Sobol’ index of an input comprises the effect of another input on which it is dependent. We also see on the right figure that the Shapley effects of two perfectly correlated variables are equal. Finally, the sandwich effect is respected for each input: From Eq (17), we can prove that when and that elsewhere.
3.3 Particular case: Effect of a correlated input non included in the model
Consider the model with two dependent standard Gaussian variables with a correlation coefficient . It corresponds to the case , , , in the model introduced in Section 3.2. We then derive the Shapley effects:
Eq. (18) leads to the important remark that an input not involved in the numerical model can have a non-zero effect if it is correlated with an influential input of the model. If the two inputs are perfectly correlated, their Shapley effects are equal. This example also illustrates the FF setting that can be achieved with the Shapley effects: if is close to zero, is small and can be fixed without changing the output variance.
For the Sobol’ indices, we have
which indicates that is only important because of its strong correlation with (FP setting) and that by only accounting for the uncertainty in , one should be able to evaluate the uncertainty of accurately.
3.4 Gaussian framework, linear model with three inputs
We consider a linear model with being a Gaussian random vector with . We assume that is independent from both and , and that and may be correlated. The covariance matrix reads:
We obtained the following analytical results.
As expected, we have and we see in and how the correlation effect is distributed in each index. In the case of fully correlated variables, we obtain with .
We study the particular case , and , for which in kuctar12 (), the authors provide the formulas of full first-order and independent total Sobol’ indices. The analytical indices are depicted on Figure 2 as a function of the correlation coefficient . The Shapley effects are equal to the Sobol’ indices in the absence of correlation, and then lie between the associated full first-order and independent total indices in the presence of correlation. The sandwich effect is respected. The effect of an increasing correlation (in absolute value) can be interpreted as an attractive effect both for the full Sobol’ indices (here the first-order one) and the Shapley effect. However, for the Shapley effects, the contribution of the correlation is shared with each correlated variable. This leads to the increase of one Shapley effect and the decrease of the other. The Shapley effects allow an easy interpretation. As before, we see that the Shapley effects of two perfectly correlated variables are equal.
3.5 Gaussian framework, linear model with an interaction and three inputs
In Sections 3.2 and 3.4, we have presented models for which the Shapley effect takes a value between the associated full first-order and independent total indices. In the present section, we will show that it is not always the case. Let us define the model
By simple computations, we get and . Recall that
We thus get
A straightforward computation yields
We also get , and . Thus
Concerning the third input variable , one gets , and . Thus the two following assertions are equivalent:
The two following assertions are also equivalent:
It also happens that is not comprised between and . The two following assertions are equivalent:
Figure 3 illustrates the previous findings about the Sobol’ indices and the Shapley effects for this model. As expected, when the correlation coefficient belongs to two intervals, and , the Shapley effects of are larger than the full first-order Sobol’ indices and the independent total Sobol’ indices.
3.6 General model with three inputs and a block-additive structure
We consider the following model:
which is called a “block-additive” structure. We consider the general case where the vector is not restricted to a Gaussian vector. We only assume that the three inputs have finite variances and that is independent from . From the independence properties one has:
We also get that, for , the three following assertions are equivalent
We now consider, as in kuctar12 (), the Ishigami function, a non-linear model involving interaction effects which writes:
where with a non-zero correlation between a pair of variables.
Our first study considers a correlation between and only. Moreover is assumed to be independent from and . This model has a block-additive structure (as in Eq. (23) up to a permutation between and ). The sensitivity measures depicted on Figure 4 were obtained with a numerical procedure explained in the next section. We observe that the sandwich effect is respected for and . As is independent from the group and it has no interaction with that group, the Shapley index of equals both its full first-order and independent total indices. Moreover, the Shapley effects of and get closer as the correlation between them increases.
Our second (resp. third) study considers some correlation between and (resp. between and ). Let us note that the model structure of Eq. (23) with independence between the two blocks is not conserved: The model cannot be decomposed in the sum of two independent terms as in the block-additive model. Figure 5 shows the results of these studies. Surprisingly, the sandwich effect appears still respected. However, it seems difficult to deduce general results on that more general kind of models. In the two studies, we also observe that the correlation effects are smaller on the Shapley effects than in the previous case (correlation between and ), except when the absolute value of the correlation is close to one. It is clear that the strong effects observed in the previous case are due to the conjunction of correlation and interaction between and .
4 Numerical estimation issues
4.1 Estimation by direct sampling
being the cost function (which has been shown to be more efficient than the variance of the conditional expectation). The first algorithm traverses all possible permutations between the inputs and is called the “Exact permutation method”. The second algorithm consists of randomly sampling some permutations of the inputs and is called the “Random permutation method”. It only makes sense when all permutations is too large to exhaust and then the first algorithm is not applicable.
For each iteration of the inputs’ permutations loop, a conditional variance expectation must be computed. The cost , in terms of model evaluations, of these algorithms are then the following sonnel16 ():
Random permutation method: , with the number of random permutations for discretizing the principal sum in (5), the inner loop size, the outer loop size and the sample size for the variance computation.
The term that appears in these costs comes from the fact that Shapley effects are estimated (the last Shapley effect is estimated by using the sum-to-one property). Note that the full first-order Sobol’ indices (Eq. (2)) and the independent total Sobol’ indices (Eq. (4)) are also estimated by applying these algorithms, each one during only one loop iteration.
From theoretical arguments, the authors in sonnel16 () have shown that the near-optimal values of the sizes of the different loops are the following:
and as large as possible for the exact permutation method,
, and as large as possible for the random permutation method.
We consider these values in all our numerical tests and applications. The value of which leads to the same numerical cost for the two algorithms is:
In practical applications, this choice is not realistic and the random permutation method is applied with a much smaller value for (see Section 5). The exact permutation algorithm with fixed is consistent as tends to infinity. The random permutation one with fixed and is consistent as the number of sampled permutations, , tends to infinity. Indeed, both algorithms are based on unbiased estimators of and , whose variance tends to zero (see Appendix A, Equations (18) and (22) in sonnel16 () for more details). From Theorem 3 in sonnel16 (), we know that the variance of the estimator of obtained from the random permutation algorithm is bounded by . More intuitively, it seems reasonable to think that the difficulty to estimate is related on the effective dimension of the model as far as on the complexity of the dependence structure of the inputs.
To illustrate these numerical estimators, we consider the linear model with Gaussian inputs of Section 3.4 with , , . We first set . On the left-hand side of Figure 6 (resp. right-hand side) are plotted the results of the exact permutation method (resp. random permutation method) versus (resp. ). The error was obtained from the central limit theorem on the permutation loop (Monte Carlo sample of size ) and then by taking two times the standard deviation of the estimates ( confidence intervals). Similarly, the one in Figure 6 (left) was obtained from the central limit theorem on the outer loop (Monte Carlo sample of size ). In the both cases, we observe the convergence of the estimated values toward the exact values as (reps. ) increases. Further analysis about the choice of , and is nevertheless necessary.
While varying between and , Figure 7 shows the Shapley effects and the Sobol’ indices estimated by the two methods (with the same cost by using Eq. (27) with and for both algorithms). We recall that the theoretical results have been obtained in Section 3.4. First, we note that the numerical results and the theoretical values (see Figure 2) are in a good agreement. Second, we can observe that the accuracy of the two methods are equivalent for this low dimensional problem ().
4.2 Metamodel-based estimation
In this section, we consider a relatively common case in industrial applications where the numerical code is expensive in computational time. As a consequence, it cannot be evaluated intensively (e.g. only several hundred calculations are possible). It is therefore not possible to estimate the sensitivity indices with direct use of the model. Indeed, the estimate of Sobol’ indices requires for each input several hundreds or thousands of model evaluations ioolem15 ()). For the Shapley effects, an additional loop is required which increases the computational burden.
In this case, it is recommended to use a metamodel instead of the original numerical model in the estimation procedure. A metamodel is an approximation of the numerical model, built on a learning dataset fanli06 (). The appeal to a metamodel is a current engineering practice for estimating sensitivity indices legmar17 (). In the present work, we use the Gaussian process metamodel (also called kriging) sacwel89 (); sanwil03 () which has demonstrated in many practical situations to have good predictive capacities (see marioo08 () for example). The Gaussian process model is defined as follows:
where is a deterministic trend function (typically a multiple linear model) and is a centered Gaussian process. The practical implementation details of kriging can be found in rougin12 (). We make the assumption that is second-order stationary with a Matérn covariance parameterized by the vector of its correlation lengths and variance . The hyperparameters and are classically estimated by the maximum likelihood method on a learning sample comprising input/output of a limited number of simulations. Kriging provides an estimator of which is called the kriging predictor denoted by . To quantify the predictive capability of the metamodel and to validate the predictor, the metamodel predictivity coefficient is estimated by cross-validation or on a test sample (marioo08, ). More precisely, the Gaussian process model gives the following predictive distribution:
where is a point of the input space not contained in the learning sample, is the output vector of the learning sample of size and is the kriging variance that can also be explicitly estimated. In particular, the kriging variance quantifies the uncertainty induced by estimating with .
As an illustration, we study the Ishigami function (Eq. (25)) with a correlation coefficient between and , on which kuctar12 () studied the Sobol’ indices (see also Section 3.6 of this paper). When constructing the models, three different sizes of the learning sample (, and ) respectively give three predictive coefficients ( which is equivalent to the in prediction) different for the kriging predictor: , and . Figure 8 shows that with a strong predictive metamodel ( with ), the estimations of the Shapley effects by the metamodel are satisfactorily accurate. The precision of the estimated effects deteriorate rapidly with the decrease of the metamodel predictivity.
This example has just illustrated the need to have a sufficiently accurate metamodel in order to have precise estimates of Shapley effects. Controlling the error made on the Shapley effects estimates due to the metamodel approximation is possible thanks to the properties of the Gaussian process model. To provide such confidence bounds on Shapley effects estimates, beneli18 () has developed an algorithm based on conditional Gaussian process simulations (as in legcan14 () for Sobol’ indices).
5 Industrial application
This application concerns a probabilistic analysis of an ultrasonic non-destructive control of a weld containing manufacturing defect. Complex phenomena occur in such heterogeneous medium during the ultrasonic wave propagation and a fine analysis to understand the effect of uncertain parameters is important. The simulation of these phenomena is performed via the finite element code ATHENA2D, developed by EDF (Electricité de France). This code is dedicated to the simulation of elastic wave propagation in heterogeneous and anisotropic materials like welds.
A first study rupbla14 () has been realized with an inspection configuration aiming to detect a manufactured volumic defect located in a mm thick V grooveweld made of L steel (Figure 9). The weld material reveals a heterogeneous and anisotropic structure. It was represented by a simplified model consisting of a partition of equivalent homogeneous regions with a specific grain orientation. Eleven scalar input variables ( elastic coefficients and orientations of the columnar grains of the weld inspections) have been considered as uncertain and modeled by independent random variables, each one associated to a probability density function. The scalar output variable of the model is the amplitude of the defect echoes resulting from an ultrasonic inspection (maximum value on a so-called Bscan). Uncertainty and sensitivity analysis (based on polynomial chaos expansion legmar17 ()) have then been applied from Monte Carlo simulations of ATHENA2D in rupbla14 (). The sensitivity analysis has shown that almost all inputs are influential (only one input has a total Sobol’ index smaller than ), that the interaction effects are non-negligible (approximately of the total variance) and that the orientations play a major role for explaining the amplitude variability. The analysis confirms that an accurate determination of the micro-structure is essential in these simulation studies. Finally, as a perspective of their work, the authors in rupbla14 () explain that the real configuration has been strongly simplified by considering independent input random variables. Indeed, due to the welding physical process, a dependence structure exists between the orientations, in particular between two neighbor domains (see Figure 9 right).
The purpose of the present study is then to perform a sensitivity analysis by using a more realistic probabilistic model for the input random variables. Our SA setting is mainly a FF objective (see Section 2.3): Which parameters can be fixed without impacting the predicted model response uncertainty? Indeed, these SA results are expected to be useful with regards to the qualification process of the non-destructive control technique. As explained in Section 2.3, the Shapley effects are well adapted to FF setting in the case of dependent inputs.
In our study, the probability distributions of all the inputs are considered Gaussian, with the same mean and standard deviation as in rupbla14 (). From physical models of welding process and solidification moyapf03 (), engineers have been able to estimate the following correlation matrix between the orientations of Figure 9 (right),
As only several hundreds of numerical simulations of ATHENA2D can be performed in the schedule time of the present study, our strategy consists of generating a space filling design in order to have a “good” learning sample for a metamodel building process. A Sobol’ sequence of points has then been generated for the input variables on . After transformation of this sample to a sample of inputs which follow their physical scales and their joint probability density function, the corresponding runs of ATHENA2D have been computed.
Remark: The Monte Carlo simulations performed in the previous study rupbla14 () were not stored, and thus could not be reused. As already mentioned, the metamodel built in that previous study was based on polynomial chaos expansion, and was not stored as well.
From the resulting -size learning sample, a Gaussian process metamodel (parameterized as explained in Section 4.2) has then be fitted. We refer to legmar17 () for a comparative study between metamodels based on polynomial chaos expansions and the one based on Gaussian processes. We obtain a predictivity coefficient of . This result is rather satisfactory, especially when it is compared to the predictivity coefficient obtained by a simple linear model (). Moreover, the test on Ishigami function (Section 4.2) has shown that the estimation of Shapley effects with a metamodel of predictivity close to gives results rather close to the exact values.
The Shapley effects are estimated by using the metamodel predictor instead of ATHENA2D (Section 4.2). Due to the input dimension (), the random permutation method is used with , , and . The cost is then in terms of required metamodel evaluations. It would be prohibitive with the “true” computer code ATHENA2D, but it is feasible by using the metamodel predictor. Figure 10 gives the Shapley effects of the elasticity coefficients (, , , ) and orientations (, , , , , , ). The lengths of the -confidence interval (see Section 4.1) are approximately equal to , which is sufficient to provide a reliable interpretation. Note that negative values of some Shapley effects are caused by the Monte Carlo estimation errors.
By visualizing the Shapley effects, we can propose a discrimination in four groups of inputs according to their degree of influence (note that such discrimination is questionable due to the residual uncertainties on Shapley effects):
and whose effects are larger than ,
whose effect is ,
, , , and whose effects range between and ,
, and whose effects are smaller than . The FF setting could be addressed with the inputs in this group.
To be convinced by this FF setting, the variance of the metamodel output when (,,) are fixed is compared with the variance of the metamodel output when all the inputs vary. The variance with all inputs is and the variance with the three fixed inputs is . As expected, the decrease of corresponds approximately to the sum of the Shapley effects of , and (approximately ).
In the study of rupbla14 () which did not take into account the correlation, and have been identified as influential inputs (effects larger than ). This result shows the importance of taking into account the dependence structure between inputs and the usefulness of the Shapley effects for FF setting in this case. If we compare the (normalized) total Sobol’ indices of rupbla14 () and the Shapley effects of our study, taking into account the correlation has led to:
an increase in sensitivity indices for , and ,
a decrease in sensitivity indices for ,
similar sensitivity indices for , and .
By looking at the input correlation matrix (Eq. (30)), we remark that we can distinguish two groups of inputs as a function of their correlation degrees: and . We observe the homogeneity of the correlation structure effects: the inputs inside the first group correspond to an increase (or a stability) in sensitivity indices and that the inputs inside the second group correspond to a decrease (or a stability) in sensitivity indices.
In many applications of global sensitivity analysis methods, it is common that the input variables have a known statistical dependence structure or that the input space is constrained to a non-rectangular region. In this paper we considered two answers to that issue: the Shapley effects (a normalized version of the variance-based Shapley values proposed in owe14 () in the framework of sensitivity analysis) and the methodology developed in martar15 (). The latter suggests the joint analysis of full and independent first-order and total indices to analyze the sensitivity of a model to dependent inputs. In the present paper, we conducted a comparative analysis between Shapley effects on one side and full first-order and independent total indices on the other side. From analytical solutions obtained with linear models and Gaussian variables, we have shown that the dependence between inputs lead to a rebalancing of the corresponding Shapley effects, while a full Sobol’ index of an input captures the effect of any input on which it is dependent. Comparisons of Shapley effects with the complementary independent first-order and full total indices are currently under investigation.
We have also illustrated the convergence of two numerical algorithms for estimating Shapley effects. These algorithms depend on various parameters: (conditional variance estimation sample size), (expectation estimation sample size), (output variance estimation sample size) and (random permutation number). It would be interesting to investigate further the response of the algorithms to these different parameters and to derive empirical and asymptotic confidence intervals for the Shapley effects estimates. Introducing a sequential procedure in the random permutation algorithm, in order to increase until a sufficient precision on the Shapley effects, seems also promising. Moreover, it would be important in a future work to consider the estimation algorithms capabilities on more complex dependence structures than the pairwise cases exclusively discussed in the present paper.
brobac18 () has started to develop more efficient algorithms in the Gaussian linear case. Finally, we have shown the relevance of using a metamodel (here the Gaussian process predictor) in the industrial situations where the computer model is too time consuming to be evaluated thousands of times for the previous algorithms to be applied. Future work (started in beneli18 ()) will consist in developing an algorithm exploiting the complete structure of the Gaussian process allowing to infer the error due to this approximation (see jannod14 (), jankle13 () and legcan14 () for the Sobol’ indices and delmar16 () for the Derivative-based Global Sensitivity Measures).
We are grateful to Thierry Mara and Stefano Tarantola for organizing this SAMO special issue, and to the reviewers for their numerous remarks on the paper. We also thank Géraud Blatman who has performed the computations on ATHENA2D model, Roman Sueur for helpful discussions about Shapley effects and Chu Mai for his remarks and his help for the paper proofreading. Numerical estimation of Shapley effects and Sobol’ indices have been realized using the sensitivity package of the R software. Thanks to Eunhye Song, Barry L. Nelson and Jeremy Staum for providing preliminary versions of the Shapley effects estimation functions.
- (1) B. Iooss, P. Lemaître, A review on global sensitivity analysis methods, in: C. Meloni, G. Dellino (Eds.), Uncertainty management in Simulation-Optimization of Complex Systems: Algorithms and Applications, Springer, 2015, pp. 101–122.
- (2) I. Sobol, Sensitivity estimates for non linear mathematical models, Mathematical Modelling and Computational Experiments 1 (1993) 407–414.
- (3) D. Kurowicka, R. Cooke, Uncertainty analysis with high dimensional dependence modelling, Wiley, 2006.
- (4) M. Petelet, B. Iooss, O. Asserin, A. Loredo, Latin hypercube sampling with inequality constraints, Advances in Statistical Analysis 94 (2010) 325–339.
- (5) S. Kucherenko, O. Klymenko, N. Shah, Sobol’ indices for problems defined in non-rectangular domains, Reliability Engineering and System Safety 167 (2017) 218–231.
- (6) A. Lopez-Benito, R. Bolado-Lavin, A case study on global sensitivity analysis with dependent inputs: The natural gas transmission model, Reliability Engineering and System Safety 165 (2017) 11–21.
- (7) A. Saltelli, S. Tarantola, On the relative importance of input factors in mathematical models: Safety assessment for nuclear waste disposal, Journal of American Statistical Association 97 (2002) 702–709.
- (8) S. Da Veiga, F. Wahl, F. Gamboa, Local polynomial estimation for sensitivity analysis on models with correlated inputs, Technometrics 51 (4) (2009) 452–463.
- (9) J. Jacques, C. Lavergne, N. Devictor, Sensitivity analysis in presence of model uncertainty and correlated inputs, Reliability Engineering and System Safety 91 (2006) 1126–1134.
- (10) C. Xu, G. Gertner, Uncertainty and sensitivity analysis for models with correlated parameters, Reliability Engineering and System Safety 93 (2008) 1563–1573.
- (11) T. Mara, S. Tarantola, Variance-based sensitivity indices for models with dependent inputs, Reliability Engineering and System Safety 107 (2012) 115–121.
- (12) G. Li, H. Rabitz, J. Hu, Z. Chen, Y. Ju, Regularized random-sampling high dimensional model representation, Journal of Mathematical Chemistry 43 (3) (2008) 6022–6032.
- (13) S. Kucherenko, S. Tarantola, P. Annoni, Estimation of global sensitivity indices for models with dependent variables, Computer Physics Communications 183 (2012) 937–946.
- (14) T. Mara, S. Tarantola, P. Annoni, Non-parametric methods for global sensitivity analysis of model output with dependent inputs, Environmental Modeling & Software 72 (2015) 173–183.
- (15) M. Rosenblatt, Remarks on a multivariate transformation, The Annals of Mathematical Statistics 23 (3) (1952) 470–472.
- (16) R. Iman, W. Conover, A distribution-free approach to inducing rank correlation among input variables, Communications in Statistics 11 (3) (1982) 311–334.
- (17) G. Li, H. Rabitz, P. Yelvington, O. Oluwole, F. Bacon, C. Kolb, J. Schoendorf, Global sensitivity analysis for systems with independent and/or correlated inputs, Journal of Physical Chemistry 114 (2010) 6022–6032.
- (18) B. Sudret, Y. Caniou, Analysis of covariance (ANCOVA) using polynomial chaos expansions, in: G. Deodatis (Ed.), Proceedings of 11th Int. Conf. Struct. Safety and Reliability (ICOSSARâ2013), New-York, USA, 2013.
- (19) W. Hoeffding, A class of statistics with asymptotically normal distributions, Annals of Mathematical Statistics 19 (1948) 293–325.
- (20) C. J. Stone, The use of polynomial splines and their tensor products in multivariate function estimation, The Annals of Statistics 22 (1) (1994) 118–184.
- (21) G. Hooker, Gen