Multifidelity probability estimation
via fusion of estimators
Abstract
This paper develops a multifidelity method that enables estimation of failure probabilities for expensivetoevaluate models via information fusion and importance sampling. The presented general fusion method combines multiple probability estimators with the goal of variance reduction. We use lowfidelity models to derive biasing densities for importance sampling and then fuse the importance sampling estimators such that the fused multifidelity estimator is unbiased and has meansquared error lower than or equal to that of any of the importance sampling estimators alone. By fusing all available estimators, the method circumvents the challenging problem of selecting the best biasing density and using only that density for sampling. A rigorous analysis shows that the fused estimator is optimal in the sense that it has minimal variance amongst all possible combinations of the estimators. The asymptotic behavior of the proposed method is demonstrated on a convectiondiffusionreaction partial differential equation model for which samples can be afforded. To illustrate the proposed method at scale, we consider a model of a free plane jet and quantify how uncertainties at the flow inlet propagate to a quantity of interest related to turbulent mixing. Compared to an importance sampling estimator that uses the highfidelity model alone, our multifidelity estimator reduces the required CPU time by 65% while achieving a similar coefficient of variation.
ultifidelity modeling, uncertainty quantification, information fusion, importance sampling, reducedorder modeling, failure probability estimation, PDEs, turbulent jet
1 Introduction
This paper considers multifidelity estimation of failure probabilities for largescale applications with expensivetoevaluate models. Failure probabilities are required in, e.g., reliable engineering design and risk analysis. Yet failure probability estimation with expensivetoevaluate nonlinear models is computationally challenging due to the large number of Monte Carlo samples needed for lowvariance estimates.
Efficient failure probability estimation methods aim to reduce the number of samples at which the expensive model is evaluated, e.g., by exploiting variancereducing sampling strategies, multifidelity/multilevel estimation methods, or sequential sampling approaches. Variance reduction can be obtained through importance sampling [33], which allows for orderofmagnitude reductions in the number of samples needed to reliably estimate a small probability. However, importance sampling relies on having a good biasing distribution which in turn requires insight into the system. Surrogate models can provide such insight at much lower computational cost. Multifidelity approaches (see [38] for a review) that use surrogates for failure probability estimation via sampling have seen great interest recently [26, 8, 27, 13, 42, 35, 14], but require that the user selects a good importance sampling density. Multifidelity methods that avoid the selection of a single biasing density and instead use a suite of surrogate models to generate importance sampling densities were proposed in [36, 37, 32]. Nevertheless, this framework requires all knowledge about the small probability event to be available in the form of biasing densities, and is therefore only applicable to importance sampling estimators. Multilevel Monte Carlo [15, 2] methods use a hierarchy of approximations to the highfidelity model in the sampling scheme. However, those model hierarchies have to satisfy certain error decay criteria, an assumption we do not make here. Subset simulation [3, 34] and line search [40, 11] can be used directly on the highfidelity models, and therefore are of a blackbox nature.
In this work, in addition to the computationally expensive model, we also have information about the system in form of surrogate models, analytical models, expert elicitation, and reduced models. In other settings where such a variety of information is available, information fusion has been used to combine multisource probabilistic information into a single estimator, see [9, 31, 28]. Moreover, combining information from multiple models and sources via a weighted multifidelity method can lead to efficient data assimilation strategies [30].
Here, we propose a new approach to enable small probability estimation for largescale, computationally expensive models that draws from prior work in information fusion, importance sampling, and multifidelity modeling. We use information fusion in combination with multifidelity importancesamplingbased failure probability estimators, where in addition to the variance reduction from importance sampling, we obtain further variance reduction through information fusion. The proposed multifidelity framework uses the available surrogates to compute multiple unbiased failure probability estimators. We then combine them optimally into a new unbiased estimator that has minimal variance amongst all possible linear combinations of those estimators. The method therefore avoids the selection of the lowest variance biasing density to be used for sampling. Selecting the density that leads to the lowest variance in the failure probability estimator would require additional information, and not even error estimates on the surrogate model would suffice. Thus, we circumvent this step and optimally use all information available to us in form of probability estimators.
This paper is structured as follows: In Section 2 we illustrate the challenges in small failure probability computation and cover the necessary background material for multifidelity importance sampling used herein. Section 3 details our proposed approach of information fusion, importance sampling and multifidelity modeling. We then present in Section 4 a moderately expensive convectiondiffusionreaction test case, where we illustrate the asymptotic behavior of our approach. Section 5 discusses a turbulent jet model and demonstrates the computational efficiency of our proposed methods for this computationally expensive model. We close with conclusions in Section 6.
2 Small probability events and importance sampling estimators
We are interested in computing events with small probabilities, e.g., failure events, where the system fails to meet critical constraints. Section 2.1 describes small probability events, Section 2.2 introduces importance sampling and Section 2.3 briefly summarizes multifidelity importance sampling.
2.1 Small probability events
Let be a sample space which, together with a sigma algebra and probability measure, defines a probability space. Define a dimensional random variable with probability density , and let be a realization of . Let be an expensivetoevaluate model of high fidelity with corresponding dimensional quantity of interest . Let denote a limit state function that defines failure of the system. If , then is a configuration where the system fails. This defines a failure set
Define the indicator function via
The standard Monte Carlo estimator of the failure probability
uses realizations of the random variable and estimates
(1) 
In the special case of small probabilities, standard Monte Carlo may be unfeasible due to the large number of samples needed to obtain good estimators. Since failure probabilities are generally small, most realizations will be outside the failure domain , and conversely, only a small fraction of the samples lies in the failure region. The coefficient of variation (also called relative rootmeansquared error) of the estimator is given by
(2) 
Thus, to obtain estimators with a small coefficient of variation, a large number of samples is necessary. For instance, if the small probability is and if we want we would need samples via standard Monte Carlo approaches. This challenge is amplified by the presence of an expensivetoevaluate model, such as the model of a free plane jet in Section 5.
2.2 Importance sampling
Importance sampling achieves variance reduction by using realizations of a random variable with probability density . This random variable is chosen such that its probability density function has higher mass (compared to the nominal density ) in the region of the event of interest. For a general introduction to importance sampling, see [33, Sec.9]. Define the support , and let . Then
(3) 
is well defined, where is the likelihood ratio—in the context of importance sampling also called importance weight. The importancesampling estimate of the failure probability then draws realizations of the random variable with density and evaluates
(4) 
The variance of the importance sampling estimator is
(5) 
where
(6) 
If , and by using (3), one can show that the importance sampling estimator is an unbiased estimator of the failure probability, i.e.,
The importance sampling estimator has mean and variance , and by the central limit theorem converges in distribution to the normal random variable . Constructing a good biasing density that leads to small is challenging [33]. We next introduce lowfidelity surrogate models, which are then used to construct biasing densities.
2.3 Multifidelity Importance Sampling (MFIS)
Recall that by we denote an expensivetoevaluate model of high fidelity with corresponding quantity of interest . Let surrogates
of lower fidelities be available, which are cheaper to evaluate than the highfidelity model . We do not assume any information about the accuracy of the with respect to the highfidelity model . Sections 4.2 and 5.4 detail the specific surrogate models used for the respective applications.
We use the MFIS method (see [35] for details) to obtain estimators of the failure probability. First, MFIS evaluates the surrogate models at samples to obtain a surrogatemodel specific failure set . Second, MFIS computes a biasing density by fitting a distribution in form of a Gaussian mixture model to the parameters in the failure set. If no failed samples are found by the surrogate model, i.e., if , then we set the biasing density to be the nominal density. This leads to biasing densities from which we get importance sampling estimators
(7) 
for . The variance of the importance sampling estimator is given by (5) with and , with being the asymptotic variance from (6) with .
3 Fusion of multifidelity estimators
In many practical situations, a range of probability estimators are available, for instance in form of MFIS estimators derived from different biasing densities, in form of analytical models, or estimators derived from expert elicitation [31]. If one a priori knew which was the lowest variance estimator then a good strategy would be to sample only from that estimator. However, knowing a priori which estimator has the lowest variance is a formidable task, and one has to draw samples to assess which estimator has the lowest variance. In this section, we present our new approach that combines all available estimators in an optimal fashion by solving the following problem.
Problem 1.
Given unbiased estimators, with expected value , i.e. , find an estimator with expected value of the form
(8) 
such that it attains minimal variance amongst all estimators of the form (8). That is, find the optimal weights such that
(9) 
The fused estimator approach allows to still use information coming from the other (highvariance) estimators, whose samples would have otherwise gone to waste. Moreover, with the proposed method we can estimate smallprobabilities for expensivetoevaluate models by exploiting a variety of surrogates. We derive expressions for the mean and variance of the fused estimator in Section 3.1. In Section 3.2, we derive the optimal weights for the fused estimator. Section 3.3 then discusses the special case of uncorrelated estimators. Our proposed algorithm is discussed in Section 3.4, followed by a brief Section 3.5 that discusses measures of convergence of the estimators.
3.1 Mean and variance of fused estimator
We start with the observation that if the weights of the fused estimator sum to one, then the fused estimator is unbiased:
Let the estimators have corresponding variances . To compute the variance of the fused estimator we have to consider covariances between the individual estimators. Define the Pearson productmoment correlation coefficient as
(10) 
where . We also define the symmetric, positive semidefinite covariance matrix as:
(11) 
It is worth noticing that if the estimators are independent, then is diagonal. The variance of the fused estimator from (8) is
which can be written in vector form as
(12) 
In the following section, we provide an explicit formula to find the optimal weights for the general case of (possibly)correlated estimators ; while in Section 3.3 we discuss the case of independent estimators, such as those constructed with the MFIS method.
3.2 Optimizing the weights for minimumvariance estimate
Problem (9) seeks the optimal such that the variance in (12) is minimized and remains unbiased. In this section, we show that such weights exist, are unique, and present a closedform solution, provided that the covariance matrix is invertible. This is summarized in the following result.
Proposition 1.
Let be the vector of probability estimators and assume that is not singular. Define as a columnvector of length . The optimization problem (9) has the unique solution
That is, the minimal variance unbiased estimator is such that
Proof.
We have seen above that if and only if . Define the cost function by using equation (12). Therefore, the optimization problem (9) can be written as the quadratic program
(13) 
Letting denote the Lagrangian cost function associated to (13), the optimality conditions are and . This optimality system is written as
(14) 
For invertible , the unique weights to this quadratic program are then obtained by
(15) 
and the expression for the variance follows by inserting these weights into (13). The estimator is obtained by inserting the weights into (8). ∎
The weights can be expressed explicitly in terms of the components of the covariance matrix as
(16) 
Note, that the weights are inversely proportional to the variance of the individual estimators and the weight depends on the covariance between the estimators and . Also, note that if are correlated some weights may be negative, while for a diagonal all weights are strictly positive. In the next section, we have a closer look at the uncorrelated case.
3.3 The special case of uncorrelated estimators
In the situation where all estimators are uncorrelated, we recover the classical result of the inverse varianceweighted mean [29]. As a corollary from Proposition 1 we get the following result.
Corollary 1.
A few observations about this special case are in order:

The optimal coefficients of the combined estimator are inversely proportional to the asymptotic variance of the corresponding estimator . To reduce the variance via a weighted combination of estimators, smaller weights are assigned to estimators with larger variance.

If one variance is small compared to all other ones, say , then so that . The estimators with large variance cannot reduce the variance of the fused estimator much more.

If all estimators have equal variance, , then so that . Hence, combining the estimators reduces the variance by a factor of .

Since , it follows from both equations in (17) that
(18) Consequently, we are guaranteed to reduce the variance in by combining all estimators in the optimal way described above.
3.4 Fused multifidelity importance sampling: Algorithm and Analysis
We now use the general fusion framework to obtain a failure probability estimator. Thus, we solve Problem 1 in the context of importancesamplingbased failure probability estimators so that . Our proposed method optimally fuses the MFIS estimators from (7), such that
(19) 
with the optimal weights chosen as in Proposition 1 and . Since estimator is computed from samples, uses samples.
We now discuss how compares to a single importance sampling estimator with samples. Consider the estimator that uses samples drawn from a single biasing density for . This estimator would require selection of the lowest biasing density a priori, a formidable task. The next results compares and , and gives a criterion for which the former has lower variance than the latter.
Proposition 2.
Let estimators with samples be given. Let , and be a biasing density that is used to derive an IS estimator with samples. If
then the variance of the fused estimator in (19) with samples is smaller than the variance of the estimator with biasing density with samples, i.e.,
Proof.
Set , so that all estimators use the same number of samples. According to equation (17),
as well as , so that
∎
The importance sampling estimate (7) requires evaluating the highfidelity model at samples from the biasing density. While not required, we use to distribute the computational load evenly. Extension of Proposition 2 is straightforward to the case with different number of samples for each estimator
The computational procedure is summarized in Algorithm 1. Here, we denote samplingbased estimates as , which are realizations of the estimator .
(20) 
(21) 
3.5 Error measures and practical computation
4 Test case: Convectiondiffusionreaction
We first consider a PDE model whose solution can be numerically evaluated with moderate computational cost. With this model, we demonstrate the asymptotic behavior of our method because we can afford to sample the highfidelity model times, which will be too costly for the model in Section 5. The test problem is the convectiondiffusionreaction PDE introduced in Section 4.1. Its discretizations and reducedorder models are described in Section 4.2. Numerical results are presented in Section 4.3.
4.1 Convectiondiffusionreaction PDE model
We consider a simplified model of a premixed combustion flame at constant and uniform pressure, and follow the notation and setup in [7, Sec.3]. The model includes a onestep reaction of the species
in the presence of an additional nonreactive species, nitrogen. The physical combustor domain is in length (direction), and in height (direction), as shown in Figure 1.
The velocity field is set to be constant in the positive direction, and divergence free. The molecular diffusivity is modeled as constant, equal and uniform for all species and temperature. The PDE model is given by
(25) 
where the state is comprised of the components , with the being the mass fractions of the species (fuel, oxidizer, product), and denoting the temperature. Referring to Figure 1, we have that is the Dirichlet part of the boundary and combines the top, bottom and right boundary, where Neumann conditions are prescribed. In sum, ; the boundary conditions are imposed as given in Table 1. The nonlinear reaction term is of Arrhenius type [10], and modeled as
(26)  
(27) 
The parameters of the model are defined in Table 2. The uncertain parameters are the preexponential factor and the activation energy of the Arrhenius model. The domain for these parameters is denoted as . In particular, we have that
Boundary  Temperature  Species 

K  
K  
quantity  physical meaning  assumptions  value 

molecular diffusivity  const., equal, uniform  
velocity  const.  
molecular weight  const.  
molecular weight  const.  
molecular weight  const.  
density of mixture  const.  
univ. gas constant  const.  
heat of reaction  const.  K  
stochiometric coefficient  const.  2  
stochiometric coefficient  const.  1  
stochiometric coefficient  const.  2 
4.2 Discretization and reducedorder models
The model is discretized using a finite difference approximation in two spatial dimensions, with 72 nodes in direction, and 36 nodes in direction, leading to unknowns in the model. The nonlinear system is solved with Newton’s method. Let be the vector with components corresponding to the approximations of the temperature at the grid points. The highfidelity model (HFM) is and the quantity of interest is the maximum temperature over all grid points:
Reducedorder models provide a powerful framework to obtain surrogates for expensivetoevaluate models. In the case of nonlinear systems, reducedorder models can be obtained via reducedbasis methods [19], dynamic mode decomposition [24], proper orthogonal decomposition [5], and many others; for a survey, see [4]. Here, we compute reducedorder models for our multifidelity approach via Proper Orthogonal Decomposition and the Discrete Empirical Interpolation Method (DEIM) for an efficient evaluation of the nonlinear term. The training snapshots are generated from solutions to the highfidelity model on a parameter grid of equally spaced values . The three surrogate models are built from POD basis vectors, and accordingly DEIM interpolation points. The corresponding models are denoted as ROM1, ROM2, ROM3, respectively. We denote by the approximation to the temperature via the th ROM. The surrogate models are the mappings with corresponding quantity of interest denoted as
We refer the reader to [7] for more details on the discretization and ROM construction for this convectiondiffusionreaction model.
4.3 Results for multifidelity fusion of failure probabilities
We define a failure of the system when the maximum temperature in the combustor exceeds K, so that the limit state function is
(28) 
and likewise for the reducedorder models .
To compute the biasing densities, we draw samples from the uniform distribution on , compute surrogatebased solutions, and evaluate the limit state function for those solutions. If the limit state function indicates failure of the system for a solution obtained from the th surrogate model, the corresponding parameter is added to , the failure set computed from the th surrogate model. We compute the biasing densities via MFIS (see Section 2.3) as Gaussian mixture distributions with a single component. Table 3 shows the computational cost in CPU time of computing the biasing distributions from the various ROMs and the HFM. Computing a biasing density using the highfidelity model with samples costs approximately CPUhours. Constructing the biasing density via the lowfidelity models ROM2 and ROM3 reduces the computational time by a factor of 66 and 58, respectively. Note, that ROM1 is the reducedorder model that is cheapest to execute per model evaluation, but it is also the least accurate. In our case, ROM1 did not produce any samples in the failure region, even after samples. It is not unexpected that ROM1 is so inaccurate, since only two POD modes are not enough to resolve the important character of this problem. ROM1 is included to demonstrate how the fusion approach can be effective even in the presence of highly inaccurate surrogate models.
ROM1  ROM2  ROM3  HFM  

# of samples drawn  
# of samples in failure domain  0  13  17  17 
time needed  N.A.  [s]  [s]  [h] 
In Figure 2 we show the quantity of interest, i.e., the maximum temperature. The plots are obtained by generating samples from the nominal distribution (left) and the respective biasing distributions (right), and evaluating the HFM at those samples. Figure 2, left, shows that the typical range of the quantity of interest is between approximately 1200K and 2440K. However, only the events where the quantity of interest is above 2430K are relevant for the failure probability computation. By using the biasing distributions in Figure 2, right, a large portion of the outputs leads to a failure of the system. This indicates that the biasing distributions are successful in generating samples at the failure region of the highfidelity model.
Next, we show results for the fused multifidelity estimator with samples and compare it with importance sampling estimators that only use a single biasing density and also samples. The fused estimator is obtained via Algorithm 1 with samples by fusing the three surrogatemodelbased importance sampling estimators. For reference purposes, a biasing density is constructed as described above using the HFM with samples. Based on this density, we compute an importance sampling estimate of the failure probability with samples, resulting in .
To assess the quality of the fused estimator , we consider the error measures introduced in Section 3.5. In Figure 3, left, we show the root meansquared error of the importance sampling estimators as well as the combined estimator . Figure 3, right, shows the coefficient of variation defined in (24) for the estimators. The fused estimator is competitive in RMSE and coefficient of variation with the estimator using the highfidelity biasing density, but comes at a much lower computational cost.
Note, that the fused estimator does not use any of the highfidelity information. We only plotted the highfidelity estimator for comparison reasons, but the highfidelity density is not used in our algorithm. Heuristically, we could expect the fused estimator to perform better than the MFIS estimator with highfidelityderived biasing density in the following situation. Let the HFM be so expensive that the HF biasing density is built only from a few failure samples, and assume the lowfidelity models are good surrogates, hence able to cheaply explore the failure region. Then the lowfidelity biasing density could be better than the highfidelity biasing density.
In Table 4 we show the weights for the fused estimator . The fused estimator assigns only a small weight to the estimator which uses biasing density . This was expected, as the estimator has large variance due to the fact that biasing density is actually the nominal density, see Table 3 as the ROM1 evaluation did not yield any samples in the failure domain.
0  0  0.005  0.001  0.002  0.005  
0.587  0.471  0.331  0.294  0.415  0.742  
0.413  0.529  0.664  0.705  0.583  0.253 
4.4 Comparison to subset simulation methods
To demonstrate the efficiency of our proposed multifidelity method compared to stateoftheart existing methods in failure probability estimation, we compare our results to subset simulation [3], a widely used method for reliability analysis and failure probability estimation. The method defines intermediate failure events
for a sequence of threshold levels and being the final level. This ensures that the intermediate failure events are nested as . The failure probability can then be expressed as
Thus, this method requires sampling from the conditional events , and the efficiency of this sampling is pivotal to the success of subset simulation. Markov Chain Monte Carlo (MCMC) methods provide efficient solutions to this problem [34]. Note, that the cannot be determined in advance, but are found adaptively by specifying an intermediate failure probability . A typical choice is which yields efficient subset simulation results, see [3].
Here, we compare our fused importance sampling approach for failure probability estimation to a direct application of subset simulation to the full model. We follow the recent MCMC implementation for subset simulation of [34]. Table 5 lists the computational results that include the number of levels that subset simulation needed to arrive at the failure probability estimate, the samples at each level (user defined), the failure probability estimate, and the overall number of samples needed (not known beforehand). All results were averaged over ten independent runs. We also give an approximate coefficient of variation, although we caution that this is not the same coefficient of variation defined in (2), since at the intermediate levels, subset simulation produces correlated samples. Thus, we used an approximated coefficient of variation as suggested in [45, Eq. (19)]. For a thorough discussion of the coefficient of variation estimation within subset simulation we refer to [6, Sec.5.3]. We observe from Table 5 that the coefficient of variation monotonically decreases as more samples are added. To compare our proposed multifidelity fusion method with subset simulation, we first note that the estimate from subset simulation is biased for finite (see [3, Sec.6.3]), whereas our fused estimator is unbiased. Moreover, the numerical results in Table 5 show that the estimated coefficients of variation are about one order of magnitude larger than the coefficients of variation we reported in Figure 3, right. From a computational cost perspective, the estimator with 20,000 samples in subset simulation produces an approximated coefficient of variation of whereas our fused estimator produces a coefficient of variation of for the same number of highfidelity model evaluations. Thus, the fused estimator outperforms subset simulation in this particular example. In sum, our method can successfully take advantage of cheaper lowfidelity methods to get accurate estimators, while the subset simulation method works directly with the full model and therefore does not have access to cheaper surrogate model information.
samples  samples each level  No of levels  failure Prob.  estimated C.o.V. 

2000  500  4  
4000  800  5  
4000  1000  4  
6000  1500  4  
10000  2000  5  
20000  4000  5 
5 Failure probability estimation related to a free plane jet
We apply the proposed fusion of estimators to quantify the influence of uncertain parameters on the amount of turbulent mixing produced by a free plane jet.
This is a challenging problem, since it involves an expensivetoevaluate model for which the naive computation of low probabilities requires thousands of hours of computation. We reduce this number significantly with our multifidelity importance sampling framework via fusion of estimators.
The remainder of this section is organized as follows. Section 5.1 introduces the free plane jet, followed by details of the model and its governing equations in Section 5.2. The uncertain parameters and quantity of interest are defined in Section 5.3. The lowfidelity surrogate models used in this investigation are discussed in Section 5.4. Finally, the results for multifidelity fusion of small probability estimators are presented in Section 5.5.
5.1 Largescale application: Free plane jet
Free turbulent jets are prototypical flows believed to represent the dynamics in many engineering applications, such as combustion and propulsion. As such, free jet flows are the subject of several experimental [18, 17, 23] and numerical investigations [44, 39, 41, 21, 22] and constitute an important benchmark for turbulent flows.
Our expensivetoevaluate model of a free plane jet is based on the twodimensional incompressible Reynoldsaveraged NavierStokes (RANS) equations, complemented by the turbulence model. Although a RANS model does not resolve all relevant turbulent features of the flow, it represents a challenging largescale application for the computation of small probabilities. We use this model to investigate the influence of five uncertain parameters on the amount of turbulent mixing produced by the jet. We quantify turbulent mixing using a relatively simple metric: the integral jet width. One of the uncertain parameters is the Reynolds number at the inlet of the jet, which is assumed to vary from 5,000 to 15,000. The other four uncertain parameters correspond to coefficients of the turbulence model and its boundary condition, as detailed in Section 5.3. Figure 4 shows a flow field typical of the cases considered here.
5.2 Modeling and governing equations
We consider a free plane jet in conditions similar to the ones reported in [21, 22]. Namely, the flow exits a rectangular nozzle into quiescent surroundings with a prescribed tophat velocity profile and turbulence intensity. The nozzle has width , and is infinite along the spanwise direction. The main difference between the free plane jet we considered here and the one described in [21, 22] is the Reynolds number at the exit nozzle. Here the Reynolds number varies between 5,000 and 15,000.
Our simulation model computes the flow in a rectangular domain located at a distance downstream from the exit of the jet nozzle, as illustrated in Figure 5. By doing so, modeling the conditions at the exit plane of the jet nozzle is avoided. Instead, direct numerical simulation data are used to define inlet conditions at the surface .
The dynamics are modeled with the incompressible Reynoldsaveraged NavierStokes equations, complemented by the turbulence model [25]:
(29)  
(30)  
(31)  
(32) 
where denotes the velocity vector, denotes pressure, is the density, is the kinematic viscosity, and is the strain rate tensor given by
In the turbulence model, denotes the turbulent kinetic energy, denotes the turbulent dissipation, and denotes the turbulent kinematic viscosity, defined as
(33) 
The coefficients^{1}^{1}1We use and here as model coefficients, which is typical notation in fluids community. These are only used in this section, and throughout the paper ’s are variances. , , , , in (31)–(33) are either considered as uncertain parameters, or are functions of uncertain parameters, as detailed in Section 5.3.
At the inlet surface Dirichlet boundary conditions are imposed. Data obtained by the direct numerical simulation described in [22] (Reynolds number 10,000) are used to determine reference inlet profiles for velocity, , and for turbulent kinetic energy, . Inlet conditions are allowed to vary by defining a velocity intensity () scale, which is applied to the reference profiles. Turbulent dissipation at the inlet is estimated by assuming a mixing length model. Thus, the boundary conditions at the inlet surface are given by
where denotes the mixing length parameter.
At the symmetry axis surface, , noflux boundary conditions are imposed through a combination of Dirichlet and Neumann conditions of the form
Finally, at the surface “farfield” conditions that allow the entrainment of air around the jet are imposed through weak Dirichlet conditions, as detailed in [43].
The complete model includes additional features that make it more amenable to numerical discretization. The most delicate issue in the solution of the RANS model is the possible loss of positivity of the turbulence variables. To avoid this issue, we introduce an appropriately mollified (and thus smoothly differentiable) max function to ensure positivity of and . In addition, if inflow is detected at any point on the farfield boundary, the boundary condition is switched from Neumann to Dirichlet by means of a suitably mollified indicator of the inflow region. Finally, we stabilize the discrete equations using a strongly consistent stabilization technique (Galerkin Least Squares, GLS, stabilization) to address the convectiondominated nature of the RANS equations. The complete formulation is shown in [43].
The model equations described above are solved numerically using a finite element discretization. The discretization is implemented in FEniCS [1] by specifying the weak form of the residual, including the GLS stabilization and mollified versions of the positivity constraints on and and the switching boundary condition on the outflow boundary. To solve the nonlinear system of equations that arise from the finite element discretization, we employ a damped Newton method. The bilinear form of the state Jacobian operator is computed using FEniCS’s symbolic differentiation capabilities. Finally, we use pseudotime continuation to guarantee global convergence of the Newton method to a physically stable solution (if such solution exists) [20]. The finite element solver is detailed in [43].
5.3 Uncertain parameters and quantity of interest
In this investigation five uncertain parameters are considered: velocity intensity at inlet^{2}^{2}2Since we keep other physical parameters constant, by varying the velocity intensity we effectively change the Reynolds number. (), mixing length at inlet (), and the turbulence model coefficients , , and :
The parameter domain is , and the nominal distribution of parameters is uniform in .
The other two coefficients of the turbulence model, and , are also uncertain but do not vary independently. According to Dunn et al. [12], empirical evidence suggests that is related to by
In addition, as noted in [12, 16], the loglaw implies that must follow from
where is the von Kárman constant.
The quantity of interest is the integral jet width measured at :
(34) 
where . Figure 6 illustrates a typical solution behavior for this turbulent jet by plotting contours of the turbulent kinetic energy for selected samples in .
5.4 Simplifiedphysics surrogate models
We consider four surrogate models to represent the dynamics of the free plane jet flow. The models are based on two distinct computational grids (fine and coarse), and on two representations of turbulence effects. The fine computational grid contains 10,000 elements and 5,151 nodes, while the coarse grid contains 2,500 elements and 1,326 nodes. Furthermore, the models are based either on the complete turbulence model described in the previous section, or on a prescribed turbulent viscosity field.
In the latter case, the turbulent viscosity field is estimated by a linear interpolation based on 243 conditions that span the input parameter space on a uniform grid (3 points along each of the 5 dimensions). At each of these 243 conditions, the turbulent