Multifidelity probability estimation via fusion of estimators

Multifidelity probability estimation
via fusion of estimators

Boris Kramer Department of Aeronautics & Astronautics, Massachusetts Institute of Technology, Cambridge, MA bokramer@mit.edu, noll@mit.edu.    Alexandre Noll Marques11footnotemark: 1    Benjamin Peherstorfer Courant Institute of Mathematical Sciences, New York University, NY (pehersto@cims.nyu.edu).    Umberto Villa Department of Electrical and Systems Engineering, Washington University in St. Louis, MO (uvilla@wustl.edu).    Karen Willcox Oden Institute for Computational Engineering & Sciences, The University of Texas at Austin, TX (kwillcox@ices.utexas.edu)
April 22, 2019
Abstract

This paper develops a multifidelity method that enables estimation of failure probabilities for expensive-to-evaluate models via information fusion and importance sampling. The presented general fusion method combines multiple probability estimators with the goal of variance reduction. We use low-fidelity models to derive biasing densities for importance sampling and then fuse the importance sampling estimators such that the fused multifidelity estimator is unbiased and has mean-squared error lower than or equal to that of any of the importance sampling estimators alone. By fusing all available estimators, the method circumvents the challenging problem of selecting the best biasing density and using only that density for sampling. A rigorous analysis shows that the fused estimator is optimal in the sense that it has minimal variance amongst all possible combinations of the estimators. The asymptotic behavior of the proposed method is demonstrated on a convection-diffusion-reaction partial differential equation model for which samples can be afforded. To illustrate the proposed method at scale, we consider a model of a free plane jet and quantify how uncertainties at the flow inlet propagate to a quantity of interest related to turbulent mixing. Compared to an importance sampling estimator that uses the high-fidelity model alone, our multifidelity estimator reduces the required CPU time by 65% while achieving a similar coefficient of variation.

M

ultifidelity modeling, uncertainty quantification, information fusion, importance sampling, reduced-order modeling, failure probability estimation, PDEs, turbulent jet

1 Introduction

This paper considers multifidelity estimation of failure probabilities for large-scale applications with expensive-to-evaluate models. Failure probabilities are required in, e.g., reliable engineering design and risk analysis. Yet failure probability estimation with expensive-to-evaluate nonlinear models is computationally challenging due to the large number of Monte Carlo samples needed for low-variance estimates.

Efficient failure probability estimation methods aim to reduce the number of samples at which the expensive model is evaluated, e.g., by exploiting variance-reducing sampling strategies, multifidelity/multilevel estimation methods, or sequential sampling approaches. Variance reduction can be obtained through importance sampling [33], which allows for order-of-magnitude reductions in the number of samples needed to reliably estimate a small probability. However, importance sampling relies on having a good biasing distribution which in turn requires insight into the system. Surrogate models can provide such insight at much lower computational cost. Multifidelity approaches (see [38] for a review) that use surrogates for failure probability estimation via sampling have seen great interest recently [26, 8, 27, 13, 42, 35, 14], but require that the user selects a good importance sampling density. Multifidelity methods that avoid the selection of a single biasing density and instead use a suite of surrogate models to generate importance sampling densities were proposed in [36, 37, 32]. Nevertheless, this framework requires all knowledge about the small probability event to be available in the form of biasing densities, and is therefore only applicable to importance sampling estimators. Multilevel Monte Carlo [15, 2] methods use a hierarchy of approximations to the high-fidelity model in the sampling scheme. However, those model hierarchies have to satisfy certain error decay criteria, an assumption we do not make here. Subset simulation [3, 34] and line search [40, 11] can be used directly on the high-fidelity models, and therefore are of a black-box nature.

In this work, in addition to the computationally expensive model, we also have information about the system in form of surrogate models, analytical models, expert elicitation, and reduced models. In other settings where such a variety of information is available, information fusion has been used to combine multi-source probabilistic information into a single estimator, see [9, 31, 28]. Moreover, combining information from multiple models and sources via a weighted multifidelity method can lead to efficient data assimilation strategies [30].

Here, we propose a new approach to enable small probability estimation for large-scale, computationally expensive models that draws from prior work in information fusion, importance sampling, and multifidelity modeling. We use information fusion in combination with multifidelity importance-sampling-based failure probability estimators, where in addition to the variance reduction from importance sampling, we obtain further variance reduction through information fusion. The proposed multifidelity framework uses the available surrogates to compute multiple unbiased failure probability estimators. We then combine them optimally into a new unbiased estimator that has minimal variance amongst all possible linear combinations of those estimators. The method therefore avoids the selection of the lowest variance biasing density to be used for sampling. Selecting the density that leads to the lowest variance in the failure probability estimator would require additional information, and not even error estimates on the surrogate model would suffice. Thus, we circumvent this step and optimally use all information available to us in form of probability estimators.

This paper is structured as follows: In Section 2 we illustrate the challenges in small failure probability computation and cover the necessary background material for multifidelity importance sampling used herein. Section 3 details our proposed approach of information fusion, importance sampling and multifidelity modeling. We then present in Section 4 a moderately expensive convection-diffusion-reaction test case, where we illustrate the asymptotic behavior of our approach. Section 5 discusses a turbulent jet model and demonstrates the computational efficiency of our proposed methods for this computationally expensive model. We close with conclusions in Section 6.

2 Small probability events and importance sampling estimators

We are interested in computing events with small probabilities, e.g., failure events, where the system fails to meet critical constraints. Section 2.1 describes small probability events, Section 2.2 introduces importance sampling and Section 2.3 briefly summarizes multifidelity importance sampling.

2.1 Small probability events

Let be a sample space which, together with a sigma algebra and probability measure, defines a probability space. Define a -dimensional random variable with probability density , and let be a realization of . Let be an expensive-to-evaluate model of high fidelity with corresponding -dimensional quantity of interest . Let denote a limit state function that defines failure of the system. If , then is a configuration where the system fails. This defines a failure set

Define the indicator function via

The standard Monte Carlo estimator of the failure probability

uses realizations of the random variable and estimates

(1)

In the special case of small probabilities, standard Monte Carlo may be unfeasible due to the large number of samples needed to obtain good estimators. Since failure probabilities are generally small, most realizations will be outside the failure domain , and conversely, only a small fraction of the samples lies in the failure region. The coefficient of variation (also called relative root-mean-squared error) of the estimator is given by

(2)

Thus, to obtain estimators with a small coefficient of variation, a large number of samples is necessary. For instance, if the small probability is and if we want we would need samples via standard Monte Carlo approaches. This challenge is amplified by the presence of an expensive-to-evaluate model, such as the model of a free plane jet in Section 5.

2.2 Importance sampling

Importance sampling achieves variance reduction by using realizations of a random variable with probability density . This random variable is chosen such that its probability density function has higher mass (compared to the nominal density ) in the region of the event of interest. For a general introduction to importance sampling, see [33, Sec.9]. Define the support , and let . Then

(3)

is well defined, where is the likelihood ratio—in the context of importance sampling also called importance weight. The importance-sampling estimate of the failure probability then draws realizations of the random variable with density and evaluates

(4)

The variance of the importance sampling estimator is

(5)

where

(6)

If , and by using (3), one can show that the importance sampling estimator is an unbiased estimator of the failure probability, i.e.,

The importance sampling estimator has mean and variance , and by the central limit theorem converges in distribution to the normal random variable . Constructing a good biasing density that leads to small is challenging [33]. We next introduce low-fidelity surrogate models, which are then used to construct biasing densities.

2.3 Multifidelity Importance Sampling (MFIS)

Recall that by we denote an expensive-to-evaluate model of high fidelity with corresponding quantity of interest . Let surrogates

of lower fidelities be available, which are cheaper to evaluate than the high-fidelity model . We do not assume any information about the accuracy of the with respect to the high-fidelity model . Sections 4.2 and 5.4 detail the specific surrogate models used for the respective applications.

We use the MFIS method (see [35] for details) to obtain estimators of the failure probability. First, MFIS evaluates the surrogate models at samples to obtain a surrogate-model specific failure set . Second, MFIS computes a biasing density by fitting a distribution in form of a Gaussian mixture model to the parameters in the failure set. If no failed samples are found by the surrogate model, i.e., if , then we set the biasing density to be the nominal density. This leads to biasing densities from which we get importance sampling estimators

(7)

for . The variance of the importance sampling estimator is given by (5) with and , with being the asymptotic variance from (6) with .

3 Fusion of multifidelity estimators

In many practical situations, a range of probability estimators are available, for instance in form of MFIS estimators derived from different biasing densities, in form of analytical models, or estimators derived from expert elicitation [31]. If one a priori knew which was the lowest variance estimator then a good strategy would be to sample only from that estimator. However, knowing a priori which estimator has the lowest variance is a formidable task, and one has to draw samples to assess which estimator has the lowest variance. In this section, we present our new approach that combines all available estimators in an optimal fashion by solving the following problem.

Problem 1.

Given unbiased estimators, with expected value , i.e. , find an estimator with expected value of the form

(8)

such that it attains minimal variance amongst all estimators of the form (8). That is, find the optimal weights such that

(9)

The fused estimator approach allows to still use information coming from the other (high-variance) estimators, whose samples would have otherwise gone to waste. Moreover, with the proposed method we can estimate small-probabilities for expensive-to-evaluate models by exploiting a variety of surrogates. We derive expressions for the mean and variance of the fused estimator in Section 3.1. In Section 3.2, we derive the optimal weights for the fused estimator. Section 3.3 then discusses the special case of uncorrelated estimators. Our proposed algorithm is discussed in Section 3.4, followed by a brief Section 3.5 that discusses measures of convergence of the estimators.

3.1 Mean and variance of fused estimator

We start with the observation that if the weights of the fused estimator sum to one, then the fused estimator is unbiased:

Let the estimators have corresponding variances . To compute the variance of the fused estimator we have to consider covariances between the individual estimators. Define the Pearson product-moment correlation coefficient as

(10)

where . We also define the symmetric, positive semi-definite covariance matrix as:

(11)

It is worth noticing that if the estimators are independent, then is diagonal. The variance of the fused estimator from (8) is

which can be written in vector form as

(12)

In the following section, we provide an explicit formula to find the optimal weights for the general case of (possibly)-correlated estimators ; while in Section 3.3 we discuss the case of independent estimators, such as those constructed with the MFIS method.

3.2 Optimizing the weights for minimum-variance estimate

Problem (9) seeks the optimal such that the variance in (12) is minimized and remains unbiased. In this section, we show that such weights exist, are unique, and present a closed-form solution, provided that the covariance matrix is invertible. This is summarized in the following result.

Proposition 1.

Let be the vector of probability estimators and assume that is not singular. Define as a column-vector of length . The optimization problem (9) has the unique solution

That is, the minimal variance unbiased estimator is such that

Proof.

We have seen above that if and only if . Define the cost function by using equation (12). Therefore, the optimization problem (9) can be written as the quadratic program

(13)

Letting denote the Lagrangian cost function associated to (13), the optimality conditions are and . This optimality system is written as

(14)

For invertible , the unique weights to this quadratic program are then obtained by

(15)

and the expression for the variance follows by inserting these weights into (13). The estimator is obtained by inserting the weights into (8). ∎

The weights can be expressed explicitly in terms of the components of the covariance matrix as

(16)

Note, that the weights are inversely proportional to the variance of the individual estimators and the weight depends on the covariance between the estimators and . Also, note that if are correlated some weights may be negative, while for a diagonal all weights are strictly positive. In the next section, we have a closer look at the uncorrelated case.

3.3 The special case of uncorrelated estimators

In the situation where all estimators are uncorrelated, we recover the classical result of the inverse variance-weighted mean [29]. As a corollary from Proposition 1 we get the following result.

Corollary 1.

Consider the setting from Proposition 1, and let be diagonal. Then the unique solution to the optimization problem (9) is given by

(17)

A few observations about this special case are in order:

  1. The optimal coefficients of the combined estimator are inversely proportional to the asymptotic variance of the corresponding estimator . To reduce the variance via a weighted combination of estimators, smaller weights are assigned to estimators with larger variance.

  2. If one variance is small compared to all other ones, say , then so that . The estimators with large variance cannot reduce the variance of the fused estimator much more.

  3. If all estimators have equal variance, , then so that . Hence, combining the estimators reduces the variance by a factor of .

  4. Since , it follows from both equations in (17) that

    (18)

    Consequently, we are guaranteed to reduce the variance in by combining all estimators in the optimal way described above.

3.4 Fused multifidelity importance sampling: Algorithm and Analysis

We now use the general fusion framework to obtain a failure probability estimator. Thus, we solve Problem 1 in the context of importance-sampling-based failure probability estimators so that . Our proposed method optimally fuses the MFIS estimators from (7), such that

(19)

with the optimal weights chosen as in Proposition 1 and . Since estimator is computed from samples, uses samples.

We now discuss how compares to a single importance sampling estimator with samples. Consider the estimator that uses samples drawn from a single biasing density for . This estimator would require selection of the lowest biasing density a priori, a formidable task. The next results compares and , and gives a criterion for which the former has lower variance than the latter.

Proposition 2.

Let estimators with samples be given. Let , and be a biasing density that is used to derive an IS estimator with samples. If

then the variance of the fused estimator in (19) with samples is smaller than the variance of the estimator with biasing density with samples, i.e.,

Proof.

Set , so that all estimators use the same number of samples. According to equation (17),

as well as , so that

The importance sampling estimate (7) requires evaluating the high-fidelity model at samples from the biasing density. While not required, we use to distribute the computational load evenly. Extension of Proposition 2 is straightforward to the case with different number of samples for each estimator

The computational procedure is summarized in Algorithm 1. Here, we denote sampling-based estimates as , which are realizations of the estimator .

0:  Nominal distribution , biasing distributions , # of evaluations , limit state function .
0:  Failure probability estimate and variance estimate
1:  for   do {Loop over all surrogates}
2:     Draw independent realizations from with density and compute
(20)
3:     Compute the sample variances
(21)
4:  end for
5:  Define the vector
6:  Let
7:  Compute the fused estimate as in (13):
(22)
Algorithm 1 Computing failure probability estimate via fused importance sampling

3.5 Error measures and practical computation

The failure probability estimate is computed as in (20) and the sample variance as in (21). The root-mean-squared-error (RMSE) of the estimate is

(23)

and the relative mean-squared-error, or coefficient of variation is computed as

(24)

4 Test case: Convection-diffusion-reaction

We first consider a PDE model whose solution can be numerically evaluated with moderate computational cost. With this model, we demonstrate the asymptotic behavior of our method because we can afford to sample the high-fidelity model times, which will be too costly for the model in Section 5. The test problem is the convection-diffusion-reaction PDE introduced in Section 4.1. Its discretizations and reduced-order models are described in Section 4.2. Numerical results are presented in Section 4.3.

4.1 Convection-diffusion-reaction PDE model

We consider a simplified model of a premixed combustion flame at constant and uniform pressure, and follow the notation and set-up in [7, Sec.3]. The model includes a one-step reaction of the species

in the presence of an additional non-reactive species, nitrogen. The physical combustor domain is in length (-direction), and in height (-direction), as shown in Figure 1.

Figure 1: Set-up of combustor, with details of the boundary conditions in Table 1.

The velocity field is set to be constant in the positive direction, and divergence free. The molecular diffusivity is modeled as constant, equal and uniform for all species and temperature. The PDE model is given by

(25)

where the state is comprised of the components , with the being the mass fractions of the species (fuel, oxidizer, product), and denoting the temperature. Referring to Figure 1, we have that is the Dirichlet part of the boundary and combines the top, bottom and right boundary, where Neumann conditions are prescribed. In sum, ; the boundary conditions are imposed as given in Table 1. The nonlinear reaction term is of Arrhenius type [10], and modeled as

(26)
(27)

The parameters of the model are defined in Table 2. The uncertain parameters are the pre-exponential factor and the activation energy of the Arrhenius model. The domain for these parameters is denoted as . In particular, we have that

Boundary Temperature Species
K
K
Table 1: Boundary conditions for the combustion model from [7].
quantity physical meaning assumptions value
molecular diffusivity const., equal, uniform
velocity const.
molecular weight const.
molecular weight const.
molecular weight const.
density of mixture const.
univ. gas constant const.
heat of reaction const. K
stochiometric coefficient const. 2
stochiometric coefficient const. 1
stochiometric coefficient const. 2
Table 2: Parameters for the combustion model from [7].

4.2 Discretization and reduced-order models

The model is discretized using a finite difference approximation in two spatial dimensions, with 72 nodes in direction, and 36 nodes in direction, leading to unknowns in the model. The nonlinear system is solved with Newton’s method. Let be the vector with components corresponding to the approximations of the temperature at the grid points. The high-fidelity model (HFM) is and the quantity of interest is the maximum temperature over all grid points:

Reduced-order models provide a powerful framework to obtain surrogates for expensive-to-evaluate models. In the case of nonlinear systems, reduced-order models can be obtained via reduced-basis methods [19], dynamic mode decomposition [24], proper orthogonal decomposition [5], and many others; for a survey, see [4]. Here, we compute reduced-order models for our multifidelity approach via Proper Orthogonal Decomposition and the Discrete Empirical Interpolation Method (DEIM) for an efficient evaluation of the nonlinear term. The training snapshots are generated from solutions to the high-fidelity model on a parameter grid of equally spaced values . The three surrogate models are built from POD basis vectors, and accordingly DEIM interpolation points. The corresponding models are denoted as ROM1, ROM2, ROM3, respectively. We denote by the approximation to the temperature via the th ROM. The surrogate models are the mappings with corresponding quantity of interest denoted as

We refer the reader to [7] for more details on the discretization and ROM construction for this convection-diffusion-reaction model.

4.3 Results for multifidelity fusion of failure probabilities

We define a failure of the system when the maximum temperature in the combustor exceeds K, so that the limit state function is

(28)

and likewise for the reduced-order models .

To compute the biasing densities, we draw samples from the uniform distribution on , compute surrogate-based solutions, and evaluate the limit state function for those solutions. If the limit state function indicates failure of the system for a solution obtained from the th surrogate model, the corresponding parameter is added to , the failure set computed from the th surrogate model. We compute the biasing densities via MFIS (see Section 2.3) as Gaussian mixture distributions with a single component. Table 3 shows the computational cost in CPU time of computing the biasing distributions from the various ROMs and the HFM. Computing a biasing density using the high-fidelity model with samples costs approximately CPU-hours. Constructing the biasing density via the low-fidelity models ROM2 and ROM3 reduces the computational time by a factor of 66 and 58, respectively. Note, that ROM1 is the reduced-order model that is cheapest to execute per model evaluation, but it is also the least accurate. In our case, ROM1 did not produce any samples in the failure region, even after samples. It is not unexpected that ROM1 is so inaccurate, since only two POD modes are not enough to resolve the important character of this problem. ROM1 is included to demonstrate how the fusion approach can be effective even in the presence of highly inaccurate surrogate models.

ROM1 ROM2 ROM3 HFM
# of samples drawn
# of samples in failure domain 0 13 17 17
time needed N.A. [s] [s] [h]
Table 3: CPU time to generate the biasing densities, and the number of samples in the failure domain.

In Figure 2 we show the quantity of interest, i.e., the maximum temperature. The plots are obtained by generating samples from the nominal distribution (left) and the respective biasing distributions (right), and evaluating the HFM at those samples. Figure 2, left, shows that the typical range of the quantity of interest is between approximately 1200K and 2440K. However, only the events where the quantity of interest is above 2430K are relevant for the failure probability computation. By using the biasing distributions in Figure 2, right, a large portion of the outputs leads to a failure of the system. This indicates that the biasing distributions are successful in generating samples at the failure region of the high-fidelity model.

Figure 2: Quantity of interest in [K] of HFM ordered by magnitude versus # of samples , for samples. Left: Samples are from the nominal (uniform) distribution. Right: The parameter samples are drawn from different biasing distributions (biased towards failure above K). This demonstrates that the biasing distributions are good since the outputs are largely above the failure threshold. Here, ROM1 did not have any parameters in the failure domain, and hence defaulted to being the nominal distribution and is therefore not plotted.

Next, we show results for the fused multifidelity estimator with samples and compare it with importance sampling estimators that only use a single biasing density and also samples. The fused estimator is obtained via Algorithm 1 with samples by fusing the three surrogate-model-based importance sampling estimators. For reference purposes, a biasing density is constructed as described above using the HFM with samples. Based on this density, we compute an importance sampling estimate of the failure probability with samples, resulting in .

To assess the quality of the fused estimator , we consider the error measures introduced in Section 3.5. In Figure 3, left, we show the root mean-squared error of the importance sampling estimators as well as the combined estimator . Figure 3, right, shows the coefficient of variation defined in (24) for the estimators. The fused estimator is competitive in RMSE and coefficient of variation with the estimator using the high-fidelity biasing density, but comes at a much lower computational cost.

Note, that the fused estimator does not use any of the high-fidelity information. We only plotted the high-fidelity estimator for comparison reasons, but the high-fidelity density is not used in our algorithm. Heuristically, we could expect the fused estimator to perform better than the MFIS estimator with high-fidelity-derived biasing density in the following situation. Let the HFM be so expensive that the HF biasing density is built only from a few failure samples, and assume the low-fidelity models are good surrogates, hence able to cheaply explore the failure region. Then the low-fidelity biasing density could be better than the high-fidelity biasing density.

Figure 3: Left: Root mean-squared error from (23); Right: Coefficient of variation as defined in (24) for the convection-diffusion-reaction simulation.

In Table 4 we show the weights for the fused estimator . The fused estimator assigns only a small weight to the estimator which uses biasing density . This was expected, as the estimator has large variance due to the fact that biasing density is actually the nominal density, see Table 3 as the ROM1 evaluation did not yield any samples in the failure domain.

0 0 0.005 0.001 0.002 0.005
0.587 0.471 0.331 0.294 0.415 0.742
0.413 0.529 0.664 0.705 0.583 0.253
Table 4: Weights of the fused estimator with samples.

4.4 Comparison to subset simulation methods

To demonstrate the efficiency of our proposed multifidelity method compared to state-of-the-art existing methods in failure probability estimation, we compare our results to subset simulation [3], a widely used method for reliability analysis and failure probability estimation. The method defines intermediate failure events

for a sequence of threshold levels and being the final level. This ensures that the intermediate failure events are nested as . The failure probability can then be expressed as

Thus, this method requires sampling from the conditional events , and the efficiency of this sampling is pivotal to the success of subset simulation. Markov Chain Monte Carlo (MCMC) methods provide efficient solutions to this problem [34]. Note, that the cannot be determined in advance, but are found adaptively by specifying an intermediate failure probability . A typical choice is which yields efficient subset simulation results, see [3].

Here, we compare our fused importance sampling approach for failure probability estimation to a direct application of subset simulation to the full model. We follow the recent MCMC implementation for subset simulation of [34]. Table 5 lists the computational results that include the number of levels that subset simulation needed to arrive at the failure probability estimate, the samples at each level (user defined), the failure probability estimate, and the overall number of samples needed (not known beforehand). All results were averaged over ten independent runs. We also give an approximate coefficient of variation, although we caution that this is not the same coefficient of variation defined in (2), since at the intermediate levels, subset simulation produces correlated samples. Thus, we used an approximated coefficient of variation as suggested in  [45, Eq. (19)]. For a thorough discussion of the coefficient of variation estimation within subset simulation we refer to [6, Sec.5.3]. We observe from Table 5 that the coefficient of variation monotonically decreases as more samples are added. To compare our proposed multifidelity fusion method with subset simulation, we first note that the estimate from subset simulation is biased for finite (see [3, Sec.6.3]), whereas our fused estimator is unbiased. Moreover, the numerical results in Table 5 show that the estimated coefficients of variation are about one order of magnitude larger than the coefficients of variation we reported in Figure 3, right. From a computational cost perspective, the estimator with 20,000 samples in subset simulation produces an approximated coefficient of variation of whereas our fused estimator produces a coefficient of variation of for the same number of high-fidelity model evaluations. Thus, the fused estimator outperforms subset simulation in this particular example. In sum, our method can successfully take advantage of cheaper low-fidelity methods to get accurate estimators, while the subset simulation method works directly with the full model and therefore does not have access to cheaper surrogate model information.

samples samples each level No of levels failure Prob. estimated C.o.V.
2000 500 4
4000 800 5
4000 1000 4
6000 1500 4
10000 2000 5
20000 4000 5
Table 5: Results for subset simulation to compute failure probabilities for the convection-diffusion-reaction problem.

5 Failure probability estimation related to a free plane jet

We apply the proposed fusion of estimators to quantify the influence of uncertain parameters on the amount of turbulent mixing produced by a free plane jet.

This is a challenging problem, since it involves an expensive-to-evaluate model for which the naive computation of low probabilities requires thousands of hours of computation. We reduce this number significantly with our multifidelity importance sampling framework via fusion of estimators.

The remainder of this section is organized as follows. Section 5.1 introduces the free plane jet, followed by details of the model and its governing equations in Section 5.2. The uncertain parameters and quantity of interest are defined in Section 5.3. The low-fidelity surrogate models used in this investigation are discussed in Section 5.4. Finally, the results for multifidelity fusion of small probability estimators are presented in Section 5.5.

5.1 Large-scale application: Free plane jet

Free turbulent jets are prototypical flows believed to represent the dynamics in many engineering applications, such as combustion and propulsion. As such, free jet flows are the subject of several experimental [18, 17, 23] and numerical investigations [44, 39, 41, 21, 22] and constitute an important benchmark for turbulent flows.

Our expensive-to-evaluate model of a free plane jet is based on the two-dimensional incompressible Reynolds-averaged Navier-Stokes (RANS) equations, complemented by the turbulence model. Although a RANS model does not resolve all relevant turbulent features of the flow, it represents a challenging large-scale application for the computation of small probabilities. We use this model to investigate the influence of five uncertain parameters on the amount of turbulent mixing produced by the jet. We quantify turbulent mixing using a relatively simple metric: the integral jet width. One of the uncertain parameters is the Reynolds number at the inlet of the jet, which is assumed to vary from 5,000 to 15,000. The other four uncertain parameters correspond to coefficients of the turbulence model and its boundary condition, as detailed in Section 5.3. Figure 4 shows a flow field typical of the cases considered here.

(a) Contours of turbulent kinetic energy.
(b) Streamlines colored by the intensity of the velocity.
Figure 4: Flow field of a two-dimensional plane jet at Reynolds number 10,000, computed with standard coefficients of turbulence model.

5.2 Modeling and governing equations

We consider a free plane jet in conditions similar to the ones reported in [21, 22]. Namely, the flow exits a rectangular nozzle into quiescent surroundings with a prescribed top-hat velocity profile and turbulence intensity. The nozzle has width , and is infinite along the span-wise direction. The main difference between the free plane jet we considered here and the one described in [21, 22] is the Reynolds number at the exit nozzle. Here the Reynolds number varies between 5,000 and 15,000.

Our simulation model computes the flow in a rectangular domain located at a distance downstream from the exit of the jet nozzle, as illustrated in Figure 5. By doing so, modeling the conditions at the exit plane of the jet nozzle is avoided. Instead, direct numerical simulation data are used to define inlet conditions at the surface .

Figure 5: Illustration of the free plane jet setup. The diameter of the nozzle is denoted by . The simulation domain is composed of a box situated at a distance downstream to the nozzle exit.

The dynamics are modeled with the incompressible Reynolds-averaged Navier-Stokes equations, complemented by the turbulence model [25]:

(29)
(30)
(31)
(32)

where denotes the velocity vector, denotes pressure, is the density, is the kinematic viscosity, and is the strain rate tensor given by

In the turbulence model, denotes the turbulent kinetic energy, denotes the turbulent dissipation, and denotes the turbulent kinematic viscosity, defined as

(33)

The coefficients111We use and here as model coefficients, which is typical notation in fluids community. These are only used in this section, and throughout the paper ’s are variances. , , , , in (31)–(33) are either considered as uncertain parameters, or are functions of uncertain parameters, as detailed in Section 5.3.

At the inlet surface Dirichlet boundary conditions are imposed. Data obtained by the direct numerical simulation described in [22] (Reynolds number 10,000) are used to determine reference inlet profiles for velocity, , and for turbulent kinetic energy, . Inlet conditions are allowed to vary by defining a velocity intensity () scale, which is applied to the reference profiles. Turbulent dissipation at the inlet is estimated by assuming a mixing length model. Thus, the boundary conditions at the inlet surface are given by

where denotes the mixing length parameter.

At the symmetry axis surface, , no-flux boundary conditions are imposed through a combination of Dirichlet and Neumann conditions of the form

Finally, at the surface “far-field” conditions that allow the entrainment of air around the jet are imposed through weak Dirichlet conditions, as detailed in [43].

The complete model includes additional features that make it more amenable to numerical discretization. The most delicate issue in the solution of the RANS model is the possible loss of positivity of the turbulence variables. To avoid this issue, we introduce an appropriately mollified (and thus smoothly differentiable) max function to ensure positivity of and . In addition, if inflow is detected at any point on the far-field boundary, the boundary condition is switched from Neumann to Dirichlet by means of a suitably mollified indicator of the inflow region. Finally, we stabilize the discrete equations using a strongly consistent stabilization technique (Galerkin Least Squares, GLS, stabilization) to address the convection-dominated nature of the RANS equations. The complete formulation is shown in [43].

The model equations described above are solved numerically using a finite element discretization. The discretization is implemented in FEniCS [1] by specifying the weak form of the residual, including the GLS stabilization and mollified versions of the positivity constraints on and and the switching boundary condition on the outflow boundary. To solve the nonlinear system of equations that arise from the finite element discretization, we employ a damped Newton method. The bilinear form of the state Jacobian operator is computed using FEniCS’s symbolic differentiation capabilities. Finally, we use pseudo-time continuation to guarantee global convergence of the Newton method to a physically stable solution (if such solution exists) [20]. The finite element solver is detailed in [43].

5.3 Uncertain parameters and quantity of interest

In this investigation five uncertain parameters are considered: velocity intensity at inlet222Since we keep other physical parameters constant, by varying the velocity intensity we effectively change the Reynolds number. (), mixing length at inlet (), and the turbulence model coefficients , , and :

The parameter domain is , and the nominal distribution of parameters is uniform in .

The other two coefficients of the turbulence model, and , are also uncertain but do not vary independently. According to Dunn et al. [12], empirical evidence suggests that is related to by

In addition, as noted in [12, 16], the log-law implies that must follow from

where is the von Kárman constant.

(a) , .
(b) , .
(c) , .
(d) , .
Figure 6: Samples of the flow solution computed at different points of the input parameter space . The plots show contours of turbulent kinetic energy and velocity streamlines. The white bars denote the integral jet width associated with each case.

The quantity of interest is the integral jet width measured at :

(34)

where . Figure 6 illustrates a typical solution behavior for this turbulent jet by plotting contours of the turbulent kinetic energy for selected samples in .

5.4 Simplified-physics surrogate models

We consider four surrogate models to represent the dynamics of the free plane jet flow. The models are based on two distinct computational grids (fine and coarse), and on two representations of turbulence effects. The fine computational grid contains 10,000 elements and 5,151 nodes, while the coarse grid contains 2,500 elements and 1,326 nodes. Furthermore, the models are based either on the complete turbulence model described in the previous section, or on a prescribed turbulent viscosity field.

In the latter case, the turbulent viscosity field is estimated by a linear interpolation based on 243 conditions that span the input parameter space on a uniform grid (3 points along each of the 5 dimensions). At each of these 243 conditions, the turbulent