Solving Stochastic Inverse Problems using Sigma-Algebras on Contour Maps
We compute approximate solutions to inverse problems for determining parameters in differential equation models with stochastic data on output quantities. The formulation of the problem and modeling framework define a solution as a probability measure on the parameter domain for a given algebra. In the case where the number of output quantities is less than the number of parameters, the inverse of the map from parameters to data defines a type of generalized contour map. The approximate contour maps define a geometric structure on events in the algebra for the parameter domain. We develop and analyze an inherently non-intrusive method of sampling the parameter domain and events in the given algebra to approximate the probability measure. We use results from stochastic geometry for point processes to prove convergence of a random sample based approximation method. We define a numerical algebra on which we compute probabilities and derive computable estimates for the error in the probability measure. We present numerical results to illustrate the various sources of error for a model of fluid flow past a cylinder.
There are a seemingly unlimited number of proposed methods for solving forward and inverse problems of interest in the scientific community. The computational model considered and the framework assumed in formulating the forward and inverse problems play a critical role in defining the solution methodology. In this work, we consider a set-approximation method for approximating solutions to a stochastic inverse problem for deterministic models formulated within a measure-theoretic framework . The solutions we seek are non-parametrically defined probability measures on the model parameter domain defined by uncertain model inputs, e.g. viscosity of a fluid, initial or boundary conditions, and/or domain parameters such as locations of holes.
The models we consider are deterministic physics-based models for which a limited number of physical observations of the solution are available. We call the observations quantities of interest (QoI) and are particularly interested in the situation where there are a smaller number of QoI than model parameters. In this case, solution of the deterministic inverse problem is complicated by the fact that the map defined by inverting the QoI is a set-valued map. This is often referred to as ill-posedness. Common methods such as regularization effectively alter the QoI map so that it has a well-posed deterministic inverse albeit with an altered geometric structure. Introducing stochasticity in the form of probability measures on inputs and/or outputs also fails to address this fundamental geometric issue111Statistical inverse problems where the QoI map is replaced by a statistical map constitute an entirely different framework for formulating inverse problems, see [7, 2] for more detailed relations to statistical and Bayesian inference problems.. The method we use is based on a computational measure-theoretic approach that fully exploits the set-valued inverses directly preserving specific geometric information contained in the map from inputs to outputs [2, 6, 5, 7]. Approximate solutions are defined using simple function approximations to the density of the unique probability measure on the parameter domain and given algebra.
There are two common steps in computing any simple function approximation to a measurable function such as the density (i.e. Radon-Nikodym derivative) of a probability measure:
identify measurable sets in a given algebra partitioning the domain of the function;
assign a nominal function value on each of these sets.
These steps are explicitly used in classical examples applying the Lebesgue Monotone Convergence Theorem. First, a specific sequence of partitions on the range of a given measurable real-valued non-negative function is identified. The preimages of the sets in each partition are used to identify the measurable sets in the -algebra on the domain of the function. This completes the first step and requires evaluation of the inverse map to identify inverse sets. The second step is trivial in this case where the infimum of each output set is assigned to its preimage to define the simple function approximations for each partition.
A probability measure with density on the parameter domain and original algebra is a solution to the stochastic inverse problem if the QoI maps this parameter density to the output density. Unique solutions to the stochastic inverse problem exist in a contour algebra  embedded in the original algebra. Moreover, combining the Disintegration Theorem with an Ansatz describing probabilities on the contour events proves unique solutions to the stochastic inverse problem exist in the original parameter algebra . As shown in [2, 7], the key step in constructing any simple function approximation to the density solving the stochastic inverse problem is the approximation of the QoI contour map in the given algebra. The contour map contains all the geometric information available by the QoI map. It is then straightforward to assign probability values to any approximate partitioning sets of the contour events in the original algebra [2, 6, 5, 7] such that a sequence converges using either the Monotone Convergence Theorem  or the Lebesgue Dominated Convergence Theorem .
The formulations of the forward and inverse problems within this measure-theoretic framework, the existence and uniqueness of solutions, and the approximations of solutions by deterministic approximation techniques have been recently studied [2, 6, 5, 7]. The focus of this work is the development and analysis of sampling techniques within a parameter domain to implicitly define approximating sets of both a contour algebra and the original algebra. In this way we create a new computational algorithm for the probability measure based on the samples and their approximation properties. We prove that a counting measure approximation converges to the exact probability measure solving the stochastic inverse problem as the number of samples increases. Finally, for the computed probabilities of events, we derive computable error estimates and bounds for the effects of using a finite number of samples and numerical errors in the evaluation of the model.
The outline of this paper is as follows. In Section LABEL:S:Formulation, we summarize the measure-theoretic framework and formulation of the stochastic inverse problem. In Section LABEL:S:Algorithms, we summarize the theoretical results involving algebras on contour maps and solutions to the stochastic inverse problem. We then present computational algorithms including a point-sample based algorithm. We describe the implicit set-approximation properties defined by the sampling of points in parameter space and define a counting measure approximation to the inverse probability measure. In Section LABEL:S:Theory, we prove that the counting measure approximation converges to the correct probability measure for arbitrary events in the original algebra. In Section LABEL:S:Errors, we summarize the types and sources of error in the computed probabilities and how we estimate and/or bound these errors. In Section LABEL:S:Numerics, we provide numerical results for a Navier-Stokes flow past a cylinder with uncertain viscosity and cylinder location.
2 Formulating stochastic inverse problems in a measure-theoretic framework
Let denote the parameter domain for the model. We assume is a metric space whose metric is specified as part of the model. Thus, there is an induced topology on and Borel algebra . We define the measure space using the measure induced by the metric . Let denote the vector-valued QoI map where , , is defined by functionals of solution to the model.
We say that the component maps of -dimensional locally differentiable vector-valued map are geometrically distinct (GD) if the Jacobian of has full rank at every point in .
In practice, since we deal with measurable events of non-zero measure, we can weaken local differentiabillity of and/or the full rank of the Jacobian of to hold at a.e. point in 222Since we use probability densities, which are functions, to compute probabilities of events, any set of zero measure has no affect on the final solutions and can simply be ignored or removed from in the formulation of the inverse problem for convenience.. Assuming is locally differentiable with GD components, defines a “push-forward” measure on , where is the Borel algebra on , and for any , . This yields the measure space . The measures and are volume measures not probability measures. We often assume the probability measures are absolutely continuous with respect to the volume measures in which cases we typically refer to the associated probability density functions (i.e. the Radon-Nikodym derivatives).
The inverse sensitivity problem of interest is the direct inversion of the forward stochastic sensitivity analysis problem333The forward problem is standard: given probability measure on measurable space compute the probability measure on measurable space .. In other words, the stochastic inverse problem is to determine probability measures on measurable space such that the push-forward probability measures match given probability measures on . Fundamentally, this requires a description of the mapping between sets in the various algebras and .
When , the Implicit Function Theorem guarantees that for any fixed point in , there exists some union of locally continuous -dimensional manifolds defining the set-valued inverse. We use the notation and definitions of [7, 2, 6, 5] and define a generalized contour as any such set-valued inverse of the map . We assume that is defined by , i.e. is the range of the QoI map containing all possible physically observable data that can be mapped to from . Thus, we can decompose into a union of generalized contours in 1-1 correspondence with the points in . In other words, the map defines a type of generalized contour map on . There exists a (possibly piecewise-defined) continuous -dimensional indexing manifold, called a transverse parameterization, in defining a bijection between and the generalized contours, see  for more details. As way of analogy, a specific transverse parameterization is like a particular path of ascent up a mountain that a hiker plots out using a contour map of elevations.
We note that the contour map obtained from defines an equivalence class relation on . Denote this space of equivalence classes as where each point in corresponds to a set of points in . We obtain a measure space , where the algebra on can be generated using inverse images of a collection of Borel sets in and the volume measure induced by .
We exploit the equivalence relation to define the induced algebra on that can be generated from the set of equivalence classes for a set of generating events in . For , is a proper subset of . We define as the contour algebra on and call events in contour events. The geometric structure of on is fully exploited to define the contour events in the contour -algebra . We emphasize that events in can be uniquely determined by events in .
The goal is to define solutions to the stochastic inverse problem on , e.g. in terms of a density defined on points in not points on a contour map. Below, we prove there exists a algebra defined on the generalized contour map on equivalent to the algebra on . We then use this algebra equivalency in the Disintegration Theorems that follow to prove the existence and uniqueness of solutions to the stochastic inverse problem.
3 Solving the stochastic inverse problem using sigma-algebras on contour maps
A common starting point for constructing a measure on a algebra is to first define an algebra used to generate the algebra of interest. The next steps can vary, but typically we define a premeasure on the algebra which induces an outer-measure in a natural way. Finally, employ Carathéodory’s Theorem to extend the outer-measure uniquely to a complete measure on the generated algebra. We can define an algebra with elementary families of elementary sets for which a clear notion of measure is defined, e.g. by using a partitioning of a space where sets are either disjoint or possibly intersect only at boundaries such as hypercubes or generalized rectangles in . We will exploit such generating sets in much of the theory below.
In Section LABEL:S:SetDisintegration, we relate algebras on to algebras on the generalized contour maps. In Section LABEL:S:TheoryReview, we summarize the theory of the existence and uniqueness of solutions to the stochastic inverse problem with respect to these algebras. In Section LABEL:S:ComputingDetails, we describe the approximation of solutions.
3.1 Sigma-Algebras on COntour Maps (SACOM)
The generalized contours define a type of contour map on . Given the geometry of these generalized contours and a topology, we can define algebras on contour maps (SACOM). In Section LABEL:S:TheoryReview below, we show that the solution to the stochastic inverse problem can be solved uniquely using SACOM. The goal is to relate the algebras on (e.g. or ) to SACOM.
To motivate what follows, consider the case where the map is linear. In this case, the -dimensional generalized contours are hyperplanes. For simplicity, let , and assume we use the typical Euclidean distance metric and induced topology giving the Borel algebra on the product space . Let denote the projection map from to the equivalence class containing . For arbitrary , let denote the associated generalized contour as a subset of . Changing coordinates with respect to the directions orthogonal and parallel to the hyperplanes gives 444See Propositions 1.5 in  identifying and from the change of coordinates and .. Identifying such a product decomposition is useful for applying Fubini’s theorem in order to integrate certain functions using iterated integrals. The key is that the domain of integration, which is a set in the original algebra, can be “cut” into products of lower-dimensional sets in the component measure space algebras. In this case, such sets are “cut” into products of sets both along and transverse to the generalized contours. Below, we extend this for the general case of nonlinear generalized contours indexed by a nonlinear manifold defining a transverse parameterization.
Let denote a algebra on a given . Below, we describe two natural choices for for each defining a family of measurable spaces indexed by .
Any space contains the so-called trivial algebra consisting of the empty set or the entire space. Therefore, there exists at least one algebra on each , given by . Alternatively, using the induced topology on each , we define Borel algebras . These are equivalently defined by 555See Lemma 6.2.4 of .. In other words, the Borel algebra on can be defined by the restriction of Borel measurable sets in to the given generalized contour. These two particular choices for prove useful in the Disintegration Theorems of Section LABEL:S:TheoryReview for showing existence and uniqueness of solutions to the stochastic inverse problem.
As in many of the classical proofs of theorems involving product algebras, we make explicit use of the generating sets for each algebra. Since each measurable space we consider is metrizable, we assume the generating sets are taken from the implied topology. For each , let denote any family of subsets of generating . A standard measure-theory result states that is the unique smallest algebra generated by for each . Similarly, we let denote a generating set for the Borel algebra .
We define the transverse product algebra as the smallest algebra on generated by where and for all . We denote this algebra by .
Proof: This follows immediately from the definition of contour algebra using the equivalence relation determined by and the generating sets for and .
Before we prove this theorem, we note that the inclusion is not at all obvious since may be uncountable and algebras are closed under countable unions. Since we assume is locally differentiable, there exists a bijection between and a transverse parameterization in [2, 7]. Since the transverse parameterization is a piecewise-continuous -dimensional manifold in separable space , it follows that is separable. This is used below. Also, we let for any . We use this notation elsewhere as convenient.
Proof: Let denote any Borel generating set of and consider any . Restrict the map to , then by the assumption of GD component maps, there exists an -dimensional (piecewise) continuous manifold defining a transverse parameterization on . Moreover, the transverse parameterization is a Borel set in by assumption of the local differentiability of and is in 1-1 correspondence with . Thus, we have666See Theorem 6.9.7 . that the Borel set defines the unique Borel set . Restricting any generating sets of to a generalized contour defines a generating set for . From Definition LABEL:d:tp_algebra, we have that . It follows that 777See Lemma 1.1 of ..
Suppose we are given Borel generating set and family of Borel generating sets defined by all the open sets on these spaces. For all , any Borel set on is also Borel in . By assumption, is Borel in for any . If is countable, then for any . Suppose is uncountable. Let denote a countable dense set in and the collection of open balls in with rational radius and center in . Then every open set in is a countable union of open balls in . In other words, is countably generated by , so the set is countably generated by unions of Borel sets and is itself Borel. Thus, , so .
3.2 Existence and uniqueness of solutions
To solve the stochastic inverse problem, we use a form of the Disintegration Theorem [7, 9, 11], which is a powerful theoretical tool for rigorous definition of conditional probabilities. We focus on convenient forms of the theorem written for probabilities and direct the interested reader to  for a more thorough presentation. Using Theorem LABEL:thm:ContourSigmaAlgebraDecomp and following the steps of , we have,
Theorem 3.3 (Disintegration of Contour Map Probabilities)
Let be a measurable space. Assume that is a probability measure on . There exists a family of conditional probability measures on giving the disintegration,
It is clear we can compute the induced probability measure on defined by , and is defined by a probability density on with respect to ,
From Theorem LABEL:thm:ProbabililtyDisintegrationTrivial, the conditional probability measures for are given by if and otherwise. This proves the following
The stochastic inverse problem has a unique solution on .
The primary goal is not a probability measure on , but a probability measure on . Since we will often work with densities given as Radon-Nikodym derivatives (i.e. ), we find useful the following corollary of the Disintegration Theorem ,
Corollary 3.1 (Disintegration of the Volume Measure)
There exists a family of volume measures on such that for any ,
Using Theorem LABEL:thm:OriginalSigmaAlgebraDecomp, we arrive at the more common form of the Disintegration Theorem for probability measures given by
Let be a measurable space. Assume that is a probability measure on . There exists a family of conditional probability measures on giving the disintegration,
In the above Disintegration Theorems, the conditional probability measures can be extended to or by extending the measures for all for each . Theorem LABEL:thm:ProbabililtyDisintegration guarantees that any probability measure on can be decomposed into a form involving a probability measure on uniquely defined by and probability measures on each measurable generalized contour space defined by the conditional probabilities . It is not obvious what these conditional probability measures are in Theorem LABEL:thm:ProbabililtyDisintegration. Clearly, any conditional probability measures on can not be determined by observations of . This motivates the adoption of an Ansatz in which the probability measures along are specified.
Ansatz: For all , assume a probability measure is given on .
Under the Ansatz, the stochastic inverse problem has a unique solution on .
We prefer a specific Ansatz that can be interpreted as prescribing a “non-probabilistic” or “non-preferential” weighting determined by the disintegration of volume measure to compute probabilities of events inside of a contour event888This is valid when , which is generally satisfied in practice using compact domains. If this is not the case, implying certain physical parameters are unbounded, then we may employ techniques similar to those used in Bayesian analysis for “non-informative” priors.. However, the approximation method and resulting Algorithm LABEL:Alg_old (summarized below) can be easily modified for any Ansatz (see  for more details).
Standard Choice for Ansatz:
If we do not specify an Ansatz, the stochastic inverse problem can only be solved on where there is only one possible probability measure on each . From Theorems LABEL:thm:ContourSigmaAlgebraDecomp and LABEL:thm:OriginalSigmaAlgebraDecomp, it is evident that the choice of algebra on dictates the SACOM and whether or not we must adopt an Ansatz for the solution.
3.3 Approximating solutions
In , we describe several approximation issues that need to be addressed in any practical computation of . As described below, the fundamental approximation issues of any measure are the approximation of events in the various algebras. The numerical approximation issues involve error in numerical evaluation of the model and computation of probability measures on some collection of events. Here, we discuss the event approximations. In Section LABEL:S:Errors, we analyze the effect of errors on the approximate probability measures computed from the approximating sets.
Brief review of approximations
Repeated application of the Lebesgue Dominated Convergence Theorem yields the following,
Given probability measure , absolutely continuous with respect to , on with density and event , there exists a sequence of approximations using simple function approximations to probability densities and requiring only calculations of volumes in that converges to as .
The proof of Theorem LABEL:thm:original_approx details approximating probability densities and (i.e. the Radon-Nikodym derivatives of the probability measures) in order to apply the Lebesgue Dominated Convergence Theorem and outlines the computational measure-theoretic Algorithm LABEL:Alg_old (see  for more details). Unsurprisingly, the first step is the generation of partitions and such that (1) arbitrary events in and are approximated by unions of sets from these partitions, and (2) simple function approximations to the densities can be computed on these partitions. These partitions can be chosen from an algebra of elementary sets.
Since densities are functions, we can use convex sets with continuous boundaries to define a finite partition and associated simple function approximation of sufficient accuracy in the -norm999See Theorem 2.41 in .. The local differentiability of implies that is a measurable event in both and with boundary of zero -measure. Such events can have their measures approximated within any prescribed tolerance by a finite set of hypercubes101010See Theorem 2.40 of .. This defines an obvious choice for the cells partitioning . Once the probabilities of cells have been approximated by Algorithm LABEL:Alg_old, we may estimate for arbitrary event in any of the usual measure-theoretic ways such as using inner or outer sums, averages of inner and outer sums, or direct integration of the simple function approximation .
Suppose instead that we wish to solve the stochastic inverse problem on instead of . In this case, we can approximate the solution using the same approximate solution on computed using Algorithm LABEL:Alg_old since . In other words, we may use the same approximate finite generating set for as an approximate finite generating set for .
Unless otherwise stated, we let denote a partition of used to approximate events in and define the simple function approximation , where . In other words, the probability on each set is defined by the expected value of on . This is a natural choice equivalent to using the Integral Mean Value Theorem to define a normalized approximation, i.e. is a density with this choice of .
A sample based approximation and counting measure
Previous implementations of Algorithm LABEL:Alg_old used regular grids so that was defined as a set of generalized rectangles or hypercubes [2, 6, 5]. While such sets may be refined and the measure of unions of these sets made to approximate any Borel set arbitrarily well , there are obvious practical difficulties with using such an approximating set in high dimensions. Here, we consider an alternative where the sets are defined implicitly by a finite collection of samples in satisfying some particular properties detailed below.
For a fixed number of samples , there is a Voronoi tessellation of denoted by defined by
Here, denotes a metric on used to define the Voronoi cells111111The metric is possibly different from the metric that induces the volume measure and Borel -algebra . Common choices for are the standard Euclidean 2-norm or 1-norm..
Clearly the goal is to define a set of samples implicitly defining a tessellation of that is useful for approximating events, i.e. measurable sets, in . We often approximate events in a given algebra by a Voronoi coverage.
We say that is the Voronoi coverage of and its volume defined by
A rule for defining any samples is called -consistent if
Clearly any -consistent rule implies that any generalized rectangle or finite union of generalized rectangles may be approximated arbitrarily well by a Voronoi coverage defined by a finite number of samples. Replacing with a Voronoi tesselation generated from a -consistent rule in Algorithm LABEL:Alg_old, we define the following,
Let denote the counting probability measure (or simply counting measure on defined by
Note that is simply a measure of point masses at each sample from . A technical point is that we may use a counting measure to estimate the probability of any . However, we only prove convergence to the exact probability when .
It is possible to have compact Borel sets with boundaries of non-zero measure. We describe in Section LABEL:S:non-intrusive-converge how to approximate the probability of such sets. In practical computations, we are not concerned with such sets with fractal boundaries. Furthermore, such sets do not enter into any of the computational algorithms. This is due to the assumption of local differentiability of the map , which implies that all generalized contours, and subsequently, all induced regions of generalized contours defined by for any Borel set are sets with boundaries of zero -measure.
A Monte Carlo approach
There are many possible -consistent rules that we may choose. For example, we may define a sequence of uniformly refined grids of which the grid points are numbered and sequentially sampled in a serpentine manner. This choice produces Voronoi cells that are generalized rectangles (for certain choices of ) and has been used previously, e.g. see [2, 6, 5]. While this produces small Voronoi cells everywhere in the domain, the approximation properties of the cells have negative dependence on the dimension of the parameter space in terms of requiring large to achieve reasonable -approximations by the Voronoi coverage.
We may choose to normalize the volume measure (if possible) and generate N i.i.d. samples from this distribution corresponding to a “uniform” sampling density (see Section LABEL:S:StochGeom below). This amounts to a standard Monte Carlo (MC) approach for sampling on . Furthermore, we may use the standard MC approximation that all Voronoi cells have the same volume which greatly reduces the computational demands of the algorithm, see Algorithm LABEL:Alg_MC.
The counting measure definition is consistent with the MC approach of Algorithm LABEL:Alg_MC to estimate the probabilities of individual Voronoi cells inside of a particular induced region of generalized contours by counting the number of samples within this event. Specifically, in the algorithm, counts the number of Voronoi cells we associate within the induced generalized region of contours defined by .
Standard MC sampling algorithms tend to produce random samples that are clustered resulting in large variations in the sizes of the corresponding Voronoi cells. This may not provide good set-approximation properties for small sample sizes. Sampling points equidistant along a space filling curve or using Latin hypercube sampling are other options. Or, we may simply choose the samples deterministically based on other information we know or we want to impose a certain resolution on the problem, e.g. we may wish to obtain a fine resolution of some particular subset of for some design purposes. Some pseudo-MCMC sampling might be best. The underlying theory only requires that the samples be generated from a -consistent rule. Exploring all possibilities is beyond the scope of this paper and we limit presentation to the basic theory of convergence of counting measures for -consistent rules with special attention paid to random sampling rules satisfying criterion as discussed in Section LABEL:S:StochGeom below.
4 Convergence of counting measures
4.1 General theory
The proof of convergence is based upon classical measure-theoretic arguments using simple function approximations defined using some approximate generating set to a Borel algebra. We use a triangle inequality to separate the error in probability into two parts involving the approximation properties of the Voronoi coverages to generalized rectangles and the approximation properties of the generalized rectangles to other Borel-measurable sets.
For any and , there exists simple function approximation and generalized rectangles in partitioning compact with .
Proof: Let be given. There exists a partition of into generalized rectangles, , such that the error of and is less than . For each , which have boundaries of zero -measure, there is a finite number of generalized rectangles partitioning such that . Moreover, for any there is a finite number of generalized rectangles, in such that . The conclusion follows from Theorem LABEL:thm:original_approx where is constructed as in Algorithm LABEL:Alg_old with given by sufficiently fine generalized rectangles partitioning from which the finite number of generalized rectangles used above may be constructed by finite unions.
Given probability measure , absolutely continuous with respect to , on with density and a -consistent rule for generating samples, then for any event with , there exists a sequence of approximations of simple functions and counting measures such that a.s. as .
Proof: Let be given and denote a sequence of samples in computed from the -consistent rule. We have by the assumptions of that for any fixed with continuous boundary that the induced region of generalized contours has boundary with zero -measure. By assumption of , there exists a sequence of partitions such that for any fixed but arbitrary and we have that as where denotes the Voronoi coverage of . By construction of , the definition of , and the fact that the Borel sets on can be generated by the collection of all generalized rectangles, there is a sufficiently fine partition of by generalized rectangles and an such that for all .
By Lemma LABEL:lem:help, there is a sufficiently fine partition of by generalized rectangles such that . Let denote the number of generalized rectangles needed to generate all of the above generalized rectangles by finite unions and intersections. Then by a standard triangle inequality and Lemma LABEL:lem:help, we have that for partition as above defining simple function approximation to that for all for each . The result extends to any with by applying the Lebesgue Dominated Convergence Theorem to a sequence of constructed from a sequence of partitions of generalized rectangles in and using Theorem LABEL:thm:Lambda_inverse.
We can approximate the probability of general using the counting measure with error less than any . For such an , we use Lemma LABEL:lem:help with to define a finite union of generalized rectangles in approximating . Let denote this finite union of generalized rectangles. Then has the property that and Theorem LABEL:thm:counting_converges applies to the set . Let denote the Voronoi coverage of with samples in satisfying for any . Then using and a triangle inequality, the result follows. This proves the following
Given probability measure , absolutely continuous with respect to , on with density and a -consistent rule for generating samples. For any event and , there exists simple function , finite samples defining counting measure , and a set with such that .
4.2 Convergence of the Monte Carlo approximation
We present a result that can be viewed as a strong law of large numbers for Voronoi tessellations. We direct the interested reader to [19, 17] for a more thorough exposition on the subject within the context of stochastic geometry and point processes.
A sampling distribution, denoted by , is any distribution absolutely continuous with respect to such that the corresponding sampling density for almost every .
The following Lemma from  has been modified to conform to our notation. To our knowledge,  was the first to summarize and explicitly prove the fundamental elements of Lemma LABEL:lem:SLLN. Below, we modify and expand the proof presented in  to highlight details of the convergence and uniqueness that are of practical use, e.g., when considering design of adaptive sampling procedures to determine more accurate Voronoi coverages of specific events  or in understanding issues related to a posteriori error estimates described below in Section LABEL:S:Errors.
Given sampling density almost everywhere on , if such that , then almost surely
Proof. Suppose , and . Let denote the Minkowski sum of and restricted to domain , where is the unit ball in . Let . Since , for all there exists such that and .
Let and . We now prove that almost surely. For any , let denote the finite set of hypercubes in with edge-length partitioning compact . By Kninchine’s strong law of large numbers,
Suppose does not converge to zero almost surely, then there exists such that infinitely often. By construction, for any , and there exists a fixed such that for all . Choosing a set of hypercubes intersecting with edge-length sufficiently small (and positive) such that the maximum distance in the -metric between any two points in the hypercubes is less than yields a contradiction since almost surely for each of these hypercubes, so there must exist a sample with such that its distance is less than from . Thus almost surely. It follows that, for all , there exists a Voronoi coverage of such that for all almost surely, which implies almost surely.
This implies that any random sampling scheme with sampling density produces a -consistent rule with probability . Theorem LABEL:thm:counting_converges then applies to counting measures computed using random sampling as long as the sampling density is positive.
4.3 Choosing sampling densities
If on any set such that , then any sequence of counting measures evaluated on will result in a sequence of zeros. If shares any volume with a region of induced generalized contours of non-zero probability, then according to Theorem LABEL:thm:ProbabililtyDisintegration121212With any Ansatz such that the generalized contours in have non-zero probability., and the sequence of approximate probabilities fails to converge. This highlights the importance of the sampling density being strictly positive on .
In Algorithm LABEL:Alg_MC, a natural choice is to let , i.e. we sample from according to its volume measure. However, we may choose a non-uniform intensity to improve resolution or accuracy of the approximate probability measure for certain induced regions of generalized contours, e.g. in locations where is largest. Such ideas of computational accuracy with a finite number of samples and adaptive sampling are the subjects of future work.
We note that in Algorithm LABEL:Alg_MC, we use a standard MC approximation that can be interpreted as approximating the volumes of Voronoi cells to be equal. We may instead opt to more accurately estimate the volumes for each , in which case the algorithm more closely resembles Algorithm LABEL:Alg_old where we explicitly use volumes of the partition of to obtain . Similarly, if there exists such that and for , then the standard MC approximation that all are equal for all no longer applies. The necessary modification is that approximation to the ratio of volumes of replace division by in the last for-loop131313In this case, the counting vector can be entirely removed from Algorithm LABEL:Alg_MC.. Similar modifications are required for a non-standard choice of the Ansatz. Below, we assume such modifications to Algorithm LABEL:Alg_MC are made as necessary to correctly compute the counting measure .
5 Sources of error in the counting measure
There are two types of error in approximating probability measure using Algorithm LABEL:Alg_MC: stochastic and deterministic. Stochastic error arises from using finite samples to implicitly define a Voronoi tessellation of . Deterministic error arises from the numerical evaluation of the map for each of the samples. Each of these errors affects in different ways. The deterministic error may lead to misidentification of induced regions of generalized contours (possibly leading to incorrect component values of the vectors and in Algorithm LABEL:Alg_MC). We let denote the computed counting measure using numerical computations approximating the map , where denotes some numerical discretization parameter. The choice of samples defines the set-approximation properties of events in in terms of unions and complements of Voronoi cells (the first step in Algorithm LABEL:Alg_MC). In fact, the ability to estimate induced regions of generalized contours is limited by the choice of samples, so it is the stochastic source of error that we examine first.
Below, denotes the exact probability measure on for the standard Ansatz141414The error analysis can be altered for a non-standard Ansatz where volume measures and counting are replaced by the probabilistic weighting of the contour events. given a fixed simple-function approximation to . In practice, when repeated experiments and the subsequent measurements are used to empirically determine , such an approximation would be determined by the binning of the relevant QoI. In general, we decompose the error in the computed probability measure as
Here, term is the error in approximating the probability of event by the counting measure . Term is the error in using the numerical map to identify induced regions of generalized contours in Algorithm LABEL:Alg_MC. The events belong to a algebra that is a subset of the original algebra .
5.1 Stochastic set-approximation and numerical algebras
We find convenient the following,
For a given set of samples on , , let denote the numerical algebra generated by the implicitly defined Voronoi tessellation 151515Borel algebras on are commonly generated using generating sets defined, for example, by all half-open intervals. Analogously, is equivalently generated by the set of all Voronoi coverages for all . This particular definition uses the minimal generating set..
By the definition or Borel sets and Voronoi cells, is a proper subset of for any finite , i.e. any event in is -measurable.
Definition LABEL:def:numerical_algebra is illustrative of the fundamental types of events we may compute probabilities for using Algorithm LABEL:Alg_MC with no set-approximation errors, i.e. with no term error. Specifically, for a fixed number of samples, a fixed approximation to , and given exact map in Algorithm LABEL:Alg_MC, we may compute exactly for all . From Lemma LABEL:lem:SLLN, we have that any can be sufficiently approximated in -measure by a Voronoi coverage almost surely, i.e. we are almost surely guaranteed to find an element of for some that approximates to the desired accuracy. This was exploited in the proof of Theorem LABEL:thm:counting_converges that the counting measures converge. Here, we focus on the events of non-zero -measure and the errors in the computed probabilities of such events.
Below, we show that for certain events in that there are, in a probabilistic sense, optimal and unique approximations to these events in .
If and , then there exists at least one , ), such that is in one and only one of the measurable sets or .
Proof: By definition, the algebra is constructed by the closure of complements and countable unions of the finite set of individual Voronoi cells . Thus, by construction, any sets of non-zero -measure in must contain at least the interior of a single Voronoi cell. Since , the conclusion follows.
Suppose is fixed and choose any such that there exists at least one . There exists unique (up to a set of zero -measure) such that for any (absolutely continuous with respect to ), for any simple function approximation to , and for any choice of Ansatz, we have .
Proof: Fix an arbitrary containing at least one sample and let denote the set of indices such that for and . By construction, is a Voronoi coverage of and by definition of counting measure , for any probability measure absolutely continuous with respect to and any simple function approximation to , we have that . Lemma LABEL:lemma:nonempty gives uniqueness (see the Appendix for details).
Theorem LABEL:thm:stoch_error guarantees that for any event in the original -algebra of which we want to compute its probability that there is a “best” approximation in the numerical -algebra as long as the original event in contains at least one sample in . While we use probabilities to prove this, the probabilities on are independent of the choices of samples on (the probability measure on outputs exists independently of our choice of approximating events in the parameter domain). In other words, Theorem LABEL:thm:stoch_error is a statement about set-approximation not the approximation of probabilities since is exact.
The set-approximation error can be determined a priori to any computation of the model (and bounded in -measure independently of any probability measure) either globally or for a certain collection of events of physical importance or interest in . For the sake of simplicity and also for practical reasons, we consider any event of interest to belong to , i.e. in Eq. (LABEL:eq:P_error_decompose) we set . In other words, we assume that all the events we wish to compute the probabilities of can be represented exactly (or with negligible error) in . This prevents the problem of trying to compute probabilities of complicated events that exist even in standard Borel algebras, which are usually not of interest in a realistic setting where events are often defined by hypercubes or balls. Moreover, all such events have the property that .
Recall that denotes the partition on