A hybrid HDMR for mixed multiscale finite element methods with application to flows in random porous media ^{†}^{†}thanks: L. Jiang and J. D. Moulton acknowledge funding by the Department of Energy at Los Alamos National Laboratory under contracts DEAC5206NA25396 and the DOE Office of Science Advanced Computing Research (ASCR) program in Applied Mathematical Sciences.
Abstract
Stochastic modeling has become a popular approach to quantify uncertainty in flows through heterogeneous porous media. In this approach the uncertainty in the heterogeneous structure of material properties is often parametrized by a highdimensional random variable, leading to a family of deterministic models. The numerical treatment of this stochastic model becomes very challenging as the dimension of the parameter space increases. To efficiently tackle the highdimensionality, we propose a hybrid highdimensional model representation (HDMR) technique, through which the highdimensional stochastic model is decomposed into a moderatedimensional stochastic model in the most active random subspace, and a few onedimensional stochastic models. The derived lowdimensional stochastic models are solved by incorporating the sparsegrid stochastic collocation method with the proposed hybrid HDMR. In addition, the properties of porous media, such as permeability, often display heterogeneous structure across multiple spatial scales. To treat this heterogeneity we use a mixed multiscale finite element method (MMsFEM). To capture the nonlocal spatial features (i.e. channelized structures) of the porous media and the important effects of random variables, we can hierarchically incorporate the global information individually from each of the random parameters. This significantly enhances the accuracy of the multiscale simulation. Thus, the synergy of the hybrid HDMR and the MMsFEM reduces the dimension of the flow model in both the stochastic and physical spaces, and hence, significantly decreases the computational complexity. We analyze the proposed hybrid HDMR technique and the derived stochastic MMsFEM. Numerical experiments are carried out for twophase flows in random porous media to demonstrate the efficiency and accuracy of the proposed hybrid HDMR with MMsFEM.
Key words. Hybrid highdimensional model representation, Sparse grid collocation method, Mixed multiscale finite element method, Approximate global information
AMS subject classifications. 65N30, 65N15, 65C20
1 Introduction
The modeling of dynamic flow and transport processes in geologic porous media plays a significant role in the management of natural resources, such as oil reservoirs and water aquifers. These porous media are often created by complex geological processes and may contain materials with widely varying abilities to transmit fluids. Thus, multiscale phenomena are inherent in these applications and must be captured accurately in the model. In addition, due to measurement errors and limited knowledge of the material properties and external forcing, modeling of subsurface flow and transport often uses random fields to represent model inputs (e.g., permeability). Then the system’s behavior can be accurately predicted by efficiently simulating the stochastic multiscale model. The existence of heterogeneity at multiple scales combined with this uncertainty brings significant challenges to the development of efficient algorithms for the simulation of the dynamic processes in random porous media. As a result, the interest in developing stochastic multiscale methods for stochastic subsurface models has steadily grown in recent years.
The existence of uncertainty in random porous media is an important challenge for simulations. One way to describe the uncertainty is to model the random porous media as a random field which satisfies certain statistical correlations. This naturally results in describing the flow and transport problem using stochastic partial differential equations (SPDE). Over the last few decades several numerical methods have been developed for solving SPDEs. The existing stochastic numerical approaches roughly fall into two classes: (1) nonintrusive schemes and (2) intrusive schemes. In nonintrusive schemes, the existing deterministic solvers are used without any modification to solve a (large) set of deterministic problems, which correspond to a set of samples from the random space. This leads to a set of outputs, which are used to recover the desired statistical quantities. Monte Carlo methods [14] and stochastic collocation methods fall into this category. In contrast, intrusive schemes treat the approximation of the probabilistic dependence directly in the SPDE. Thus, discretization leads to a coupled set of equations that are fundamentally different from discretizations of the original deterministic model, and hence, raises new challenges for nonlinear solvers. Typical examples from this class are stochastic Galerkin [19, 13, 34] and perturbation methods [48]. A broad survey of these methods can be found in [29]. Among these methods, stochastic collocation methods have gained significant attention in the research community because they share the fast convergence property of the stochastic finite element methods while having the decoupled nature of Monte Carlo methods. A stochastic collocation method consists of two components: a set of collocation points and an interpolation operator. The collocation points are selected based on specific objectives, and choosing different objectives or constraints leads to different methods. Two popular collocation methods are the fulltensor product method [5] and the Smolyak sparsegrid method [6, 36, 40]. Sparsegrid stochastic collocation is known to have the same asymptotic accuracy as fulltensor product collocation, but uses significantly fewer collocation points. Unfortunately, the number of collocation points required in the sparsegrid method still increases dramatically for highdimensional stochastic problems. Hence, application of this method is still limited to problems in a moderatedimensional stochastic space.
Due to small correlation lengths in the covariance structure, the uncertainty in characterizations of porous media are often parametrized by a large number of random variables, e.g., using a truncated Fourier type expansion for random fields. Consequently, the model’s input is defined in a highdimensional random parameter space. Sampling in highdimensional random spaces is computationally demanding and very timeconsuming. If the sampling of random space is conducted in the full random space through the stochastic collocation method, then the number of samples increases sharply with respect to the dimension of the random space. This is the notorious curse of dimensionality, and is a fundamental problem facing stochastic approximations in highdimensional stochastic spaces. Dimension reduction techniques such as proper orthogonal decomposition, principal component analysis, reduced basis methods [39], novel random field expansion techniques [27, 45, 46] and highdimensional model representations (HDMRs) [38], strive to address this problem. Among these methods, the HDMR is one of the most promising methods for efficient stochastic dimension reduction in subsurface applications, and has received considerable attention in recent years [32, 33, 35, 47]. The HDMR was originally developed in the application of chemical models by [38], but was later cast as a general set of quantitative assessment and analysis tools for capturing the highdimensional relationship between model inputs and model outputs. HDMR has been used for improving the efficiency of deducing highdimensional inputoutput system behavior, and can be employed to reduce the computational effort. In practice, to avoid dealing with the full highdimensional random space, a truncated HDMR technique is used to decompose a highdimensional model into a set of lowdimensional models, each of which needs much less computational effort. The curse of dimensionality can be suppressed significantly by using this approach to HDMR. However, some drawbacks exist in the traditional truncated HDMR [9, 38] and related methods such as the adaptive HDMR [32, 47]. For example, often too many lowdimensional models need to be computed, and highorder cooperative effects from important random variables may be neglected. To overcome these drawbacks, we propose a hybrid HDMR, which implicitly uses the complete HDMR in a moderatedimensional space and explicitly uses the firstorder truncated HDMR for the remaining dimensions. Thus the hybrid HDMR decomposes a highdimensional model into a moderatedimensional model and a few onedimensional models. The moderatedimensional space is spanned by the most active random dimensions. We use sensitivity analysis to identify the most active random dimensions. Then each lowdimensional stochastic model in the hybrid HDMR is solved by using sparsegrid stochastic collocation method. The proposed hybrid HDMR renders a good approximation in stochastic space and substantially improves the efficiency for highdimensional stochastic models. Therefore, a good tradeoff between computational complexity and dimension reduction error can be achieved with the hybrid HDMR technique.
Porous media often exhibit complex heterogeneous structures that are inherently hierarchical and multiscale in nature, and hence, pose a significant challenge for developing accurate and efficient numerical methods. Simulating flow in porous media using a very fine grid to resolve this heterogeneous structure is computationally very expensive and possibly infeasible. However, disregarding the heterogeneities can lead to large errors. Thus, many multiscale methods including the MMsFEM [7], variational multiscale method [23], twoscale conservative subgrid approaches [3] and heterogeneous multiscale method [10], multiscale finite volume method [24], spectral MsFEM [11] (see [12] for more complete references), have been developed over the last few decades to capture the influence of fine and multiscale heterogeneities in underresolved simulations. We note that these multiscale methods share some similarities [12]. In this paper, we focus on the MMsFEM. The main idea behind MMsFEM is to incorporate finescale information into the finite element velocity basis functions, and hence, capture the influence of these fine scales in a largescale mixed finite element formulation. The MMsFEM retains local conservation of velocity flux and has been shown to be effective for solving flow and transport equations in heterogeneous porous media. In many cases, the overhead of computing multiscale basis functions can be amortized as they can be computed once for a particular medium, and then reused in subsequent computations with different source terms and boundary conditions. This leads to a large computational savings in simulating the flow and transport process where the flow equation needs to be solved many times dynamically. When porous media exhibit nonseparable scales, some global information is needed to represent nonlocal effects (e.g., channel, fracture and shale barriers) and is used to construct the multiscale basis functions. Using global information can significantly improve the simulation accuracy in these important geological settings [1, 2, 25, 26]. If the global information is incorporated into MMsFEM, we refer to the MMsFEM as global MMsFEM (GMMsFEM). In the paper, we extend the central concept of the global information to MMsFEM based on the hybrid HDMR.
Combining different multiscale methods and stochastic numerical approaches yields various stochastic multiscale methods [15, 16, 20, 26, 27, 33, 44]. These methods established a general approach to solve problems in multiscale stochastic media. Specifically, dimension reduction techniques are applied, particularly focusing on the multiscale spatial dimension of these stochastic models, to reduce the overall computational cost. For example, within the stochastic Galerkin framework several earlier works (e.g., [4, 35, 39]) proposed solving an optimization problem to select a set of optimal basis functions to define a reduced model. Although, these approaches reduce the resolution of the spatial dimension, the stochastic dimension is not reduced. Therefore, the computational cost of these approaches may still be too expensive in highdimensional stochastic spaces.
In this paper, we propose a hybrid HDMR framework, and use the sparsegrid stochastic collocation method with MMsFEM to develop a reduced multiscale stochastic model. Sensitivity analysis is used to control the reduction of the stochastic dimension. For the MMsFEM, we hierarchically utilize approximate global information that is aligned with the terms of the hybrid HDMR. Specifically, only the required basis functions are computed and to amortize their computational cost they are held constant in time. Thus, the proposed multiscale stochastic model reduction approach is able to reduce a highdimensional stochastic multiscale model in both stochastic space, and the resolution of physical space. Compared with traditional truncated HDMR techniques, much better efficiency and very good accuracy are achieved with this hybrid HDMR. We analyze the proposed multiscale stochastic model reduction approach and investigate its application to flows in random and heterogeneous porous media. Important statistical properties (e.g., mean and variance) of the outputs of the stochastic flow models are computed and discussed.
The rest of the paper is organized as follows. In Section , we briefly introduce the background on the flow and transport models in random porous media. In Section , we present a general framework for HDMR and propose the hybrid HDMR technique. Some theoretical results and computational complexity are also addressed in this section. Section is devoted to presenting MMsFEM integration with the hybrid HDMR. In Section , numerical examples using twophase flow are presented to demonstrate the performance of the proposed the hybrid HDMR with MMsFEM. Finally, some conclusions and closing remarks are made.
2 Background and notations
2.1 Twophase flow system and its stochastic parametrization
Let be a convex bounded domain in () and be a probability space, where is the set of outcomes, is the algebra generated by , and is a probability measure.
We consider twophase flow and transport in a random permeability field, . Here the two phases are referred to as water and oil, and designated by subscripts and , respectively. The equations of twophase flow and transport (in the absence of gravity and capillary effects) can be written:
\hb@xt@.01(2.1)  
\hb@xt@.01(2.2) 
where the total mobility is given by and is a source term. Here and , where and are viscosities of oil and water phases, respectively, and and are relative permeabilities of oil and water phases, respectively. Here is the fractional flow of water and given by , Equation (LABEL:preseqn) is the flow equation governing the water pressure, and (LABEL:sateqn) is the transport (or saturation) equation. According to Darcy’s law, the total velocity in (LABEL:sateqn) is given by
\hb@xt@.01(2.3) 
A random field may be parameterized by a Fourier type expansion, such as the expansion of KarhunenLoève (ref.[31]), polynomial chaos or wavelet [29]. This often gives rise to an infinitedimensional random space. For computation, we truncate such an expansion to approximate the random field. Then can be formally described by
\hb@xt@.01(2.4) 
For example, if a random field is characterized by a covariance structure, then the random field can be approximated by a finite sum of uncorrelated random variables through a truncated KarhunenLoève expansion (KLE). To obtain an accurate approximation, a large number of random parameters is required in (LABEL:approxrandom). This leads to a family of deterministic models in a highdimensional random parameter space.
To simplify the presentation, we make the following assumption,
Let be the image of , i.e., , and be the joint probability function of . By equations (LABEL:preseqn), (LABEL:sateqn), (LABEL:tveqn) and parametrization of permeability field, we formulate the following stochastic twophase flow system: Find random fields , , such that they almost surely (a.s) satisfy the following equations subject to initial and boundary conditions,
\hb@xt@.01(2.5) 
To further simply the notation, we will suppress the spatial variable and temporal variable in the rest of paper when no ambiguity occurs.
2.2 Sparse grid stochastic collocation method
For stochastic twophase flow systems (LABEL:tpsystem), the statistical properties (e.g., mean and variance) of solutions are of interest. These properties may be obtained by first sampling the parameter random space using, for example, a Monte Carlo method or sparse grid collocation method, then solving the deterministic problems for the samples and analyzing the corresponding results to obtain the desired statistical quantities. The convergence of Monte Carlo methods is slow and a very large number of samples may be required, which leads to high computational cost. Instead, we use the Smolyak sparsegrid collocation method [40], where the Smolyak interpolant () is a linear combination of tensor product interpolants with the property: only products with a relatively small number of nodes are used and the linear combination is chosen in such a way that an interpolation property for is preserved for [6]. In the notation , the represents the interpolation level. The sparsegrid collocation method is known to have the same asymptotic accuracy as tensor product collocation method, while requiring many fewer collocation points as the parameter dimension increases [36].
Sparse grids have been successfully applied to stochastic collocation in many recent works (e.g., [32, 33, 26]). Based on Smolyak formula (ref.[6]), a set of collocation points in are specially chosen, where is the number of collocation points. With these chosen collocation points and corresponding weights , the statistical properties of the solutions can be obtained. At each of the collocation points, the deterministic system (LABEL:tpsystem) is solved and the output, for example, is obtained. Then the mean of can be estimated by
Here the weights are determined by the basis functions of and joint probability function (ref.[18]). Similarly, the variance of can be obtained by
\hb@xt@.01(2.6) 
Let denote the number of collocation points for Smolyak sparse grid interpolation . Then it follows that (see [6])
\hb@xt@.01(2.7) 
This implies that the number of collocation points algebraically increases with respect to the dimension . We utilize Smolyak sparse grid collocation for the numerical computation. The stochastic approximation of the Smolyak sparse grid collocation method depends on the number of collocation points and the dimension of the random parameter space. The convergence analysis in [36] implies that the convergence of Smolyak sparse grid collocation is exponential with respect to the number of Smolyak points, but depends on the parameter dimension . This exponential convergence rate behaves algebraically for .
3 High dimensional model representation
The truncated KLE leads to the family of twophase flow deterministic models (LABEL:tpsystem) with a highdimensional parameter . The most challenging part of solving such a highdimensional stochastic system is to discretize the highdimensional stochastic space. There exist a few methods for the discretization of the random space [4, 6, 5, 13, 14, 16, 19, 29, 33, 35, 36, 42, 48] (more references can be found therein). Among these methods, the sparsegrid collocation method has been widely used and generates completely decoupled systems, each of which is the same size as the deterministic system. This method is usually very efficient in moderatedimensional spaces. However, when the dimension of the random parameter is large, a large number of collocation points are required (LABEL:collnodes), and the deterministic model (LABEL:tpsystem) must be solved at each of these collocation points. Consequently, the efficiency of this collocation method will deteriorate in a highdimensional space. To overcome this difficulty, we use a highdimensional model representation (HDMR) to reduce the stochastic dimension and enhance the efficiency of the simulation. By truncating HDMR, the highdimensional model can be decomposed into a set of lowdimensional models. Hence the computational effort can be significantly reduced [32, 47].
In the following section, we present a general HDMR framework and propose a hybrid HDMR method.
3.1 A general HDMR framework
In this section, we adopt the decomposition of multivariate functions described in [28] to present a general HDMR framework in terms of operator theory. Let be a linear space of realvalued functions defined on a cube and . In this discussion represents a relationship between the random vector model input, , and a model output (e.g., saturation, watercut).
To introduce HDMR approach, we define a set of projection operators as follows.
Definition 3.1
[28] Let be a set of commuting projection operators on satisfying the following property:
\hb@xt@.01(3.1) 
Let be a subset and the identity operator, we define
Associated with projection , we define . Then only depends on the variables with indices in .
Let be a measure on Borel subsets of and a product measure with unit mass, i.e.,
\hb@xt@.01(3.2) 
The inner product on induced by the measure is defined as follows:
The norm on is defined by . Given the measure , we can specify the projection operator in (LABEL:projectassum). We assume that the functions in are integrable with respect to . For any and , we define
\hb@xt@.01(3.3) 
Consequently, for any . By the definitions of these projection operator, we can obtain a decomposition of as following (see [37]),
\hb@xt@.01(3.4) 
For any , we define
Then we can show the operators are commutative projection operators and mutually orthogonal, i.e.,
The equation (LABEL:DecompF) gives an abstract HDMR expansion of by
\hb@xt@.01(3.5) 
The equation (LABEL:DecompF) also implies that
Any set of commutative projectors generate a distributive lattice whose elements are obtained by all possible combinations (addition and multiplication) of the projectors in the set. The lattice has a unique maximal projection operator , which gives the algebraically best approximation to the functions in [21]. The range of the maximal projection operator for the lattice is the union of the ranges of . Because the commutative projectors are mutually orthogonal here, the maximal projection operator and the range of have explicit expressions as follows:
As more orthogonal projectors are retained in the set, the resulting approximation by the maximal projection operator will become better.
If the measure in (LABEL:defmu) is taken to be the probability measure
then the resulting HDMR defined in (LABEL:HDMR1) is called ANOVAHDMR [38] . Let us fix a point . If the measure in (LABEL:defmu) is taken as the Dirac measure located at the point , i.e.,
then the resulting HDMR is CutHDMR. The point is so called cutpoint or anchor point. In ANOVAHDMR, the involves dimensional integration and the computation of the components of is expensive. In CutHDMR, the only involves the evaluation in a dimensional space and the computation is cheap and straightforward. Because of this reason, we use CutHDMR for the analysis and computation in the paper.
3.2 A hybrid HDMR
In this subsection, we use a less abstract representation of the HDMR to motivate approximations suitable for practical computation. Specifically, we review the traditional approach to truncated HDMR, and then propose and analyze an alternative hybrid HDMR.
Expanding elements of (LABEL:HDMR1), the HDMR of can be written in the form,
\hb@xt@.01(3.6) 
Here is the zerothorder component denoting the mean effect of . The firstorder component represents the individual contribution of the input and the secondorder component represents the cooperative effects of and and so on. We define th order truncated HDMR by
\hb@xt@.01(3.7) 
For most realistic physical systems, loworder HDMR (e.g., ) may give a good approximation [38]. The can approximate through truncating the HDMR. The consists of a large number of component terms for highdimensional models, the computation of all the components may be costly. For many cases, some highorder cooperative effects cannot be neglected in the model’s output, especially when a model strongly relies on a few dependent variables. Consequently, the traditional truncated HDMR in (LABEL:truncatedHDMR) may not yield a very good approximation. To eliminate or alleviate these drawbacks of traditional truncated HDMR, we propose a new truncated HDMR, which we refer to as hybrid HDMR.
We use the Fourier amplitude sensitivity test [9] to select the most active dimensions from the components of the highdimensional random vector . Let be the firstorder components defined in Eq.(LABEL:HDMR2) and let denote the variance of . We may assume that as we can reorder the index set to satisfy this requirement. Alternatively, we can order the index set to produce monotonically decreasing expectations, , and use this ordering to select the most active dimensions. In this paper, we only consider the variance sensitivity, and we calculate () using the method described in (LABEL:compvar). Because the firstorder components are defined in onedimensional random parameter spaces, the computational cost for is very small. Moreover, most of the elements in will be reused when we calculate the variance of outputs represented by hybrid HDMR. We set a threshold constant with and find an optimal such that
\hb@xt@.01(3.8) 
Then we define to be the most active dimensions. We note that provides information on the impact of when it is acting alone on the output. It is clear that if a change of the random input within its range leads to a significant change in the output, then is large. Therefore, (LABEL:findJ) provides a reasonable criteria for identifying the most active dimensions, and for a given stochastic model, we can adjust the value of in Eq. (LABEL:findJ) such that is much less than . Defining the index set , the proposed hybrid HDMR may be written,
\hb@xt@.01(3.9) 
Here the set of projectors generate a lattice and its maximal projection operator is . By (LABEL:newHDMR1), the hybrid HDMR consists of two parts: the first part is the complete HDMR on the most active dimensions indexed in , the second part is the firstorder truncated HDMR on the remaining dimensions . We note that the component of defined in (LABEL:HDMR1) can be recursively computed and explicitly computed, respectively by (ref. [28])
\hb@xt@.01(3.10) 
By (LABEL:HDMRexpress1), we can rewrite equation (LABEL:newHDMR1) by
\hb@xt@.01(3.11) 
We note that the operator projects the variable function to a function defined on the the most active dimensions. The term gives the dominant contribution to . Since any firstorder components are usually important, these components are retained in . In CutHDMR, there is no error for the approximation of whenever the point is located the th dimensional subvolume across the cutpoint .
In this paper, identification of the most important dimensions is based on a global sensitivity analysis on the univariate terms of HDMR. This criteria has been shown to be reasonable for many cases and is often used [9, 32, 47, 22]. However, it is important to note that this criteria may not effectively identify the most important dimensions in some cases. For example, in some situations the individual contribution of an input parameter may not be significant, even though its cooperative influence with other inputs is significant. Similarly, the identification of the most active dimensions for highly variable solutions may require different criteria. Recent work [41] presents a preliminary discussion on the choice of this critera. Since, the new hybrid HDMR naturally includes the cooperative effects of higherorder terms within the most important dimensions, this selection criteria is key to addressing these challenging cases and is an important topic for furture work.
If we use traditional truncated HDMR to approximate the term in (LABEL:newHDMR2), then we can obtain the adaptive HDMR developed in [32, 47],
\hb@xt@.01(3.12) 
To simplify the notation, we will suppress in in the paper when the truncation order is not emphasized. The following proposition gives the relationship among , and .
Proposition 3.2
Let and be defined in (LABEL:newHDMR2) and (LABEL:adHDMR), respectively. Then
\hb@xt@.01(3.13) 
Proof. We note the equality
\hb@xt@.01(3.14) 
Because the set of HDMR components in do not intersect with the set of HDMR components in , The equation (LABEL:DecompF) implies that
\hb@xt@.01(3.15) 
Then the equation (LABEL:eq0ad) follows immediately by combining equation (LABEL:eq1ad) and (LABEL:eq2ad).
The equation (LABEL:eq0ad) implies that hybrid HDMR has better approximation properties than adaptive HDMR.
The mean of can be computed by summing the mean of all HDMR components of for both ANOVAHDMR and CutHDMR. The variance of is the sum of the variance of HDMR components of for ANOVAHDMR. However, the direct summation of variance of components of may not equal to variance of for CutHDMR. Consequently, we may want to have a truncated ANOVAHDMR to approximate the variance of for CutHDMR. If we directly derive a truncated ANOVAHDMR, the computation is very costly and more expensive than the direct computation of the variance of itself. This is because a truncated ANOVAHDMR requires computing many highdimensional integrations. To overcome the difficulty, we can use an efficient twostep approach to derive a hybrid ANOVAHDMR through the hybrid CutHDMR.
Let be the dim variable with indices in . Using the hybrid HDMR formulation (LABEL:newHDMR2), the hybrid CutHDMR of has the following form
\hb@xt@.01(3.16) 
We use to act on to get a hybrid ANOVAHDMR of , which has the following form
\hb@xt@.01(3.17) 
Then we have the following theorem.
Theorem 3.3
Let and be defined in (LABEL:cut) and (LABEL:anova), respectively. Then
\hb@xt@.01(3.18)  
\hb@xt@.01(3.19) 
The proof of Theorem LABEL:HDMRthm3 is presented in Appendix LABEL:app1.
Theorem LABEL:HDMRthm3 implies that we can compute the mean and variance of by directly summing the means and variances of the components of . This observation can significantly reduce the complexity of this computation. In addition, the derived hybrid ANOVAHDMR is easily obtained by using the twostep approach through the hybrid CutHDMR . Our further calculation shows that the traditional truncated CutHDMR (LABEL:truncatedHDMR) and adaptive CutHDMR (LABEL:adHDMR) do not have these properties. We use sparsegrid quadrature [18] to compute the mean and variance in the paper.
Compared with the traditional truncated HDMR and adaptive HDMR, the hybrid HDMR defined in Eq.(LABEL:newHDMR2) has the following advantages: (a) The hybrid HDMR consists of significantly fewer terms than the traditional truncated HDMR and adaptive HDMR, the total computational effort of the hybrid HDMR can be substantially reduced. (b) Since the hybrid HDMR inherently includes all the cooperative contributions from the most active dimensions, approximation accuracy is not worse (maybe better) in hybrid HDMR than the traditional truncated HDMR and adaptive HDMR; (c) According to Theorem LABEL:HDMRthm3, the computation of variance of hybrid HDMR is much more efficient.
3.3 Analysis of computational complexity
In the previous subsection, we have addressed the accuracy of the hybrid HDMR and developed some comparisons with traditional truncated HDMR and adaptive HDMR. In this subsection, we discuss the computational efficiency for the various HDMR techniques when Smolyak sparsegrid collocation is used. We remind the reader that the HDMR refers to CutHDMR.
Let be the number of sparse grid collocation points with level in full random parameter dimension space and the total number of sparse grid collocation points with level in the traditional truncated HDMR . We define and in a similar way. We first consider the case and and calculate the total number of sparse grid collocation points for the different approaches. By using (LABEL:collnodes), we have for
Consequently, if , it follows immediately that
This means that the hybrid HDMR requires the smallest number of collocation points when and . This case is of particular interest in this paper.
Next we consider two other cases of interest. First, it can be shown that if and , then
This implies that the computational effort in the hybrid HDMR is the least for when the number of the most active dimensions is moderate for a highdimensional problem. However, if the number of the most active dimensions is large (e.g., ), we can show that
\hb@xt@.01(3.20) 
This relationship tells us that adaptive HDMR may be the most efficient if higherlevel collocation is required and is large as well.
For a highdimensional stochastic model, if the number of the most active dimensions is large and highlevel (e.g., level 3 and above) sparsegrid collocation is required to approximate the term in (LABEL:newHDMR2), using the adaptive HDMR (LABEL:adHDMR) may improve efficiency according to the bounds given in (LABEL:largeJ). Adaptive HDMR has been extensively considered in many recent papers [22, 32, 33, 47]. However, in our experience, the number of the most active dimensions is often less than as is large. If is smooth with respect to , lowlevel sparse grid collocation (e.g., level 2) will generally provide an accurate approximation. Many problems fall into this class, and we focus on them in this paper.
3.4 Integrating HDMR and sparsegrid collocation
As stated in Subsection LABEL:sectsgc, the sparse grid stochastic collocation method can reduce to the stochastic twophase flow system (LABEL:tpsystem) into a set of deterministic twophase flow systems. However, it suffers from curse of dimensionality with increasing stochastic dimension. Integrating HDMR and a sparsegrid collocation methods provides an approach to overcome this difficulty and may significantly enhance the efficiency.
Without loss of generality, we consider the saturation solution as an example to present the technique for the hybrid CutHDMR. Using (LABEL:newHDMR2), (LABEL:HDMRexpress1) and Smolyak sparsegrid interpolation, we have
\hb@xt@.01(3.21) 
where , and . Here is the cutpoint for a CutHDMR. The choice of the cutpoint may affect the accuracy of the truncated CutHDMR approximation. The study in [17] argued that an optimal choice of the cutpoint is the center point of a sparsegrid quadrature. In this paper, we will use such a cutpoint for computation. Due to Theorem LABEL:HDMRthm3, the mean and variance of can be approximated by
\hb@xt@.01(3.22)  
From (LABEL:approxeq1), we find that the approximation error of the mean and variance comes from two sources: truncated HDMR and Smolyak sparse grid quadrature. Let be the set of pairs of collocation points and weights in the most active subspace and the set of pairs of collocation points and weights in . We note that and . Then by (LABEL:meancomp1), the mean of can be computed by
\hb@xt@.01(3.23) 
Similarly, we can compute in terms of evaluations of and at the collocation points.
Because each of the terms in adaptive HDMR are usually correlated, the variance of is not equal to the sum of the variance of each term in (LABEL:adHDMR). Let (where ) be the set of pairs of collocation points and weights in the full random dimension . To compute the variance using adaptive HDMR, we need to project each collocation point onto the components and and interpolate all terms in (LABEL:adHDMR). Then we use (LABEL:compvar) to calculate the variance. This process involves at most Smolyak sparsegrid interpolations. The computation of these interpolations is usually very expensive when and are large, and hence, computing the variance using adaptive HDMR is usually much more expensive than using hybrid HDMR. The numerical experiments in Section LABEL:sectnum demonstrate this performance advantage for hybrid HDMR.
4 Mixed multiscale finite element method
In Section LABEL:sectHDMRsgc, we have discussed integrating hybrid HDMR and sparse grid collocation method to reduce the computation complexity from highdimensional stochastic spaces. The permeability field is often heterogeneous in porous media. It is necessary to use a numerical method to capture the heterogeneity. MMsFEM is one of such numerical methods and has been widely used in simulating flows in heterogeneous porous media [2, 26]. To simulate the twophase flow system (LABEL:tpsystem), it is necessary to retain local conservation for velocity (or flux). To this end, we use MMsFEM to solve the flow equation and obtain locally conservative velocity. Using MMsFEM coarsens the multiscale model in spatial space and can significantly enhance the computation efficiency.
Corresponding to the hybrid HDMR expansion of in (LABEL:approxeq1), i.e.,
the velocity in (LABEL:tpsystem) also admits the same hybrid HDMR expansion as following
\hb@xt@.01(4.1) 
where , and . We use to obtain , to obtain and to obtain .
Without loss of generality, we may assume that the boundary condition in the flow equation of (LABEL:approxeq1) is no flow boundary condition. Let and . Then we can uniformly formulate the mixed formulation of equations of and as following,
\hb@xt@.01(4.2) 
Here and are corresponding to the coefficients and , respectively. Let . Then is the solution to equation (LABEL:mixedeq1) if and are replaced by and , respectively.
The weak mixed formulation of (LABEL:mixedeq1) reads: seek such that they satisfy the equation
Let and be the finite element spaces for velocity and pressure, respectively. Then the numerical mixed formulation of (LABEL:mixedeq1) is to find such that they satisfy
\hb@xt@.01(4.3) 
We use MMsFEM for (LABEL:numweak). It means that mixed finite element approximation is performed on coarse grid, where the multiscale basis functions are defined. In MMsFEM, we use piecewise constant basis functions on coarse grid for pressure. For the velocity, we define multiscale velocity basis functions. The degree of freedom of multiscale velocity basis function is defined on interface of coarse grid. Let be a generic edge or face of the coarse block . The velocity multiscale basis equation associated with is defined by
\hb@xt@.01(4.4) 
For local mixed MsFEM [7], , the normal vector. If the media demonstrate strong nonlocal features including channels, fracture and shale barriers, some limited global information is needed to define the boundary condition to improve accuracy of approximation [2, 25]. We will specify the boundary condition for different parts in (LABEL:HDMRu). Then defines multiscale velocity basis function associated to , and the multiscale finite dimensional space for velocity is defined by
It is wellknown that using approximate singlephase global velocity information can considerably improve accuracy for multiscale simulation of twophase flows [1, 2, 25, 26]. For the twophase flow system (LABEL:tpsystem), the singlephase global velocity solves the following equation
\hb@xt@.01(4.5) 
By using hybrid HDMR and sparse grid interpolation, the admits the following approximation
where and . Here the interpolation levels and can be different. Because are the most active dimensions, it is usually desirable that . Now we are ready to specifically describe the multiscale finite element space for , () and .

To construct the multiscale basis functions for , we take in (LABEL:basiseq) and
\hb@xt@.01(4.6) The multiscale finite element space for is defined by
\hb@xt@.01(4.7) 
To construct the multiscale basis functions for (), we take in (LABEL:basiseq) and
\hb@xt@.01(4.8) The multiscale finite element space for is defined by
\hb@xt@.01(4.9) 
To construct the multiscale basis functions for