A Bayesian Estimation for the Fractional Order of the Differential Equation That Models Transport in Unconventional Hydrocarbon Reservoirs
Abstract
The extraction of natural gas from the earth has been shown to be governed by differential equations concerning flow through a porous material. Recently, models such as fractional differential equations have been developed to model this phenomenon. One key issue with these models is estimating the fraction of the differential equation. Traditional methods such as maximum likelihood, least squares, and even method of moments are not available to estimate this parameter as traditional calculus methods do not apply. We develop a Bayesian approach to estimate the fraction of the order of the differential equation that models transport in unconventional hydrocarbon reservoirs. In this paper, we use this approach to adequately quantify the uncertainties associated with the error and predictions. A simulation study is presented as well to assess the utility of the modeling approach.
Keywords: Bayesian Estimation; Fractional Differential Equations; Modeling Error; Validation;
1 Introduction
Fractional calculus and its application to different disciplines of science has grown. Recently, varied works have been published about fractional differential equations for a range of topics, from the case of Lévy motion discussed by Benson et al. (2000), to the modeling of an Ebola epidemic in Area et al. (2015). Fractional calculus has also been discussed to great length in Meerschaert and Tadjeran (2004).
Wyss (1986) discussed fractional derivative concepts, including various ways to solve them. Liu et al. (2009) considered a finite domain spacetime fractional advection dispersion equation. Malik et al. (2015) discuss the lack of knowledge of transport in porus material. The use of fractional differential equations is becoming more popular in the field of hydrocarbon reservoirs, as it utilizes all the parameters previously stated. de Andrade et al. (2005) discuss anomalous flow necessitating the use for fractional calculus, specificaly fractional diffusion equation. In this paper, the goal is to estimate the fractional order of equation (1.1) which is difficult to isolate. Fan et al. (2016) discuss parameter estimation of a fractional fractal diffusion model, specifically across porous media. Malik et al. (2014) considered the problem of modeling transport in unconventional hydrocarbon reservoirs using a fractional partial differential equation. Awotunde et al. (2016) show there is deviation between real data obtained and the model using Darcys law from Darcy (1856), which is why new models must be implemented. ArizaHernandez et al. (2017) utilize a Bayesian approach to solve the inverse problem of a fractional population growth model. They employ the Plummer (2012) JAGS (Just Another Gibbs Sampler) software to generate Markov Chain Monte Carlo (MCMC) samples, which for more information on MCMC, see Gilks et al. (1996). While JAGS is effective, it cannot reliably be guaranteed to produce quality samples. This work utilizes a Sampling Importance Resampling approach to obtain the posterior samples giving the user more control over the quality of the sample as well as ensuring the differential equation model is solved correctly. Furthermore, this work is applied to hydrocarbon reservoirs. Other publications, such as Razmina et al. (2014), discuss fractional calculus use in fractured reservoirs.
The timefractional advectiondiffusion equation is given by Malik et al. (2014):
(1.1) 
where is the pressure distribution in unconventional reservoirs, is the diffusivity (which is related to the rock permeability), and is a convection velocity; both and are highly nonlinear.
Observed data rarely follows the solution exactly, as realizations are often contaminated by some source of error. The question becomes: where are the sources of error? There are two sources of error, internal error (also known as noise or process error) and external error (also known as measurement or experimental error). Strong and Oakley (2014) state that internal error is minor misspecification of the model across the process space, specifically in the homogeneity of the substrate. With external error, measuring instruments often fail to yield the same measurement twice.
Internal error:
Here, internal error means the error in the differential equation model. In this case, the internal error will accumulate over the integral associated with the solution. Hence, may be something like a Brownian motion or OrnsteinUhlenbeck process.
External error:
where is the observed value of the process at time and . In this case, will follow some appropriate probability distribution. The external error problem is an easy problem in contrast to the internal error problem. For the proof of concept, the model will only focus on external error. Ultimately, both types of error structures in the model would be ideal.
2 Model
The fractional differential equation of interest is taken from Malik et al. (2014) to model the pressure as a function of time and location under exponential uploading, given by , (). In this case, has a closed form solution under some mild conditions.
(2.1) 
The main parameter in this model is , the fraction of the derivative, and it is bound between 0 and 1.
2.1 Statistical Model and Parameter Estimation
A Bayesian approach is used to estimate the parameters in the model which requires that both a likelihood be specified as well as prior distributions on the model parameters:
(2.2) 
where . Here, the logNormal likelihood is chosen to ensure that pressure is always a positive value. In this specification, the likelihood has two parameters, and . The prior distributions are specified as to reflect the prior knowledge that is bound between 0 and 1. For the prior distribution is specified as to reflect the prior knowledge that must be a positive value. For more on prior distribution and selection, see Berger (1985).
The posterior distribution can be found using Bayes’ Theorem:
(2.3) 
In the case considered here, there is no analytic solution to . Thus, a sampling method must be employed to draw samples from the posterior distribution, from which all inferences will be made. There are many choices for the algorithm to sample from the posterior distribution such as Acceptance Sampling, MetropolisHastings Sampling, Sampling Importance Resampling, etc, found in Gelman et al. (2013) and Gilks et al. (1996). Due to the low number of model parameters, the Sampling Importance Resampling approach is employed in this work. For generality of notation, let .
Algorithm

Draw where is a distribution similar to the posterior distribution from which it is easy to draw candidate values.

Calculate

Calculate the posterior probability weights .

Resample samples with replacement from using their corresponding posterior probability weights .

The set of posterior samples of size is obtained by taking samples, with replacement, from using their corresponding posterior probability weights .
3 Simulated Example
Suppose the system of interest is given by (1.2) where . Further, suppose that data for has been observed, with noise, at all combinations of equally spaced levels of from to and equally spaced times from to and the noise is multiplicative following a LogNormal distribution with mean 1 and . Figure 1 shows the unperturbed data surface on the left and the perturbed data surface on the right. Notice that this set parameter specification produces a quite noisy surface.
The prior distributions for and were specified as follows:
Using the LogNormal likelihood, the prior above, and Bayes Formula, the posterior distribution is given as:
(3.1) 
where is the realized value and is the theoretical without noise. For clarity, is the solution to the system for for the sampling point.
The Sampling Importance Resampling algorithm was employed with 10,000 candidate samples with 1,000 posterior samples drawn. The sample generated a posterior sample of 89.8% unique samples indicating a high quality sample from the posterior distribution. Histograms of the posterior distributions of and can be found in Figure 2, which shows the true values and in the middle of the distribution. For reference, the 95% posterior credible intervals were generated by taking the 2.5% and 97.5% quantiles from the marginal posterior distributions and gave: for , (0.8169, 0.8226), which contains the true value 0.82, and for , (0.0907, 0.1050), which also contains the true value of 0.1. This shows that the procedure proposed can be employed to use data to estimate the fraction of the differential equation.
Not only can the method proposed be used to estimate the fraction of the differential equation, it can also be used to quantify the prediction uncertainty associated with the model and parameter estimates. To do this, the posterior predictive distribution can be employed to generate a distribution for a new observation at the value of and . Recall that the posterior predictive distribution is given by:
(3.2) 
In the case considered here, the posterior predictive distribution is difficult to visualize in three dimensions. Instead, profile plots of the median surface are created, the surfaces generated by the 2.5% and 97.5% quantiles for given values of and given values of along with the data. Figures 3 and 4 give these profile plots for and , respectively. Notice that the posterior predictive intervals capture most of the observed data. This gives evidence that the modeling approach is properly quantifying the uncertainties associated with both estimation as well as inherent noise in the data.
4 Robustness
In order to determine if the approach proposed for estimating is robust to the value of in the underlying process, a robustness analysis was conducted for varying values of and . For this study, a new dataset was simulated using each combination of to by increments of and and . With these simulated datasets the estimation algorithm was used to estimate the 95% credible intervals for each parameter using the 2.5% and 97.5% quantiles from the 1,000 samples from the posterior distribution of and . Table 3 shows the results of this study. Notice that all of the 95% credible intervals contain the correct parameter value, which indicates that the estimation algorithm is robust to the underlying values of and .
0.1  (0.0999, 0.1001)  (0.0999, 0.1002)  (0.0999, 0.1002)  

(0.0090, 0.0104)  (0.0912, 0.1065)  (0.2461, 0.2838)  
0.2  (0.1999, 0.2001)  (0.1999, 0.2001)  (0.1998, 0.2001)  
(0.0094, 0.0110)  (0.0916, 0.1067)  (0.2356, 0.2725)  
0.3  (0.2999, 0.3001)  (0.2998, 0.3002)  (0.2995, 0.3001)  
(0.0093, 0.0108)  (0.0886, 0.1022)  (0.2322, 0.2699)  
0.4  (0.3999, 0.4001)  (0.3997, 0.4005)  (0.3984, 0.4003)  
(0.0098, 0.0114)  (0.0868, 0.1003)  (0.2245, 0.2603)  
0.5  (0.4998, 0.5001)  (0.4990, 0.5006)  (0.4980, 0.5016)  
(0.0094, 0.0109)  (0.0987, 0.1146)  (0.2175, 0.2544)  
0.6  (0.5998, 0.6002)  (0.5986, 0.6015)  (0.5970, 0.6035)  
(0.0096, 0.0112)  (0.0987, 0.1161)  (0.2276, 0.2629)  
0.7  (0.6999, 0.7003)  (0.6996, 0.7038)  (0.6938, 0.7043)  
(0.0094, 0.0110)  (0.0902, 0.1048)  (0.2498, 0.2893)  
0.8  (0.7999, 0.8004)  (0.7965, 0.8019)  (0.7946, 0.8082)  
(0.0088, 0.0103)  (0.0926, 0.1071)  (0.2309, 0.2673)  
0.9  (0.8997, 0.9005)  (0.8979, 0.9051)  (0.8953, 0.9146)  
(0.0098, 0.0114)  (0.0939, 0.1088)  (0.2458, 0.2847) 
Additionally, as this paper is not particularly concerned about the extremes to the values of , the Beta distribution was used, and to further check robustness, * and * were increased each to 5, 10, 20, 50, etc, to make sure the 95% credible interval would still capture both * and *. As seen in Table 2 when this test was run, * and * were both at 100 before the interval failed to capture the true value, in this case . To show robustness, both and are changed, starting at and , also known as a uniform distribution, and increased at an equal rate until the 95% credible interval no longer captured the parameters of the equation successfully. As is evident in the table above, the properties of the method derived capture the parameters in the case of a uniform distribution and continued to capture the parameters until the beta distribution was at and both equal to 100. Even under a very strict beta distribution, the properties of the method capture the parameters.
* & *  

1  (0.8174, 0.8237)  
(0.0984, 0.1143)  
3  (0.8184, 0.8244)  
(0.0982, 0.1141)  
5  (0.8190, 0.8249)  
(0.0958, 0.1115)  
10  (0.8176, 0.8230)  
(0.0908, 0.1046)  
20  (0.8165, 0.8220)  
(0.0853, 0.1000)  
50  (0.8141, 0.8201)  
(0.0920, 0.1065)  
100  (0.8202, 0.8260)  
(0.0894, 0.1042) 
To further expound upon the robustness of the algorithm, the algorithm used 200 unique data sets to ensure that the algorithm successfully captured both and at least % of the iterations tested.
(4.1) 
where is an indicator variable, indicating whether or not the was inside the credible interval, and the total amount of ’ in the interval divided by the total number of samples yielded a percentage, indicating coverage probability.
0.1  0.995  0.980  0.965  

0.995  0.980  0.975  
0.3  0.995  0.975  0.965  
0.995  0.965  0.955  
0.5  1.000  0.980  0.965  
0.995  0.975  0.965  
0.7  0.990  0.975  0.955  
1.000  0.975  0.950  
0.9  0.995  0.985  0.960  
0.995  0.990  0.955 
5 Conclusion
Proof of concept for accurate estimation of the fractional parameter with varying external error was concluded. This work demonstrates a method for researchers to employ fractional advectiondiffusion equations on real world problems where the fraction can be estimated using observed data. This approach, while computationally intensive, not only can estimate the fraction but can also adequately quantify the uncertainty associated with the parameters and predictions. The method is shown to be robust with respect to the underlying value of the fraction. By testing the robustness of changing the true value of and , the approach created credible intervals capturing the true value of and . Furthermore, by changing the prior distribution to a Uniform distribution or Beta distribution with varying and , the method is robust to capture the true values. By running the 200 MCMC simulations, the coverage probabilities successfully captures both parameters at least of the time. Utilizing the Sample Importance Resampling Regime and then tuning the importance sampler, the control of the quality of the sample ensured the differential equation model was solved correctly.
6 Future Research
This work has demonstrated that the Bayesian framework will allow for accurate estimation of under the scenario of external error; a similar study should be performed under internal error. Due to the stochastic nature of this scenario, the techniques for estimation may differ greatly such as using MetropolisHasting sampling possibly combined with Slice sampling. This work has been omitted this paper as it is drastically different and the work performed loses focus. The study should be a simulation study and verify that the methods can be combined to produce accurate results. Ultimately, future research should focus on developing a Bayesian estimation approach when both the internal and external error exist in a fractional differential equation system. In addition to error, future research on scenarios with more parameters such as rock porosity, hydrocarbon density, hydrocarbon viscosity, and rock permeability to better model the transport of unconventional hydrocarbon reservoirs. Lastly, applying varying numerical and hybrid methods to test for parameter redundancy by adding constraints to result in better modeling in the future will impact the effectiveness of models.
Competing Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 Area et al. (2015) 00 Area I, Losada H, Nieto JJ, Shammakh W, and Torres Á (2015) On a fractional order Ebola epidemic model, Advances in Difference Equations, 2015:278, DOI:10.1186/s1366201506135.
 ArizaHernandez et al. (2017) ArizaHernandez FJ, SanchezOrtiz J, ArcigaAlejandre M, and VivasCruz L (2017) Bayesian Analysis for a Fractional Population Growth Model, Journal of Applied Mathematics, Article ID 9654506, 1–9, DOI: 10.1155/2017/9654506.
 Awotunde et al. (2016) Awotunde A, Ghanam RA, AlHomidan S, and Tatar Nassereddine (2016) Numerical schemes for anomalous diffusion of singlephase fluids in porous media, Communications in Nonlinear Science and Numerical Simulation, Article ID 10075704, 381–395, DOI: 10.1016/j.cnsns.2016.03.006.
 Benson et al. (2000) Benson DA, Wheatcraft SW, and Meerschaert MM (2000) The fractionalorder governing equation of Lévy motion Water Resources Research, 36(6), 1413 – 1423, DOI: 10.1029/2000WR900032.
 Berger (1985) Berger JO (1985) Statistical Decision Theory and Bayesian Analysis, Second Edition, Springer, New York.
 Darcy (1856) Darcy H (1856) Les Fontaines Publiques de la Ville de Dijion. Paris: Dalmont.
 de Andrade et al. (2005) de Andrade MF, Lenzi EK, Evangelista LR, Mendes RS, and Malacarne LC (2005) Anomalous diffusion and fractional diffusion equation: anisotropic media and external forces, Physics Letters A, 347, 160–169, DOI:10.1016/physleta.2005.07.090.
 Fan et al. (2016) Fan W, Jian X, and Chen S (2016) Parameter estimation for thre fractional fractal diffusion model based on its numerical solution, Computer and Mathematics with Applications, 71, 642–651, DOI: 10.1016/j.camwa.2015.12.030.
 Gelman et al. (2013) Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, and Rubin DB (2013) Bayesian Data Analysis, Third Edition, CRC Press, Boca Raton, FL.
 Gilks et al. (1996) Gilks WR, Richardson S, and Spiegelhalter DJ (1996) Markov Chain Monte Carlo in Practice, First Edition, Chapman & Hall, Boca Raton.
 Liu et al. (2009) Liu F, Bayarriy MJ, and Berger JO (2009) Modularization in Bayesian Analysis, with Emphasis on Analysis of Computer Models, Bayesian Analysis, 4(1), 119–150, DOI:10.1214/09BA404.
 Liu et al. (2009) Liu F, Zhuang P, Anh V, Turner I, and Burrage K (2007) Stability and convergence of the difference methods for the spacetime fractional advectiondiffusion equation, Applied Mathematics and Computation, 191, 12–20, DOI:10.1016/j.amc.2006.08.162.
 Malik et al. (2014) Malik NA, Ali I, and Chanane B (2014) Numerical Solutions of Nonlinear Fractional Transport Models in Unconventional Hydrocarbon Reservoirs using Variational Iteration Method, Proceedings of the 5th International Conference on Porous Media and its Applications in Science and Engineering, Kona Hawaii, June 22 to 27, 2014.
 Malik et al. (2015) Malik NA, Ghanam RA, and AlHomidan S (2015) Sensitivity of the pressure distribution to the fractional order in the fractional diffusion equation, Canadian Journal of Physics, 93, 18–36, DOI:10.1139/cjp20130387.
 Plummer (2012) Plummer M (2012) JAGS: A program for analysis of bayesian graphical models using Gibbs sampling, 2012.
 Meerschaert and Tadjeran (2004) Meerschaert and Tadjeran (2004) Finite difference approximations for fractional advectiondispersion flow equations, Journal of Computation and Applied Mathematics, 172, 65–77, DOI:10.1016/j.cam.2004.01.033.
 Razmina et al. (2014) Razminia K, Razminia A, and Tenreiro Machado JA (2014) Analysis of diffusion process in fractured reservoirs using fractional derivative approach, Communications in Nonlinear Science and Numerical Simulation, 19, 3161–3170, DOI:10.1016/j.cnsns.2014.01.025.
 Strong and Oakley (2014) Strong M and Oakley JE (2014) When Is a Model Good Enough? Deriving the Expected Value of Model Improvement via Specifying Internal Model Discrepancies Journal of Uncertainty Quantification, 2(1), 106 – 125, DOI:10.1137/120889563.
 Wyss (1986) Wyss W (1986) The fractional diffusion equation Journal of Mathematical Physics, 27(11), 2782–2785, DOI:10.1063/1.527251.