A variable high-order shock-capturing finite difference method with GP-WENO
We present a new finite difference shock-capturing scheme for hyperbolic equations on static uniform grids. The method provides selectable high-order accuracy by employing a kernel-based Gaussian Process (GP) data prediction method which is an extension of the GP high-order method originally introduced in a finite volume framework by the same authors. The method interpolates Riemann states to high order, replacing the conventional polynomial interpolations with polynomial-free GP-based interpolations. For shocks and discontinuities, this GP interpolation scheme uses a nonlinear shock handling strategy similar to Weighted Essentially Non-oscillatory (WENO), with a novelty consisting in the fact that nonlinear smoothness indicators are formulated in terms of the Gaussian likelihood of the local stencil data, replacing the conventional -type smoothness indicators of the original WENO method. We demonstrate that these GP-based smoothness indicators play a key role in the new algorithm, providing significant improvements in delivering high – and selectable – order accuracy in smooth flows, while successfully delivering non-oscillatory solution behavior in discontinuous flows.
keywords:Gaussian processes; GP-WENO; high-order methods; finite difference method; variable order; gas dynamics; shock-capturing
High-order discrete methods for hyperbolic conservative equations comprise an important research area in computational fluid dynamics (CFD). The rapid growth in development of high-order methods has been to a great extent driven by a radical change in the balance between computation and memory resources in modern high-performance computing (HPC) architectures. To efficiently use computing resources of modern HPC machines CFD algorithms need to adapt to hardware designs in which memory per compute core has become progressively more limited Attig2011 (); Dongarra2012future (); Subcommittee2014top (). A computationally efficient numerical algorithm should exploit greater availability of processing power, while keeping memory use low. This goal can be achieved by discretizing the CFD governing equations to high order, thus providing the desired solution accuracy at a higher cost in processor power, but with a smaller memory requirement Hesthaven2007 (); LeVeque2002 (); leveque2007finite (). Of course, a practical consideration in designing such high-order numerical algorithms is that time-to-solution at a given grid resolution should not increase due to the additional floating point operations.
The most popular approach to designing high-order methods for shock-capturing is based on implementing highly accurate approximations to partial differential equations (PDEs) using piecewise local polynomials. By and large, polynomial approaches fall into three categories: finite difference (FD) methods, finite volume (FV) methods, and discontinuous Galerkin (DG) methods. These three formulations all interpolate or reconstruct fluid states using Taylor series expansions, with accuracy controlled by the number of expansion terms retained in the interpolation or reconstruction. Below, we briefly summarize the aspects of those schemes that are most relevant to this paper.
The DG method, first proposed by Reed and Hill reed1973triangular () in 1973 for solving neutron transport problems, approximates a conservation law by first multiplying a given PDE by a test function and then integrating it in each cell to express the governing dynamics in integral form shu2009high (). The method approximates both the numerical solution and the test function by piecewise polynomials of chosen degree in each cell. These polynomials are permitted to be discontinuous at each cell interface, allowing flexibility in achieving high-order accuracy in smooth regions, while achieving non-oscillatory shock capturing at discontinuities. Solutions are typically integrated with a -stage Runge-Kutta (RK) method, in which case the scheme is referred to as RKDG. The advantages of RKDG are that it is well-adapted to complicated geometries cockburn2001runge (); it is easy to parallelize due to data locality beck2014high (); atak2015discontinuous (); it lends itself to GPU-friendly computing implementations klockner2009nodal (); accommodates arbitrary adaptivity baccouch2015asymptotically (); cao2015superconvergence (); it permits designs that preserve given structures in local approximation spaces cockburn2004locally (); li2005locally (). The weaknesses of the method include the fact that it is significantly more complicated in terms of algorithmic design; it potentially features less robust solution behaviors at strong shocks and discontinuities shu2009high (); its stability limit for timestep size becomes progressively more restrictive with increasing order of accuracy cockburn2001runge (); zhang2005analysis (); liu20082 (). For more discussions and an extended list of references, see also Cockburn and Shu cockburn1998runge (); cockburn2001runge (); Shu shu2009high (); shu2016high (); Balsara balsara_higher-order_2017 ().
The finite volume method (FVM) also uses the governing equations in integral form, making use of volume-averaged conservative variables. The discrete formulation of FVM provides a natural way of maintaining conservation laws toro2013riemann (). This inherently conservative property of the scheme makes FVM a very popular algorithmic choice for application problems where shocks and discontinuities are dominant. Historically, work on high-order FVM methods began with the seminal works by van Leer van1974towards (); van1977towards (); vanleer1979 (), which overcame the Godunov Barrier theorem godunov1959difference () to produce effective second-order accurate methods. More advanced numerical methods with higher solution accuracies than second-order became available soon thereafter, including the piecewise parabolic method (PPM) colella1984piecewise (), the essentially non-oscillatory (ENO) schemes harten1987uniformly (); shu_efficient_1988 (), and the weighted ENO (WENO) schemes ***Although the original WENO-JS scheme introduced in jiang1996efficient () is in FDM, the key ideas of WENO-JS have been reformulated and studied in FVM by numerous practitioners. liu1994weighted (); jiang1996efficient () which improved ENO by means of nonlinear weights. Most of these early schemes focused on obtaining high-order accuracy in 1D, and their naive extension to multidimensional problems using a dimension-by-dimension approach buchmuller2014improved () resulted in a second-order solution accuracy bottleneck due to inaccuracies in both space and time associated with computing a face-averaged flux function shu2009high (); buchmuller2014improved (); zhang2011order (); mccorquodale2011high (). Zhang et al. zhang2011order () studied the effect of second-order versus high-order quadrature approximations combined with the 1D 5th order accurate WENO reconstruction as a baseline spatial formulation. They demonstrated that for nonlinear test problems in 2D, the simple and popular dimension-by-dimension approach only converges to second-order, despite the fact that the order of accuracy of the baseline 1D algorithm (e.g., 5th order accuracy for WENO) does in fact carry over to linear 2D problems. See also a recent work on the piecewise cubic method (PCM) proposed by Lee et al. lee2017piecewise (). The spatial aspect of the problem is addressable by using multiple quadrature points per cell-face shu2009high (); titarev_finite-volume_2004 () or quadrature-free flux integration through freezing the Riemann fan along the cell interfaces dumbser2007quadrature (), while high-order temporal accuracy is achievable by using multi-stage Runge-Kutta (RK) methods zhang2011order (); buchmuller2014improved (); mccorquodale2011high () or single-step ADER-type formulations titarev2002ader (); toro2006derivative (); Titarev2005 (); Dumbser2008 (); balsara2009efficient (); dumbser2013ader (); balsara2013efficient (); qian_generalized_2014 ().
In the conservative FDM approach, one evolves pointwise quantities, thus avoiding the complexities of FVM associated with the need for dealing with volume-averaged quantities. For this reason, FDM has also been a popular choice for obtaining high-order solution accuracy in hyperbolic PDE solution, particularly when some of its well-known difficulties such with geometry, AMR, and free-streaming preservation are not considerations. See, e.g., jiang1996efficient (); del_zanna_echo:_2007 (); mignone_high-order_2010 (); jiang_alternative_2013 (); chen_5th-order_2016 (); zhu2016new (). Traditional FDM formulations directly reconstruct high order numerical fluxes from cell-centered point values shu1989 (); jiang1996efficient (); mignone_high-order_2010 (). In this approach, the pointwise flux is assumed to be the volume average of the targeted high-order numerical flux , thereby recovering the same reconstruction procedure of FVM (i.e., reconstructing point values of Riemann states at each interface, given the volume-averaged quantities at cell centers) for its flux approximation. To ensure stability, upwinding is enforced through a splitting of the FD fluxes into left and right moving fluxes. The most commonly used flux splitting is the global Lax-Friedrichs splitting shu2009high (), which, while able to maintain a high formal order of accuracy in smooth flows, is known to be quite diffusive chen_5th-order_2016 (). Alternatively, an improvement can be achieved with the use of local characteristic field decompositions in the flux splitting in part of the flux reconstruction mignone_high-order_2010 (), in which the characteristic field calculation is heavily dependent on a system of equations under consideration and adds an extra computational expense. Recently, some practitioners including Del Zanna et al. and Chen et al. studied another form of high-order FD flux formulation, called FD-Primitive del_zanna_echo:_2007 (); chen_5th-order_2016 () (referred to as FD-Prim hereafter), in which, instead of designing high-order numerical fluxes directly from cell pointwise values, high-order fluxes are constructed first by solving Riemann problems, followed by a high-order correction step where the correction terms are derived from the so-called compact-like finite difference schemes lele1992compact (); deng2000developing (); zhang2008development (). In this way, FD-Prim is algorithmically quite analogous to FVM, in that it first interpolates (rather than reconstructs) high-order Riemann states at each cell interface, then uses them to calculate interface fluxes by solving Riemann problems using either exact or approximate solvers, and finally makes corrections to the fluxes to deliver high-order-accurate FD numerical fluxes del_zanna_echo:_2007 (); chen_5th-order_2016 (). In this way the FD-Prim approach allows the added flexibility of choosing a Riemann solver (e.g., exact saurel1994exact (); delmont2009exact (); takahashi2014exact (); toro2013riemann (), HLL-types harten1983upstream (); toro1994restoration (); miyoshi2005multi (); guo2016hlld (), or Roe roe1981approximate (), etc.) in a manner analogous to the FVM approach.
The aforementioned traditional polynomial-based high-order methods are complemented by a family of “non-polynomial” (or “polynomial-free”) methods called radial basis function (RBF) approximation methods. As a family of “mesh-free” or “meshless” method, RBF approximation methods have been extensively studied (see buhmann2000radial ()) to provide more flexible approximations, in terms of approximating functions powell1985radial () as well as scattered data Franke1982 (). Unlike local polynomial methods, RBF has degrees of freedom that disassociate the tight coupling between the stencil configuration and the local interpolating (or reconstructing) polynomials under consideration. For this reason, interest has grown in meshfree methods based on RBF as means for designing numerical methods that achieve high order convergence while retaining a simple (and flexible) algorithmic framework in all spatial dimensions sarra2009multiquadric (); safdari2018radial (). Approximations based on RBF have been used to solve hyperbolic PDEs Katz2009 (); morton2007 (); sonar1996 (); guo2017rbf (); guo_non-polynomial_2016 (); bigoni_adaptive_2017 (), parabolic PDEs moroney2006 (); moroney2007 (), diffusion and reaction-diffusion PDEs shankar2015radial (), and boundary value problems of elliptic PDEs liu2015kansa (), as well as for interpolations on irregular domains chen2016reduced (); heryudono2010radial (); martel2016stability (), and for interpolations on more general sets of scattered data Franke1982 ().
In this paper, we develop a new high-order FDM in which the core interpolation formulation is based on Gaussian Process (GP) Modeling bishop2007pattern (); rasmussen2005 (); wahba1995 (). This work is an extension of our previous GP high-order method reyes_new_2016 () introduced in a FVM framework. Analogous to RBF, our GP approach is a non-polynomial method. By being a meshless method, the proposed GP method features attractive properties similar to those of RBF and allows flexibility in code implementation and selectable orders of solution accuracy in any number of spatial dimensions. An important feature of the GP approach is that it comes equipped with a likelihood estimator for a given dataset, which we have leveraged to form a new set of smoothness indicators. Based on the probabilistic theory of Gaussian Process Modeling bishop2007pattern (); rasmussen2005 (), the new formulation of our smoothness indicators is a key component of our GP scheme that we present in Section 3.3. We call our new GP scheme with the GP-based smoothness indicators GP-WENO in this paper. As demonstrated below in our numerical convergence study, the GP-WENO’s formal accuracy is and is controlled by the parameter , called the GP radius, that determines the size of a GP stencil on which GP-WENO interpolations take place. These numerical experiments also show that the new GP-based smoothness indicators are better able to preserve high order solutions in the presence of discontinuities than are the conventional WENO smoothness indicators based on the -like norm of the local polynomials jiang1996efficient ().
2 Finite Difference Method
We are concerned with the solution of 3D conservation laws:
where is a vector of conserved variables and , and are the fluxes. For the Euler equations these are defined as
We wish to produce a conservative discretization of the pointwise values of , i.e., , and we write Eq. (1) in the form:
Here , and are the , and numerical fluxes evaluated at the halfway point between cells in their respective directions. The numerical fluxes are defined so that their divided difference rule approximates the exact flux derivative with -th order:
and similarly for the and fluxes. As a result, the overall finite difference scheme in Eq. (3) approximates the original conservation law in Eq. (2) with the spatial accuracy of order . The temporal part of Eq. (3) can be discretized by a method-of-lines approach with a Runge-Kutta time discretization gottlieb_strong_2011 ().
To determine the numerical fluxes in Eq. (3), let us consider first the pointwise -flux, , as the 1D cell average of an auxiliary function, , in the -direction. If we also define another function, , we can write as
Differentiating Eq. (5) with respect to gives
and comparing with Eq. (4) we can identify as the analytic flux function we wish to approximate with the numerical flux . This can be repeated in a similar fashion for the and fluxes, and . The goal is then to form a high order approximation to the integrand quantities , and , knowing the mathematically cell-averaged integral quantities and physically pointwise fluxes , up to some design accuracy of order , e.g.,
Note that this is exactly the same reconstruction procedure of computing high-order accurate Riemann states at cell interfaces given the integral volume-averaged quantities at cell centers in 1D FVM.
In the high order finite difference method originally put forward by Shu and Osher shu1989 (), the problem of approximating the numerical fluxes is accomplished by directly reconstructing the face-centered numerical fluxes from the cell-centered fluxes on a stencil that extends from the points to . That is, in complete analogy to reconstruction in the context of finite volume schemes, where is a high order accurate procedure to reconstruct face-centered values from cell-averaged ones. Such flux-based finite difference methods (or FD-Flux in short), as just described, are easily implemented using the same reconstruction procedures as in 1D finite volume codes and provide high order of convergence on multidimensional problems. For this reason, they have been widely adopted mignone_high-order_2010 (); shu_high-order_2003 (); jiang1996efficient (). One pitfall of this approach is that proper upwinding is typically achieved by appropriately splitting the fluxes into parts moving towards and away from the interface of interest using the global Lax-Friedrichs splitting mignone_high-order_2010 (); shu_high-order_2003 () at the cost of introducing significant diffusion to the scheme.
On the other hand, it can be readily seen from Eq. (5) that the naive use of the interface value of the flux as the numerical flux can provide at most a second order approximation and should be avoided for designing a high-order FDM, no matter how accurately is computed, since
Alternatively, in an approach originally proposed by Del Zanna del_zanna_echo:_2007 (); del_zanna_efficient_2003 (), upwinding is provided by solving a Riemann problem at each of the face centers, , and for the corresponding face-normal fluxes, , and . The numerical flux is then viewed as being the face-center flux from the Riemann problem, i.e., at the cell interface ,
plus a series of high order corrections using the spatial derivatives of the flux evaluated at the face-center,
where parenthesized superscripts denote numerical derivatives in the corresponding dimension, and where and are constants chosen so Eq. (6) holds up to the desired order of accuracy, e.g.,
For the choice only the terms up to the fourth derivative in Eq. (10) need to be retained. The constants and are determined by Taylor expanding the terms in (10) and enforcing the condition in (11). For this reason, the values of and depend on the stencil geometry used to approximate the derivatives. Del Zanna del_zanna_echo:_2007 () used the Riemann fluxes at neighboring face-centers to calculate the derivatives and found and . A disadvantage of this choice is that it requires additional guard cells on which to solve the Riemann problem in order to compute the high order correction terms near the domain boundaries. Chen et al. chen_5th-order_2016 () converted the cell-centered conservative variables to cell-centered fluxes around the face-center of interest to compute the flux derivatives. This leads to and . While this approach doesn’t require additional guard cells, the conversion from conservative (or primitive) variables to flux variables needs to be performed at each grid point in the domain, incurring additional computational cost. For this reason, we adopt the face-center flux approach of Del Zanna. For example, the second and fourth derivatives of the -flux are then given by the finite difference formulas,
The derivatives of the -fluxes and -fluxes are given in the same way. These correction terms were originally derived in the context of compact finite difference interpolation lele1992compact (); deng2000developing (); zhang2008development (). For instance, the explicit formula in lele1992compact () for the first derivative approximation using six neighboring interface fluxes reduces to the high order correction formula in Eqs. (10) – (12) (see also the Appendix in del_zanna_echo:_2007 ()).
In summary, the finite difference method described in this paper consists of the following steps:
Pointwise values of either the primitive or conservative variables are given on a uniform grid at time , .
The Riemann states and as given in Eq. (9) at the face-centers between grid points, , and are interpolated from pointwise cell-centered values. These interpolations should be carried out in a non-oscillatory way with the desired spatial accuracy.
The face-center normal fluxes are calculated from the Riemann problem in Eq. (9) at the halfway points between grid points.
The second and fourth derivatives of the fluxes are calculated by following Eq. (12) using the Riemann fluxes from the previous step. In principle these should be carried out in a non-oscillatory fashion using a nonlinear slope limiter as was done in del_zanna_echo:_2007 (). However, it was pointed out in chen_5th-order_2016 () that the flux-derivatives’ contribution to the numerical fluxes in Eq. (10) is relatively small and will produce only small oscillations near discontinuities. For this reason, the derivatives are finite differenced without any limiters in our implementation. The numerical fluxes are constructed as in Eq. (10).
The conservative variables can then be updated using a standard SSP-RK method gottlieb_strong_2011 ().
So far, we have not yet described what type of spatial interpolation method is to be used in Step 2 to compute high-order Riemann states at each cell interface. Typically, non-oscillatory high-order accurate local polynomial schemes are adopted such as MP5 suresh_accurate_1997 () in chen_5th-order_2016 () or WENO-JS jiang1996efficient () in del_zanna_echo:_2007 (). In the next section, we will introduce our high-order polynomial-free interpolation scheme, based on Gaussian Process Modeling.
3 Gaussian Process Modeling
In this section, we briefly outline the statistical theory underlying the construction of GP-based Bayesian prior and posterior distributions (see Section 3.1). Interested readers are encouraged to refer to our previous paper reyes_new_2016 () for a more detailed discussion in the context of applying GP for the purpose of achieving high-order algorithms for FVM schemes. For a more general discussion of GP theory see rasmussen2005 ().
3.1 GP Interpolation
GP is a class of stochastic processes, i.e., processes that sample functions (rather than points) from an infinite-dimensional function space. The distribution over the space of functions is specified by the prior mean and covariance functions, which give rise to the GP prior:
a mean function over , and
a covariance GP kernel function which is a symmetric and positive-definite integral kernel over given by .
The GP approach to interpolation is to regard the values of the function at a series of points as samples from a function that is only known probabilistically in terms of the prior GP distribution, and to form a posterior distribution on that is conditioned on the observed values. One frequently refers to the observed values as training points, and to the passage from prior distribution to posterior predictive distribution as a training process. We may use the trained distribution to predict the behavior of functions at a new point . For example, in the current context, the fluid variables are assumed to be sample functions from GP distributions with prescribed mean and covariance functions, written as . We then train the GP on the known values of the fluid variables at the cell centers, , to predict the Riemann states at cell-face centers, e.g., with .
The mean function is often taken to be constant, . We have found a choice of zero mean, works well, and we adopt this choice in this paper. For the kernel function, , we will use the “Squared Exponential” (SE) kernel,
For other choices of kernel functions and the related discussion in the context of designing high-order approximations for numerical PDEs, readers are referred to reyes_new_2016 ().
The SE kernel has two free parameters, and , called hyperparameters. We will see below that plays no role in the calculations that are presented here, and may as well be chosen to be . However, is a length scale that controls the characteristic scale on which the GP sample functions vary. As was demonstrated in reyes_new_2016 (), plays a critical role in the solution accuracy of a GP interpolation/reconstruction scheme and ideally should match the length scales of the problem that are to be resolved.
Formally, a GP is a collection of random variables, any finite collection of which has a joint Gaussian distribution bishop2007pattern (); rasmussen2005 (). We consider the function values at points , , as our “training” points. Introducing the data vector with components , the likelihood, , of given a GP model (i.e., ) is given by
where and . The likelihood is a measure of how compatible the data is with the GP model specified by the mean and the covariance .
Given the function value samples , the GP theory furnishes the posterior predictive distribution over the value of the unknown function at any new point . The mean of this distribution is the posterior mean function,
where . Taking a zero mean GP, , Eq. (15) reduces to
According to Eqs. (15) and (16), the GP posterior mean is a linear function of , with a weight vector specified entirely by the choice of covariance kernel function, the stencil points, and the prediction point. We take this posterior mean of the distribution in Eq. (16) as the interpolation of the function at the point , , where is any one of the fluid variables in primitive, conservative or characteristic form, which we will denote as . Note that had we retained the multiplicative scale factor as a model hyperparameter, it would have canceled out in Eq. (16). This justifies our choice of .
3.2 GP Interpolation for FD-Prim
Hereafter, we restrict ourselves to describe our new multidimensional GP high-order interpolation scheme in the framework of FD-Prim, which only requires us to consider 1D data interpolations as typically done in the dimension-by-dimension approach for FDM. The notation and the relevant discussion will therefore be formulated in 1D.
We wish to interpolate the Riemann states from the pointwise cell centered values . We consider a (fluid) variable on a 1D stencil of points on either side of the central point of the -th cell , and write
We seek a high-order interpolation of at ,
where is the GP interpolation given in Eq. (16). We define the data vector on by
and we define a vector of weights , so that the interpolation in Eq. (16) can be cast as a product between the (row) vector of weights and the data ,
Note here that is a covariance kernel matrix of size whose entries are defined by
and is a vector of length with entries are defined by
The weights are independent of the data and depend only on the locations of the data points , and the interpolation point . Therefore, for cases where the grid configurations are known in advance, the weights can be computed and stored a priori for use during the simulation.
3.3 Handling Discontinuities: GP-WENO
The above GP interpolation procedure works well for smooth flows without any additional modifications. For non-smooth flows, however, it requires some form of limiting to avoid numerical oscillations at discontinuities that can lead to numerical instability. To this end, we adopt the approach of the Weighted Essentially Non-Oscillatory (WENO) methods jiang1996efficient (), where the effective stencil size is adaptively changed to avoid interpolating through a discontinuity, while retaining high-order convergence in smooth regions. In the work by Del Zanna et al. del_zanna_echo:_2007 (), a high-order Riemann state is constructed by considering the conventional WENO’s weighted combination of interpolations from a set of candidate sub-stencils. The weights are chosen based on -norms of the derivatives of polynomial reconstructions on each local stencil (e.g., , see below) in such a way that a weight is close to zero when the corresponding stencil contains a discontinuity, while weights are optimal in smooth regions in the sense that they reduce to the interpolation over a global†††The term global here is to be understood in a sense that the desired order of accuracy, e.g., 5th-order in WENO-JS, is to be optimally achieved in this “larger” or “global” stencil, rather than the global entire computational domain. stencil (e.g., , see below).
For the proposed GP-WENO scheme, we introduce a new GP smoothness indicator inspired by the probabilistic interpretation of GP, replacing the standard -norm-based formulations of WENO. The GP-WENO scheme will be fully specified by combining the linear GP interpolation in Section 3.2 and the nonlinear GP smoothness local indicators in this section.
We begin with considering the global stencil, , in Eq. (17) with points centered at the cell and the candidate sub-stencils , each with points,
Eq. (20) can then be evaluated to give a GP interpolation at the location from the -th candidate stencil ,
We now take the weighted combination of these candidate GP approximations as the final interpolated value,
As in the traditional WENO approach, the nonlinear weights, , should reduce to some optimal weights in smooth regions, so that the approximation in Eq. (25) reduces to the GP approximation (Eq. (20)) over the global point stencil . The ’s then should satisfy,
We then seek as the solution to the overdetermined system
where the -th column of M is given by for row entries and zeros for the rest:
where . For example, in the case of the above system reduces to the overdetermined system,
or in matrix form, ,
The optimal weights, , then depend only on the choice of kernel (Eq. (13)) and the stencil , and as with the weights and , the ’s need only be computed once and used throughout the simulation. We take as the least squares solution to Eq. (29), which can be determined numerically.
All that remains to complete GP-WENO is to specify the nonlinear weights in Eq. (26). These should reduce to the optimal weights in smooth regions, and more importantly, they need to serve as an indicator of the continuity of data on the candidate stencil , becoming small when there is a strong discontinuity on . We first adopt the weighting scheme of the WENO-JS schemes jiang1996efficient (),
where we have set and in our tests. The quantity is the so-called smoothness indicator of the data on the stencil . In WENO schemes the smoothness indicators are taken as the scaled sum of the square norms of all the derivatives , , of the local -th degree reconstruction polynomials over the cell where the interpolating points are located.
In our GP formulation, however, there is no polynomial to use for , and hence a non-polynomial approach is required. The GP theory furnishes the concept of the data likelihood function, which measures how likely the data is to have been sampled from the chosen GP distribution. The likelihood function is very well-adapted to detecting departures from smoothness, because the SE kernel (Equation 13) is a covariance over the space of smooth () functions rasmussen2005 (), so that non-smooth functions are naturally assigned smaller likelihood by the model. As in reyes_new_2016 () we construct the smoothness indicators within the GP framework as the negative of the GP likelihood in Eq. (14),
which is non-negative. The three terms on the right hand side of Eq. (34) can be identified as a normalization, a complexity penalty and a data fit term, respectively bishop2007pattern (); rasmussen2005 (). The GP covariance matrix, , on each of the sub-stencils are identical here in the uniform grid geometry, causing the first two terms in Eq. (34) – the normalization and complexity penalty terms – to be the same on each candidate stencil regardless of the data . For this reason, we use only the data fit term in our GP smoothness indicators. With the choice of zero mean the GP-based smoothness indicator becomes
Let us consider a case in which the data on is discontinuous, while the other sub-stencils () contain smooth data. The statistical interpretation of Eq. (35) is that the short length-scale variability (i.e., the short shock width ranging over a couple of grid spacing ) in the data makes unlikely (i.e., low probability) according to the smoothness of the model represented by , in which case is relatively larger than the other , . On the other hand, for smooth where the data is likely (i.e., high probability), becomes relatively smaller than the other , .
As in the standard WENO schemes, the nonlinear GP-WENO interpolation relies on the “relative ratio” of each individual to the others. For this reason, the choice of for in Eq. (35) can also be justified due to cancellation.
We note that, with the use of zero mean, does not reduce to zero on a sub-stencil where the data is non-zero constant. In this case, the value of could be any non-zero value proportional to which could be arbitrarily large depending on the constant value of . One resolution to this issue to guarantee in this case is to use a non-zero mean . In our numerical experiments, the use of non-zero mean helps to improve some under- and/or over-shoots adjacent to constant flow regions. However, away from such constant regions, the GP solution becomes more diffusive than with zero mean function. In some multidimensional problems where there is an assumed flow symmetry, the GP solutions with non-zero mean failed to preserve the symmetry during the course of evolution. For this reason, we use zero mean function in this paper, leaving a further investigation of this issue to a future study.
As previously mentioned, for the uniform grids considered here the ’s are the same for every candidate stencil. Hence, like and , the combination need only be computed once before starting the simulation and then used throughout the simulation.
It is worthwhile to note that our smoothness indicators in Eq. (36) are written compactly as a sum of perfect squares, which is an added advantage recently studied by Balsara et al. balsara_efficient_2016 () for their WENO-AO formulations. In addition, all eigenvalues of the symmetric, positive-definite matrix are positive-definite, so that the smoothness indicators are always positive by construction.
3.4 The Length Hyperparameter
As mentioned in Section 3.1 the SE kernel in Eq. (13) used in this paper contains a length hyperparameter that controls the characteristic length scale of the GP model. Alternatively, the GP prediction given in Eq. (15), using the SE kernel, can be viewed as a linear smoother of the data over the length scale . For this reason, should be comparable to, if not larger than the size of the given stencil, . We demonstrate in Section 5 how this length scale can be tuned to obtain better accuracy for smooth flows, as was also seen in reyes_new_2016 ().
It is important to note that the length hyperparameter serves a second, logically separate purpose when used to compute the smoothness indicators according to the GP likelihood in Eq. (35). In this application we are not using the GP model to smooth the data over the given sub-stencil but rather to determine whether there is a discontinuity in any of the candidate sub-stencils. In general, these two applications have different purposes and requirements. We therefore introduce a second length hyperparameter for determining the GP smoothness indicators in Eq. (35) so that we compute using instead of , thus treating separately from the “smoothing” length hyperparameter . This modification allows us to obtain greater stability in the presence of discontinuities by considering length scales comparable to the grid spacing , , based on the fact that the typical shock width in high-order shock capturing codes is of order . Viewed as a whole, the method essentially first attempts to detect discontinuities on the (shorter) scale of , and then smooths the data on the (larger) scale of .
We have found that using and works well on a wide range of problems, especially for the GP-WENO scheme. Additional stability for problems with strong shocks using larger stencil radii can be achieved by using lower values of , down to . Larger values of can also lead to marginal gains in stability. However, we observe that setting makes the GP-WENO scheme unstable regardless of any choice of values.
4 GP-WENO Code Implementation and Distribution
The implementation of the GP-WENO FD-Prim scheme is parallelized using Coarray Fortran (CAF) garain_comparing_2015 (). CAF is made to work with the GNU Fortran compiler through the OpenCoarrays library eachempati_open-source_2010 (), and IO is handled with the HDF5 library.
The source code is available at https://github.com/acreyes/GP-WENO_FD-Prim under a Creative Commons Attribution 4.0 International License.
5 Accuracy and Performance Results on Smooth Test Problems
In this section we assess the accuracy and performance of the proposed GP-WENO scheme on smooth advection problems in 1D and 2D. As demonstrated below, the high-order GP-WENO solutions converge linearly in and target solution errors are reached faster, in terms of CPU-time, than the 5th-order WENO-JS we chose as the polynomial-based counterpart algorithm for comparison. A suite of nonlinear discontinuous problems in 1D, 2D and 3D are outlined in Section 6 to display the shock-capturing capability of the GP-WENO scheme.
For smooth-flow problems where there is no shock or discontinuity, treating differently from is not required because the associated nonlinear smoothness indicators all become equally small in magnitude and do not play an important role. We performed the smooth advection problems in Section 5 by setting to follow the same convention we use for all discontinuous problems in Section 6 (i.e., and ). Alternatively, one can set in all smooth flow problems, which does not qualitatively change the results reported in this section.
5.1 1D Smooth Gaussian Advection
The test presented here considers the passive advection of a Gaussian density profile in 1D. The problem is set up on a computational domain with periodic boundary conditions. The initial condition is given by a density profile of with , with constant velocity and pressure, and . The ratio of specific heats is chosen to be . The profile is propagated for one period through the boundaries until where the profile returns to its initial position at . At this point, any deformation to the initial profile is solely due to phase errors and/or numerical diffusion of the algorithm under consideration, serving as a metric of the algorithm’s accuracy.
We perform this test for the GP-WENO method using values of , a length hyperparameter of (the choice of this value becomes apparent below), and . We employ RK4 for time integration, adjusting the time step to match the temporal and spatial accuracy of the scheme as the resolution is increased (e.g., see mignone_high-order_2010-1 ()). The results are summarized in Fig. 1 and Table 1. All three choices of demonstrate a convergence rate, as shown in reyes_new_2016 () for the same problem using the GP-WENO finite volume scheme reported therein.
The length hyperparameter provides an additional knob to tune the solution accuracy. Fig. 2 shows how the error changes with the choice of on the 1D Gaussian advection problem for different resolutions using the 5th-order GP-R2 scheme compared to the 5th-order WENO-JS scheme (denoted with dotted lines). The dependence of the errors on is qualitatively similar for all resolutions. At larger values of (e.g., ) the errors plateau and are generally higher than the corresponding WENO-JS simulation. This can be explained by the nature of the GP’s kernel-based data prediction, in which larger values of results in the GP model under-fitting the data. On the other hand, we see that the all errors diverge at small (e.g., ) due to the fact that the GP model over-fits the data (i.e., large oscillations between the data points). The errors of GP-WENO reach a local minimum at , roughly the full-width half-maximum (FWHM) of the initial Gaussian density profile. In all cases this local minimum in the error is lower than the errors obtained using the WENO-JS scheme. This behavior is similar to that observed for radial basis function (RBF) methods for CFD bigoni_adaptive_2017 (), for the RBF shape parameter which represents an inverse length scale, i.e., . Nonetheless, the connection between the optimal and the length scales of the problem has only been made in the context of Gaussian process interpolations/reconstructions in our recent work reyes_new_2016 (). This suggests that, to best resolve the “smallest” possible features in a simulation for a given grid spacing , the choice of may be optimal.
5.2 2D Isentropic Vortex
Next, we test the accuracy of the GP-WENO schemes using the multidimensional nonlinear isentropic vortex problem, initially presented by Shu shu_essentially_1998 (). The problem consists of the advection of an isentropic vortex along the diagonal of a Cartesian computational box with periodic boundary conditions. We set up the problem as presented in spiegel_survey_2015 (), where the size of the domain is doubled to be compared to the original setup in shu_essentially_1998 () to prevent self-interaction of the vortex across the periodic domain. The problem is run for one period of the advection through the domain until the vortex returns to its initial position, where the solution accuracy can be measured against the initial condition.
Our error results are shown in Fig. 3 and summarized in Table 2 using , , and RK4 for time integration, utilizing once more the appropriate reduction in time step to match the spatial and temporal accuracies. As with the 1D smooth advection in the previous section, the GP-WENO method obeys a order of convergence rate.
We also repeat the test of the dependence of the errors on the length hyperparameter and show the results in Fig. 4. Similar to the 1D case in Fig. 2 there is a minimum for the error at higher resolution around . The errors diverge at small values of and plateau at large . Shown as dotted lines in Fig. 4 are the errors for the WENO-JS interpolation. For all resolutions, the minimum of the error for the GP-WENO scheme is significantly smaller than the errors of the WENO-JS scheme. This can also be seen by comparing the GP-R2 column of Table 2 to the WENO-JS column of Table 3. Also, the order of convergence for the WENO-JS is smaller than that of GP-R2 method.
We attribute the significant improvement in errors for the GP-R2 scheme over the classical WENO-JS to the use of the GP-based smoothness indicators in Eq. (36), which seem to better suit the adaptive stencil process in WENO-type methods for nonlinear problems like the isentropic vortex test. It is known that the original weighting scheme of the WENO-JS method jiang1996efficient () suffers from reduced accuracy in the presence of inflection points that lowers the scheme’s formal order of accuracy. This has been addressed with such schemes as the Mapped WENO henrick_mapped_2005 () or the WENO-Z borges2008improved () methods. All of these methods use the same smoothness indicators as in WENO-JS and acquire improved behavior by modifying the way nonlinear weights are formulated (see Eq. (33)). The GP-WENO method uses exactly the same weighting as in the classical WENO-JS scheme and the observed improvement originates from the new GP-based smoothness indicators.
This suggests that the GP-based smoothness indicators could also be applied in conventional polynomial-based WENO interpolations/reconstructions to achieve improved accuracy in smooth solutions. More specifically, a WENO polynomial-interpolation is used for on the candidate stencils (Eq. (26)), while the GP-based smoothness indicators are adopted in Eq. (36). We refer to such a scheme as the WENO-GP weighting scheme. ‡‡‡On a static, uniform grid configuration, WENO-GP requires a one-time formulation and storage of the GP covariance matrix on a sub-stencil , followed by the computation of its eigenvalues and eigenvectors . The GP-based smoothness indicators are then computed using on each via Eq. (36), and applied to an any polynomial-based WENO scheme. In Table 3, we compare errors for the WENO-JS and WENO-GP schemes. The latter outperforms the former, without changing the formulation of the weights .
In Fig. 5 we show the CPU efficiency of WENO-JS and of GP-WENO with , and as the error versus the CPU time for the isentropic vortex problem. The 5th-order GP-R2 and 7th-order GP-R3 schemes yield faster time-to-solution accuracy when compared to the 5th-order WENO-JS scheme. The comparison is quantitatively summarized in Table 4.
6 Numerical Test Problems with Shocks and Discontinuities
In this section we present test problems using the GP-WENO interpolation method described in Section 3 and applied to the compressible Euler equations in 1, 2, and 3D. The GP-WENO with (or GP-R2) scheme is chosen as the default method and is compared to the 5th order WENO method chen_5th-order_2016 (); jiang1996efficient (); del_zanna_echo:_2007 () that is nominally of the same order of accuracy. All interpolations are carried out on characteristic variables to minimize unphysical oscillations in the presence of discontinuities qiu_construction_2002 () . A 3rd-order TVD Runge-Kutta method (RK3) gottlieb_strong_2011 () for temporal integration and an HLLC li_hllc_2005 (); toro1994restoration () Riemann solver are used throughout unless otherwise specified. The two hyperparameters and are chosen to have values so that and .
6.1 1D Shu-Osher Problem
The Shu-Osher problem shu1989 () is a compound test of a method accuracy and stability. The goal is to resolve small scale features (i.e., high-frequency waves) in a post-shock region and concurrently capture a Mach 3 shock in a stable and accurate fashion.
In this problem, a (nominally) Mach 3 shock wave propagates into a constant density field with sinusoidal perturbations. As the shock advances, two sets of density features develop behind the shock: one that has the same spatial frequency as the initial perturbation; one that has twice the frequency of the initial perturbations and is closer to the shock. The numerical method must correctly capture the dynamics and the amplitude of the oscillations behind the shock, and be compared against a reference solution obtained using much higher resolution.
The results for this problem are shown in Fig. (a)a for the whole domain (left). A close-up of the frequency doubled oscillations is shown in Fig. (b)b. We compare the GP-R2 method, with and , to the WENO-JS method. The GP-R2 scheme clearly captures the amplitude of the oscillations better than the 5th order WENO-JS. Again, the improvement over the WENO-JS scheme is attributed to the use of the GP smoothness indicators.
Fig. 7 shows the Shu-Osher problem for the 5th order GP-R2 and the 7th order GP-R3 schemes, for different values of and . Changing results in small changes in the amplitude of the oscillations, while the variation of has a more significant impact. Smaller values of result in more oscillations, while larger values better match the reference solution. From this parameter study we conclude that is fairly a robust choice for this shock tube problem. Further, we found that can be further reduced closer to on higher resolution runs and in problems with stronger shocks.
In Fig. 8 we show the combination of the GP-R2 interpolation with the classical WENO smoothness indicators. The FWHM of the post shock oscillations is times the grid spacing . This suggests that, following the FWHM discussion in Section 5, a choice of for GP-R2 is optimal. This is confirmed in Fig. 8, where the solution in panel (a) with better captures the amplitude of the oscillations, when compared to WENO-JS and the GP-R2 solution with . Notwithstanding, the default combination of GP-based smoothness indicators with GP-WENO yields much better results overall (Fig. (b)b).
6.2 1D Two Blast Wave Interaction
This problem was introduced by Woodward and Colella woodward_numerical_1984 () and consists of two strong shocks interacting with one another.
We follow the original setup of the problem and compare the GP-R2 scheme with the 5th order WENO-JS schemes on a computational domain of , resolved onto 128 grid points. We set and . Fig. 9 shows the density profiles for GP-R2 and WENO-JS at , compared against a highly-resolved WENO-JS solution (2056 grid points). As shown in Fig. (a)a, both methods yield acceptable solutions. However, the close-ups in Fig. (b)b reveals that the GP-R2 scheme better resolves the peaks and is closer to the reference solution.
6.3 2D Sedov
Next, we consider the Sedov blast test sedov1993similarity (). This problem studies the methods ability to maintain the symmetry of the self-similar evolution of a spherical shock, generated by a high pressure point-source at the center of the domain. We follow the setup of fryxell2000flash ().
Fig. 10 shows density profiles along (i.e., the diagonal) and (i.e., the -axis) with GP-R2, at two different resolutions ( and ) and different choices of . All runs used a value of to perform a parameter scan on . The top two panels show the solutions obtained with WENO-JS. Again, as in the Shu-Osher problem, small values of introduce oscillations and lead to asymmetric shock evolution, whereas a choice of gives a good balance.
6.4 2D Mach 3 Wind Tunnel with a Step
The next 2D shock problem consists of a Mach 3 wind tunnel setup with a forward facing reflecting step, originally proposed by Woodward and Colella woodward_numerical_1984 (). We initialize the problem as in woodward_numerical_1984 () with an entropy fix for the region immediately above the corner of the step. After the bow shock reflects on the step, the shock progressively reaches the top reflecting wall of the tunnel at around . A triple point is formed due to the reflections and interactions of the shocks, from which a trail of vortices travels towards the right boundary.
Shown in Fig. 11 are the results computed using the GP-R2 and GP-R3 schemes, along with the WENO-JS solution on a grid at the final time . Using the HLLC Riemann solver, we noticed that the GP and WENO-JS schemes converged to different solutions, on account of the singularity at the corner of the step and despite the entropy fix. To compare the two schemes, we ran our simulations using an HLL Riemann solver, for which the two schemes converged to similar solutions. Both methods are able to capture the main features of the flow but the GP schemes produce more well-developed Kelvin-Helmholtz roll-ups that originate from the triple point.
6.5 2D Riemann Problem – Configuration 3
Next, we consider a specific Riemann Problem configuration that is drawn from a class of two dimensional Riemann problems that have been studied extensively zhang1990conjecture (); schulz-rinne_classification_1993 () and have been applied to code verification efforts buchmuller2014improved (); balsara2010 (); lax1998solution (); schulz-rinne_numerical_1993 (); don_hybrid_2016 (); lee2017piecewise (). Specifically, we look at the third configuration of the 2D Riemann problem presented in don_hybrid_2016 (); lee2017piecewise ().
Panels in Fig. 12 show density profiles at , along with 40 contour lines, for different choices of GP radii, on a grid resolution. All GP methods correctly capture the expected features of the problem. In this experiment, we see that the increase of results in a sharper representation of the flow features. Notably, the 5th-order GP-R2 solution in (c) captures significantly more features when compared to the 5th-order WENO-JS in (a), as evinced by the formation of more developed Kelvin-Helmholtz vortices along the slip lines (i.e., the interface boundaries between the green triangular regions and the sky blue areas surrounding the mushroom-shaped jet).
6.6 Double Mach Reflection
Our last 2D test is the double Mach reflection problem introduced by Woodward and Colella woodward_numerical_1984 (). This test problem consists of a strong Mach 10 shock that is incident on a reflecting wedge that forms a angle with the plane of the shock. Fig. 13 shows density profiles from grid resolution runs, for a variety of GP stencil radii (), as well as for the GP adaptive order (AO) hybridization balsara_efficient_2016 () (detailed in A). We present these GP solutions together with a 5th-order WENO-JS solution for comparison. The reflection of the shock at the wall forms two Mach stems and two contact discontinuities. The contact discontinuity that emanates from the roll-up of the stronger shocks is known to be Kelvin-Helmholtz unstable provided there is sufficiently low numerical dissipation in the method. Thus, the problem quantifies the method’s numerical dissipation by the number of Kelvin-Helmholtz vortices present at the end of the run.
The presence of a highly supersonic shock is a stringent test for the stability of the algorithm. We find that the GP solutions remain stable for small values of . The choice of does not appear to considerably affect stability and thus can be set to relatively larger values than . All GP runs successfully reach for and on two different resolutions, in Fig. 13 and in Fig. 15.
Close-ups in Fig. 14 reveal that GP-WENO is significantly better at capturing the development of Kelvin-Helmholtz vortices in the Mach stem than the WENO-JS method. More specifically, the 5th order GP-R2 scheme is less dissipative than the WENO-JS scheme, which in turn is less dissipative than the 3rd order GP-R1 scheme. Both 7th order GP-R3 and 9th order GP-R4 schemes are able to resolve more vortices. While the GP-AO() scheme is less dissipative than the GP-R2, it does not capture as many features as the GP-R3 scheme. This is despite the fact that both GP-AO() and GP-R3 are of the same formal order of accuracy in smooth flows.
In Fig. 15 we provide results for double the grid resolution, . The ranking derived from the lower resolution solutions still holds and the reduced dissipation of the GP-R2 scheme over the 5th order WENO-JS is more evident. Further, the GP-R3 on the grid in Fig. 14(d) captures more vortices than WENO-JS on the grid in Fig. 15(a). Our results from Fig. 15 can be directly compared to one of the most recent finite difference WENO-AO solutions by Balsara et al. balsara_efficient_2016 () (see their Fig. 7).
6.7 3D Explosion
This 3D explosion test problem was introduced by Boscheri and Dumbser noauthor_direct_2014 () as a three-dimensional extension of the Sod problem sod1978survey (). The test is set up on a domain with outflow boundary conditions. The initial condition is given by
where and . The ratio of specific heats is and the simulation completes at .