A variable highorder shockcapturing finite difference method with GPWENO
Abstract
We present a new finite difference shockcapturing scheme for hyperbolic equations on static uniform grids. The method provides selectable highorder accuracy by employing a kernelbased Gaussian Process (GP) data prediction method which is an extension of the GP highorder method originally introduced in a finite volume framework by the same authors. The method interpolates Riemann states to high order, replacing the conventional polynomial interpolations with polynomialfree GPbased interpolations. For shocks and discontinuities, this GP interpolation scheme uses a nonlinear shock handling strategy similar to Weighted Essentially Nonoscillatory (WENO), with a novelty consisting in the fact that nonlinear smoothness indicators are formulated in terms of the Gaussian likelihood of the local stencil data, replacing the conventional type smoothness indicators of the original WENO method. We demonstrate that these GPbased smoothness indicators play a key role in the new algorithm, providing significant improvements in delivering high – and selectable – order accuracy in smooth flows, while successfully delivering nonoscillatory solution behavior in discontinuous flows.
keywords:
Gaussian processes; GPWENO; highorder methods; finite difference method; variable order; gas dynamics; shockcapturing1 Introduction
Highorder discrete methods for hyperbolic conservative equations comprise an important research area in computational fluid dynamics (CFD). The rapid growth in development of highorder methods has been to a great extent driven by a radical change in the balance between computation and memory resources in modern highperformance computing (HPC) architectures. To efficiently use computing resources of modern HPC machines CFD algorithms need to adapt to hardware designs in which memory per compute core has become progressively more limited Attig2011 (); Dongarra2012future (); Subcommittee2014top (). A computationally efficient numerical algorithm should exploit greater availability of processing power, while keeping memory use low. This goal can be achieved by discretizing the CFD governing equations to high order, thus providing the desired solution accuracy at a higher cost in processor power, but with a smaller memory requirement Hesthaven2007 (); LeVeque2002 (); leveque2007finite (). Of course, a practical consideration in designing such highorder numerical algorithms is that timetosolution at a given grid resolution should not increase due to the additional floating point operations.
The most popular approach to designing highorder methods for shockcapturing is based on implementing highly accurate approximations to partial differential equations (PDEs) using piecewise local polynomials. By and large, polynomial approaches fall into three categories: finite difference (FD) methods, finite volume (FV) methods, and discontinuous Galerkin (DG) methods. These three formulations all interpolate or reconstruct fluid states using Taylor series expansions, with accuracy controlled by the number of expansion terms retained in the interpolation or reconstruction. Below, we briefly summarize the aspects of those schemes that are most relevant to this paper.
The DG method, first proposed by Reed and Hill reed1973triangular () in 1973 for solving neutron transport problems, approximates a conservation law by first multiplying a given PDE by a test function and then integrating it in each cell to express the governing dynamics in integral form shu2009high (). The method approximates both the numerical solution and the test function by piecewise polynomials of chosen degree in each cell. These polynomials are permitted to be discontinuous at each cell interface, allowing flexibility in achieving highorder accuracy in smooth regions, while achieving nonoscillatory shock capturing at discontinuities. Solutions are typically integrated with a stage RungeKutta (RK) method, in which case the scheme is referred to as RKDG. The advantages of RKDG are that it is welladapted to complicated geometries cockburn2001runge (); it is easy to parallelize due to data locality beck2014high (); atak2015discontinuous (); it lends itself to GPUfriendly computing implementations klockner2009nodal (); accommodates arbitrary adaptivity baccouch2015asymptotically (); cao2015superconvergence (); it permits designs that preserve given structures in local approximation spaces cockburn2004locally (); li2005locally (). The weaknesses of the method include the fact that it is significantly more complicated in terms of algorithmic design; it potentially features less robust solution behaviors at strong shocks and discontinuities shu2009high (); its stability limit for timestep size becomes progressively more restrictive with increasing order of accuracy cockburn2001runge (); zhang2005analysis (); liu20082 (). For more discussions and an extended list of references, see also Cockburn and Shu cockburn1998runge (); cockburn2001runge (); Shu shu2009high (); shu2016high (); Balsara balsara_higherorder_2017 ().
The finite volume method (FVM) also uses the governing equations in integral form, making use of volumeaveraged conservative variables. The discrete formulation of FVM provides a natural way of maintaining conservation laws toro2013riemann (). This inherently conservative property of the scheme makes FVM a very popular algorithmic choice for application problems where shocks and discontinuities are dominant. Historically, work on highorder FVM methods began with the seminal works by van Leer van1974towards (); van1977towards (); vanleer1979 (), which overcame the Godunov Barrier theorem godunov1959difference () to produce effective secondorder accurate methods. More advanced numerical methods with higher solution accuracies than secondorder became available soon thereafter, including the piecewise parabolic method (PPM) colella1984piecewise (), the essentially nonoscillatory (ENO) schemes harten1987uniformly (); shu_efficient_1988 (), and the weighted ENO (WENO) schemes ^{*}^{*}*Although the original WENOJS scheme introduced in jiang1996efficient () is in FDM, the key ideas of WENOJS have been reformulated and studied in FVM by numerous practitioners. liu1994weighted (); jiang1996efficient () which improved ENO by means of nonlinear weights. Most of these early schemes focused on obtaining highorder accuracy in 1D, and their naive extension to multidimensional problems using a dimensionbydimension approach buchmuller2014improved () resulted in a secondorder solution accuracy bottleneck due to inaccuracies in both space and time associated with computing a faceaveraged flux function shu2009high (); buchmuller2014improved (); zhang2011order (); mccorquodale2011high (). Zhang et al. zhang2011order () studied the effect of secondorder versus highorder quadrature approximations combined with the 1D 5th order accurate WENO reconstruction as a baseline spatial formulation. They demonstrated that for nonlinear test problems in 2D, the simple and popular dimensionbydimension approach only converges to secondorder, despite the fact that the order of accuracy of the baseline 1D algorithm (e.g., 5th order accuracy for WENO) does in fact carry over to linear 2D problems. See also a recent work on the piecewise cubic method (PCM) proposed by Lee et al. lee2017piecewise (). The spatial aspect of the problem is addressable by using multiple quadrature points per cellface shu2009high (); titarev_finitevolume_2004 () or quadraturefree flux integration through freezing the Riemann fan along the cell interfaces dumbser2007quadrature (), while highorder temporal accuracy is achievable by using multistage RungeKutta (RK) methods zhang2011order (); buchmuller2014improved (); mccorquodale2011high () or singlestep ADERtype formulations titarev2002ader (); toro2006derivative (); Titarev2005 (); Dumbser2008 (); balsara2009efficient (); dumbser2013ader (); balsara2013efficient (); qian_generalized_2014 ().
In the conservative FDM approach, one evolves pointwise quantities, thus avoiding the complexities of FVM associated with the need for dealing with volumeaveraged quantities. For this reason, FDM has also been a popular choice for obtaining highorder solution accuracy in hyperbolic PDE solution, particularly when some of its wellknown difficulties such with geometry, AMR, and freestreaming preservation are not considerations. See, e.g., jiang1996efficient (); del_zanna_echo:_2007 (); mignone_highorder_2010 (); jiang_alternative_2013 (); chen_5thorder_2016 (); zhu2016new (). Traditional FDM formulations directly reconstruct high order numerical fluxes from cellcentered point values shu1989 (); jiang1996efficient (); mignone_highorder_2010 (). In this approach, the pointwise flux is assumed to be the volume average of the targeted highorder numerical flux , thereby recovering the same reconstruction procedure of FVM (i.e., reconstructing point values of Riemann states at each interface, given the volumeaveraged quantities at cell centers) for its flux approximation. To ensure stability, upwinding is enforced through a splitting of the FD fluxes into left and right moving fluxes. The most commonly used flux splitting is the global LaxFriedrichs splitting shu2009high (), which, while able to maintain a high formal order of accuracy in smooth flows, is known to be quite diffusive chen_5thorder_2016 (). Alternatively, an improvement can be achieved with the use of local characteristic field decompositions in the flux splitting in part of the flux reconstruction mignone_highorder_2010 (), in which the characteristic field calculation is heavily dependent on a system of equations under consideration and adds an extra computational expense. Recently, some practitioners including Del Zanna et al. and Chen et al. studied another form of highorder FD flux formulation, called FDPrimitive del_zanna_echo:_2007 (); chen_5thorder_2016 () (referred to as FDPrim hereafter), in which, instead of designing highorder numerical fluxes directly from cell pointwise values, highorder fluxes are constructed first by solving Riemann problems, followed by a highorder correction step where the correction terms are derived from the socalled compactlike finite difference schemes lele1992compact (); deng2000developing (); zhang2008development (). In this way, FDPrim is algorithmically quite analogous to FVM, in that it first interpolates (rather than reconstructs) highorder Riemann states at each cell interface, then uses them to calculate interface fluxes by solving Riemann problems using either exact or approximate solvers, and finally makes corrections to the fluxes to deliver highorderaccurate FD numerical fluxes del_zanna_echo:_2007 (); chen_5thorder_2016 (). In this way the FDPrim approach allows the added flexibility of choosing a Riemann solver (e.g., exact saurel1994exact (); delmont2009exact (); takahashi2014exact (); toro2013riemann (), HLLtypes harten1983upstream (); toro1994restoration (); miyoshi2005multi (); guo2016hlld (), or Roe roe1981approximate (), etc.) in a manner analogous to the FVM approach.
The aforementioned traditional polynomialbased highorder methods are complemented by a family of “nonpolynomial” (or “polynomialfree”) methods called radial basis function (RBF) approximation methods. As a family of “meshfree” or “meshless” method, RBF approximation methods have been extensively studied (see buhmann2000radial ()) to provide more flexible approximations, in terms of approximating functions powell1985radial () as well as scattered data Franke1982 (). Unlike local polynomial methods, RBF has degrees of freedom that disassociate the tight coupling between the stencil configuration and the local interpolating (or reconstructing) polynomials under consideration. For this reason, interest has grown in meshfree methods based on RBF as means for designing numerical methods that achieve high order convergence while retaining a simple (and flexible) algorithmic framework in all spatial dimensions sarra2009multiquadric (); safdari2018radial (). Approximations based on RBF have been used to solve hyperbolic PDEs Katz2009 (); morton2007 (); sonar1996 (); guo2017rbf (); guo_nonpolynomial_2016 (); bigoni_adaptive_2017 (), parabolic PDEs moroney2006 (); moroney2007 (), diffusion and reactiondiffusion PDEs shankar2015radial (), and boundary value problems of elliptic PDEs liu2015kansa (), as well as for interpolations on irregular domains chen2016reduced (); heryudono2010radial (); martel2016stability (), and for interpolations on more general sets of scattered data Franke1982 ().
In this paper, we develop a new highorder FDM in which the core interpolation formulation is based on Gaussian Process (GP) Modeling bishop2007pattern (); rasmussen2005 (); wahba1995 (). This work is an extension of our previous GP highorder method reyes_new_2016 () introduced in a FVM framework. Analogous to RBF, our GP approach is a nonpolynomial method. By being a meshless method, the proposed GP method features attractive properties similar to those of RBF and allows flexibility in code implementation and selectable orders of solution accuracy in any number of spatial dimensions. An important feature of the GP approach is that it comes equipped with a likelihood estimator for a given dataset, which we have leveraged to form a new set of smoothness indicators. Based on the probabilistic theory of Gaussian Process Modeling bishop2007pattern (); rasmussen2005 (), the new formulation of our smoothness indicators is a key component of our GP scheme that we present in Section 3.3. We call our new GP scheme with the GPbased smoothness indicators GPWENO in this paper. As demonstrated below in our numerical convergence study, the GPWENO’s formal accuracy is and is controlled by the parameter , called the GP radius, that determines the size of a GP stencil on which GPWENO interpolations take place. These numerical experiments also show that the new GPbased smoothness indicators are better able to preserve high order solutions in the presence of discontinuities than are the conventional WENO smoothness indicators based on the like norm of the local polynomials jiang1996efficient ().
2 Finite Difference Method
We are concerned with the solution of 3D conservation laws:
(1) 
where is a vector of conserved variables and , and are the fluxes. For the Euler equations these are defined as
(2) 
We wish to produce a conservative discretization of the pointwise values of , i.e., , and we write Eq. (1) in the form:
(3) 
Here , and are the , and numerical fluxes evaluated at the halfway point between cells in their respective directions. The numerical fluxes are defined so that their divided difference rule approximates the exact flux derivative with th order:
(4) 
and similarly for the and fluxes. As a result, the overall finite difference scheme in Eq. (3) approximates the original conservation law in Eq. (2) with the spatial accuracy of order . The temporal part of Eq. (3) can be discretized by a methodoflines approach with a RungeKutta time discretization gottlieb_strong_2011 ().
To determine the numerical fluxes in Eq. (3), let us consider first the pointwise flux, , as the 1D cell average of an auxiliary function, , in the direction. If we also define another function, , we can write as
(5) 
Differentiating Eq. (5) with respect to gives
(6) 
and comparing with Eq. (4) we can identify as the analytic flux function we wish to approximate with the numerical flux . This can be repeated in a similar fashion for the and fluxes, and . The goal is then to form a high order approximation to the integrand quantities , and , knowing the mathematically cellaveraged integral quantities and physically pointwise fluxes , up to some design accuracy of order , e.g.,
(7) 
Note that this is exactly the same reconstruction procedure of computing highorder accurate Riemann states at cell interfaces given the integral volumeaveraged quantities at cell centers in 1D FVM.
In the high order finite difference method originally put forward by Shu and Osher shu1989 (), the problem of approximating the numerical fluxes is accomplished by directly reconstructing the facecentered numerical fluxes from the cellcentered fluxes on a stencil that extends from the points to . That is, in complete analogy to reconstruction in the context of finite volume schemes, where is a high order accurate procedure to reconstruct facecentered values from cellaveraged ones. Such fluxbased finite difference methods (or FDFlux in short), as just described, are easily implemented using the same reconstruction procedures as in 1D finite volume codes and provide high order of convergence on multidimensional problems. For this reason, they have been widely adopted mignone_highorder_2010 (); shu_highorder_2003 (); jiang1996efficient (). One pitfall of this approach is that proper upwinding is typically achieved by appropriately splitting the fluxes into parts moving towards and away from the interface of interest using the global LaxFriedrichs splitting mignone_highorder_2010 (); shu_highorder_2003 () at the cost of introducing significant diffusion to the scheme.
On the other hand, it can be readily seen from Eq. (5) that the naive use of the interface value of the flux as the numerical flux can provide at most a second order approximation and should be avoided for designing a highorder FDM, no matter how accurately is computed, since
(8) 
Alternatively, in an approach originally proposed by Del Zanna del_zanna_echo:_2007 (); del_zanna_efficient_2003 (), upwinding is provided by solving a Riemann problem at each of the face centers, , and for the corresponding facenormal fluxes, , and . The numerical flux is then viewed as being the facecenter flux from the Riemann problem, i.e., at the cell interface ,
(9) 
plus a series of high order corrections using the spatial derivatives of the flux evaluated at the facecenter,
(10) 
where parenthesized superscripts denote numerical derivatives in the corresponding dimension, and where and are constants chosen so Eq. (6) holds up to the desired order of accuracy, e.g.,
(11) 
For the choice only the terms up to the fourth derivative in Eq. (10) need to be retained. The constants and are determined by Taylor expanding the terms in (10) and enforcing the condition in (11). For this reason, the values of and depend on the stencil geometry used to approximate the derivatives. Del Zanna del_zanna_echo:_2007 () used the Riemann fluxes at neighboring facecenters to calculate the derivatives and found and . A disadvantage of this choice is that it requires additional guard cells on which to solve the Riemann problem in order to compute the high order correction terms near the domain boundaries. Chen et al. chen_5thorder_2016 () converted the cellcentered conservative variables to cellcentered fluxes around the facecenter of interest to compute the flux derivatives. This leads to and . While this approach doesn’t require additional guard cells, the conversion from conservative (or primitive) variables to flux variables needs to be performed at each grid point in the domain, incurring additional computational cost. For this reason, we adopt the facecenter flux approach of Del Zanna. For example, the second and fourth derivatives of the flux are then given by the finite difference formulas,
(12) 
The derivatives of the fluxes and fluxes are given in the same way. These correction terms were originally derived in the context of compact finite difference interpolation lele1992compact (); deng2000developing (); zhang2008development (). For instance, the explicit formula in lele1992compact () for the first derivative approximation using six neighboring interface fluxes reduces to the high order correction formula in Eqs. (10) – (12) (see also the Appendix in del_zanna_echo:_2007 ()).
In summary, the finite difference method described in this paper consists of the following steps:

Pointwise values of either the primitive or conservative variables are given on a uniform grid at time , .

The Riemann states and as given in Eq. (9) at the facecenters between grid points, , and are interpolated from pointwise cellcentered values. These interpolations should be carried out in a nonoscillatory way with the desired spatial accuracy.

The facecenter normal fluxes are calculated from the Riemann problem in Eq. (9) at the halfway points between grid points.

The second and fourth derivatives of the fluxes are calculated by following Eq. (12) using the Riemann fluxes from the previous step. In principle these should be carried out in a nonoscillatory fashion using a nonlinear slope limiter as was done in del_zanna_echo:_2007 (). However, it was pointed out in chen_5thorder_2016 () that the fluxderivatives’ contribution to the numerical fluxes in Eq. (10) is relatively small and will produce only small oscillations near discontinuities. For this reason, the derivatives are finite differenced without any limiters in our implementation. The numerical fluxes are constructed as in Eq. (10).

The conservative variables can then be updated using a standard SSPRK method gottlieb_strong_2011 ().
So far, we have not yet described what type of spatial interpolation method is to be used in Step 2 to compute highorder Riemann states at each cell interface. Typically, nonoscillatory highorder accurate local polynomial schemes are adopted such as MP5 suresh_accurate_1997 () in chen_5thorder_2016 () or WENOJS jiang1996efficient () in del_zanna_echo:_2007 (). In the next section, we will introduce our highorder polynomialfree interpolation scheme, based on Gaussian Process Modeling.
3 Gaussian Process Modeling
In this section, we briefly outline the statistical theory underlying the construction of GPbased Bayesian prior and posterior distributions (see Section 3.1). Interested readers are encouraged to refer to our previous paper reyes_new_2016 () for a more detailed discussion in the context of applying GP for the purpose of achieving highorder algorithms for FVM schemes. For a more general discussion of GP theory see rasmussen2005 ().
3.1 GP Interpolation
GP is a class of stochastic processes, i.e., processes that sample functions (rather than points) from an infinitedimensional function space. The distribution over the space of functions is specified by the prior mean and covariance functions, which give rise to the GP prior:

a mean function over , and

a covariance GP kernel function which is a symmetric and positivedefinite integral kernel over given by .
The GP approach to interpolation is to regard the values of the function at a series of points as samples from a function that is only known probabilistically in terms of the prior GP distribution, and to form a posterior distribution on that is conditioned on the observed values. One frequently refers to the observed values as training points, and to the passage from prior distribution to posterior predictive distribution as a training process. We may use the trained distribution to predict the behavior of functions at a new point . For example, in the current context, the fluid variables are assumed to be sample functions from GP distributions with prescribed mean and covariance functions, written as . We then train the GP on the known values of the fluid variables at the cell centers, , to predict the Riemann states at cellface centers, e.g., with .
The mean function is often taken to be constant, . We have found a choice of zero mean, works well, and we adopt this choice in this paper. For the kernel function, , we will use the “Squared Exponential” (SE) kernel,
(13) 
For other choices of kernel functions and the related discussion in the context of designing highorder approximations for numerical PDEs, readers are referred to reyes_new_2016 ().
The SE kernel has two free parameters, and , called hyperparameters. We will see below that plays no role in the calculations that are presented here, and may as well be chosen to be . However, is a length scale that controls the characteristic scale on which the GP sample functions vary. As was demonstrated in reyes_new_2016 (), plays a critical role in the solution accuracy of a GP interpolation/reconstruction scheme and ideally should match the length scales of the problem that are to be resolved.
Formally, a GP is a collection of random variables, any finite collection of which has a joint Gaussian distribution bishop2007pattern (); rasmussen2005 (). We consider the function values at points , , as our “training” points. Introducing the data vector with components , the likelihood, , of given a GP model (i.e., ) is given by
(14) 
where and . The likelihood is a measure of how compatible the data is with the GP model specified by the mean and the covariance .
Given the function value samples , the GP theory furnishes the posterior predictive distribution over the value of the unknown function at any new point . The mean of this distribution is the posterior mean function,
(15) 
where . Taking a zero mean GP, , Eq. (15) reduces to
(16) 
According to Eqs. (15) and (16), the GP posterior mean is a linear function of , with a weight vector specified entirely by the choice of covariance kernel function, the stencil points, and the prediction point. We take this posterior mean of the distribution in Eq. (16) as the interpolation of the function at the point , , where is any one of the fluid variables in primitive, conservative or characteristic form, which we will denote as . Note that had we retained the multiplicative scale factor as a model hyperparameter, it would have canceled out in Eq. (16). This justifies our choice of .
3.2 GP Interpolation for FDPrim
Hereafter, we restrict ourselves to describe our new multidimensional GP highorder interpolation scheme in the framework of FDPrim, which only requires us to consider 1D data interpolations as typically done in the dimensionbydimension approach for FDM. The notation and the relevant discussion will therefore be formulated in 1D.
We wish to interpolate the Riemann states from the pointwise cell centered values . We consider a (fluid) variable on a 1D stencil of points on either side of the central point of the th cell , and write
(17) 
We seek a highorder interpolation of at ,
(18) 
where is the GP interpolation given in Eq. (16). We define the data vector on by
(19) 
and we define a vector of weights , so that the interpolation in Eq. (16) can be cast as a product between the (row) vector of weights and the data ,
(20) 
Note here that is a covariance kernel matrix of size whose entries are defined by
(21) 
and is a vector of length with entries are defined by
(22) 
The weights are independent of the data and depend only on the locations of the data points , and the interpolation point . Therefore, for cases where the grid configurations are known in advance, the weights can be computed and stored a priori for use during the simulation.
3.3 Handling Discontinuities: GPWENO
The above GP interpolation procedure works well for smooth flows without any additional modifications. For nonsmooth flows, however, it requires some form of limiting to avoid numerical oscillations at discontinuities that can lead to numerical instability. To this end, we adopt the approach of the Weighted Essentially NonOscillatory (WENO) methods jiang1996efficient (), where the effective stencil size is adaptively changed to avoid interpolating through a discontinuity, while retaining highorder convergence in smooth regions. In the work by Del Zanna et al. del_zanna_echo:_2007 (), a highorder Riemann state is constructed by considering the conventional WENO’s weighted combination of interpolations from a set of candidate substencils. The weights are chosen based on norms of the derivatives of polynomial reconstructions on each local stencil (e.g., , see below) in such a way that a weight is close to zero when the corresponding stencil contains a discontinuity, while weights are optimal in smooth regions in the sense that they reduce to the interpolation over a global^{†}^{†}†The term global here is to be understood in a sense that the desired order of accuracy, e.g., 5thorder in WENOJS, is to be optimally achieved in this “larger” or “global” stencil, rather than the global entire computational domain. stencil (e.g., , see below).
For the proposed GPWENO scheme, we introduce a new GP smoothness indicator inspired by the probabilistic interpretation of GP, replacing the standard normbased formulations of WENO. The GPWENO scheme will be fully specified by combining the linear GP interpolation in Section 3.2 and the nonlinear GP smoothness local indicators in this section.
We begin with considering the global stencil, , in Eq. (17) with points centered at the cell and the candidate substencils , each with points,
(23) 
which satisfy
(24) 
Eq. (20) can then be evaluated to give a GP interpolation at the location from the th candidate stencil ,
(25) 
We now take the weighted combination of these candidate GP approximations as the final interpolated value,
(26) 
As in the traditional WENO approach, the nonlinear weights, , should reduce to some optimal weights in smooth regions, so that the approximation in Eq. (25) reduces to the GP approximation (Eq. (20)) over the global point stencil . The ’s then should satisfy,
(27) 
or equivalently,
(28) 
We then seek as the solution to the overdetermined system
(29) 
where the th column of M is given by for row entries and zeros for the rest:
(30) 
where . For example, in the case of the above system reduces to the overdetermined system,
(31) 
or in matrix form, ,
(32) 
The optimal weights, , then depend only on the choice of kernel (Eq. (13)) and the stencil , and as with the weights and , the ’s need only be computed once and used throughout the simulation. We take as the least squares solution to Eq. (29), which can be determined numerically.
All that remains to complete GPWENO is to specify the nonlinear weights in Eq. (26). These should reduce to the optimal weights in smooth regions, and more importantly, they need to serve as an indicator of the continuity of data on the candidate stencil , becoming small when there is a strong discontinuity on . We first adopt the weighting scheme of the WENOJS schemes jiang1996efficient (),
(33) 
where we have set and in our tests. The quantity is the socalled smoothness indicator of the data on the stencil . In WENO schemes the smoothness indicators are taken as the scaled sum of the square norms of all the derivatives , , of the local th degree reconstruction polynomials over the cell where the interpolating points are located.
In our GP formulation, however, there is no polynomial to use for , and hence a nonpolynomial approach is required. The GP theory furnishes the concept of the data likelihood function, which measures how likely the data is to have been sampled from the chosen GP distribution. The likelihood function is very welladapted to detecting departures from smoothness, because the SE kernel (Equation 13) is a covariance over the space of smooth () functions rasmussen2005 (), so that nonsmooth functions are naturally assigned smaller likelihood by the model. As in reyes_new_2016 () we construct the smoothness indicators within the GP framework as the negative of the GP likelihood in Eq. (14),
(34) 
which is nonnegative. The three terms on the right hand side of Eq. (34) can be identified as a normalization, a complexity penalty and a data fit term, respectively bishop2007pattern (); rasmussen2005 (). The GP covariance matrix, , on each of the substencils are identical here in the uniform grid geometry, causing the first two terms in Eq. (34) – the normalization and complexity penalty terms – to be the same on each candidate stencil regardless of the data . For this reason, we use only the data fit term in our GP smoothness indicators. With the choice of zero mean the GPbased smoothness indicator becomes
(35) 
Let us consider a case in which the data on is discontinuous, while the other substencils () contain smooth data. The statistical interpretation of Eq. (35) is that the short lengthscale variability (i.e., the short shock width ranging over a couple of grid spacing ) in the data makes unlikely (i.e., low probability) according to the smoothness of the model represented by , in which case is relatively larger than the other , . On the other hand, for smooth where the data is likely (i.e., high probability), becomes relatively smaller than the other , .
As in the standard WENO schemes, the nonlinear GPWENO interpolation relies on the “relative ratio” of each individual to the others. For this reason, the choice of for in Eq. (35) can also be justified due to cancellation.
We note that, with the use of zero mean, does not reduce to zero on a substencil where the data is nonzero constant. In this case, the value of could be any nonzero value proportional to which could be arbitrarily large depending on the constant value of . One resolution to this issue to guarantee in this case is to use a nonzero mean . In our numerical experiments, the use of nonzero mean helps to improve some under and/or overshoots adjacent to constant flow regions. However, away from such constant regions, the GP solution becomes more diffusive than with zero mean function. In some multidimensional problems where there is an assumed flow symmetry, the GP solutions with nonzero mean failed to preserve the symmetry during the course of evolution. For this reason, we use zero mean function in this paper, leaving a further investigation of this issue to a future study.
The calculation of in Eq. (35) can be speeded up by considering the eigenvalues and eigenvectors of the square matrix , which allow to be expressed as (see reyes_new_2016 () for derivation),
(36) 
As previously mentioned, for the uniform grids considered here the ’s are the same for every candidate stencil. Hence, like and , the combination need only be computed once before starting the simulation and then used throughout the simulation.
It is worthwhile to note that our smoothness indicators in Eq. (36) are written compactly as a sum of perfect squares, which is an added advantage recently studied by Balsara et al. balsara_efficient_2016 () for their WENOAO formulations. In addition, all eigenvalues of the symmetric, positivedefinite matrix are positivedefinite, so that the smoothness indicators are always positive by construction.
3.4 The Length Hyperparameter
As mentioned in Section 3.1 the SE kernel in Eq. (13) used in this paper contains a length hyperparameter that controls the characteristic length scale of the GP model. Alternatively, the GP prediction given in Eq. (15), using the SE kernel, can be viewed as a linear smoother of the data over the length scale . For this reason, should be comparable to, if not larger than the size of the given stencil, . We demonstrate in Section 5 how this length scale can be tuned to obtain better accuracy for smooth flows, as was also seen in reyes_new_2016 ().
It is important to note that the length hyperparameter serves a second, logically separate purpose when used to compute the smoothness indicators according to the GP likelihood in Eq. (35). In this application we are not using the GP model to smooth the data over the given substencil but rather to determine whether there is a discontinuity in any of the candidate substencils. In general, these two applications have different purposes and requirements. We therefore introduce a second length hyperparameter for determining the GP smoothness indicators in Eq. (35) so that we compute using instead of , thus treating separately from the “smoothing” length hyperparameter . This modification allows us to obtain greater stability in the presence of discontinuities by considering length scales comparable to the grid spacing , , based on the fact that the typical shock width in highorder shock capturing codes is of order . Viewed as a whole, the method essentially first attempts to detect discontinuities on the (shorter) scale of , and then smooths the data on the (larger) scale of .
We have found that using and works well on a wide range of problems, especially for the GPWENO scheme. Additional stability for problems with strong shocks using larger stencil radii can be achieved by using lower values of , down to . Larger values of can also lead to marginal gains in stability. However, we observe that setting makes the GPWENO scheme unstable regardless of any choice of values.
4 GPWENO Code Implementation and Distribution
The implementation of the GPWENO FDPrim scheme is parallelized using Coarray Fortran (CAF) garain_comparing_2015 (). CAF is made to work with the GNU Fortran compiler through the OpenCoarrays library eachempati_opensource_2010 (), and IO is handled with the HDF5 library.
The source code is available at https://github.com/acreyes/GPWENO_FDPrim under a Creative Commons Attribution 4.0 International License.
5 Accuracy and Performance Results on Smooth Test Problems
In this section we assess the accuracy and performance of the proposed GPWENO scheme on smooth advection problems in 1D and 2D. As demonstrated below, the highorder GPWENO solutions converge linearly in and target solution errors are reached faster, in terms of CPUtime, than the 5thorder WENOJS we chose as the polynomialbased counterpart algorithm for comparison. A suite of nonlinear discontinuous problems in 1D, 2D and 3D are outlined in Section 6 to display the shockcapturing capability of the GPWENO scheme.
For smoothflow problems where there is no shock or discontinuity, treating differently from is not required because the associated nonlinear smoothness indicators all become equally small in magnitude and do not play an important role. We performed the smooth advection problems in Section 5 by setting to follow the same convention we use for all discontinuous problems in Section 6 (i.e., and ). Alternatively, one can set in all smooth flow problems, which does not qualitatively change the results reported in this section.
5.1 1D Smooth Gaussian Advection
The test presented here considers the passive advection of a Gaussian density profile in 1D. The problem is set up on a computational domain with periodic boundary conditions. The initial condition is given by a density profile of with , with constant velocity and pressure, and . The ratio of specific heats is chosen to be . The profile is propagated for one period through the boundaries until where the profile returns to its initial position at . At this point, any deformation to the initial profile is solely due to phase errors and/or numerical diffusion of the algorithm under consideration, serving as a metric of the algorithm’s accuracy.
We perform this test for the GPWENO method using values of , a length hyperparameter of (the choice of this value becomes apparent below), and . We employ RK4 for time integration, adjusting the time step to match the temporal and spatial accuracy of the scheme as the resolution is increased (e.g., see mignone_highorder_20101 ()). The results are summarized in Fig. 1 and Table 1. All three choices of demonstrate a convergence rate, as shown in reyes_new_2016 () for the same problem using the GPWENO finite volume scheme reported therein.
Grid  GPR1  GPR2  GPR3  

Order  Order  Order  
1/25  –  –  –  
1/50  2.02  4.11  5.49  
1/100  2.66  4.28  6.36  
1/200  2.78  4.75  6.76  
1/400  2.96  4.99  6.88 
The length hyperparameter provides an additional knob to tune the solution accuracy. Fig. 2 shows how the error changes with the choice of on the 1D Gaussian advection problem for different resolutions using the 5thorder GPR2 scheme compared to the 5thorder WENOJS scheme (denoted with dotted lines). The dependence of the errors on is qualitatively similar for all resolutions. At larger values of (e.g., ) the errors plateau and are generally higher than the corresponding WENOJS simulation. This can be explained by the nature of the GP’s kernelbased data prediction, in which larger values of results in the GP model underfitting the data. On the other hand, we see that the all errors diverge at small (e.g., ) due to the fact that the GP model overfits the data (i.e., large oscillations between the data points). The errors of GPWENO reach a local minimum at , roughly the fullwidth halfmaximum (FWHM) of the initial Gaussian density profile. In all cases this local minimum in the error is lower than the errors obtained using the WENOJS scheme. This behavior is similar to that observed for radial basis function (RBF) methods for CFD bigoni_adaptive_2017 (), for the RBF shape parameter which represents an inverse length scale, i.e., . Nonetheless, the connection between the optimal and the length scales of the problem has only been made in the context of Gaussian process interpolations/reconstructions in our recent work reyes_new_2016 (). This suggests that, to best resolve the “smallest” possible features in a simulation for a given grid spacing , the choice of may be optimal.
5.2 2D Isentropic Vortex
Next, we test the accuracy of the GPWENO schemes using the multidimensional nonlinear isentropic vortex problem, initially presented by Shu shu_essentially_1998 (). The problem consists of the advection of an isentropic vortex along the diagonal of a Cartesian computational box with periodic boundary conditions. We set up the problem as presented in spiegel_survey_2015 (), where the size of the domain is doubled to be compared to the original setup in shu_essentially_1998 () to prevent selfinteraction of the vortex across the periodic domain. The problem is run for one period of the advection through the domain until the vortex returns to its initial position, where the solution accuracy can be measured against the initial condition.
Our error results are shown in Fig. 3 and summarized in Table 2 using , , and RK4 for time integration, utilizing once more the appropriate reduction in time step to match the spatial and temporal accuracies. As with the 1D smooth advection in the previous section, the GPWENO method obeys a order of convergence rate.
GPR1  GPR2  GPR3  

Order  Order  Order  
2/5  –  –  –  
1/5  1.74  4.82  5.82  
1/10  2.62  4.93  6.68  
1/20  2.94  4.75  6.75  
1/40  2.99  4.61  6.66 
We also repeat the test of the dependence of the errors on the length hyperparameter and show the results in Fig. 4. Similar to the 1D case in Fig. 2 there is a minimum for the error at higher resolution around . The errors diverge at small values of and plateau at large . Shown as dotted lines in Fig. 4 are the errors for the WENOJS interpolation. For all resolutions, the minimum of the error for the GPWENO scheme is significantly smaller than the errors of the WENOJS scheme. This can also be seen by comparing the GPR2 column of Table 2 to the WENOJS column of Table 3. Also, the order of convergence for the WENOJS is smaller than that of GPR2 method.
WENOJS  WENOGP  

Order  Order  
2/5  –  –  
1/5  4.72  4.07  
1/10  2.50  4.80  
1/20  3.92  4.55  
1/40  4.50  4.78 
We attribute the significant improvement in errors for the GPR2 scheme over the classical WENOJS to the use of the GPbased smoothness indicators in Eq. (36), which seem to better suit the adaptive stencil process in WENOtype methods for nonlinear problems like the isentropic vortex test. It is known that the original weighting scheme of the WENOJS method jiang1996efficient () suffers from reduced accuracy in the presence of inflection points that lowers the scheme’s formal order of accuracy. This has been addressed with such schemes as the Mapped WENO henrick_mapped_2005 () or the WENOZ borges2008improved () methods. All of these methods use the same smoothness indicators as in WENOJS and acquire improved behavior by modifying the way nonlinear weights are formulated (see Eq. (33)). The GPWENO method uses exactly the same weighting as in the classical WENOJS scheme and the observed improvement originates from the new GPbased smoothness indicators.
This suggests that the GPbased smoothness indicators could also be applied in conventional polynomialbased WENO interpolations/reconstructions to achieve improved accuracy in smooth solutions. More specifically, a WENO polynomialinterpolation is used for on the candidate stencils (Eq. (26)), while the GPbased smoothness indicators are adopted in Eq. (36). We refer to such a scheme as the WENOGP weighting scheme. ^{‡}^{‡}‡On a static, uniform grid configuration, WENOGP requires a onetime formulation and storage of the GP covariance matrix on a substencil , followed by the computation of its eigenvalues and eigenvectors . The GPbased smoothness indicators are then computed using on each via Eq. (36), and applied to an any polynomialbased WENO scheme. In Table 3, we compare errors for the WENOJS and WENOGP schemes. The latter outperforms the former, without changing the formulation of the weights .
In Fig. 5 we show the CPU efficiency of WENOJS and of GPWENO with , and as the error versus the CPU time for the isentropic vortex problem. The 5thorder GPR2 and 7thorder GPR3 schemes yield faster timetosolution accuracy when compared to the 5thorder WENOJS scheme. The comparison is quantitatively summarized in Table 4.
Scheme  Relative timetoerror 

GPR1  35.81 
GPR2  0.43 
GPR3  0.22 
WENOJS  1.0 
6 Numerical Test Problems with Shocks and Discontinuities
In this section we present test problems using the GPWENO interpolation method described in Section 3 and applied to the compressible Euler equations in 1, 2, and 3D. The GPWENO with (or GPR2) scheme is chosen as the default method and is compared to the 5th order WENO method chen_5thorder_2016 (); jiang1996efficient (); del_zanna_echo:_2007 () that is nominally of the same order of accuracy. All interpolations are carried out on characteristic variables to minimize unphysical oscillations in the presence of discontinuities qiu_construction_2002 () . A 3rdorder TVD RungeKutta method (RK3) gottlieb_strong_2011 () for temporal integration and an HLLC li_hllc_2005 (); toro1994restoration () Riemann solver are used throughout unless otherwise specified. The two hyperparameters and are chosen to have values so that and .
6.1 1D ShuOsher Problem
The ShuOsher problem shu1989 () is a compound test of a method accuracy and stability. The goal is to resolve small scale features (i.e., highfrequency waves) in a postshock region and concurrently capture a Mach 3 shock in a stable and accurate fashion.
In this problem, a (nominally) Mach 3 shock wave propagates into a constant density field with sinusoidal perturbations. As the shock advances, two sets of density features develop behind the shock: one that has the same spatial frequency as the initial perturbation; one that has twice the frequency of the initial perturbations and is closer to the shock. The numerical method must correctly capture the dynamics and the amplitude of the oscillations behind the shock, and be compared against a reference solution obtained using much higher resolution.
The results for this problem are shown in Fig. (a)a for the whole domain (left). A closeup of the frequency doubled oscillations is shown in Fig. (b)b. We compare the GPR2 method, with and , to the WENOJS method. The GPR2 scheme clearly captures the amplitude of the oscillations better than the 5th order WENOJS. Again, the improvement over the WENOJS scheme is attributed to the use of the GP smoothness indicators.
Fig. 7 shows the ShuOsher problem for the 5th order GPR2 and the 7th order GPR3 schemes, for different values of and . Changing results in small changes in the amplitude of the oscillations, while the variation of has a more significant impact. Smaller values of result in more oscillations, while larger values better match the reference solution. From this parameter study we conclude that is fairly a robust choice for this shock tube problem. Further, we found that can be further reduced closer to on higher resolution runs and in problems with stronger shocks.
In Fig. 8 we show the combination of the GPR2 interpolation with the classical WENO smoothness indicators. The FWHM of the post shock oscillations is times the grid spacing . This suggests that, following the FWHM discussion in Section 5, a choice of for GPR2 is optimal. This is confirmed in Fig. 8, where the solution in panel (a) with better captures the amplitude of the oscillations, when compared to WENOJS and the GPR2 solution with . Notwithstanding, the default combination of GPbased smoothness indicators with GPWENO yields much better results overall (Fig. (b)b).
6.2 1D Two Blast Wave Interaction
This problem was introduced by Woodward and Colella woodward_numerical_1984 () and consists of two strong shocks interacting with one another.
We follow the original setup of the problem and compare the GPR2 scheme with the 5th order WENOJS schemes on a computational domain of , resolved onto 128 grid points. We set and . Fig. 9 shows the density profiles for GPR2 and WENOJS at , compared against a highlyresolved WENOJS solution (2056 grid points). As shown in Fig. (a)a, both methods yield acceptable solutions. However, the closeups in Fig. (b)b reveals that the GPR2 scheme better resolves the peaks and is closer to the reference solution.
6.3 2D Sedov
Next, we consider the Sedov blast test sedov1993similarity (). This problem studies the methods ability to maintain the symmetry of the selfsimilar evolution of a spherical shock, generated by a high pressure pointsource at the center of the domain. We follow the setup of fryxell2000flash ().
Fig. 10 shows density profiles along (i.e., the diagonal) and (i.e., the axis) with GPR2, at two different resolutions ( and ) and different choices of . All runs used a value of to perform a parameter scan on . The top two panels show the solutions obtained with WENOJS. Again, as in the ShuOsher problem, small values of introduce oscillations and lead to asymmetric shock evolution, whereas a choice of gives a good balance.
6.4 2D Mach 3 Wind Tunnel with a Step
The next 2D shock problem consists of a Mach 3 wind tunnel setup with a forward facing reflecting step, originally proposed by Woodward and Colella woodward_numerical_1984 (). We initialize the problem as in woodward_numerical_1984 () with an entropy fix for the region immediately above the corner of the step. After the bow shock reflects on the step, the shock progressively reaches the top reflecting wall of the tunnel at around . A triple point is formed due to the reflections and interactions of the shocks, from which a trail of vortices travels towards the right boundary.
Shown in Fig. 11 are the results computed using the GPR2 and GPR3 schemes, along with the WENOJS solution on a grid at the final time . Using the HLLC Riemann solver, we noticed that the GP and WENOJS schemes converged to different solutions, on account of the singularity at the corner of the step and despite the entropy fix. To compare the two schemes, we ran our simulations using an HLL Riemann solver, for which the two schemes converged to similar solutions. Both methods are able to capture the main features of the flow but the GP schemes produce more welldeveloped KelvinHelmholtz rollups that originate from the triple point.
6.5 2D Riemann Problem – Configuration 3
Next, we consider a specific Riemann Problem configuration that is drawn from a class of two dimensional Riemann problems that have been studied extensively zhang1990conjecture (); schulzrinne_classification_1993 () and have been applied to code verification efforts buchmuller2014improved (); balsara2010 (); lax1998solution (); schulzrinne_numerical_1993 (); don_hybrid_2016 (); lee2017piecewise (). Specifically, we look at the third configuration of the 2D Riemann problem presented in don_hybrid_2016 (); lee2017piecewise ().
Panels in Fig. 12 show density profiles at , along with 40 contour lines, for different choices of GP radii, on a grid resolution. All GP methods correctly capture the expected features of the problem. In this experiment, we see that the increase of results in a sharper representation of the flow features. Notably, the 5thorder GPR2 solution in (c) captures significantly more features when compared to the 5thorder WENOJS in (a), as evinced by the formation of more developed KelvinHelmholtz vortices along the slip lines (i.e., the interface boundaries between the green triangular regions and the sky blue areas surrounding the mushroomshaped jet).
6.6 Double Mach Reflection
Our last 2D test is the double Mach reflection problem introduced by Woodward and Colella woodward_numerical_1984 (). This test problem consists of a strong Mach 10 shock that is incident on a reflecting wedge that forms a angle with the plane of the shock. Fig. 13 shows density profiles from grid resolution runs, for a variety of GP stencil radii (), as well as for the GP adaptive order (AO) hybridization balsara_efficient_2016 () (detailed in A). We present these GP solutions together with a 5thorder WENOJS solution for comparison. The reflection of the shock at the wall forms two Mach stems and two contact discontinuities. The contact discontinuity that emanates from the rollup of the stronger shocks is known to be KelvinHelmholtz unstable provided there is sufficiently low numerical dissipation in the method. Thus, the problem quantifies the method’s numerical dissipation by the number of KelvinHelmholtz vortices present at the end of the run.
The presence of a highly supersonic shock is a stringent test for the stability of the algorithm. We find that the GP solutions remain stable for small values of . The choice of does not appear to considerably affect stability and thus can be set to relatively larger values than . All GP runs successfully reach for and on two different resolutions, in Fig. 13 and in Fig. 15.
Closeups in Fig. 14 reveal that GPWENO is significantly better at capturing the development of KelvinHelmholtz vortices in the Mach stem than the WENOJS method. More specifically, the 5th order GPR2 scheme is less dissipative than the WENOJS scheme, which in turn is less dissipative than the 3rd order GPR1 scheme. Both 7th order GPR3 and 9th order GPR4 schemes are able to resolve more vortices. While the GPAO() scheme is less dissipative than the GPR2, it does not capture as many features as the GPR3 scheme. This is despite the fact that both GPAO() and GPR3 are of the same formal order of accuracy in smooth flows.
In Fig. 15 we provide results for double the grid resolution, . The ranking derived from the lower resolution solutions still holds and the reduced dissipation of the GPR2 scheme over the 5th order WENOJS is more evident. Further, the GPR3 on the grid in Fig. 14(d) captures more vortices than WENOJS on the grid in Fig. 15(a). Our results from Fig. 15 can be directly compared to one of the most recent finite difference WENOAO solutions by Balsara et al. balsara_efficient_2016 () (see their Fig. 7).
6.7 3D Explosion
This 3D explosion test problem was introduced by Boscheri and Dumbser noauthor_direct_2014 () as a threedimensional extension of the Sod problem sod1978survey (). The test is set up on a domain with outflow boundary conditions. The initial condition is given by
(37) 
where and . The ratio of specific heats is and the simulation completes at .