Boosting the accuracy of SPH

Boosting the accuracy of SPH techniques:
Newtonian and special-relativistic tests

S. Rosswog
The Oskar Klein Centre, Department of Astronomy, AlbaNova, Stockholm University, SE-106 91 Stockholm, Sweden
School of Engineering and Science, Jacobs University Bremen, Campus Ring 1, 28759, Bremen, Germany
TASC, Department of Astronomy and Astrophysics, University of California, Santa Cruz, CA 95064, USA
Draft version

We study the impact of different discretization choices on the accuracy of SPH and we explore them in a large number of Newtonian and special-relativistic benchmark tests. As a first improvement, we explore a gradient prescription that requires the (analytical) inversion of a small matrix. For a regular particle distribution this improves gradient accuracies by approximately ten orders of magnitude and the SPH formulations with this gradient outperform the standard approach in all benchmark tests. Second, we demonstrate that a simple change of the kernel function can substantially increase the accuracy of an SPH scheme. While the ”standard” cubic spline kernel generally performs poorly, the best overall performance is found for a high-order Wendland kernel which allows for only very little velocity noise and enforces a very regular particle distribution, even in highly dynamical tests. Third, we explore new SPH volume elements that enhance the treatment of fluid instabilities and, last, but not least, we design new dissipation triggers. They switch on near shocks and in regions where the flow –without dissipation– starts to become noisy. The resulting new SPH formulation yields excellent results even in challenging tests where standard techniques fail completely.

pagerange: Boosting the accuracy of SPH techniques: Newtonian and special-relativistic testsLABEL:lastpagepubyear: 2013

1 Introduction

Smoothed Particle Hydrodynamics is a completely mesh-free, fully conservative hydrodynamics method originally suggested by Lucy (1977) and Gingold & Monaghan (1977) in an astrophysical context. By now, SPH has spread far beyond its original scope and it has also found a multitude of applications in engineering. For detailed overviews over the various aspects of the method and its applications the interested reader is referred to recent SPH reviews (Monaghan, 2005; Rosswog, 2009; Springel, 2010b; Price, 2012; Monaghan, 2012; Rosswog, 2014).
SPH has long been appreciated for a number of properties that are highly desirable in an astrophysical simulation. One of them is SPH’s natural adaptivity that comes without the burden of additional infrastructure such as an adaptive mesh. SPH is not constrained by any prescribed geometry (as is usually the case in Eulerian approaches), the SPH particles simply follow the gas flow. SPH naturally has tendency to “refine on density” and therefore vacuum is modelled with ease: it is simply devoid of SPH particles and no computational resources are wasted for modelling it. In Eulerian approaches, vacuum usually needs to be modelled as a low-density fluid and the interaction of the “real” fluid with the background medium can introduce substantial artifacts such as spurious friction or shocks which need to be disentangled from the physical results.
The probably most salient advantage of SPH, however, is that the conservation of mass, energy, momentum and angular momentum can be “hardwired” into discrete SPH formulations so that conservation is guaranteed independent of the numerical resolution. In practice, this conservation is only limited by, say, time integration accuracy or the accuracy with which gravitational forces are calculated. These issues, however, are fully controllable and can be adjusted to the desired accuracy. The exact conservation is usually enforced via symmetries in the SPH particle labels together with gradients that are antisymmetric with respect to the exchange of two particles. In SPH this is usually ensured by the use of radial kernels, , and the application of direct kernel gradients with the property . See, for example, Sec. 2.4 in Rosswog (2009) for detailed discussion of conservation in SPH. Below we will also discuss an alternative gradient estimate that shares the same antisymmetry properties.
Another, highly desirable property is Galilean invariance. For an Eulerian approach it is crucial to perform a simulation in the best possible reference frame. For example, a binary system modelled in a space-fixed frame may completely spuriously spiral in and merge while the same simulation performed in the frame of the binary system would have delivered the correct, stable orbital revolution (New & Tohline, 1997). For further examples related with Galilean invariance see Springel (2010a). Closely related is SPH’s perfect advection property: a property assigned to an SPH particle, say a nuclear composition, is simply (and exactly) carried along as the particle moves. This is highly desirable in a number of contexts, for example, when fluid trajectories need to be post-processed with nuclear reaction networks.
But as every numerical method, SPH has also properties where improvements would be welcome and this is the topic of this paper. Most SPH codes use artificial viscosity to ensure the Rankine-Hugoniot relations in shocks. Often, artificial viscosity is considered a drawback, but a well-designed artificial dissipation scheme should perform similar to an approximate Riemann solver. The major challenge in this respect is to design switches that apply dissipation only where needed and not elsewhere. We discuss such switches in detail in Sec. 6.3. Another drawback of standard SPH approaches that has received much attention in recent years is that, depending on the numerical setup, at contact discontinuities spurious surface tension forces can emerge that can suppress subtle fluid instabilities (Agertz et al., 2007; Springel, 2010b; Read et al., 2010). The reason for these surface tension forces are a mis-match in the smoothness of the internal energy and the density. This observation also provides strategies to cure the problem, either by smoothing the internal energies via artificial conductivity (Price, 2008) or by reformulating SPH in terms of volume elements that differ from the usual choice (Saitoh & Makino, 2013; Hopkins, 2013). Such approaches are discussed in detail and generalized to the special-relativistic case in Sec. 5.
SPH as derived from a Lagrangian has a built-in “re-meshing” mechanism: while the particles follow the fluid flow, they try to arrange themselves into optimal relative positions that maximize the sum of the particle volumes. In other words, the SPH force is a sum of a regularization term and the approximation to the hydrodynamic force. For optimally distributed particles the first term vanishes and the second one becomes identical to the Euler momentum equation. If the particles are, however, arranged in a non-optimal way, they start to move to improve the local distribution and this will appear as noise. Such motion is unavoidable, for example, in a multi-dimensional shock where the particles have to transition from the pre- to the post-shock lattice. While this remeshing mechanism is highly desireable, the related motions should be very subtle. This can be achieved by using very smooth kernels and some dissipation in regions where such remeshing occurs. Two improvements for this “noise issue” are discussed below. One is related to the choice of the smoothing kernel, see Sec. 4, and the other one to a “noise trigger” for artificial dissipation, see Sec. 6.3.
Last but not least, there is a drawback that is related to one of SPH’s strengths, its refinement on density. If low-density regions are geometrically close to high-density regions and they are the major interest of the investigation, SPH is seriously challenged: even when huge particle numbers are invested the low density region will always be poorly resolved. For such problems, SPH may not be the right method and one may possibly obtain good results with substantially less effort by resorting to adaptive mesh refinement methods.
In this article, we want to explore a number of technical improvements of SPH methods. Our main focus is multi-dimensional, special-relativistic hydrodynamics, but all discussed improvements can straight forwardly be applied also in purely Newtonian simulations. To demonstrate that the improvements also work in the Newtonian regime, we show a number of tests in this limit. Thereby we address in particular tests where “standard” SPH approaches yield very poor results. As we will demonstrate below, the suggested new SPH fromulations yield excellent results also in these challenging tests. All the tests that are shown here –whether Newtonian or special-relativistic– have been performed with a new, special-relativistic SPH code called SPHINCS_SR.
The article is organized as follows. In Sec. 3 we discuss a number of kernel interpolation techniques as a basis for the following sections. In Sec. 4 we discuss different kernel functions and assess their accuracy in estimating a constant density and the gradient of a linear function for the case that particles are arranged as a perfect lattice. We subsequently generalize the family of SPH volume elements that has recently been suggested by Hopkins (2013) and an integral-based gradient estimate (Garcia-Senz et al., 2012) to the special-relativistic case. These ingredients are combined into a number of different SPH formulations in Sec. 6 which are systematically explored in Sec. 7. By comparing the performance of the different SPH formulations one can gauge how important a particular measure is for the considered benchmark test. Our large set of benchmark tests shows that the accuracy of SPH can be drastically improved with respect to commonly made choices (high, constant dissipation; standard volume element; kernel gradients and the M kernel). Our results are finally summarized in Sec. 8.

2 Translation between Newtonian and special-relativistic approaches

The focus of this work is the improvement of SPH techniques, and we demonstrate them at Newtonian and –with future applications in mind– special-relativistic hydrodynamic examples. In the following we mainly stay in the special-relativistic picture, but the notation can be straight forwardly translated to the Newtonian case:

  • Mass
    We use here the (fixed) baryon number per SPH particle , this corresponds to (fixed) SPH particle mass .

  • Density
    In the relativistic treatment we need to distinguish between densities measured in our computing frame () and those measured in the local fluid rest frame (; they differ by a Lorentz factor, see Eq. 46). In the Newtonian limit (), of course, both densities coincide. The prescription how to calculate , Eq. (40), corresponds to the usual Newtonian density summation for .

  • Velocity
    The derivation from a relativistic Lagrangian suggests to use the canonical momentum per baryon, , as momentum variable. This is beneficial, since the form of the equations becomes very similar to the Newtonian case and since it avoids numerical complications (such as time derivatives of Lorentz factors) which would otherwise appear (Laguna et al., 1993). Inspection of Eq. (54) (, , inserting speed of light) shows that the corresponding Newtonian quantity is the velocity.

  • Energy
    Again, the Lagrangian derivation suggests to use the canonical energy per baryon. The resulting energy equation looks very similar to the Newtonian equation for the thermokinetic energy, , see, for example, Sec. 2.3.2 in Rosswog (2009). But the simplicity of the evolution equations comes at the price of recovering the primitive variables by an iteration at every time step. This topic is discussed in Sec. 6.4.

  • Suggested improvements
    The suggestions such as the gradient estimate, see Sec. 3, kernel choice, see Sec. 4, volume element, see Sec. 5, and dissipation triggers, see Sec. 6.3, then carry over straight forwardly. We have not performed extensive tests, though, to double-check whether our parameters can be blindly applied in Newtonian formulations.

3 Interpolation and gradients

As a basis for the later SPH discretizations in Sec. 6 we briefly discuss here some basic properties of discrete kernel interpolations. For now we keep the SPH volume element, , unspecified, we discuss particular choices in Sec. 5. Nothing in this section is specific to the special-relativistic case. In the following, we use the convention that upper indices refer to spatial directions while lower indices refer to SPH particle identities. Usually, we will use “a” for the particle of interest and “b” for a neighbor particle. We will follow the convention of summing from 1 to the number of spatial dimensions over spatial indices that appear twice. In some cases, though, we explicitly write out the summation for clarity.

3.1 SPH kernel interpolation

At the heart of SPH is the smooth representation of a quantity known at discrete (”particle”) positions by


where is a smoothing kernel whose width is set by the smoothing length . Particular choices of the smoothing kernel are discussed in Sec. 4. To assess the accuracy of the approximation Eq. (1) one can Taylor-expand around


and insert it into Eq. (1)


Requiring that be a close approximation to then provides us with the ”quality indicators” for the discrete kernel interpolation


which –for a good particle distribution– should be fulfilled to high accuracy. simply states that the particles should provide a good partition of unity.

3.2 Standard SPH gradient

A standard SPH procedure is to take directly the gradient of the interpolant Eq. (1)


Although this estimate is known to be of moderate accuracy only, it is advantageous because the involved kernel gradient has the desired antisymmetry property, , which makes it easy to obtain a fully conservative set of SPH equations, see e.g. Sec. 2.4 in Rosswog (2009) for an explicit discussion of numerical conservation in SPH. One can again proceed as above and insert a Taylor expansion into Eq. (6) to obtain


Requiring be a close approximation to then delivers the quality indicators of this gradient estimate:


is simply the gradient of and therefore again an expression of the partition of unity requirement.

3.3 Constant-exact gradient

It is obvious that the gradient estimate Eq. (6) does not necessarily vanish for constant function values , which is sometimes referred to as lack of zeroth order consistency. The gradient only vanishes in the case when is fulfilled exactly. This property, however, can be enforced by simply subtracting the leading error term, see Eq. (7),


so that now a constant function is reproduced exactly. For a non-regular particle distribution this substantially improves the gradient estimate, see Sec. 3.6, but it comes at the price that it does not have the desired antisymmetry and therefore makes it much harder to obtain exact conservation.

3.4 Linear-exact gradient

A linear-exact gradient estimate can be constructed (Price, 2004) starting from Eqs. (6) and (7) specified to particle position


which can be rearranged into


By matrix inversion one obtains a linearly exact gradient estimate






contains information about the local particle distribution while contains the function values at the neighboring particles. Obviously, the calculation of requires the inversion of a -matrix, but this can be done analytically and does not represent a major computational burden.

3.5 Integral-based gradient

More than five decades ago it was realized that derivatives can also be estimated by actually performing an integration (Lanczos, 1956). The resulting generalized derivative has a number of interesting properties. Among them is its existence even where conventional derivatives are not defined and the property that its value is the average of the left- and right-hand side limit of the derivative. As an example, the Lanczos derivative of at is . From a numerical perspective, this derivative has the desirable property that it is rather insensitive to noise in the data from which the derivative is to be estimated.
In an SPH context, integral-based estimates for second derivatives have been applied frequently, mainly because they are substantially less noise-prone than those resulting from directly taking second derivatives of kernel approximations (Brookshaw, 1985; Monaghan, 2005). For first order derivatives, however, such integral approximations have only been explored very recently (Garcia-Senz et al., 2012; Cabezon et al., 2012; Jiang et al., 2014). We will assess the accuracy of the integral-based gradient estimates under idealized conditions in Sec. 3.6, and, in more practical, dynamical tests in Sec. 7.
The function in the expression


can be Taylor-expanded around , so that one finds


Therefore the gradient component representation, which is exact for linear functions, is given by


where the matrix is the inverse of the symmetric matrix whose components read


contains only position information while also contains the function to be differentiated. In the following we will approximate the integrals in Eqs. (16) and (19) by conventional SPH summations over particles (the resulting summation approximations have no tilde), which yields




Whenever we use this expression in a gradient estimate, we refer to it as the “full integral approximation”, or fIA for short.
It is worth mentioning that for a radial kernel the gradient can be written as


where and is also a valid, positively definite and compactly supported kernel function. Therefore, if Eq. (22) is inserted in Eqs. (14) and (15), one recovers the fIA-gradient formula, i.e. the LE- and fIA-gradient approach are actually equivalent.
If we now assume that the quality indicator , Eq. (5), is fulfilled to good accuracy we can drop the term containing to obtain


We refer to Eq. (18) with as “integral approximation” or IA for short. How good this approximation is in practice depends on the regularity of the particle distribution, see also Sec. 3.6. As pointed out by Garcia-Senz et al. (2012) this last approximation breaks the exactness of the gradient of linear functions, but, on the other hand, it rewards us with a gradient estimate that is antisymmetric with respect to the exchange of and . This is crucial to ensure that the strongest property of SPH, the exact conservation, remains preserved. From Eq. (23) it is obvious that the gradient only vanishes exactly in the case of constant if , Eq. (5), is fulfilled exactly. So rather than having to fulfill several quality criteria for the function interpolant and its gradient, Eqs. (4) to (9), we only need to ensure the interpolation quality in the form of Eq. (5) in order to also have accurate gradient estimates.
So our gradient estimate in integral approximation reads explicitly


From the comparison with Eq. (6) it is obvious that the second sum takes over the role that is usually played by the kernel gradient:


We will make use of this replacement in Sec. 6.2 to obtain an alternative SPH formulation with integral-based derivative estimates.

3.6 Assessment of the gradient accuracy

Figure 1: Sensitivity of different gradient prescriptions to the regularity of the particle distribution. The left panel shows results for a perfect 2D hexagonal lattice corresponding to the closest packing of spheres of radius . The right panel shows results for a slightly perturbed hexagonal lattice that was obtained by displacing each particle in a hexagonal lattice in a random direction by a distance that has been randomly chosen from . The parameter determines the smoothing length via , where is the particle volume. “Std. SPH gradient” refers to the direct gradient of the SPH interpolant Eq. (6), “CE gradient” stands for “constant exact gradient” and is calculated according to Eq. (10), “IA gradient” is calculated from Eq. (24), the “fIA gradient” from Eq. (21) and the “LE gradient” from Eq. (13). For a regular particle distribution the gradient estimate can be improved by about ten orders of magnitude by using the IA-prescription (left; the CE gradient coincides with the standard SPH estimate and is therefore not shown). But even a small perturbation of the lattice degrades the gradient quality to an accuracy similar to the standard SPH estimate (right panel). The LE- and the fIA gradient are hardly affected and therefore not shown in the right panel. See text for more details.

We briefly want to assess the accuracy of the different gradient estimates Eqs. (6), (10), (13), (21) and (24) (with the standard cubic spline kernel) in a numerical experiment. Our experiment is similar to the one in Rosswog (2010): we set up particles on a 2D hexagonal lattice in , corresponding to a close-packed distribution of spheres with radius . The particles are assigned the same baryon number/masses and pressures that rise linearly with the x-coordinate so that the slope is . The numerical gradient estimate, , is calculated via Eq. (6) (“SPH-gradient”), the linear-exact gradient (“LE-gradient”), Eq. (13), Eq. (21) (full integral approximation, “fIA-gradient”) and Eq. (23) (“IA-gradient”). In Fig. 1 we display the error as a function of the parameter by which we set the smoothing length


based on the particle volume . For this perfectly regular particle distribution the quality indicators, Eq. (4) and (5), are fulfilled to high accuracy and therefore the constant exact (CE) gradient is practically identical to the standard SPH estimate and therefore it is not shown in the left panel of Fig. 1. For the same reason the IA-approximation, Eq. (23), is very accurate and yields a gradient estimate (red) that is roughly ten orders of magnitude better than the standard SPH gradient estimate (black). The full integral approximation and the linear-exact prescription reproduce the exact result to within machine precision, but, as noted above, they lack the desirable antisymmetry property that facilitates exact numerical conservation.
As expected from the term that was neglected to obtain Eq. (24), the quality of antisymmetric gradient estimate is sensitive to the particle distribution. To illustrate this, we perform a variant of the previous experiment in which we slightly perturb the perfect hexagonal lattice. To each particle position we add a randomly oriented vector whose modulus is chosen randomly from the interval . Even this very subtle perturbation substantially degrades the accuracy of those gradients that do not account for the actual particle distribution, i.e. for the standard SPH- and the IA-gradient prescriptions. The CE-, LE- and fIA-gradient estimates make use of the information on the local particle distribution and are therefore hardly deteriorated. For the perturbed distribution, the IA-gradient has no more obvious advantage with respect to the standard SPH gradient, both now show comparable errors. Therefore, further, dynamical, tests are required to see whether a regular enough particle distribution can be maintained during dynamical simulation and to judge whether the IA-gradient prescription really improves the accuracy in practice. As will be shown below, however, and consistent with the findings of Garcia-Senz et al. (2012), the integral-based, antisymmetric IA-approach by far outperforms the traditional direct kernel gradients in all dynamical tests.

4 Kernel choice

4.1 Kernels

In Sect. 3 we have briefly collected some kernel interpolation basics. Traditionally, “bell-shaped” kernels with vanishing derivatives at the origin have been preferred as SPH kernels because they are rather insensitive to the exact positions of nearby particles and therefore they are good density estimators (Monaghan, 1992). For most bell-shaped kernels, however, a “pairing instability” sets in once the smoothing length exceeds a critical value. When it sets in, particles start to form pairs and, in the worst case, two particles are effectively replaced by one. This has no dramatic effect other than halfing the effective particle number, though, still at the original computational expense. Recently, there have been several studies that (re-)proposed peaked kernels (Read et al., 2010; Valcke et al., 2010) as a possible cure for the pairing instability. Such kernels have the benefit of producing very regular particle distributions, but as the experiments below show, they require the summation over many neighboring particles for an acceptable density estimate. It has recently been pointed out by Dehnen & Aly (2012), however, that, contrary to what was previously thought, the pairing instability is not caused by the vanishing central derivative, but instead a necessary condition for stability against pairing is a non-negative Fourier transform of the kernels. For example, the Wendland kernel family (out of which we explore one member below), has a vanishing central derivative, but does not fall pray to the pairing instability. This is, of course, advantageous for convergence studies since the kernel support size is not restricted.
In the following, we collect a number of kernels whose properties are explored below. We give the kernels in the form in which they are usually presented in the literature, but to ensure a fair comparison in tests, we scale them to a support of , as the most commonly used SPH kernel, M, see below. So if a kernel has a normalization for a support of , it has, in spatial dimensions, normalization when it is stretched to a support of .

Figure 2: Comparison of the standard spline kernels M and M with the kernel family W (kernels left, their derivatives right). Note that for ease of comparison the M kernel has been stretched to a support of . W is a close approximation of M and, if scaled to the same support, W is similar to M. Also shown is the Wendland kernel .


Figure 3: The peaked kernels LIQ and QCM: kernel on the left and its derivative on the right. The QCM kernel has been constructed so that it hardly deviates from M, but has a non-vanishing central derivative. For comparison also the M kernel plotted (dashed lines).

4.1.1 Kernels with vanishing central derivatives

B-spline functions: M and M kernels
The most commonly used SPH kernels are the so-called B-spline functions (Schoenberg, 1946), , which are generated as Fourier transforms:


The smoothness of the functions increases with and they are continuous up to the -th derivative. Since SPH requires at the very least the continuity in the first and second derivative, the cubic spline kernel M


is the lowest-order member of this kernel family that is a viable option. It is often considered the “standard choice” in SPH. The normalized SPH kernel then has the form


where111We use the convention that refers to the full normalized kernel while is the un-normalized shape of the kernel. . The normalizations are obtained from


where is the kernel support which is equal to 2 for M. This yields values of in 1, 2 and 3 dimensions.
The M kernel (truncated at support ) is given as


with normalizations of in 1, 2 and 3 dimensions. Note that for a fair comparison in all plots the M kernel is rescaled to a support on .

A parameterized family of Kernels
More recently, a one parameter family of kernels has been suggested (Cabezon et al., 2008)


where determines the smoothness and the shape of the kernel, see Fig. 2. The normalization of this kernel family can be expressed as a fifth order polynomial whose coefficients can be found in Tab. 2 of Cabezon et al. (2008). In this form, is allowed to vary continuously between 2 and 7. We use here exactly the form described in their paper, in particular all kernels have a support of . The kernel is a very close approximation to the M, see Fig.  2, while is very similar to M, provided they are stretched to have the same support. For practical calculations we use the kernels for -values from 3 to 9, the corresponding normalizations are given in Table 1.

Normalization of the W kernels

n= 3 n= 4 n= 5 n= 6 n= 7 n= 8 n= 9
1D 0.66020338 0.75221501 0.83435371 0.90920480 0.97840221 1.04305235 1.10394401
2D 0.45073324 0.58031218 0.71037946 0.84070999 0.97119717 1.10178466 1.23244006
3D 0.31787809 0.45891752 0.61701265 0.79044959 0.97794935 1.17851074 1.39132215
Table 1: Normalization constant of the W kernels in one, two and three dimensions.

Wendland kernels
An interesting class of kernels with compact support and positive Fourier transforms are the so-called Wendland functions (Wendland, 1995). In various areas of applied mathematics they have long been appreciated for their good interpolation properties, but they have not received much attention as SPH kernels. Recently, they have been discussed in some detail in Dehnen & Aly (2012), where it was in particular found that these kernels are not prone to the pairing instability, despite having a vanishing central derivative. We only experiment here with one particular example, the smooth


see e.g. Schaback & Wendland (2006), where the symbol denotes the cutoff function . The normalization is 78/(7) and 1365/(64 ) in 2 and 3 dimensions. As we will demonstrate in the below benchmark tests, this kernel has some very interesting properties, in particular it maintains a highly ordered particle distribution in dynamical simulations and it does not fall prey to the pairing instability.

4.1.2 Kernels with non-vanishing central derivatives

Here, we briefly discuss two kernel functions with non-vanishing central derivatives. The first example is the “linear quartic” (LIQ) kernel that has been suggested (Valcke et al., 2010) to achieve a regular particle distribution and to improve SPH’s performance in Kelvin-Helmholtz tests. The second example is shown mainly for pedagogical reasons: it illustrates how a very subtle change to the core of the well-appreciated M kernel seriously compromises its accuracy.

Linear quartic kernel, LIQ
The centrally peaked ”linear quartic” (LIQ) kernel (Valcke et al., 2010) reads


with and and . The normalization constant is 2.962 in 2D and 3.947 in 3D.

Quartic core M kernel, QCM
We briefly explore a modification of the M kernel so that it remains very smooth, but has a non-vanishing derivative in the center. This quartic core kernel (QCM) is constructed by replacing the second derivative of the kernel for by a parabola whose parameters have been chosen so that it fits smoothly and differentially the kernel at the transition radius which is defined by the condition . The QCM-kernel then reads:


The coefficients and are determined from the conditions , , and , where the primes indicate the derivatives with respect to . The resulting numerical values are given in Table 2. Note that is continuous everywhere up to the third derivative. The peaked kernels LIQ and QCM are compared in Fig. 3.

parameter numerical value
8.245880 E-3
4.649647 E-3
2.650839 E-3
Table 2: Parameters of the Quartic-Core- kernel (QCM).

4.2 Accuracy assessment

4.2.1 Kernel support size

We give in Tab. 3 the values that are used in the numerical experiments. They are chosen to be very large, but small enough to avoid pairing. Also given is the value , where has its maximum, and the corresponding neighbor number for a hexagonal lattice, .

4.2.2 Density estimates

Figure 4: Accuracy of density estimates for different kernels. Particles are placed on a hexagonal lattice so that the density is uniform and the SPH density estimate is compared with the theoretical value to calculate the relative density error, see left panel. The parameter determines the smoothing length, ; LIQ: linear-quartic, QCM: quartic core M). To clarify the nature of the “dips” in the left panel, we plot in the right panel the logarithm of the relative density error (upper right) and for the case of the cubic spline kernel M. The dips occur when the density error changes sign (exact value is indicated by the dashed red line).

To assess the density estimation accuracy, we perform a simple experiment: we place the SPH particles on a hexagonal lattice and assign all of them a constant value for the baryon number222This is equivalent to assigning a constant particle mass for the Newtonian case., . Since the effective area belonging to each particle with radius in a close packed configuration is , the theoretical density value is . We now measure the average relative deviation between the estimated and the theoretical density value, , the results are shown in the left panel of Fig. 4 as a function of the smoothing length parameter , see Eq. (26). For clarity, we only show the odd members of the W family. Note that we show the results up to large values of where, in a dynamical simulation, some of the kernels would already have become pairing unstable. The “dips” in the left panel occur where the density error changes sign. The right panel of Fig. 4 illustrates this for the case of the M kernel.
The “standard”, cubic spline kernel M does not perform particularly well and simply replacing it by, say, the M kernel increases the density estimate already by roughly two orders of magnitude. If larger smoothing length can be afforded, however, the density estimate can be further substantially improved. For example, the W kernel for achieves in this test approximately four orders of magnitude lower errors than what the M can achieve for (for larger values it becomes pairing unstable). The Wendland kernel achieves a continuous improvement in the density estimate with increasing and, as Dehnen & Aly (2012) argue, this protects the kernel against becoming pairing unstable. Although the results in this test are less accurate than those of the higher-order W kernels, we gave preference to the Wendland kernel as our standard choice, since allows only for very little noise, see below.
Both peaked kernels perform very poorly in this test, even for a very large support. It is particularly interesting to observe how the subtle change of the core of the M kernel (compare dashed black and the blue curve in Fig. 3) seriously compromises the density estimation ability (compare the dashed black curve with the blue triangles in Fig. 4).

M 0.667 1.500 28 1.2
M 0.506 1.976 49 1.6
0.661 1.512 28 1.2
0.567 1.765 39 1.5
0.504 1.984 49 1.6
0.458 2.183 59 1.7
0.423 2.364 70 1.8
0.395 2.531 80 1.9
0.372 2.690 90 2.2
0.430 2.323 n.a. 2.2
LIQ n.a. n.a. n.a. 2.2
QCM 0.506 1.976 49 2.2
Table 3: Values where the kernel derivative has the maximum, . is the neighbor number for a hexagonal lattice that corresponds to and is the value used in our experiments. The latter has been chosen so that it is as large as possible without becoming pairing unstable, or, in cases where no pairing occurs, as large as computational sources allowed.

Figure 5: Accuracy of gradient estimates (calculated via Eq. (6)) for different kernels. Particles are placed on a hexagonal lattice so that the density is uniform and the pressure linearly increasing. The parameter determines the smoothing length, ; LIQ: linear-quartic, QCM: quartic core M.

4.2.3 Gradient estimates

We repeat the experiment described in Sec. 3.6 with the “standard” SPH gradient, Eq. (6), for a number of different kernels. We set up particles on a 2D hexagonal lattice in , corresponding to a close-packed distribution of spheres with radius . All particles possess the same baryon numbers and are assigned pressures that rise linearly with the x-coordinate so that the slope is . In Fig. 5 we show the impact of the kernel choice on the gradient accuracy (again, the M is stretched to a support of 2h). The M kernel (dashed black) yields, roughly speaking, a two orders of magnitude improvement over the standard SPH kernel M (solid black). Provided one is willing to apply larger neighbor numbers, the accuracy can be further improved by using smoother kernels. Once more, the high-order members of the kernel family perform very well for large values. And, again, the Wendland kernel continuously improves the gradient estimate with increasing . None of peaked kernels reaches an accuracy substantially below , not even for very large -values.

5 Generalized volume elements

Here we discuss the volume elements that are needed for the kernel techniques described in Sec. 3. In the following discussion we use the baryon number and the computing frame baryon number density , but every relation straight forwardly translates into Newtonian SPH by simply replacing with the particle mass and with the mass density .
At contact discontinuities the pressure is continuous, but density and internal energy suffer a discontinuity. For a polytropic EOS, , where is the baryon number density in the local rest frame333As will be described in more detail below, we measure energies here in units of the baryon rest mass energy . The Newtonian correspondence of the expression is, of course, ., the product of density and internal energy must be the same on both sides of the discontinuity to ensure a single value of at the discontinuity, i.e. . Here the subscripts label the two sides of the discontinuity. If this constraint is violated on a numerical level, say, because density and internal energy have a different smoothness across the discontinuity, spurious forces occur that act like a surface tension. This can occur in ”standard” SPH formulations since the density is estimated by a kernel-weighted sum over neighboring particles and therefore is smooth, while the internal energy is a property assigned to each particle that enters ”as is” (i.e. un-smoothed) in the SPH evolution equations. Such spurious surface tension forces can compromise the ability to accurately handle subtle fluid instabilities, see for example Agertz et al. (2007); Springel (2010b); Read et al. (2010). One may, however, question whether an unresolvably sharp transition in is a viable initial condition in the first place. Note that Godunov-type SPH formulations (Inutsuka, 2002; Cha & Whitworth, 2003; Cha et al., 2010; Murante et al., 2011; Puri & Ramachandran, 2014) do not seem to suffer from such surface tension problems.
The problem can be alleviated if also the internal energy is smoothed, for example by applying some artificial thermal conductivity. This approach has been shown to work well in the case of Kelvin-Helmholtz instabilities (Price, 2008; Valdarnini, 2012). But it is actually a non-trivial problem to design appropriate triggers that supply conductivity in the right amounts exclusively where needed, but not elsewhere. Artificial conductivity applied where it is undesired can have catastrophic consequences, say by spuriously removing pressure gradients that are needed to maintain a hydrostatic equilibrium.
An alternative and probably more robust cure comes from using different volume elements in the SPH discretization process. Saitoh & Makino (2013) pointed out that SPH formulations that do not include density explicitly in the equations of motion avoid the pressure becoming multi-valued at contact discontinuities. Since the density usually enters the equation of motion via the choice of the volume element (or , respectively, in the Newtonian case), a different choice can possibly avoid the problem altogether. This observation is consistent with the findings of Heß & Springel (2010) who used a Voronoi tessellation to calculate particle volumes. In their approach no spurious surface tension effects have been observed. Closer to the original SPH spirit is the class of kernel-based particle volume estimates that have recently been suggested by Hopkins (2013) as a generalization of the Saitoh & Makino (2013) approach. In the following we will make use of these ideas for our relativistic SPH formulations.
We explore different ways to calculate kernel-based particle volume estimates from which the densities follow as


An obvious possibility for a volume element is the inverse of the local SPH-particle number density (calculated in the computing frame), estimated by a kernel sum


While this is a natural choice, one is in principle free to generalize this estimate by weighting each kernel with an additional quantity


There is a lot of freedom in the choice of and we will here only explore a small set of (Lorentz invariant) weights and assess their suitability in numerical experiments. If the weight is chosen, one obviously recovers the volume element of Eq. (37) and the baryon number density is simply given by the number density estimate weighted with the particle’s own baryon number


Since only the baryon number of the particle itself enters, this form in principle allows for sharp density transitions (say, via a uniform particle distribution with discontinuous -behavior). As confirmed by the experiments in Sec. 7.3, this removes spurious surface tension effects.
If instead is chosen, one recovers the standard SPH density estimate


Another choice is , which yields


One may wonder whether the pressure in the denominator may not give an inappropriately large weight in case of substantial pressure differences between neighboring particles, say in a shock. Indeed, for the relativistic Sod-type shock, Sec. 7.6.1, with a pressure ratio of order , and the choice we have observed a small density “precursor” spike within one smoothing length of the shock. To avoid such artifacts, we choose a small value, , for which no anomalies have been observed. Obviously, for one recovers the previous case of . If the pressure is used as a weight in the volume/density estimate an iteration is required for self-consistent values of and . This is explained in detail in Sec. 6.1.3.
To illustrate the impact that the choice of the volume element has on the resulting pressure across a contact discontinuity we perform the following experiment. We place particles on a uniform hexagonal lattice, assign baryon numbers so that the densities for have value and for and internal energies as to reproduce a constant pressure on both sides, . Here labels the side and is the polytropic exponent. Once set up, we measure the densities according to Eq. (36) and since here we calculate the pressure distribution across the discontinuity, once for each choice of .

Figure 6: Different ways to set up a contact discontinuity. In the upper panel, particles are placed on a uniform hexagonal lattice (“close packed”, CP) and masses/baryon numbers and internal energies are assigned as to reproduce the constant states, with a sharp transition at . Since the subsequent density calculation (with ) produces a smooth density transition, the mismatch between a sharp internal energy and a smooth density leads to a “pressure blip” (orange oval) that leads to spurious surface tension forces at the interface. If for the same setup is used, both and have a sharp transition and the pressure blib disappears, see middle panel. The lower panel shows an alternative setup with equal masses/baryon numbers where the density structure is encoded in the particle distribution (here two uniform hexagonal lattices). The corresponding density calculation (with ) also produces a smooth density transition, but the internal energy is consistently (and therefore smoothly) assigned from the calculated density and the condition of uniform pressure.

The standard SPH volume element corresponding to weight factor produces a smooth density transition, see upper panel of Fig. 6 that together with the sharp change in causes the ”pressure blip” (orange oval) that is also frequently seen in SPH shock tests and that is the reason behind the spurious surface tension forces discussed above. The same experiment, if repeated using , does not show any noticeable deviation from the desired value of unity. The phenomenon of spurious surface tension is further explored by numerical experiments in Sec. 7.3.
As an alternative numerical model for a contact discontinuity, we show in the last panel of Fig. 6 a setup for the case of constant particle masses/baryon numbers. In this case, the information about the density is encoded in the particle distribution, for which we use here two uniform hexagonal lattices. Also here () the density transition is smooth, but the internal energy is calculated (via an iteration) from the condition of constant pressure, so that the smoothness of and are consistent with each other.

6 Special-relativistic SPH

We will now apply the techniques discussed in Secs. 3-5 to the case of ideal, special-relativistic fluid dynamics. Excellent descriptions of various aspects of the topic (mostly geared towards Eulerian methods) can be found in a number of recent textbooks (Alcubierre, 2008; Baumgarte & Shapiro, 2010; Goedbloed et al., 2010; Rezzolla & Zanotti, 2013). In a first step, see Sec. 6.1, we generalize the derivation from the Lagrangian of an ideal relativistic fluid to the case of the generalized volume elements as introduced in Eq. (38). This leads to a generalization of the kernel gradient based equations given in Rosswog (2010). In a second formulation, see Sec. 6.2, we use integral-based gradient estimates, see Eq. (25), again for the case of a general volume element.

6.1 Special-relativistic SPH with kernel derivatives

We assume a flat space-time metric, , with signature (-,+,+,+) and use units in which the speed of light is equal to unity, . We reserve Greek letters for space-time indices from 0…3 with 0 being the temporal component, and refer to spatial components and SPH particles are labeled by and .
Written in Einstein sum convention the Lagrangian of a special-relativistic perfect fluid can be written as (Fock, 1964)


where the energy-momentum tensor reads


We can write the energy density as a sum of a rest mass and an internal energy contribution


where, for now, the speed of light is shown explicitly. The baryon number density is measured in the local fluid rest frame and the average baryon mass is denoted by . With the conventions that all energies are written in units of and , we can use the normalization of the four-velocity, , to simplify the Lagrangian to


The number density as measured in the ”computing frame” (CF), see Eq. (36), is –due to length contraction– increased by a Lorentz factor with respect to the local fluid rest frame


Therefore, the Lagrangian can be written as




where the (fixed) baryon number carried by particle , , has been introduced. To obtain the equations of motion from the Euler-Lagrange equations, we need and for which we use (see Eq. 38)


with the “grad-h” terms


The derivatives of the CF number densities then become




6.1.1 The general momentum equation

From straight forward differentiation of Eq. (48) using Eq. (46), the first law of thermodynamics, , and one finds the canonical momentum (for the explicit steps see Rosswog (2009))


and the evolution equation for the canonical momentum per baryon, , follows directly from the Euler-Lagrange equations (Eq. (155) in Rosswog (2009))


For the choice this reduces to the momentum equation given in Rosswog (2010).

6.1.2 The general energy equation

The energy derived from the Lagrangian is


where the canonical energy per baryon is




where we have used the specific, relativistic enthalpy


The subsequent derivation is identical to the one in Rosswog (2009) up to their Eq. (165),


which, upon using Eqs. (53) and (55), yields the special-relativistic energy equation


Again, for this reduces to the energy equation given in Rosswog (2010).
The set of equations needs to be closed by an equation of state. In all of the tests presented below, we use a polytropic equation of state,


where is the polytropic exponent (keep in mind our convention of measuring energies in units of ). The corresponding sound speed is


The choices of the variables , and are suggested by the Lagrangian derivation and they avoid problems that have plagued earlier relativistic SPH formulations. For a comparison with Eulerian approaches we refer to the literature, e.g. to Marti & Müller (2003) or Keppens et al. (2012).

6.1.3 Consistent values for smoothing lengths, densities and weights

The smoothing lengths, the CF density and possibly the weight depend on each other. If the weight depends on the density, say for the case , we first perform a few steps to find accurate values of the new pressure: a) calculate new volumes according to Eq. (38) using the smoothing lengths and weights from the previous time step, b) from the resulting new value of the density we update the pressure according to Eq. (62), c) update the smoothing length according to Eq. (26) d) once more update the volume and e) again the pressure. This pressure value is finally used to perform an iteration between volume, Eq. (38), and the smoothing length, Eq. (26). Due to the previous steps the guess values at this stage are already very accurate, so that on average only one iteration is needed to meet our convergence criterion .
While this procedure requires a number of iterations, we find that it is worth the effort, since a careless update of the smoothing length can produce a fair amount of noise that can compromise the quality of a simulation. An inaccurate update of the smoothing lengths may be a largely overlooked source of noise in many SPH codes.

6.2 Special-relativistic SPH based on integral approximations to derivatives

As an alternative, we suggest a relativistic SPH formulation that is based on the integral approximation of gradients given in Eq. (24). This generalizes the Newtonian formulation of Garcia-Senz et al. (2012). If we use the formal replacement, Eq. (25), we can write alternative, integral-based relativistic SPH equations as