An adaptively refined phasespace element method for cosmological simulations and collisionless dynamics
Abstract
body simulations are essential for understanding the formation and evolution of structure in the Universe. However, the discrete nature of these simulations affects their accuracy when modelling collisionless systems. We introduce a new approach to simulate the gravitational evolution of cold collisionless fluids by solving the VlasovPoisson equations in terms of adaptively refineable “Lagrangian phase space elements”. These geometrical elements are piecewise smooth maps between Lagrangian space and Eulerian phase space and approximate the continuum structure of the distribution function. They allow for dynamical adaptive splitting to accurately follow the evolution even in regions of very strong mixing.
We discuss in detail various one, two and threedimensional test problems to demonstrate the performance of our method. Its advantages compared to body algorithms are: i) explicit tracking of the finegrained distribution function, ii) natural representation of caustics, iii) intrinsically smooth gravitational potential fields, thus iv) eliminating the need for any type of adhoc force softening.
We show the potential of our method by simulating structure formation in a warm dark matter scenario. We discuss how spurious collisionality and largescale discreteness noise of body methods are both strongly suppressed, which eliminates the artificial fragmentation of filaments.
Therefore, we argue that our new approach improves on the body method when simulating selfgravitating cold and collisionless fluids, and is the first method that allows to explicitly follow the finegrained evolution in sixdimensional phase space.
keywords:
cosmology: dark matter – cosmology: largescale structure of the Universe – cosmology: theory – galaxies: kinematics and dynamics – methods: numerical1 Introduction
Numerical simulations lie at the very heart of contemporary cosmology. They are the only method that can accurately follow the growth of small primordial density fluctuations into the highly nonlinear objects that populate the lowredshift Universe (e.g. Davis et al., 1985; Efstathiou et al., 1985; Bertschinger, 1998; Springel et al., 2005; Angulo et al., 2012). As such, they have proven an indispensable tool in the formulation of our theory of cosmological structure formation and in the validation of the CDM model.
Since most of the mass in the Universe appears to be in the form of dark matter (DM; a fundamental particle with a negligible nongravitational interaction crosssection with both itself and baryonic matter), numerical simulations that only follow gravitational forces were the natural first tool employed by pioneer numerical cosmologists. Since the 1970s, these simulations have progressively increased their scope and accuracy, nowadays spanning a huge dynamic range. Stateoftheart simulations employ trillions of bodies to describe volumes comparable to the observable Universe, while resolving the collapsed DM structures that could host the faintest galaxies (see e.g. Heitmann et al., 2015; Skillman et al., 2014; Ishiyama et al., 2015, for recent examples).
A milestone in the history of gravityonly simulations was the establishment of a universal form for the density profile of collapsed dark matter haloes (Navarro et al., 1996, 1997). Other important results were the accurate characterisation of the hierarchy of nonlinear correlation functions, of the cosmic velocity field, precise estimates of nonlinear effects on the BAO peak and the universal form for the abundance of dark matter haloes, which secure their use for cosmological parameter estimation (see the reviews of Kuhlen et al., 2012; Frenk & White, 2012, and references therein).
The central role of simulations in cosmology is in part due to the apparent robustness of the methods employed. Their results seem to hold even when extremely different time, mass and force resolution are used. For instance, regarding the functional form and universality of halo density profiles, stateoftheart simulations show a remarkable agreement with those carried out years ago, despite employing orders of magnitude more particles, and times better force resolution (e.g. Diemand et al., 2008; Springel et al., 2008; Gao et al., 2012). It is worth noting that, despite some recent hints (Pontzen & Governato, 2013; Ludlow et al., 2013, 2014), the fundamental origin of the halo density profile is still unknown, and even a purely numerical one has been suggested (e.g. Baushev, 2015).
1.1 The Collisionless Boltzmann equation
Dark matter (DM) in the Universe can be considered as a collisionless fluid. Even in the case of a (heavy) 100 GeV DM particle, the mean comoving number density of particles is . This obviously implies that dark matter, just like baryons, can be treated as a fluid on cosmological and astrophysical scales, where the microphysical particle character is irrelevant. Additionally, DM appears to have an extremely weak interaction crosssection with itself or with ordinary matter. Therefore, the DM fluid can be regarded as collisionless. These properties hold for most particle dark matter candidates, even for warm ones such as e.g. sterile neutrinos.
A selfgravitating collisionless fluid is fully described by the evolution of its sixdimensional phasespace density , which is governed by the collisionless Boltzmann equation (CBE):
(1) 
supplemented with Poisson’s equation for the Newtonian gravitational potential
(2) 
where is the (microphysical) particle mass, is the mean number density of dark matter particles in the Universe, is the gravitational constant and is the cosmological scale factor that itself obeys the first Friedmann equation. The system of the CBE with a selfcoupling described by Poisson’s equation is commonly termed the VlasovPoisson system of equations. While we focus here on modelling the DM dynamics, we note that VlasovPoisson describes all collisionless fluids with a potential and thus the discussion presented in this paper also applies to those cases.
One can consider two limits for the initial distribution function, . In the hot limit, the problem is manifestly sixdimensional, where the thermal width gives a natural scale in velocity space that one can aim to resolve. Quite in contrast, in the cold limit, i.e. when the thermal particle velocity dispersion is much smaller than the bulk velocities arising from gravitational instability (which is an excellent approximation for both cold and warm DM candidates^{1}^{1}1In fact, for a particle, the RMS velocity is only around at (Bode et al., 2001) when structure formation starts to become nonlinear. This velocity further decays and is orders of magnitude smaller than the velocities arising from collapse. The signature of the thermal velocity is the truncation of the spectrum, which represents the largest Jeans mass before the particles start to cool. At later times, the particles are never substantially heated again since they remain locally (on their stream) roughly at the cosmic mean density even inside collapsed structures (cf. Vogelsberger & White, 2011). In very good approximation, it thus suffices to consider all (nonrelativistic) dark matter particles as a cold fluid.), the problem is only threedimensional: the thermal width is zero and thus the fluid occupies only a threedimensional submanifold of phase space (cf. e.g. Zeldovich, 1970; Arnold et al., 1982). This means that the distribution function is a smooth (i.e. differentiable) threedimensional hypersurface without holes in sixdimensional phasespace. The VlasovPoisson equation guarantees that it remains smooth and that no holes will emerge in phase space during its evolution. In particular, one can parameterise this submanifold in terms of Lagrangian coordinates (so that it is customary to call it the “Lagrangian manifold”) and express the mapping between Lagrangian and Eulerian phase space as
(3) 
whose evolution describes the full evolution of the fluid in both the monokinetic (singlestream) and the multistream regime. Despite the cold finegrained distribution function, this three dimensional hypersurface can become “dense” in phase space through gravitational evolution and mixing, which means that it will cover more and more of phase space over time and approach a state close to a macroscopically “hot” system. Once the phase space is densely covered (meaning that the distance in phase space between different regions of the sheet becomes small compared to the extent of the system), one can expect that the system does not change any more on time scales on which particles move, which means that it will approach ergodicity in those regions.
Unfortunately, the solution of the set of equations in the hot limit without further approximations is extremely expensive computationally, and an explicit 6D solution has been obtained only for a handful of simplified cases (e.g. Yoshikawa et al., 2013, using a grid). The situation is in some sense even worse in the cold limit, since both spatial and velocity resolution achievable with such a coarsegrained approach are completely insufficient to resolve the dynamics of a cold system without being dominated by diffusion in phase space. Additionally, because of the collisionless nature of the DM fluid, Boltzmann’s Htheorem does not apply (in the sense that it is not to be expected that the system approaches a maximum entropy state described by a single “temperature” on shortenough time scales) and hence the CBE cannot be replaced by a loworder moment expansion in terms of fluid equations with a simple closure condition such as an effective pressure or effective viscosity (although such approaches are applied on mildly nonlinear scales, e.g. Carrasco et al., 2012; Hertzberg, 2014, but the effective fluid properties have not been determined from first principles, and are provided by virialisation on smaller scales rather than a conventional thermalisation).
To make progress, it has thus been customary to postpone the goal of following the exact evolution of the finegrained distribution function and instead to solve the VlasovPoisson system using a MonteCarlo approach. The idea is to sample the distribution function at discrete locations () with massive bodies (in what can be thought of as a “coarsegraining” of the initial conditions over macroscopic volume elements containing a mass of microscopic particles, i.e. ):
(4) 
for which the VlasovPoisson system then reduces to the equations of motion for a Hamiltonian body system, i.e. the phase space density is trivially conserved along the characteristics , defined by:
(5) 
The equations of motion are thus easy to solve if the gravitational potential is known.
Inserting the distribution function (4) into Poisson’s equation (2), and convolving with the Green’s function of the Laplacian , the potential is given by the Newtonian potential of point masses. The initial coarse graining however implied that the mass is not concentrated in one point, but spread out over the volume , and so the unbounded twobody force is certainly a bad approximation to the continuous collisionless system. The evolution of the volume is nontrivial and so an approximation is made by replacing the point mass potential e.g. with a Plummer softened potential
(6) 
where is the softening length (which may evolve over time). In principle initially is of the order of the mean particle separation and becomes smaller in regions of collapse^{2}^{2}2In fact, represents the phase space volume associated with a particle and will remain of the order of the mean particle separation before shell crossing. After shellcrossing in regions of multistreaming, it is still on the order of the mean separation, but the mean separation of particles on the same stream, not the mean separation after projection to configuration space as is done when adaptive softening is used (either by adaptive softening lengths or in AMR). It is thus hopeless to estimate more accurately using only configuration space properties.. Since its evolution is not followed, it is usually treated as a somewhat arbitrary parameter to soften the force field on a small fraction of the mean particle separation scale that seeks to suppress the discrete nature of the system while allowing for the highest force “resolution” possible. The validity of this approach is then typically inferred from convergence studies in which and are varied. In fact, the method we present in this paper can be viewed as following the exact evolution of the volume elements over time.
The evolution of the body system is thus assumed to be analogous to the evolution of a macroscopic group of microphysical dark matter particles, and therefore, it is assumed that it is an actual solution to the CBE. This is clearly true only when goes to infinity and the force softening goes to zero. For a finite number of particles, there are collision terms that would affect the details of dynamics such as chaotic and phase mixing, Landau damping and twobody interactions among others. Furthermore, it is not possible to guarantee an upper bound to the error introduced. This can lead to solutions dominated by numerical artefacts.
Over the years, many authors have pointed out cosmological setups in which the discrete and collisional nature of the body solution can be appreciated (e.g. Centrella & Melott, 1983; Centrella et al., 1988; Melott & Shandarin, 1989; Diemand et al., 2004; Melott et al., 1997; Splinter et al., 1998; Wang et al., 2007; Melott, 2007; Marcos, 2008). There are two situations worth mentioning. The first is simulations that feature a smallscale cutoff in their primordial density power spectrum and in which the body method produces a large number of spurious clumps that resemble dark matter haloes (usually known as “artificial filament fragmentation”). Another example is when the gravitational interaction of two fluids that start with a different power spectrum is followed, e.g. baryons and dark matter in the early Universe. In this case, twobody interactions make the fluids collisional resulting in an incorrect evolution of their relative spatial distribution (Yoshida et al., 2003; Angulo et al., 2013). In all these cases problems are most severe if is smaller than the mean particle separation (which is however clearly the case in all production run body simulations where typically is to of the mean particle separation).
It is not yet clear how the problems described above affect standard singlefluid CDM simulations. However, there is a reasonable amount of concern about the correctness and accuracy of body simulations. The importance that numerical simulations have for modern cosmology certainly warrants a level of skepticism and motivates the search for alternative methods that do not suffer from the problems of the body method.
1.2 A phasespace element method
In order to overcome the limitations of the traditional body approach, various improvements and alternative schemes have been proposed over the years.
Most body simulations employ a softening scale that is either constant or varies only over time. Several attempts have been made using spatially varying adaptive softening (e.g. Bagla & Khandai, 2009; Iannuzzi & Dolag, 2011), which suppresses twobody interactions more efficiently. Adaptive softening is, however, performed isotropically, which implies that it is not expected to improve dramatically the situation in regions of anisotropic collapse (which in cosmological simulation is basically everywhere except inside of haloes).
Among the methodological alternatives to the body method are e.g. a reformulation of VlasovPoisson as a Schrödinger equation (e.g. Widrow & Kaiser, 1993; Schaller et al., 2014; Uhlemann et al., 2014, for classical and more recent examples), the 1D waterbag method (e.g. Colombi & Touma, 2014; Colombi, 2015), as well as the Lagrangian tessellation method of Hahn et al. (2013), hereafter HAK13.
The method proposed in HAK13 (referred to as “TetPM” by the authors) attempts to retain the continuous nature of the distribution function by performing a Delaunay tessellation of the body particles in Lagrangian space (based on the Lagrangian tessellation idea of Abel et al. 2012 and Shandarin et al. 2012). HAK13 demonstrated that by assigning the mass to tetrahedral volume elements, whose vertices are freefalling particles, anisotropic deformation can be followed much more accurately and the obvious discreteness effects of the body method, such as twobody collisions and spurious fragmentation, could be overcome. However, a serious shortcoming of the method was also noted by HAK13: approximating the manifold by tetrahedra is accurate in the mildly nonlinear regime, but not in strongly nonlinear stages. In this regime, the volume of grows faster than what can be followed by the motion of the tetrahedra. This leads to a violation of the Hamiltonian structure of the equations, and thus it implies a lack of energy conservation. This can manifest itself in an overestimation of the density in the centre of haloes (HAK13), and it also affects the location of material tidally disrupted from infalling dark matter substructure (Angulo et al., 2014).
In this article, we present a general class of ”Lagrangian phase space element methods” to solve the VlasovPoisson system with cold initial conditions. We will show that the approach of HAK13 is naturally included as the lowest order method that provides a continuous approximation to the distribution function. When higher order elements are employed, the evolution of phasespace elements can be captured much more accurately, and caustics can be explicitly represented. Furthermore, we will demonstrate that an adaptive refinement of phase space elements allows to fully track their distortion, even during late highly nonlinear stages. This guarantees energy conservation and a direct control of errors. Therefore, we will argue that our method offers an interesting alternative to solve the CBE while fully resolving the evolution of the full phasespace distribution function. The possibility to have the full finegrained distribution function at hand offers exciting possibilities to study the behaviour of selfgravitating collisionless systems in much more detail than can be extracted from body data.
Our paper is structured as follows. In Section 2, we introduce the phase space element method, discuss how we implemented the projection to configuration space for the mass deposit, discuss discretisation errors, and how they can be limited by the adaptive dynamic refinement technique that we also present. After having introduced all methodology, in Sections 3 and 4, we validate the performance of our method in various one, two, and threedimensional test problems first without and then with selfgravity. Finally, we apply our method to a cosmological simulation with a cutoff in the initial spectrum in Section 5, and demonstrate the advantages over previous approaches. We discuss our results and conclude in Section 6.
2 Method
In what follows, we will present the theoretical foundations, as well as a practical implementation, of a new general class of numerical methods to solve the CBE referred to as “Lagrangian phase space elements”.
We start in §2.1 by introducing the fundamental unit of our method: piecewise maps between Lagrangian space and Eulerian phase space. Then we discuss how this gives rise to piecewise density fields that are defined everywhere in space and can be made arbitrarily regular, but at the same time can contain caustics of various types. In §2.2 we focus on the errors associated to our scheme and how they propagate over time. In §2.3 we exploit this and show how the Lagrangian phase space elements can be adaptively split to track the full phasespace evolution of a fluid to a given accuracy. In §2.4 we discuss how our method allows an accurate and efficient calculation of the evolution of a selfgravitating fluid. In the final section, §2.5, we further discuss the implementation of these ideas and parallelization strategies for an efficient execution in distributedmemory computers.
2.1 Lagrangian phase space elements
2.1.1 The Lagrangian manifold
Let us begin by considering in more detail the evolution of the neighbourhood of a point with Lagrangian coordinate , which maps to Eulerian phase space as at some time . An illustration for such a mapping in two dimensions is given in Fig. 1. Shown is a piece of Lagrangian space, a square in this case, on which nine points are put down on a regular lattice. In Lagrangian space the density is uniform. This square is now mapped into fourdimensional phase space (where it is still a twodimensional surface) and then projected back into twodimensional configuration space. In the case shown, the map was given by some biquadratic polynomial. In general, the projection can lead to several points in Lagrangian space mapping to the same point in Eulerian configuration space^{3}^{3}3The map to configuration space is thus in general surjective but not injective, while the one to phase space is bijective.. The projection becomes singular – corresponding to caustic subsets – where derivatives of the map to configuration space vanish (i.e. where has at least one vanishing eigenvalue). The singularities arising from such projections have been studied in catastrophe theory in great detail (e.g. Arnold et al., 1982; Hidding et al., 2014, for caustics of the Zel’dovich map in two and three dimensions). In what follows, we describe a procedure to produce a covering of the Lagrangian domain with such local maps.
The differential nature of the cold distribution function guarantees that the space tangent to the sheet at in phase space is uniquely defined as,
(7) 
This tangent space evolves under its own set of equations, sometimes called the “geodesic deviation equation” (GDE) because of the analogy to geodesics in curved spaces (Vogelsberger et al., 2008; White & Vogelsberger, 2009; Vogelsberger & White, 2011), and the tetrahedral approximation is a secant approximation to this tangent space^{4}^{4}4For tetrahedra, the are just given through the three directional finite differences that can be computed on a tetrahedron.. Since the GDE gives us only infinitesimal information around a point, and the tetrahedral secants are very crude, a better local description of in the neighbourhood of can be achieved by considering the evolution of a higher order expansion of in the neighbourhood of .
2.1.2 Lagrangian elements
The central idea of our approach is to consider a decomposition of the Lagrangian manifold into a finite number of Lagrangian volume elements . In the TetPM approach, these volume elements were tetrahedra, but in principle any volume decomposition is possible. In what follows we will use cubical volume elements to decompose Lagrangian space. Each element carries constant mass and defines a local map between Lagrangian space and Eulerian phase space. In particular, we will approximate this map using multivariate polynomials in what follows. We will show that the coefficients of these polynomials, each defined on its Lagrangian cubical element, can be represented by a number of supporting points (or flow tracer particles).
For a threedimensional Lagrangian manifold, we will thus consider the space of 3variate polynomials of order in , i.e. the set
(8) 
and we note that . The Lagrangian environment of our point shall be described by a unit cube centred on . Matching , we can choose the following subset of points on a regular lattice
(9) 
so that the coefficients are fully determined if is known for all elements . We also say that the element has degrees of freedom. Note that the requirement that be a regular lattice is not a necessity but a mere convenience when calculating the polynomial coefficients. Furthermore, the lattice is “forcefree” and gives a trivial equal mass decomposition.
We are then able to express the mapping of a point and its environment between Lagrangian space and phase space through mappings of the unit cube centred on as , i.e. specifically
(10) 
with all and .
Knowing position and velocity at point alone without any neighbour information in the unit cube constrains only the 0order polynomial (). This is comparable to the body case, where also the continuous structure is not retained. In the TetPM method, four points are used. A tetrahedron represents thus the map , where is a rank 2 tensor and a vector, i.e. it describes a simple linear transformation between three dimensional Lagrangian and sixdimensional phase space. The case of corresponds to a trilinear element (made up from 8 points). In TetPM however, the unit cube was decomposed into six tetrahedra, so that one achieves six unilinear elements per unit cube instead of one trilinear element as would be the case at order in the approach discussed here. In this paper, we will mainly focus on the triquadratic case, which is fully determined by knowing position and velocity at locations in the unit cube. Thus, for a system with a given number of degrees of freedom, fewer elements are needed when the order increases, since higher order elements have more internal degrees of freedom.
Finally, without loss of generality, the approach discussed above for a single unit cube can be extended to all of by tiling with such cubes and performing the mapping between Lagrangian space and phase space separately for every tile of Lagrangian space. For cosmological simulations with periodic boundary conditions one uses a finite tiling of the 3torus.
2.1.3 Densities from Lagrangian elements
In general, the uniform density of a cubical element of Lagrangian space is mapped to a nonuniform density curved patch in configuration space. We note that the exact density of the mapped unit cube (of mass ) is obtained directly from the Jacobian of the configuration space part of the coordinate transformation (i.e. ) as
(11) 
where the last equality just highlights the connection to differential geometry, i.e. the curved Lagrangian elements define a metric space with the metric and , so that caustics correspond to . It is obvious then that tetrahedra possess a constant metric, while the higher order polynomial elements are of nonconstant metric. Equivalently, density is constant for the tetrahedron, and the reciprocal of a polynomial for , which means that elements with can explicitly track caustics, while body and TetPM can not (here caustics arise only when a tetrahedron has vanishing volume).
In Figure 2, we illustrate how a Lagrangian unit square maps to configuration space, and how the inferred density depends on the adopted order of the polynomial. In this twodimensional example, we represent the Lagrangian unit square by supporting points, and displace the Eulerian coordinates outwards proportional to the distance from the centre of the square. In case a., we show the result of adopting bilinear polynomials on four subsquares tiling the full square. In case b., we show the result of adopting biquadratic polynomials on the full square. Finally, in case c., we triangulate the full square into eight triangles. We see that the mapped shaped in the two linear methods a and c are identical, but the inferred density is more accurately representing the radial gradient in the bilinear case than in the tessellation case. The biquadratic method leads to parabolic edges and a smooth density distribution in the interior. The resulting density field (before shellcrossing) is piecewise linear in one direction for the trilinear map, but generally not continuous. The triquadratic map, remedies this by allowing for gradients to change smoothly inside of an element. It is always continuous but generally not continuously differentiable. Finally, triangular elements are piecewise constant which leads to a density field where many elements are needed to faithfully describe gradients. Given that all three examples use identical data, the advantage of higher order methods is clearly obvious in this case.
We illustrate how polynomial elements trivially capture caustics in Fig. 3, where, by a simple linear map to Eulerian space the top vertices of a square are flipped. In the left panel, a, the result for a trilinear map is shown. Here, the mapped square has the correct topology and an actual singularity at the centre. This is quite in contrast to what can be achieved with a triangulation, which is shown in the right panels, b. Here, a choice must be made as to how to triangulate the square (in 2D there are two possibilities). The linear map flipping the top vertices then maps each triangle again to a triangle of the same size, which leads to a final density field that has no singularity, but also the mapped square does not have the correct topology (nothing should have been mapped to the region of the dark grey triangle in the bottom right panel). Obviously, for Nbody particles, the flipped and unflipped state are indistinguishable. This anisotropy of a simple triangulation exists of course also in three dimensions. In order to avoid the intrinsic anisotropy inherent in the Delaunay triangulation of the unit cube, HAK13 have proposed an alternative decomposition of the cube into eight overlapping tetrahedra instead of six (the decomposition they employed for the TCM scheme as opposed to the Delaunay that was used for the T4PM; Figure 3 in HAK13, where the TCM decomposition can be thought to be rearranged into one cube). In this paper, whenever we use tetrahedra, we thus employ the same nontessellating doublecovering of the unit cube as in HAK13 to avoid an anisotropic decomposition with tetrahedra.
2.2 Time evolution, discretisation errors and refinement strategies
Next, we will consider the time evolution of the Lagrangian elements. We will demonstrate that the evolution equations for the tripolynomial elements become separable evolution equations for the coefficients that can be solved by simply evolving the supporting points as freely falling flow tracer particles. The representation by finite order polynomials, however, introduces a truncation error that has to be bounded in order to maintain the Hamiltonian character of the system. This then leads to natural refinement criteria that attempt to keep the truncation error within reasonable limits and are necessary to guarantee e.g. energy conservation in general.
2.2.1 Time evolution of elements
The phase space density is, as in the body case, conserved along the characteristics given by the canonical equations of motion (cf. eq. 5) with the only difference that we are no longer concerned with a discrete set of characteristics but that they have an underlying manifold structure and thus generalise to
(12) 
along which .
The time evolution of the phase space elements can be obtained from the time derivative of eq. (10), which contains six time derivatives of polynomials . Since the Lagrangian coordinates do not depend on time, we are left with time derivatives of the coefficients . Furthermore, consistency between the supporting points and the coefficients then requires that the coefficients themselves follow canonical equations, i.e.
(13) 
where is the Jacobian and is the force mapped to Lagrangian space. It is completely equivalent to evolve the coefficients using their own canonical equations, truncated at some order, or using a number of freely falling test particles (the “flow tracers”, their number per element matching the degrees of freedom of the element) from which these coefficients can be calculated. This second possibility allows us to use a set of particles that are evolved like massless body particles to describe the evolution of each element.
2.2.2 Discretisation errors
The time evolution above is exact if the series expansion is not truncated (i.e. for ). When approximating the evolution equations above by piecewise polynomial expansions, a truncation error is however introduced since the force field across an element is only considered to the order of the polynomial expansion. The force across the element is in general however given as
(14) 
so that the elements capture correctly the evolution of the first terms, and the error in the momentum update is given by
(15) 
which implies that second order terms are sourced by a nonconstant tidal field across the element, third order terms by a nonlinearly changing tidal field and so on. This implies that momentum is only conserved at the level of this truncation error. The error is expected to be small when the potential is smooth across elements, which is usually true when the elements are small, but is certainly an invalid assumption at later times. Estimating the magnitude of terms of order and larger across elements thus naturally leads to a refinement criterion, so that when an element is split into smaller elements, the truncation error remains bounded. We will elaborate on this possibility next.
Refining on force: As we have just discussed, errors arise when the variation of the gravitational force across the element is not accurately captured. To turn this error source into a criterion for adaptive refinement, we calculate the force at using two different orders of the interpolating polynomial, i.e. we calculate
(16) 
with , then we can estimate the maximum relative force error and attempt to keep it bounded
(17) 
where is the refinement level of the element. Specifically, in our implementation, we choose and chose not to determine the maximum but approximate the problem by evaluating at halfpoints, i.e. at the locations where new supporting points would be placed if the element were to be refined (cf. Fig. 4). E.g., for the triquadratic element, we compute the triquartic interpolant by simply combining 8 triquartic elements together into a fundamental grid unit of supporting points which provides us with the necessary amount of points to compute the triquartic polynomial.
Refining on high order derivatives: Alternatively, and similar in spirit to the refinement criterion based on second derivatives proposed by Löhner et al. (1987), we can keep errors arising from derivatives of order across our elements bounded by using a criterion that splits the Lagrangian cubes when the dimensionless ratio of derivatives of order to derivatives of order exceeds a specified threshold. Since we will focus on the case of in the remainder, we use the ratio between third and second derivatives of the form
where indices run over the supporting points in Lagrangian space. Note that in three dimensions, we calculate the derivatives and along dimension and then evaluate the quantity
(19) 
A Lagrangian element is selected for refinement into smaller elements when . We typically use and thresholds are , where low values lead to more aggressive refinement, but typically .
Other refinement criteria are of course possible and might possibly perform better than what we consider in this paper. What is the optimal criterion, i.e. minimises errors with the smallest number of flagged elements for a given problem, is an open question. It will thus be interesting to consider alternatives to the two presented above in the future.
2.3 Adaptive refinement of elements
We next discuss the adaptive splitting of Lagrangian elements. We adopt an octtree based refinement strategy, meaning that a cubical element can be split into eight smaller cubical elements coextensive with the original element in Lagrangian space. Splitting this way in Lagrangian space automatically implies an equalmass splitting and exact mass conservation.
In order to illustrate the procedure we follow, we discuss the steps involved using the Lagrangian element in two dimensions shown in Figure 4. The steps followed during a refinement step are then as follows (and trivially generalise to three dimensions):

The refinement criterion is evaluated for each element and if it is fulfilled, the element is flagged for subsequent refinement (cf. left panel of Fig. 4).

A regularisation step is performed to ensure that neighbouring elements differ by at most one level of refinement. All elements that need to be refined to maintain regularity are also flagged.

The element is split into eight elements of th the mass of the original element using the interpolated midpoints (cf. right panel of Fig. 4).
The algorithm outlined above is guaranteed to provide a covering of Lagrangian space at all times (i.e. no holes will appear and mass is explicitly conserved), and changes in refinement level are constrained to be at most one between neighbouring cells. This procedure thus implements a treebased AMR method in Lagrangian space. Since this AMR structure can be represented by an octtree, all operations on the tree can be encoded using integer arithmetic (we represent Lagrangian space coordinates by three 20bit integer numbers that can be stored in one 64bit integer) allowing for fast identification of neighbours without explicitly storing pointers.
All flow tracers are allowed to move freely, except those that occupy the faces between elements of different resolution. At these coarsefine boundaries, flow tracers are constrained to remain in Lagrangian space at the mid points between coarser particles. This can be achieved by using an acceleration interpolated from the flow tracers shared between the coarse and the fine element to the half points instead of evaluating their actual acceleration from the gravitational force at their location. This ensures that also no holes appear when the Lagrangian manifold is mapped to Eulerian space.
2.4 Gravitational Evolution
We now describe how we compute the gravitational force field sourced by the Lagrangian elements and discuss how to follow their time evolution. We will discuss how we compute the total density field , where indexes all Lagrangian elements. Having an estimate of the total density field, it is then possible to solve Poisson’s equation and compute the gravitational accelerations.
2.4.1 Approximating the projected mass distribution of Lagrangian phase space elements
A basic element consists, in general, of a curved hexahedron whose inner density field is described by a rational function. In principle it may be possible to directly compute in some simple cases the force exerted by these geometrical objects. In practice, however, this would be computationally extremely expensive (if possible at all). A more practical alternative is to project the Lagrangian elements onto a regular mesh in Eulerian configuration space and compute the forces by solving Poisson’s equation using Fourier or relaxation methods.
Choosing such an approach, the elements need to be deposited onto a grid. In general there is no known method to integrate an arbitrary rational function over the intersection of curved hexahedra and cubes (i.e. grid cells). An alternative is to approximate them by polyhedra and then decompose them into a set of tetrahedra, for which an exact deposit is possible (Powell & Abel, 2015). Another alternative, which we adopt here, is to recursively decompose, in Lagrangian space, the mass of the hexahedra into equalmass particles located at the centre of mass of each subvolume. This set of mass carriers is then mapped to Eulerian space using eq. (10) and deposited onto the mesh as point masses. We describe this approach in more detail below.
Our cubical Lagrangian elements are fully described by a set of points in dimensions, spanning linear unit volumes, or 1 quadratic unit volume (cf. Fig 5, where the blue points indicate the supporting points, or “flow tracers” and the shaded grey area the Lagrangian volume spanned by them). In contrast to the tetrahedral elements of TetPM (which have constant density), our trilinear and tricubic (and all higher order) mapping of these elements results in a nonuniform density. Fortunately, their projection can be trivially approximated with a simple octtree based particle deposit scheme. Since the density in Lagrangian space is uniform, we perform an octtree subdivision of the Lagrangian unit cube up to some level that decomposes each element into discrete particles (or “mass carriers”) as illustrated in Fig. 5. For every element, we have the supporting points , cf. eq. (9), that fully determine the mapping , eq. (10). We then represent the element of total mass at approximation level through a space partitioning into octs with mass and Lagrangian centres of mass at the locations
(20) 
The mapping to Eulerian space for all points in is then given by simply applying the mapping , i.e. , that has been determined using the supporting points . In practice, since the points at which needs to be evaluated are given by the simple octtree structure, the coefficients of the 3variate polynomials can be precalculated.
One apparent disadvantage of our approach is that mapped particles have a deformed grid regularity. For completeness, we mention three alternatives to deposit particles at the end of the recursive decomposition.

A very smooth deposition could be achieved by triangulating the final oct and depositing the resulting tetrahedra using an exact deposition algorithm (cf. Powell & Abel, 2015).

A glass particle distribution (cf. White, 1996) can be used to map the elements to Eulerian space suppressing possible Moiré between the density mesh and the deposited particles (i.e. is replaced with a glass distribution in ).

A random sampling is also possible, where is replaced with an arbitrary number of uniformly distributed random points drawn from . This approach is not suitable to sample the mass distribution in time evolving simulations since a single random realisation is not force free (in contrast to the glass and the hierarchical lattice). However, this approach is favourable for the creation of images since Poisson sampling gives the visual impression of a smoother field than the other approaches using the same number of sampling points.
All of these three alternatives result into no or fewer symmetries, suppressing possible Moiré between the density mesh and the deposited particles. Currently, we have successfully implemented ii) and iii) in our code. However, we have tested that this does not affect our results in practice, thus we will not employ the simple grid deformation in the reminder of the paper.
2.4.2 Force Calculation and Time Stepping
The particles in can deposited using any of the known particlemesh charge assignment schemes (cf. Hockney & Eastwood, 1981). We have adopted the cloudincell (CIC) deposit in this paper for the pseudoparticles determined as described above. Once the density field is obtained, we solve Poisson’s equation using a standard Fourier method. We first perform a Fast Fourier Transform of the field, then compute the gravitational potential by multiplying with Green’s function for a fourth order finitedifference Laplacian, i.e.
(21) 
We then perform the backward Fourier transform and apply a 4th order finite difference gradient to compute the gravitational force field. We finally interpolate this field to the position of the supporting points and advance their positions and velocities just as would be done for standard body particles.
We note that our implementation of the force algorithms do not require all mass carriers to be concurrent in memory. This is a great advantage in keeping the memory footprint of our code low since the ratio of flow to mass tracers will be small.
Once forces are computed, we adopt a leapfrog kickdriftkick time integration scheme. Since our forces are given by a PM algorithm, we employ a global timestep set by the minimum of the timestep assigned to flow tracers, which in turn is set by a Courant criterion:
(22) 
where is the FFT cell size, is the magnitude of the velocity for the th particle, and is a parameter which we usually set to somewhat less than unity.
For completeness, we mention that we have also implemented an octtree algorithm, which allows us to have a very high force resolution and also the implementation of individual time steps for every flow tracer. The idea is to build a full octtree down to some specified node length, and then use flow tracers to refine the tree topology in a given simulation. Then, the moments of the multipole expansion of the tree force of each node are computed using the mass field given by all mass carriers. In the remainder of this paper we, however, only employ forces computed by the PM algorithm to avoid possible artefacts due to spatiallycorrelated force errors introduced by the use of finite terms in the multipole expansion of the force field.
2.5 Implementation and Parallelisation Strategies
Interesting problems in cosmological structure formation require a full exploitation of stateofthe art supercomputer centres. This in turn requires algorithms capable of being efficiently executed on up to thousands of processor cores with a mixture of shared and distributed memory, and for billions of resolution elements. Thus, we have put particular emphasis on these aspects in our code.
We have implemented our algorithms in the distributedmemory parallel code LGadget3. The LGadget3 code is a memory efficient version of PGadget3 (Springel, 2005; Springel et al., 2008) specially tailored for dark matter only simulations and that was successfully executed on 12,000 computer cores for the MXXL project (Angulo et al., 2012). Most of the parallel algorithms can be directly used for our Lagrangian element method, but there are two aspects noteworthy.
The first is the onetime construction of Lagrangian patches from a set of simulation particles. For this, we start by performing a decomposition of the computational domain using Lagrangian (as opposed to Eulerian) coordinates. This makes most Lagrangian patches to have locally all their supporting points. For those lying in the surface of the domain boundary this is not true and they will need to communicate with other processors and import the required set of vertex. This can be done in a very efficient manner by exploiting the fact that there is a unique relation between a particle ID, its Lagrangian patch, and the MPI task hosting it.
The second aspect is that our fundamental data structure are Lagrangian patches, on which an Eulerian domain decomposition is performed. We keep a locally complete set of supporting points for these structures, which implies a slight overhead in terms of memory (since the boundary supporting points are duplicated) and slight suboptimal work/load balance. However, our choice has the huge advantage of allowing all quantities related to Lagrangian patches to be computed locally, without requiring any internode communication. This makes the creation of mass carriers embarrassingly parallel, while their subsequent deposit to the mesh requires communication.
3 Testing Adaptive Refinement: Phase Mixing in a Fixed Potential
In a first test problem, we evolve a cube of cold collisionless material without selfgravity in a fixed potential. This will allow us to study the general feasibility of adaptive refinement as well as the impact of the refinement criterion on the accuracy of the subsequent evolution. After this, in the subsequent sections, we then turn to test problems with selfgravity in one and two dimensions.
As discussed in HAK13, the most important shortcoming of TetPM is the inability to follow the dark matter sheet when strong mixing occurs (e.g. in the inner parts of haloes). In this case, the surface of the sheet in phase space grows rapidly (see also below) by being continuously stretched, and a given small number of points is insufficient to describe its evolution over longer times. New points need to be inserted in order to follow the motion of the stretched regions accurately.
To provide a robust and simplified framework to test the performance in mixing situations, we study a cube of collisionless, cold but not selfgravitating material orbiting in a nonharmonic potential. The cube is initially represented by particles spanning tetrahedral, trilinear or 8 triquadratic elements respectively. For comparison, we will compare to the highresolution body solution in which we sample the cube by particles. The cube is given an initial uniform velocity that is offset from the centre of the potential (i.e. every point in it has nonzero angular momentum). Specifically, we consider a constant Plummer gravitational potential for which the acceleration field is given by
(23) 
with some and whose specific values are irrelevant in our case as we only want to study the qualitative behaviour. Since the potential is constant in time, the motion of any point in phase space is determined uniquely by its initial coordinates in phase space . Hence, the choice of the Lagrangian elements used to represent the orbiting cube has no impact on the dynamics of the vertices. The order of the elements will however impact the distribution of matter between the vertices, as well as where new vertices are inserted if adaptive refinement is used. This simple test problem will thus allow us to study a few key diagnostics: (1) how the order of the Lagrangian elements impacts where mass is deposited over time and how well energy is conserved for the elements, (2) which refinement criteria allow to follow the mixing well and allow an energy conserving scheme, as well as (3) which refinement criteria yield a better tracking of the evolving phase space sheet with the smallest number of newly inserted vertices.
3.1 Phase mixing without refinement
We first demonstrate once again that in mixing situations the Lagrangian element schemes lead to mass deposit towards the centre of the potential, leading to a violation of energy conservation. In Figure 7, we show the sheared cube at two times and (left and right columns, respectively) for all schemes: tetrahedral, trilinear, triquadratic Lagrangian elements based on only supporting points to describe the initial cube as well as the respective body solution using particles to sample the initial cube (from top to bottom). We note that generally higher order schemes, as expected, help to represent the true mass distribution more faithfully, with the triquadratic scheme providing a reasonably good description at , where no mass is deposited into disallowed orbits close to the centre. This is not the case for the linear schemes where significant diffusion towards the centre has occurred already by . At , the situation is even more dramatic, as all schemes now deposit mass into the very centre of the potential, with the tetrahedra even assigning most mass to the central region. This is exactly the situation that lead to the biased central densities of haloes discussed by HAK13. While the curved elements help to some degree to faithfully describe the wrapping of the sheet, it is obvious that a refinement technique is needed to fully resolve diffusion towards the centres of potentials and thus maintain energy conservation.
To quantify the visual impression gained from Fig. 7, we calculate the total energy of the cube over time
(24) 
which can be particularly easily evaluated for an analytic potential and is conserved if the potential is static as in our case. Furthermore, the degree of sheet wrapping can be quantified by the total volume of the cube as given by
(25) 
In Figure 7, we show the evolution of and with respect to their initial values. We find that at early times, the tetrahedral and trilinear elements lose energy at the same rate, for , the tetrahedra lose somewhat less than the trilinear. At all times, the triquadratic scheme conserves energy significantly better, but still loses energy. The sheet volume is identical for all methods for and disagrees after, with trilinear and triquadratic roughly following each other and tetrahedra estimating a significantly larger volume (roughly a factor of two). This is however simply due to the fact that the tetrahedral decomposition is not an actual tessellation to avoid intrinsic anisotropies (cf. the discussion in Section 2.1.3). It is obvious from these results, that adaptive refinement of the elements will be required to both conserve energy and arrive at a converged answer for the volume of the sheet.
3.2 Phase mixing with refinement
We now allow for adaptive refinement using the refinement criterion based on limiting highorder polynomial contributions to the represented velocity field as expressed in equation (19). We have tested also refining on forces, but found that a significantly larger amount of refinement was needed to arrive at a solution of comparable quality^{5}^{5}5This can be understood simply from the difference in location w.r.t. the potential centre where the respective criteria trigger refinement. The velocity refinement inserts new vertices preferentially at large radii where large differences in velocity for an element occur first. Quite in contrast to this, the force refinement will split the innermost elements first since they experience stronger force gradients. The latter proved to be less optimal in this case..
First, we will now investigate the impact of the order of the element as well as the threshold used in the refinement criterion on the evolution of the cube. In Figure 8, we show the distribution of mass at time (to be compared to the right column of Fig. 6) for the three Lagrangian element types as well as refinement thresholds of and . We note that adaptive refinement has significantly improved the agreement with the density distribution of the body reference solution for all element types with increasing agreement, as expected, with higher order and smaller thresholds . In fact, the triquadratic solution with threshold 0.05 outshines the reference solution which is grainy due to the body sampling, while both use about two million points to represent the solution. For the lower order elements refined to at least this number of vertices, while still suffering from slight density fluctuations. For , all elements refined to make use of a few hundred thousand vertices to represent the solution. We complement this visual inspection with a quantitative study of the total energy and the sheet volume in Figure 9. In all cases (we also show results for in this figure), energy conservation is greatly improved, but remains only for the triquadratic elements at the subpercent level by . Interestingly, the different elements seem to converge to different asymptotic curves for the evolution of the sheet volume when the threshold is reduced. The sheet volume is a very sensitive diagnostic since all perturbations and numerical errors will enhance the sheet growth rate. The increase in sheet volume is entirely due to the density inhomogeneities visible in Figure 8 for the tetrahedral and linear elements. While it may be possible to find refinement criteria more suitable to the low order elements, we conclude at this stage that the triquadratic elements are significantly improving energy conservation and quality of the solution with a significantly smaller number of necessary vertices to be inserted via adaptive refinement.
We push this observation even further and will next use a triquartic, i.e. 4th order interpolant over 8 triquadratic elements when inserting new vertices. We still represent the elements only by triquadratic maps. This allows to improve the quality of the solution even further. In Figure 10, we show the mass distribution of the cube at a twice later time than before, i.e. at . By this time, the inner regions have already mixed very substantially. In all cases we use a refinement threshold of . The left panel shows triquadratic elements with new vertices inserted using a second order interpolant, while the middle panel uses a fourth order interpolant and the right panel shows the body reference solution. Using the fourth order interpolation improves energy conservation by a factor of 4 and once again improves the solution. Furthermore, at , using the second order vertex insertion leads to vertices in total, fourth order insertion requires only , which is only of the particles of the body reference run. The fourth order interpolation also significantly improves the evolution of the sheet volume (see Figure 11), where noise due to the second order interpolant makes the sheet grow faster than it should at very late times.
We can conclude from the analysis in this section that adaptive refinement, especially when combined with high order schemes for mass assignment and refinement can make the Lagrangian phase space element approach energy conserving while overcoming the inherent graininess of the body method with significantly fewer active particles.
4 Testing Selfgravitating Lagrangian Elements
We have seen in the previous section that adaptive refinement can exquisitely track the evolution of the phase space sheet even in situations of strong mixing. This test was however performed in an external potential meaning that local errors in the solution did not feed back into the global solution. We will thus investigate in the subsequent sections the performance of the adaptively refined elements when selfgravity is present.
4.1 Test 1: Onedimensional Plane Wave collapse (axisaligned)
We will only briefly consider the axisaligned onedimensional plane wave collapse here, before we move on to computationally more challenging problems. This problem consists in solving the selfgravitating collapse, in an Einsteinde Sitter cosmology, of a onedimensional potential perturbation
(26) 
where is parallel to one of the cartesian axes of the computational volume, and is chosen so that shell crossing occurs at an expansion factor of .
For the plane wave, the solution (and thus particle position, velocity and acceleration) can be calculated analytically before shellcrossing occurs (Zeldovich, 1970) but the analytic solution, of course, relies on “unsmoothed” gravity. In the case that force softening goes to zero (and thus the PM resolution to infinity), the analytic solution should be approached. We thus present a plot identical to what is shown in the top panel of Figure 7 of HAK13 in Figure 12: the position error of flow tracers at the time of shellcrossing as a function of the force resolution while keeping the number of flow tracers constant. The solution should converge to a sharper and sharper caustic with increasing force resolution. As demonstrated by HAK13, body does not converge at fixed particle number. Here, we show the results for standard body and for triquadratic phase space elements without adaptive refinement. Unlike HAK13, where no true convergence at fixed tracer particle number was seen, we see that the recursive mass deposit improves the situation substantially, leading to close to second order convergence to the analytical solution with a constant number of tracer particles! For a given deposit recursion level, the error w.r.t. the analytical solution always asymptotes to a fixed value. To get true second order convergence, either exact deposits have to be employed (which have been implemented e.g. for tetrahedra by Powell & Abel 2015), or an optimally chosen deposit level for a given force resolution has to be employed.
4.2 Test 2: Onedimensional Plane Wave collapse (oblique)
We next consider a single plane wave that is not aligned with the Cartesian axes, but has a wave vector , where is the fundamental mode of the simulation box. This is a very hard problem for particle methods since the onedimensional wave motion has to be preserved in three dimension (cf. Melott et al., 1997). The particular choice of leads to the particles or flow tracers being folded onto themselves at rationals of the boxlength. In Figure 13, we show the phase space along the wave vector for the new phase space element method (using trilinear and triquadratic elements without adaptive refinement) as well as the standard Nbody method. We employed a force resolution of and tracer/Nbody particles. The mass distribution of the phase space elements was sampled at two deposit levels. We compare the results to a highresolution onedimensional solution (using particles) at the same force resolution at two times corresponding to (top panel) as well as (bottom panel).
The results are consistent with what HAK13 found (the results in their Figure 11 are however shown at somewhat later times) in that the body solution differs widely from the true onedimensional solution. Here we see that this discrepancy already exists at shellcrossing, but becomes most severe at later times, where substantial velocities outside of the plane of symmetry are found to break momentum conservation inside of the symmetry plane. Quite differently for the trilinear and triquadratic phase space elements. Using the same number of tracer particles as the body solution, they have substantially smaller errors at shell crossing and trace the correct solution still very well at much later times when , in fact, the triquadratic solution is right on top of the highresolution solution. This is even more remarkably since the way we plot the solution makes it appear that the resolution is higher than it actually is: points that appear as neighbours in phasespace in this plot are not actually neighbours in Lagrangian space. They are points at the respective phase of the wave but at rather different locations on the Lagrangian manifold and therefore spread out throughout the entire simulation box.
4.3 Test 3: twodimensional perturbed plane wave collapse
In a final test problem, we employ the same twodimensional test as in Hahn et al. (2013): the antisymmetrically perturbed wave described by Valinia et al. (1997). Compared to the plane wave collapse above, this problem introduces anisotropic collapse as well as mixing and is thus a significantly harder problem. In this test problem, the Lagrangian gravitational potential is given by that of a plane wave in xdimension with a sinusoidal phase perturbation in the ydimension
(27) 
The initial particle positions and velocities have been obtained using the Zel’dovich approximation at . We adopt , and , where is the size of the simulation box (note that this is different from the parameters chosen in HAK13), and is chosen so that first shell crossing occurs at an expansion factor of . We run this problem in an Einsteinde Sitter cosmology and discuss the results at . In all cases, unless otherwise specified, we use initial conditions for particles and employ a force resolution of PM cells. All solutions are run in three dimensions, effectively replicating the problem along the axis. The initial density fields for this test problem using the particle representation as well as the elements of various orders can be seen in Figure 15. As already discussed in Section 2.1.3, the tetrahedral elements provide a piecewise constant, while the triquadratic provide a smoother continuous density field.
In Figure 14, we present the solution for this test problem using a constant number of elements of increasing order, from top to bottom: body, then tetrahedra, trilinear elements, and triquadratic elements, and a reference highresolution body run using particles at the same PM resolution. We see that all Lagrangian element methods recover a roughly identical structure that is only barely inferred visually from the lowresolution body solution. Furthermore, we clearly see how the increasing order of the elements improves the smoothness of the resulting density field (as shown in Section 2.1.3). The most important results to be taken from this figure are the central density of the clump and the position of the caustics. HAK13 have found that the selfgravitating tetrahedra can lead to a biased central density and also bias in the positions of caustics. We see these shortcomings reproduced in this figure for both tetrahedra and trilinear elements. The only advantage of the trilinear elements is that the projected density is somewhat smoother. As expected, both the central density and the positions of caustics (compared to the highresolution body solution) are, at least visually, significantly improved when the triquadratic elements are used.
Clearly, the density field inferred from the triquadratic elements is significantly smoother than that from the linear element types. One might thus wonder whether body actually does well but looks bad in comparison. To illustrate that this is clearly not the case, in Fig. 16, we show the density field using triquadratic elements, determined a posteriori from the evolved lowresolution body solution (top halfpanel), in comparison to the selfconsistent triquadratic element solution (bottom halfpanel). Although the quadratic elements provide a smooth field, the body estimate show a very noisy solution, where “noisy” now means that caustics are significantly perturbed compared to the much smoother selfconsistent solution. The smooth field is thus clearly not an artefact of the rendering, but demonstrates convincingly the superiority of the Lagrangian elements over the body solution!
Naturally, the big remaining question is whether dynamical adaptive refinement in such a selfgravitating system with mixing converges to the right solution. We show the solution using refinement in Figure 17, comparing once more against the particle highres body solution at the same force resolution. We only consider the triquadratic elements in this case, although the linear elements also perform reasonably well. We started with the same initial conditions as in the fixed resolution test shown in Figure 14, but now employed the force refinement criterion with a threshold of 0.1 to dynamically split elements if required (the results using velocity refinement are however not significantly different). The solution allowing for one additional level of refinement is shown in the top panel, the one for two levels in the middle panel, and the reference body solution at the bottom. Rather strikingly, the solutions quickly converge to the reference solution in the exact shape and position of caustics. Already with one additional level, the central density of the clump is comparable to the reference solution. We do not perform a more quantitative solution of these toy problems but let the images speak for themselves and perform a quantitative convergence study of refinement in the next section, where we apply the Lagrangian element method to cosmological structure formation.
5 A First Application: Cosmological Simulation of a Warm DM Universe
We now apply our Lagrangian phase space element method to a cosmological problem. We simulate the gravitational evolution of a L= Mpc/h cube in a universe where dark matter is made of warm particles of mass , leading to a smallscale cutoff in the density perturbation spectrum.
The cosmological parameters we employ correspond to those favoured by the WMAP7 data release measurements (Komatsu et al., 2011). Explicitly: , , , , and spectral index . We generate our initial conditions using the MUSIC code (Hahn & Abel, 2011), with an input linear density power spectrum given by the fitting formulae of Eisenstein & Hu (1999) and a smallscale cutoff that mimicks the effects of a warm dark matter particle (see Bode et al., 2001; Angulo et al., 2013, for more details).
We follow the evolution of Lagrangian patches, each one of mass , which are created from a set of particles. Gravitational forces are computed using a PM method with a grid of cells, which implies that forces are effectively softened below kpc/h. We note that the dark matter particle mass, cosmological parameters, simulation particle mass and force resolution match those of Angulo et al. (2013).
In the following, we compare the results of simulating the above configuration with different methods; i) a standard body method, with particles, ii) the tetrahedral method of Hahn et al. (2013) and Angulo et al. (2013), with a total of tetrahedra, iii) trilinear elements, iv) triquadratic elements, and v) triquadratic elements with 1 or 3 levels of dynamic adaptive refinement allowed. They all employ numerically identical initial conditions and simply group particles together as vertices to define the elements.
The simulations that feature our Lagrangian element method were run with levels of recursion in their mass deposit. This means that the density field associated to a given Lagrangian patch was represented with bodies, summing up to in the entire simulation volume. The simulation that was allowed to refine up to 3 levels created new resolution elements by . At this redshift, we find only patches that have not triggered refinement, , , and at refinement levels 1,2, and 3, respectively. We recall that at the highest refinement levels, particle have a mass times smaller, i.e .
The evolution was simulated from to using global timesteps. As could have been expected, the computational costs of the different runs differ largely. The standard body run took CPU mins. The tetrahedra, linear and quadratic runs took , , CPU mins, respectively. The run with refinement took CPU hours. Out of this time, was spent in the mapping of mass carriers onto the PM grid, and in the onthefly generation of mass carriers.
In Fig. 18, we illustrate the performance of our adaptivelyrefined Lagrangian method. In the left panel, we show the final density field in Eulerian coordinates at . The high level of detail that our method provides about the topology of the large scale structure in WDM is clearly obvious. One should keep in mind that this is while using the equivalent of evolving only points. We highlight that the image displayed is not a smoothed rendering of our patches, but it is the actual density field used inside our simulation code to gravitationally evolve the dark matter fluid. On the right hand side image, we show a map of the highest refinement level reached by different regions in Lagrangian coordinates (shown as the maximum along the lineofsight). It is clear that the refinement level correlates with the Lagrangian counterpart of dense filaments and haloes, and that the material infalling into the largest haloes will lead to the largest amounts of refinement.
After this first impression of the full performance of our approach, we now qualitatively compare our different runs. In Fig.19 we show the density field at for a Mpc/h region centred on the most massive haloes present in the box. We present the same image for all the cases mentioned above, as indicated by the legend. As in the previous image, this is a not a “visualizationpurpose” rendering of the particle distribution, but the actual density field that sources the gravitational potential field trough the Poisson equation in our gravitational evolution.
It is readily visible that all methods coincide in their predicted large scale structure: filaments, halos and voids coincide in their location and extent. However, there are also clear differences among the methods. Perhaps the most striking difference is the evident granularity of the standard body method and how filaments are broken into clumps in this case. In contrast to this, all of our Lagrangian methods provide an extremely smooth field where filaments are continuous 1D structures. Furthermore, some of these filaments are embedded inside larger filaments which shows very clear caustics, none of these features can be even distinguished in the standard body case. Again, we want to emphasise that the top four panels in this figure employ the exact same number of “particles”.
When comparing the runs using the new method, we can see that they most strongly differ in the internal structure of the massive haloes present in the image. The cases employing tetrahedra, trilinear and even triquadratic elements all display very concentrated and round halos (with the triquadratic elements improving slightly). This is a consequence of the biases extensively discussed throughout this paper. The situation is clearly different when adaptive refinement is enabled and the halos appear less dense and more triaxial, very much in agreement with the standard body case.
In Fig. 20 we quantify these differences by comparing the density power spectrum among our runs. This figure shows three groups of curves, from bottom to top: , and , respectively. Within each set, coloured lines show the results of the 6 cases discussed above and indicated by the figure legend. At high redshift, we see that all methods are almost indistinguishable on large scales. On small scales, the standard body case strongly deviates from the rest due to the dominating effect of the primordial lattice, which drives the measured power spectrum to the Poisson shot noise level on scales smaller than the mean interparticle separation. At the differences between methods start to appear at h/Mpc. However, the contribution of shotnoise in the body case is not negligible on such scales. One can however already see that the power spectrum of the Lagrangian element methods without refinement is biased high, while those with refinement are consistent with the body solution.
At we can then clearly compare the performance of the different methods. In agreement with the qualitative picture provided by Fig. 19, tetrahedral and trilinear elements lead to a factor of more power than the body case at . Triquadratic elements and one level of adaptive refinement significantly reduce the excess power, are, however, still not sufficient to resolve the phase space sheet evolution well enough. Clearly, allowing for more levels of dynamic refinement solves this problem. When three levels are allowed, then for Mpc/h the measured spectrum perfectly agrees with standard body case. An accurate comparison is not possible on smaller scales since the Nbody case becomes dominated again by the Poisson noise intrinsic to its discrete nature. The adaptively refined triquadratic elements however probe the density field to much smaller scales with no such noise term, as we had seen in Figure 19 qualitatively.
6 Discussion and Conclusions
Numerical simulations of the evolution of the dark matter fluid have been and will be the backbone of physical cosmology in the late Universe. They are essential to our understanding of the formation of structures in the Universe – voids, filaments and clusters – as well as of the evolution and formation of galaxies and the distribution of dark matter around our Milky Way. In addition, they are indispensable in the correct and accurate interpretation of virtually all cosmological observables, and thus, they are fundamental to almost every aspect of dark energy probes and dark matter experiments.
Methodologically, however, all of these predictions rely on only few techniques in the mildly nonlinear regime: namely perturbative approaches and effective field theories; and only a single technique in the highlynonlinear regime in three dimensions: namely the body method. While it may appear to the outsider that there is a wide range of numerical tools available, PM, TreePM, fastmultipole body, direct Nbody, AMR, these approaches only differ in how they calculate the forces between the particles, but not how they discretise the dark matter fluid (see e.g. Dehnen & Read, 2011, for a recent review and discussion of the body method for both collisional and collisionless dynamics). In all these methods the collisionless VlasovPoisson system with cold initial conditions is discretised in terms of the characteristics of massive particles. It can be shown that as , the body system approaches the VlasovPoisson system, but naturally these simulations are far from that limit. As we have argued in length in the Introduction, there are several known shortcomings of the body approach, most notably spurious fragmentation in simulations with a cutoff scale in the initial perturbation spectrum, as well as two body scattering. Recently, Hahn et al. (2013) have presented a promising alternative to body that was shown to avoid spurious fragmentation by performing a tessellation of the body particles in Lagrangian space and then assigning the mass to the tetrahedral elements (whose connectivity is preserved throughout the simulation) instead of the vertices as in the body approach. At the same time, the method of Hahn et al. (2013) is limited by the strong constraint that the tetrahedra should describe the mass enclosed between the particles accurately. This however is a constraint that is known to be violated in regions of strong mixing, manifesting itself as a density bias in the centres of halos.
In this paper we present a more general discretisation of the phase space sheet in terms of “Lagrangian phase space elements” which are a generalisation of both body and the tetrahedral tessellation approach. We can summarise our results as follows:

We propose a new “Lagrangian phase space element” method in Section 2 that describes a piecewise mapping between Lagrangian space and Eulerian phase space. We specifically discuss tripolynomial maps in this paper, but other functional forms are possible. The Jacobian of the map to configuration space defines a unique density for every point in the element, and the projection of all Jacobians provides a density field in Eulerian space of arbitrary regularity, determined only by the order of the chosen polynomials. Specifically we show that the tetrahedral approximation of Hahn et al. (2013) is a low order version implementing only a piecewise linear transformation between the two spaces. We consider explicitly trilinear and triquadratic elements here, the latter giving a continuous density field and the latter two being able to represent caustics on a singleelement level.

We then discuss a simple pseudoparticle discretisation of the elements. This simplifies the problem of projecting highorder elements that may contain caustics onto a grid in order to solve for the gravitational field. The pseudoparticle approach also facilitates the interfacing of our method with existing body codes.

The phasespace elements map from cuboid domains in Lagrangian space to Eulerian space. We show that by subdividing these domains, one can construct a Lagrangian octtree based AMR scheme that allows for dynamic adaptive refinement of the elements.

In Sections 3.1 and 3.2, we demonstrate that in a simple phasemixing test in a fixed potential dynamic adaptive refinement allows us to conserve energy and represent phasemixing exactly for the finegrained distribution function by inserting new vertices. At the same time, new quantities, not accessible for an body simulation can be calculated. We show, e.g., the growth of the total configuration space volume of the distribution function over time.

Next, in Section 4.1, we show that we approach second order convergence with the quadratic elements at fixed vertex number and increasing force resolution in the plane wave collapse test. In the following sections, we provide further one and twodimensional tests that demonstrate the high accuracy of our approach and the excellent performance of the Lagrangian phasespace elements compared to body.

Finally, in Section 5, we argue that our method is mature and efficient enough for its application to cosmological structure formation problems. As an example, we simulate the evolution of the dark matter fluid with a smallscale cutoff in its initial power spectrum. We explicitly show that our method solves the artificial fragmentation problem while providing an unbiased solution for the density field in regions of heavy distortions of Lagrangian elements.
Clearly, this new approach will enable us to follow the full finegrained distribution function throughout gravitational collapse in the formation history of a halo. We expect that knowledge of the distribution of dark matter in sixdimensional phasespace will provide new insights into virialisation as well as relaxation processes in such collisionless selfgravitating systems.
Acknowledgements
We thank Tom Abel for the spark to consider adaptive refinement of tetrahedra, as well as him and Stéphane Colombi for the organisation of various workshops surrounding the Vlasov equation. We thank Tom Abel, Ralf Kaehler, Thierry Sousbie, Stéphane Colombi, Romain Teyssier, Francesco Miniati, Volker Springel, Simon White, Devon Powell and, last but not least, Sergei Shandarin for various discussions, inspirations and/or suggestions along the way. We are indebted to Volker Springel for his work on Gadget, on which we built our implementation.
O.H. acknowledges support from the Swiss National Science Foundation (SNSF) through the Ambizione fellowship.
References
 Abel et al. (2012) Abel T., Hahn O., Kaehler R., 2012, MNRAS, 427, 61
 Angulo et al. (2014) Angulo R. E., Chen R., Hilbert S., Abel T., 2014, MNRAS, 444, 2925
 Angulo et al. (2013) Angulo R. E., Hahn O., Abel T., 2013, MNRAS, 434, 1756
 Angulo et al. (2012) Angulo R. E., Springel V., White S. D. M., Jenkins A., Baugh C. M., Frenk C. S., 2012, MNRAS, 426, 2046
 Arnold et al. (1982) Arnold V. I., Shandarin S. F., Zeldovich I. B., 1982, Geophysical and Astrophysical Fluid Dynamics, 20, 111
 Bagla & Khandai (2009) Bagla J. S., Khandai N., 2009, MNRAS, 396, 2211
 Baushev (2015) Baushev A. N., 2015, Astroparticle Physics, 62, 47
 Bertschinger (1998) Bertschinger E., 1998, ARA&A, 36, 599
 Bode et al. (2001) Bode P., Ostriker J. P., Turok N., 2001, ApJ, 556, 93
 Carrasco et al. (2012) Carrasco J. J. M., Hertzberg M. P., Senatore L., 2012, Journal of High Energy Physics, 9, 82
 Centrella & Melott (1983) Centrella J., Melott A. L., 1983, Nature, 305, 196
 Centrella et al. (1988) Centrella J. M., Gallagher, III J. S., Melott A. L., Bushouse H. A., 1988, ApJ, 333, 24
 Colombi (2015) Colombi S., 2015, MNRAS, 446, 2902
 Colombi & Touma (2014) Colombi S., Touma J., 2014, MNRAS, 441, 2414
 Davis et al. (1985) Davis M., Efstathiou G., Frenk C. S., White S. D. M., 1985, ApJ, 292, 371
 Dehnen & Read (2011) Dehnen W., Read J. I., 2011, European Physical Journal Plus, 126, 55
 Diemand et al. (2008) Diemand J., Kuhlen M., Madau P., Zemp M., Moore B., Potter D., Stadel J., 2008, Nature, 454, 735
 Diemand et al. (2004) Diemand J., Moore B., Stadel J., 2004, MNRAS, 353, 624
 Efstathiou et al. (1985) Efstathiou G., Davis M., White S. D. M., Frenk C. S., 1985, ApJS, 57, 241
 Eisenstein & Hu (1999) Eisenstein D. J., Hu W., 1999, ApJ, 511, 5
 Frenk & White (2012) Frenk C. S., White S. D. M., 2012, Annalen der Physik, 524, 507
 Gao et al. (2012) Gao L., Navarro J. F., Frenk C. S., Jenkins A., Springel V., White S. D. M., 2012, MNRAS, 425, 2169
 Hahn & Abel (2011) Hahn O., Abel T., 2011, MNRAS, 415, 2101
 Hahn et al. (2013) Hahn O., Abel T., Kaehler R., 2013, MNRAS, 434, 1171
 Heitmann et al. (2015) Heitmann K. et al., 2015, ApJS, 219, 34
 Hertzberg (2014) Hertzberg M. P., 2014, Phys. Rev. D, 89, 043521
 Hidding et al. (2014) Hidding J., Shandarin S. F., van de Weygaert R., 2014, MNRAS, 437, 3442
 Hockney & Eastwood (1981) Hockney R. W., Eastwood J. W., 1981, Computer Simulation Using Particles. Computer Simulation Using Particles, New York: McGrawHill, 1981
 Iannuzzi & Dolag (2011) Iannuzzi F., Dolag K., 2011, MNRAS, 417, 2846
 Ishiyama et al. (2015) Ishiyama T., Enoki M., Kobayashi M. A. R., Makiya R., Nagashima M., Oogi T., 2015, PASJ, 67, 61
 Komatsu et al. (2011) Komatsu E. et al., 2011, ApJS, 192, 18
 Kuhlen et al. (2012) Kuhlen M., Vogelsberger M., Angulo R., 2012, Physics of the Dark Universe, 1, 50
 Löhner et al. (1987) Löhner R., Morgan K., Peraire J., Vahdati M., 1987, International Journal for Numerical Methods in Fluids, 7, 1093
 Ludlow et al. (2014) Ludlow A. D., Navarro J. F., Angulo R. E., BoylanKolchin M., Springel V., Frenk C., White S. D. M., 2014, MNRAS, 441, 378
 Ludlow et al. (2013) Ludlow A. D. et al., 2013, MNRAS, 432, 1103
 Marcos (2008) Marcos B., 2008, Communications in Nonlinear Science and Numerical Simulations, 13, 119
 Melott (2007) Melott A. L., 2007, arXiv: astroph/0709.0745
 Melott & Shandarin (1989) Melott A. L., Shandarin S. F., 1989, ApJ, 343, 26
 Melott et al. (1997) Melott A. L., Shandarin S. F., Splinter R. J., Suto Y., 1997, ApJ, 479, L79
 Navarro et al. (1996) Navarro J. F., Frenk C. S., White S. D. M., 1996, ApJ, 462, 563
 Navarro et al. (1997) Navarro J. F., Frenk C. S., White S. D. M., 1997, ApJ, 490, 493
 Pontzen & Governato (2013) Pontzen A., Governato F., 2013, MNRAS, 430, 121
 Powell & Abel (2015) Powell D., Abel T., 2015, Journal of Computational Physics, 297, 340
 Schaller et al. (2014) Schaller M., Becker C., Ruchayskiy O., Boyarsky A., Shaposhnikov M., 2014, MNRAS, 442, 3073
 Shandarin et al. (2012) Shandarin S., Habib S., Heitmann K., 2012, Phys. Rev. D, 85, 083005
 Skillman et al. (2014) Skillman S. W., Warren M. S., Turk M. J., Wechsler R. H., Holz D. E., Sutter P. M., 2014, ArXiv: 1407.2600
 Splinter et al. (1998) Splinter R. J., Melott A. L., Shandarin S. F., Suto Y., 1998, ApJ, 497, 38
 Springel (2005) Springel V., 2005, MNRAS, 364, 1105
 Springel et al. (2008) Springel V. et al., 2008, MNRAS, 391, 1685
 Springel et al. (2005) Springel V. et al., 2005, Nature, 435, 629
 Uhlemann et al. (2014) Uhlemann C., Kopp M., Haugg T., 2014, Phys. Rev. D, 90, 023517
 Valinia et al. (1997) Valinia A., Shapiro P. R., Martel H., Vishniac E. T., 1997, ApJ, 479, 46
 Vogelsberger & White (2011) Vogelsberger M., White S. D. M., 2011, MNRAS, 413, 1419
 Vogelsberger et al. (2008) Vogelsberger M., White S. D. M., Helmi A., Springel V., 2008, MNRAS, 385, 236
 Wang et al. (2007) Wang H. Y., Mo H. J., Jing Y. P., 2007, MNRAS, 375, 633
 White (1996) White S. D. M., 1996, in Cosmology and LargeScale Structure, Schaefer R., Silk J., Spiro M., ZinnJustin J., eds., Dordrecht: Elsevier, astroph/9410043
 White & Vogelsberger (2009) White S. D. M., Vogelsberger M., 2009, MNRAS, 392, 281
 Widrow & Kaiser (1993) Widrow L. M., Kaiser N., 1993, ApJ, 416, L71
 Yoshida et al. (2003) Yoshida N., Sugiyama N., Hernquist L., 2003, MNRAS, 344, 481
 Yoshikawa et al. (2013) Yoshikawa K., Yoshida N., Umemura M., 2013, ApJ, 762, 116
 Zeldovich (1970) Zeldovich Y. B., 1970, A&A, 5, 84