Moving mesh cosmology: numerical techniques and global statistics
Abstract
We present the first hydrodynamical simulations of structure formation using the new moving mesh code AREPO and compare the results with GADGET simulations based on a traditional smoothed particle hydrodynamics (SPH) technique. The two codes share the same TreePM gravity solver and include identical subresolution physics for the treatment of star formation, but employ different methods to solve the equations of hydrodynamics. This allows us to assess the impact of hydrosolver uncertainties on the results of cosmological studies of galaxy formation. In this paper, we focus on predictions for global baryon statistics, such as the cosmic star formation rate density, after we introduce our simulation suite and numerical methods. Properties of individual galaxies and haloes are examined by Keres et al. (2011), while a third paper by Sijacki et al. (2011) uses idealised simulations to analyse in more detail the differences between the hydrodynamical schemes. We find that the global baryon statistics differ significantly between the two simulation approaches. AREPO shows systematically higher star formation rates at late times, lower mean temperatures averaged over the simulation volume, and different gas mass fractions in characteristic phases of the intergalactic medium, in particular a reduced amount of hot gas. Although both hydrodynamical codes use the same implementation of cooling and yield an identical dark matter halo mass function, more gas cools out of haloes in AREPO compared with GADGET towards low redshifts, which results in corresponding differences in the latetime star formation rates of galaxies. We show that this difference is caused by a higher heating rate with SPH in the outer parts of haloes, owing to viscous dissipation of SPH’s inherent sonic velocity noise and SPH’s efficient damping of subsonic turbulence injected in the halo infall region, and because of a higher efficiency of gas stripping in AREPO. As a result of such differences, AREPO leads also to more disklike morphologies in the moving mesh calculation compared to GADGET. Our results hence indicate that inaccuracies in hydrodynamic solvers can lead to comparatively large systematic differences even at the level of predictions for the global state of baryons in the universe.
keywords:
cosmology: galaxy formation – cosmology: dark matter – methods: numerical1 Introduction
Current theories of galaxy formation are based on the view that the dominant mass contribution to the Universe is in the form of cold dark matter (DM), which clusters under gravity and builds up the backbone of all cosmic structure. Together with the discovery of a large dark energy (DE) component (Riess et al., 1999; Perlmutter et al., 1999), this led to the emergence of the concordance cold dark matter (CDM) cosmogony. The third component to the energy density in this scenario, the baryons, condense via radiative cooling at the centres of a population of hierarchically merging DM haloes, forming galaxies (Silk, 1977; Rees & Ostriker, 1977; White & Rees, 1978; Blumenthal et al., 1984). Although the precise physical nature of DM and DE is not yet known (but see Bertone et al., 2005, for a review of possible DM candidates), largescale predictions of this CDM theory show good agreement with a wide range of observations, among them cosmic microwave background (CMB) fluctuations (Dunkley et al., 2009), largescale clustering of galaxies in the lowredshift universe (Percival et al., 2010), and the redshift power spectrum probed by the Lyman forest (Viel et al., 2009). It is however crucial to further constrain and test the CDM paradigm, especially on the smaller scales of galaxies, which remains a theoretical and observational frontier.
Structure formation is a highly nonlinear process and only the early stages can be described analytically using linear theory (Zel’Dovich, 1970). However, some statistical properties of the evolved DM field, in particular the halo mass function, can also be predicted analytically by combining linear theory and spherical collapse models, through the PressSchechter formalism (Press & Schechter, 1974) and subsequent extensions (e.g. Bond et al., 1991). These predictions ultimately rely on calibration through more accurate calculations, which typically can only be done numerically. Indeed, while PressSchechter theory has been quite successful in describing the abundances of haloes, once computer models reached high precision it became clear that the original analytic treatment was not sufficiently accurate for the quantitative work required in today’s era of precision cosmology (e.g. Jenkins et al., 2001).
Over the last two decades, numerical simulations have played a key role in guiding our knowledge of structure and galaxy formation. This is especially evident for the CDM component, where the relevant calculations have reached a high level of sophistication (e.g. Springel et al., 2005, 2008; Diemand et al., 2008; BoylanKolchin et al., 2010; Klypin et al., 2010). Many important insights have been gained from purely collisionless simulations, for example the nearly universal density profiles of DM haloes (Navarro et al., 1996, 1997, 2010), assembly bias (Gao et al., 2005), and the internal structure of haloes on small scales (Vogelsberger et al., 2009; Vogelsberger & White, 2011a). An important reason for today’s reliability of CDM predictions is that the initial conditions are unambiguously specified in the CDM model, with parameters that are tightly constrained by CMB observations (Komatsu et al., 2011). Moreover, the computational problem is wellposed and comparatively simple, with equations of motion that for DM involve only gravity. Efficient new algorithms and the rapid growth of computing power over the few last decades have allowed ever more detailed theoretical predictions for the DM distribution. It is worthwhile, however, to recall that initially such collisionless DM simulations suffered from numerical artifacts, leading to issues like the infamous “overmerging problem”, which resulted in featureless and smooth DM haloes primarily due to insufficient mass, force and timeresolution (Moore et al., 1998, 1999).
To understand properties of the observable universe one needs to relate the dark matter distribution to that of the baryonic material. This can be done in a number of different ways: (i) using socalled semianalytical modelling of galaxy formation (e.g. White & Frenk, 1991; Kauffmann et al., 1993; Baugh, 2006; Croton et al., 2006; Benson, 2010a, b; Guo et al., 2011) or the closely related approach of halo occupation distributions (Benson et al., 2000), or (ii) using direct hydrodynamical simulations (e.g Hernquist & Katz, 1989; Katz et al., 1992; Bryan & Norman, 1998; Davé et al., 1999; Springel et al., 2005; Governato et al., 2010). Approach (i) allows a fast exploration of a large parameter space of the underlying coarse description of galaxy formation physics, whereas (ii) makes possible the calculation of a selfconsistent model that requires far fewer assumptions about the gas dynamics. Nevertheless, both methods share the common problem that complicated physics on very small scales, like star formation and associated feedback processes, need to be accounted for in a rather crude way. This propagates into uncertainties in the interpretation of simulation results for the hydrodynamic sector. One possible strategy to get better control over this problem is to calibrate subresolution treatments in semianalytic models and hydrodynamical simulations observationally (e.g. Schaye et al., 2010; Guo et al., 2011). In addition, numerical artifacts in hydrodynamical simulation techniques need to be better understood, hopefully allowing one to reduce or completely eliminate them.
Stateoftheart cosmological hydrodynamical simulations include a prescription for star formation (e.g. Springel & Hernquist, 2003a), radiative cooling (Katz et al., 1996; Smith et al., 2008), chemical enrichment (e.g. Wiersma et al., 2009) and various feedback processes relevant for low and highmass systems (e.g. Sijacki et al., 2007; Scannapieco et al., 2008). Furthermore, some simulation codes can also include magnetic fields (Collins et al., 2010; Dolag et al., 2005), radiative transfer (Cantalupo & Porciani, 2011; Petkova & Springel, 2010), cosmic ray physics (Jubelgas et al., 2008), thermal conduction (Jubelgas et al., 2004; Dolag et al., 2004), or black hole physics (Di Matteo et al., 2005; Springel et al., 2005; Di Matteo et al., 2008).
Hydrodynamical simulations have been successfully used to study the Lymanforest (e.g. Hernquist et al., 1996; Katz et al., 1996; Viel et al., 2005) and the detailed physics of the intracluster medium, where cluster scaling relations were obtained that agreed with Xray observations (Puchwein et al., 2008). There have also been numerous studies that focused on the formation of a representative galaxy population (e.g. Pearce et al., 1999; Murali et al., 2002; Weinberg et al., 2004; Nagamine et al., 2005; Crain et al., 2009). While some successes have been achieved here as well, a general finding of these simulations is that it appears to be difficult to reproduce the observed shallow slope of the faintend of the galaxy luminosity function and the observed morphological mix of galaxies, with its high abundance of disklike galaxies. Although there has been some considerable progress over the last few years on the latter issue (e.g. Robertson et al., 2004; Governato et al., 2007; Scannapieco et al., 2008; Agertz et al., 2011; Governato et al., 2010; Guedes et al., 2011), one of the outstanding problems is to produce a statistical sample of galaxies that agrees reasonably well with observations and reproduces the Hubble sequence. But modifications in the detailed feedback and star formation prescriptions proved to be important and successful in forming more realistic late type spiral galaxies. We note that cosmological hydrodynamic simulations have also made it clear that a proper inclusion of the baryonic physics can lead to interesting back reactions on the DM distribution (Duffy et al., 2010; D’Onghia et al., 2010), potentially eliminating or reducing the central dark matter cusp (Governato et al., 2010), or forming a dark disk (Read et al., 2008). It is hence clear that it is ultimately not sufficient to work with darkmatter only simulations.
Most hydrodynamical simulations of galaxy formation carried out thus far have used the smoothed particle hydrodynamics (SPH) technique (Lucy, 1977; Gingold & Monaghan, 1977; Monaghan, 1992, 2005), where the gas is discretised into a set of particles for which appropriate equations of motion can be derived. The SPH method is wellsuited for cosmological applications owing to its pseudoLagrangian character, which automatically brings resolution elements to regions where they are needed most, i.e. collapsing objects like clusters and galaxies. Also, the conservation properties of SPH in terms of simultaneous conservation of energy, momentum, mass, entropy and angular momentum are excellent. A further convenient feature is that a particlebased gravity solver (needed for the DM component anyway) can be readily applied to SPH. These properties have made the method also popular for applications other than cosmic structure growth, for example for isolated merger simulations (Barnes & Hernquist, 1992, 1991, 1996; Mihos & Hernquist, 1994, 1996; Naab & Burkert, 2003; Cox et al., 2006).
However, different methods for solving the equations of hydrodynamics in a cosmological context have been employed as well, most of which are descendants of classic Eulerian methods (Stone & Norman, 1992) on regular meshes. The fixed Cartesian grids originally available in the corresponding codes are insufficient to capture the large dynamic ranges encountered in galaxy formation, but finite volume schemes incorporated in modern adaptive mesh refinement (AMR) codes (Berger & Colella, 1989; Teyssier, 2002; O’Shea et al., 2004) can alleviate this problem. Interestingly, these meshbased methods have led in some cases to significantly different results compared with SPH. A prominent example of these discrepancies is highlighted in the Santa Barbara cluster comparison project (Frenk et al., 1999), where it was found that the entropy profiles of clusters are significantly and systematically different between AMR and SPH codes, the former yielding a large entropy core which is absent in the SPH calculations. Only recently has some progress been made in identifying the cause for this difference (Mitchell et al., 2009a), but the issue is still not fully understood; hence we return to it in one of our companion papers (Sijacki et al., 2011).
It thus appears that there are at least two important challenges in the field of cosmological hydrodynamics simulations. One lies in an adequate treatment of all relevant physics of galaxy formation, in particular with respect to the reliability of the parameterisations of subresolution physics, and the other having to do with the accuracy and efficiency of the underlying hydrodynamics solver. It is of course crucial to strive for a more faithful treatment of the physics in the simulations, but it should also be evident that a proper modelling of the physics relies on a correct solution of the basic hydrodynamical equations in the first place. It is therefore problematic to tune the subresolution physics without first verifying in detail the quality of the hydro solver, because this risks incorrectly absorbing deficiencies of a numerical technique into distorted physics models. Here, in this initial study we shall primarily be concerned with an assessment of the extent to which uncertainties in numerical methodology are reflected in the predicted galaxy properties, for a fixed physics model. In future works, we will explore the consequences of various treatments of feedback effects, which are known to be crucial for the galaxy formation problem (Scannapieco et al., 2011).
We remark that it is not obvious whether SPH or AMR is more accurate in all simulation regimes, as both methods are known to have shortcomings. For example, SPH suffers from relatively poor shock resolution, noise on the scale of the smoothing kernel, and loworder accuracy for the treatment of contact discontinuities. Furthermore, some hydrodynamic instabilities like the KelvinHelmholtz instability can be suppressed in SPH (Agertz et al., 2007). Recently, Bauer & Springel (2011) also showed that conventional implementations of SPH do not properly properly subsonic turbulence, which is generically present in cosmological haloes. Astrophysical Eulerian methods on the other hand (usually realised as AMR finitevolume schemes) may suffer from overmixing due to advection errors in the presence of bulk flows. One of the principal differences between SPH and AMR lies in their handling of mixing at the level of individual fluid elements. This is absent in SPH by construction (unless added in crudely by hand somehow), while in AMR it occurs implicitly through the averaging of the evolved Riemann solutions over the scale of the gridcells at the end of each timestep. It is not always clear which scheme yields a more accurate result (Trac et al., 2007; Mitchell et al., 2009b).
Another issue with classical AMR codes is that their handling of the discrete equations of motion can lead to errors with the fluid is moving rapidly across the mesh (Springel, 2010a, hereafter S10). Because the truncation error of Eulerian codes depends on the fluid velocity relative to the grid, the results can thus be sensitive to the presence of bulk velocities (see also Tasker et al., 2008; Wadsley et al., 2008). Finally, a further challenge for AMR schemes in cosmic structure formation arises from the need to accurately follow the gravitational growth of even very small structures. The meshbased Poisson solvers typically employed by AMR codes have been shown to lack sufficient smallscale force accuracy and to produce too few low mass haloes (O’Shea et al., 2005; Heitmann et al., 2008) potentially corrupting the solution on even wellresolved scales. While this can be addressed with more aggressive refinement strategies, the discontinuous changes in gravity resolution brought about by such refinements introduce nonHamiltonian perturbations into the dark matter dynamics which are in principle undesirable.
Recently, S10 introduced a new movingmesh approach as embodied in the AREPO code. The principal goal is similar to the earlier implementations of Gnedin (1995) and Pen (1998), but these movingmesh algorithms suffered from grid distortions which limited their applicability. AREPO does not use coordinate transformations like previous moving mesh codes in cosmology, but instead employs an unstructured Voronoi tessellation of the computational domain. The meshgenerating points of this tessellation are allowed to move freely, offering significant flexibility for representing the geometry of the flow. As discussed in detail in S10, this technique avoids several of the weaknesses of SPH and AMR schemes while it retains their most important advantages. For example, if the mesh motion is tied to the gas flow, the results are Galileaninvariant (like in SPH), while at the same time a high accuracy for shocks and contact discontinuities is achieved (like in Eulerian schemes).
The aim of this paper is to compare the novel cosmological hydrodynamics code AREPO with the widely used SPH code GADGET (Springel, 2005). In this first paper of a series we give an overview of the numerical techniques used for our galaxy formation simulations, introduce our simulation set, present an analysis of the global baryon characteristics, and evaluate the performance and efficiency of the new simulation scheme. In one companion paper (Keres et al., 2011, hereafter Paper II), we discuss the properties and statistics of individual galaxies and haloes in the simulations. A further companion paper (Sijacki et al., 2011, hereafter Paper III) analyses idealised test problems to highlight various differences of the two simulation schemes and to elucidate how they affect galaxy formation.
The outline of this paper is as follows. In Section 2, we provide a discussion of the initial conditions and numerical details which are particularly important for the goal of our comparison project. There we also discuss some modifications of the AREPO code that were adopted during the course of this work. Section 3 presents first results and discusses the largescale structure and global statistics of the baryon content in the simulations at different resolutions. We turn to an analysis of the origin of the cooling difference we find between the codes in Section 4. In Section 5 we discuss some generic issues and problems of SPH. Finally, we give a summary of our results and our conclusions in Section 6. In an Appendix, we provide data on the mesh geometry, analyse the performance of the codes in different parts of the calculations and present scaling tests.
2 Numerical methods and simulation set
2.1 Implemented physics
Our GADGET and AREPO simulations follow the same physics, consisting of a collisionless dark matter fluid and an ideal baryonic gas. Both components act as sources of gravity and are evolved in a Newtonian approximation on top of an expanding FriedmanLemaitre background model. We describe the gas dynamics with the ordinary inviscid Euler equations, augmented with additional terms that account for “extra physics” related to radiative processes and star formation. In particular, we account for optically thin radiative cooling of a primordial mixture of helium and hydrogen, with a hydrogen mass fraction of . We also assume a spatially uniform, timedependent ionising UV background with an amplitude and time evolution according to FaucherGiguère et al. (2009).
As a result of cooling, gas can collapse to high density and turn into stars. Both simulation codes describe this process with the simple star formation and supernova feedback model introduced in Springel & Hernquist (2003a), which has been used in many SPH simulations in recent years. In this approach, every sufficiently dense gas particle/cell is treated as a representative region of the interstellar medium (ISM) and is assumed to exhibit a multiphase structure consisting of cold and hot gas in pressure equilibrium. Stars form out of the cold gas and return energy back into the medium though supernova explosions. This energy feedback pressurises the multiphase medium, the effect of which is described in terms of an effective equation of state that is imposed onto the gas once it has overcome the density threshold for star formation. The value of the density threshold we use in this work is , set such that isolated disk galaxies at are characterised with a relation between disk surface gas density and star formation density that agrees with the observed Kennicutt law (Kennicutt, 1998). To reach these densities gas needs to fall into dark matter haloes and dissipate energy via radiative cooling to eventually form galaxies (White & Rees, 1978; White & Frenk, 1991). Once the gas density is higher than the threshold value, particles/cells are eligible to form stars and can be converted into collisionless stellar particles. We treat this conversion as a stochastic process that generates one generation of stellar particles per hydrodynamic resolution element (SPH particle or Voronoi cell, respectively) at an average rate equal to the star formation rate predicted by the subresolution multiphase model.
We stress again that for the purposes of our study it is crucial to have an identical implementation of this extra physics in both codes in order to allow a straightforward comparison of the two hydrodynamical schemes. Fortunately, the quasiLagrangian nature of AREPO typically allows an easy adoption of methods originally developed for particlebased schemes like SPH. We could hence adopt most of the “extra physics” implemented in GADGET in a largely unmodified form in AREPO, helping to ensure that possible implementation differences for this physics do not affect our conclusions. We also emphasise that in our present context it is preferable to employ a relatively simple subresolution model like that of Springel & Hernquist (2003a) in order to facilitate a clear assessment of the differences between hydro solvers. In particular, for now we have not included the effects of strong feedback capable of inducing galactic winds and outflows. We are thus not attempting to solve the galaxy formation problem at this stage, but are instead concerned with identifying and understanding issues related to the treatment of the hydrodynamics.
2.2 Simulation Codes
In the following two subsections we highlight the most relevant features and parameter settings of the codes we used, as well as any particular changes we made for this project. We note that full details are presented in substantial depth in the corresponding code papers, so in the interest of brevity we here restrict ourselves only to those aspects directly pertinent to our study. We are also aided by the fact that the AREPO and GADGET codes share the same gravity solver, something that was not the case in previous comparison studies of SPH and AMR (e.g. Regan et al., 2007). Consequently, the evolution of the collisionless DM component is identical in both codes in the limit of vanishing hydrodynamical forces. This similar handling of selfgravity is important for isolating differences caused primarily by the hydro solvers.
2.2.1 Gadget
GADGET is a widely used and welltested SPHcode which employs a formulation of SPH that simultaneously conserves energy and entropy despite the use of fully adaptive smoothing lengths (Springel & Hernquist, 2002). The gravity calculation is split into short and longrange components, where shortrange forces are calculated with an hierarchical octtree algorithm (Barnes & Hut, 1986; Hernquist, 1987) and the longrange forces are evaluated with a particle mesh (PM) method (e.g. Hockney & Eastwood, 1981). Our GADGET runs are based on a default setting of the numerical SPH parameters, using 32 neighbours in smoothed estimates and an artificial viscosity parameter of , combined with Balsara’s switch (Balsara, 1995) to reduce the viscosity in the presence of strong shear.
2.2.2 Arepo
AREPO is a secondorder accurate finite volume method that solves the Euler equations using piecewise linear reconstruction and a calculation of hydrodynamical fluxes at each cell interface with an exact Riemann solver. The basic solution strategy of the code is that of the wellknown MUSCLHancock scheme. What makes AREPO unusual is that it employs an unstructured mesh based on a Voronoi tessellation using a set of meshgenerating points. Furthermore, the mesh is allowed to move freely as a function of time. In our runs, we tie the motion of the meshgenerating points to the hydrodynamical flow, which gives the scheme a quasiLagrangian character and makes the mesh automatically adaptive. As S10 demonstrated, such an approach offers a number of advantages compared with ordinary Eulerian or Lagrangian codes, including a reduction of numerical diffusivity and advection errors and an improved handling of contact discontinuities. In the following, we discuss some details and modifications of AREPO relevant for our study.
(i) Mesh regularisation: It is desirable to have a Voronoi mesh geometry where the geometric centre of each Voronoi cell lies reasonably close to the meshgenerating point of that cell. This reduces the size of errors in the linear reconstruction step of the MUSCLHancock scheme and also limits the rate at which mesh faces turn their orientation during the mesh motion. AREPO therefore incorporates a method to regularise the mesh, as described in S10. This scheme uses a modified Lloyd algorithm to drift the mesh a bit towards its centroidal configuration at each time step. To this end the motion of the meshgenerating points contains an additional velocity component for certain cells, designed to move a meshgenerating point closer towards the geometric centre of its cell. The default setup of AREPO uses the spatial offset between the actual location of a vertex and its cell’s geometric centre in units of the cell’s size to decide whether or not this corrective motion should be added to the cell’s vertex velocity.
However, a number of test simulations revealed that this regularisation scheme can introduce some unwanted side effects. Specifically, systematic differences in the average mass of gaseous cells can develop between the starforming gas and the lower density material in the surrounding halo, with the former tending to become systematically more massive than the latter. That such a tendency exists in principle is to be expected because the Lloyd algorithm would ultimately converge to a homogeneous Voronoi mesh with cells of equal volume if it applied to every cell centre with full intensity all the time. The mesh regularisation motions in the default scheme have hence the tendency to move the mesh away from the densest regions (but leaving the gas there – in AREPO, the gas motion and the mesh motion are independent), such that the mass per cell will tend to increase in the densest regions. This is somewhat undesirable because ideally we want to have the highest mass resolution in the central parts of galaxies and not in the surrounding low density gas.
There are at least two solutions to this problem (besides disabling the regularisation scheme altogether). One is to simply make the settings of the standard regularisation less aggressive, thereby reducing the systematic effects described above. Another possibility is to change the criterion that decides when such a mesh correction motion should be applied. We have found that the simple criterion originally implemented in AREPO, based on the displacement of a meshgenerating point from the geometric centre of its cell, is prone to invoke mesh regularisation drifts even when they are not needed. This happens especially in regions with strong density gradients, where such displacements naturally occur even if the mesh geometry is acceptable. As an alternative criterion we have therefore considered the maximum opening angle under which a cell face is seen from the meshgenerating point. This identifies problematic cell geometries more directly, which are in fact those where a point lies close to a wall, or equivalently, where this opening angle is large. We find that this opening angle criterion is more tolerant to rapid changes in spatial resolution, and is hence less prone to triggering unwanted meshcorrection motions in the presence of large density gradients. Because of this it virtually eliminates the mass segregation effect mentioned above.
For definiteness, in the simulations presented here, we invoke meshregularisation motions if the maximum face angle of a Voronoi cell (defined as the maximum of , where is the area of a face and its distance to the vertex) is larger than . In this case we add a velocity component up to half the sound speed of the corresponding cell to the vertex velocity, which moves the meshgenerating point closer towards the geometric centre of the Voronoi cell. Ideally, this scheme should then guarantee reasonably regular cells where the bulk of all cells will have a maximum face angle less than . Our tests confirm that this is the case, while at the same time the mass bias between star forming high density cells and the lower density cells in the surrounding halo is significantly reduced. The face angle scheme applies the meshcorrection motions more rarely than the original approach but still frequently enough to maintain a regular mesh. In the Appendix, we discuss some measurements of the mesh statistics, which also show that this new regularisation scheme still ensures that the offsets between meshgenerating points and the geometric centres of the corresponding cells remain small, as is desired to minimise errors in the linear reconstruction step.
(ii) Refinement and derefinement: AREPO is quasiLagrangian in the sense that the vertices of the Voronoi mesh follow the flow of the fluid such that an approximately constant mass resolution is automatically achieved. The code is not purely Lagrangian, however, because mass can be exchanged between cells in a manner consistent with the equations of motion, in order to prevent the mesh from becoming highly distorted. Moreover, the code can also make use of re and derefinement of individual cells in order to improve the local resolution beyond that offered by the dynamical motion of the mesh or to improve efficiency when high resolution is no longer required in certain regions. The method employed by AREPO is more general than that used by standard AMR methods, which typically employ subgrids to refine a certain section of the parent grid, thereby creating a hierarchy of overlapping grid patches. In AREPO, we are not required to use subgrids, but can split or merge individual cells, such that a single global mesh with a smooth transition from low to highresolution parts results. The criteria for when a change of the local resolution should be introduced are very flexible, just as in the AMR technique. Indeed, because the initial distribution of mesh generating points and the manner in which they are updated in time are arbitrary, AREPO offers the capability of running in an AMR mode.
In our cosmological galaxy formation simulations it is desirable to maintain a roughly constant mass resolution. This is because in runs with star formation whole gas cells can turn into collisionless stellar particles, and it is best to create them with approximately the same mass to avoid potential twobody heating effects among these star particles. A relatively homogeneous mass resolution in the gas phase is also warranted to guarantee a more proper comparison to the SPH runs, where a fixed mass resolution is present by construction. (We note, however, that this fixed mass resolution in SPH is a consequence of this method’s inability to correctly represent the flow on small scales, as we discuss in Section 5.3.)
We have therefore introduced a new re/derefinement scheme in AREPO that ensures that the mass resolution in the gas and the spawned star particles never deviates too much from the value of the SPH particle mass in the corresponding GADGET simulation. Specifically, we “force” cells to have a mass in the range , where we set the target gas mass equal to the mean cell mass assuming a homogeneous gas distribution in the simulation volume with equal volume cells. This corresponds then exactly to the mass of an individual SPH particle in the GADGET calculation at the same particle/cell number. We allow only quite “roundish” cells to be refined, specifically we only refine a cell if the maximum face angle is smaller than . This protects against introducing large local errors by splitting very irregular cells (which are usually cells that have just been split in the previous timestep). As a result of this constraint our simulations may contain at any given time a few cells which are more massive than . We have also done runs where only cells with a sufficiently shallow temperature gradient across them were allowed to be derefined, but this did not make any significant difference to our results.
We also slightly modified the star formation implementation to take this re/derefinement scheme into account. Specifically, if a star particle is formed in a cell with mass less than , it inherits the full mass and the cell is removed. Otherwise, a star particle is created with a mass equal to , leaving the rest of the mass in the (surviving) cell. Cells that do not contain at least do not produce stellar particles, which prevents very low mass collisionless particles from being formed. We note that such low mass cells are exceedingly rare, because cells with less than are normally all removed during the derefinement process. However, since two neighbouring cells will never be dissolved in the same timestep (see the discussion in S10), a cell can in rare cases briefly scatter below and potentially also attempt to spawn a stellar particle during this time.
This modification of the star formation implementation guarantees that star particle masses are bounded by the interval . Our GADGET and AREPO simulations both use only one generation of stellar particles, which implies that the total number of baryonic resolution elements in the GADGET simulations is strictly conserved, while in AREPO it can fluctuate slightly around the initial value.
(iii) Dual entropy formalism: As discussed in S10, finite volume hydro solvers can suffer from spurious heating in highly supersonic, cold parts of the flow, where the kinetic energy dominates over the thermal energy. Especially at high redshifts in cosmological simulations, this can lead to an incorrect temperature evolution in lowdensity gas as a result of discretisation errors in the advection of the kinetic energy. Owing to its quasiLagrangian character AREPO is somewhat less prone to this effect than fixed mesh techniques, but it can still be present at some level. To circumvent this problem, AREPO can optionally force the entropy to be conserved instead of the energy during a timestep. In this case, no spurious heating will be introduced. However, entropy is conserved only outside of shocks in adiabatic parts of the flow, so a criterion is required to decide on a cellbycell basis whether energy or entropy conservation should be given precedence for a cell’s evolution over the current timestep.
As S10 showed, this dual entropy formalism is especially useful for nonradiative simulations, because here a spurious heating cannot be compensated by radiative cooling. Since our simulations include radiative cooling due to a primordial gas mixture we expect this to be less problematic for our applications. Indeed, extensive test simulations showed that our results are not sensitive to whether or not the dual entropy formalism is used. We have therefore decided to exclusively use strict energy conservation in all our production calculations, without employing the dual entropy formalism.
(iv) Gravitational softening length: GADGET uses a global gravitational softening length for collisional and collisionless particles, i.e. for gas and stellar/dark matter particles. AREPO also uses a global gravitational softening length for collisionless particles (which we set equal to that used in our GADGET simulations). However, as the code is a finite volume method which uses gas cells of variable size to represent the fluid, we have used individual Plummerequivalent gravitational softening lengths according to for each gaseous cell, where is the cell’s fiducial radius assuming the cell volume is spread over a sphere, and is an overall scaling factor. In addition, we imposed a lower floor on the gravitational softening lengths equal to that of the SPH particles in the corresponding GADGET simulation. This was done in order to safeguard against a potentially higher gravitational resolution in AREPO in the densest gas, which may have distorted our comparison. However we note that we do not expect any significant difference due to the slightly different treatment of the gravitational softening length of the gas even if such a floor was not used. Indeed, as described in Appendix B, we have repeated lower resolution AREPO simulations with a fixed softening length for all gas cells and obtained nearly identical results. Also, we have repeated simulations with adaptive gravitational softening without a lower floor and again obtained equivalent results. We note that the presence of an effective equation of state for highly dense gas is ultimately responsible for this insensitivity. Without it, we would get fragmentation of very dense gas where the adaptive softening could make a difference (Bate & Burkert, 1997).
Name  Code  Boxsize  hydro elements  DM particles  []  

A_L20n512  AREPO  
G_L20n512  GADGET  
A_L20n256  AREPO  
G_L20n256  GADGET  
A_L20n128  AREPO  
G_L20n128  GADGET 
2.3 Code comparison strategy
In any comparison between codes as different as GADGET and AREPO, a number of different benchmarks can be used to gauge their relative performance, and various criteria can be defined to align runs that are deemed comparable with each other. In our study, we have chosen to compare GADGET and AREPO at matching mass resolution, using the same initial number of baryonic and dark matter resolution elements. This provides a welldefined comparison strategy and is advantageous for a number of reasons:
(1) First, we can use the same initial conditions for the simulations with both codes. This would not be possible if we performed the runs with different initial mass resolutions. (2) Second, our choice makes it straightforward to relate the same objects between the different runs (and they are resolved in dark matter in the same way). (3) Third, as we discuss below, for the same initial mass resolution, the CPU time requirements for GADGET and AREPO are similar, at least for cosmological simulations, with the moving mesh runs being only some more costly. In many cases, the limiting factor for the simulation size that can be achieved is the amount of computing time required, so our choice effectively enables a comparison for fixed computational resources.
Of course, in principle one may also attempt a comparison at equivalent spatial resolution in the gas between GADGET and AREPO, or at the same accuracy in the solution obtained. We have not attempted this for the following reasons. First, while the spatial resolution that can be achieved for a given number of resolution elements is expected to be worse in SPH compared to AREPO, the degree to which this matters is likely very problemdependent. For example, for our treatment of star formation, the mass resolution is much more important than a precise treatment of fluid discontinuities. Aligning the spatial resolutions such that discontinuities are broadened over the same scale in both codes would lead to drastically different mass resolutions in the starforming phases and hence likely to large systematic resolution effects in the smallest galaxies.
Second, and even more important in our view, is that the convergence properties of SPH are not well understood, making a comparison at fixed accuracy a highly problematic undertaking from the start. As we discuss at length in Section 5, formal convergence in SPH ultimately requires that the number of neighbouring particles used in smoothed estimates around a particular location be increased as the total number of particles is made larger. This has typically not been done in cosmological applications, and is also in practice not readily possible due to clumping instabilities (e.g. Schuessler & Schmitt, 1981).
Moreover, while SPH is often described as being “Lagrangian,” this is not formally correct because individual fluid elements, as represented by SPH particles, are not allowed to deform arbitrarily in response to e.g. shearing motions, as we illustrate in Section 5. SPH should, therefore, properly be referred to as “pseudoLagrangian.” The errors that this feature of SPH entails depend on the detailed properties of the flow, in addition to the local spatial resolution, and consequently cannot be assessed in general. In its most basic formulation, SPH may simply lack a formal convergence condition, making it unclear how a comparison at fixed accuracy should be defined.
Finally, in our study, we have chosen a particular implementation of “standard SPH”, as implemented by the GADGET code. Moreover, by choosing GADGET for our SPH simulations, we are able to use the same gravity solver and the same physics in our comparisons between SPH and AREPO, which might not be possible otherwise. This enables us to isolate differences in runs that owe primarily to the differences in the hydro solvers between GADGET and AREPO. As we discuss in Section 5, various modifications have been proposed to the basic structure of SPH in order to improve its reliability under some circumstances. It is, of course, possible that we would have obtained somewhat different results had we employed such formulations of SPH. However, we believe that many limitations of SPH are generic and are not specific to GADGET. Indeed, as we discuss in Section 5, it is difficult to see how SPH could be modified to entirely eliminate, for example, the sources of error that owe to its pseudoLagrangian nature without radical modifications to the underlying algorithm.
We also stress again that the primary goal of our comparison is not to arrive at the best fit to observational data, or to achieve the highest possible resolution for single galaxy models. While we acknowledge that important strides have been made recently in studying disk formation numerically using zoomin procedures in cosmological SPH simulations (e.g. Agertz et al., 2011; Guedes et al., 2011), we do not think these successes provide much evidence for the numerical reliability of SPH, although this misconception appears widespread. Our ultimate aim is not merely to model single objects, but entire populations of galaxies so that we can statistically constrain uncertain processes associated with star formation and feedback by using, e.g., observations of the distribution of galaxies with respect to their Hubble type. This requires the identification of accurate and computationally efficient techniques that are best suitable for the task. In particular, our strategy is designed to detect systematic inaccuracies in current simulation techniques that may easily be concealed by the adjustment of heuristic feedback parameters to match a certain observation, as it is commonly done, while at the same time these effects may have serious consequences in other places.
2.4 Simulation Suite
For our simulation set we adopt the cosmological parameters , , , , and (). These parameters are consistent with the most recent WMAP7 measurements (Komatsu et al., 2011) as well as with a host of other cosmological constraints. We create a realisation of this cosmology in a periodic box with a sidelength , at three different mass resolutions corresponding to , and particles/cells. A comoving gravitational softening length of , or , respectively, is used for the dark matter, and for the gas particles in the GADGET runs. For AREPO, we start initially with the same number of cells as there are particles in the corresponding SPH run, but we use an adaptive gravitational softening with a floor as described earlier. Note that due to re and derefinement the number of baryonic resolution elements (cells plus star particles) is not strictly conserved during the course of a simulation but can fluctuate slightly around the initial number. Initial conditions were generated at based on the power spectrum fit of Eisenstein & Hu (1999), with gas particles/cells added to the initial conditions by splitting each original particle into a dark matter and gas particle/cell pair, displacing them with respect to each other such that two interleaved grids are formed, keeping the centreofmass of each pair fixed.
All our simulations have been evolved until redshift even though the fundamental mode of our small box becomes mildly nonlinear at low redshift. However, our primary goal here is not to obtain quantitatively accurate largescale statistics, but rather to compare the properties of galaxies formed by two different hydrodynamical schemes. As we start all simulations with the same phases, this relative comparison is unaffected by boxsize effects down to . To easily refer to the different runs in our simulation suite we use the following naming convention. A_L20nX denotes AREPO runs (indicated by the leading ‘A’), where , , or encodes the numerical resolution and the tag ‘L20’ stands for the box size of . Likewise, G_L20nX refers to our GADGET runs (indicated by the ‘G’). In Table 1, we provide an overview of our simulation suite and list its most important parameters.
3 Results for global baryon statistics
3.1 Density and temperature maps
To obtain a first impression of the simulations, it is instructive to examine maps of the density and temperature fields. In Fig. 1, we show projections of the gas density and temperature distributions for our highest resolution simulations A_L20n512 and G_L20n512 at . The left panel gives the AREPO results and the right panel those obtained with GADGET. In both cases, the image on top depicts a large part of the full box, while the zoom images below show two successive zooms onto a disk galaxy (as indicated by the white squares). In these enlarged images, the left half depicts the temperature field while the right half gives the density field. All slices have a thickness of and their extent in the image plane is specified by the scalebars included in the images. Note that we did not centre the two image sequences individually onto the galaxies, so the zoom sequence also illustrates the typical magnitude of coordinate offsets that can develop between the different simulations.
Inspection of the maps in Fig. 1 demonstrates that both codes produce essentially identical results on large scales. This is reasonable since we do not expect largescale changes induced by a different hydrodynamical scheme. However, already at the intermediate zoom level one can identify some significant differences. The distribution of hot gas is clearly quite different, with GADGET showing a more extended hot atmosphere compared to AREPO. On still smaller scales, the differences between the two hydrodynamical codes become even more pronounced: whereas AREPO produces an extended disk, GADGET forms a significantly smaller and clumpier disk. The difference can be best appreciated in the temperature map, where the cold gas disk stands out clearly as a large blue region in the moving mesh calculation. AREPO forms a spiral disk with a central bar, which is visible in the density map. For a more detailed analysis of the structural properties of this galaxy we refer the reader to Paper II of this series. We note that although some spiral features are visible in the disk, these are partly seeded by Poisson noise in the potential due to the limited number of DM particles in the halo (see D’Onghia et al., 2011, for details of this process), and have hence only limited physical significance.
In order to demonstrate that differences like those seen in Fig. 1 do not appear only in unusual cases, we show in Fig. 2 gas density projections of the central galaxies of the 24 most massive haloes at of A_L20n512 (top panels), and G_L20n512 (bottom panels), along random projection directions. We did not impose any constraint on the galaxy selection other than halo mass; as a result, we can also have galaxies in our sample that are disturbed or undergo a merger, such as the system labelled ‘galaxyid 5’. The object ‘galaxyid 8’ of Fig. 2 corresponds to the galaxy shown in Fig. 1, but happens to be oriented edgeon here. The random orientations of the galaxy set give an impression of how a population of galaxies would look like in these simulations. Clearly, the gas distributions of most of the galaxies in Fig. 2 are more extended and more reminiscent of spiral galaxies in AREPO than in GADGET.
Inspection of animated sequences from the AREPO simulations reveals a population of interacting and merging systems. Many of them show thin bridges and tails, as expected if the galaxies are rotationally supported disks (e.g. Toomre & Toomre, 1972; D’Onghia et al., 2010)^{1}^{1}1Volumerendering movies of AREPO galaxies (at redshift and ) and highresolution images are available for download at the website http://www.cfa.harvard.edu/itc/research/movingmeshcosmology..
The differences in the gas properties are also reflected in the corresponding stellar distributions of these galaxies, which we show in Fig. 3. Although the stellar distributions are more similar between the two numerical schemes, in most cases one still notices that AREPO’s disks are slightly larger and appear typically more disky, an impression that is also supported by the quantitative analysis of galaxy properties that we present in Paper II. Furthermore, the stellar populations of the AREPO galaxies are bluer, i.e. younger, than those found in the GADGET run.
In Fig. 4 and Fig. 5 we show faceon projections of 32 other galaxies at with Milky Waylike DM halo masses in the mass range from to . The gas disk structure shown in Fig. 4 follows the trend already seen at , and demonstrates that AREPO produces significantly larger disks at all times and at all halo masses compared to the GADGET simulations. Fig. 5 demonstrates that the stellar disks are also slightly larger, but not as significant as the gas disks. But similar to the result, the AREPO stellar disk populations are typically younger than those found in GADGET.
A selfevident conclusion from these visual comparisons of Figs. 2, 3 and Figs. 4, 5 is that the morphology of forming galaxies in cosmological hydrodynamics simulations can be strongly affected by the underlying hydrodynamics solver. SPH has significantly more problems producing disklike galaxies than the moving mesh approach for a similar computational cost. Note however that we do not expect our simulations to produce a completely realistic galaxy population, because we here refrained from including strong feedback capable of producing galactic winds and outflows. However, since both simulations followed exactly the same physics, we expect this qualitative difference to be present in simulations with stronger feedback as well.
3.2 Mass functions
As we emphasised earlier, AREPO and GADGET use the same gravity solver based on a highresolution TreePM scheme. Because the gravitational field is largely dominated by the collisionless DM component, we expect that the DM distribution in both simulations should hence be similar, at least on scales where baryonic effects do not change the structure of DM haloes. We explicitly verify this expectation in the left column of Fig. 6, where we compare the DM halo mass function of friendsoffriends (FoF) groups with a linking length of of the mean particle separation down to a particle number limit of 32 particles at and . The codes show excellent agreement in the DM halo mass functions at both redshifts. Also, the convergence for the different resolutions is in both cases very good. We note that previous comparisons of hydrodynamical mesh codes and high resolution tree codes (O’Shea et al., 2005; Heitmann et al., 2008) have shown a deficit of lowmass haloes in the meshbased hydrodynamical codes, an effect that has its origin in their AMRbased gravity solver, which typically does not refine small DM haloes sufficiently early. As evidenced here, AREPO does not have this problem.
The other panels of Fig. 6 count haloes in terms of their baryonic content, separately for gas and stars. In order to relate baryons to dark matter haloes, we determine for each stellar and gaseous cell/particle the closest dark matter particle, and then associate the baryonic mass element with the FoF group this DM particle resides in (see Dolag et al., 2009, for more details). The middle column of Fig. 6 shows the cumulative halo gas mass functions at redshifts and . While both codes converge equally well for the gas mass functions as a function of resolution, they do not agree on the result. At , AREPO has more objects of a given gas mass than GADGET over nearly the full range of halo masses. This behaviour changes towards lower redshifts, where fewer very gasrich objects are found in AREPO. We will show below that this has to do with more efficient cooling and a higher star formation rate in the mesh code in massive haloes at late times.
The right column in Fig. 6 gives the mass functions for the stellar component of haloes, and here the situation is slightly different. At , we find fewer objects with low stellar mass in AREPO compared to GADGET, but the situation reverses at the high mass end where we find AREPO has more haloes at a given stellar mass. This behaviour changes somewhat towards , where the low mass end agrees now very well between the two codes. However, there are still more stellar systems of high mass in AREPO, and this difference becomes more pronounced and extends over a larger range of masses. Again, this can be attributed to the more efficient star formation at late times in massive systems in AREPO that we analyse in detail below. Since the baryonic mass contributes only a small amount to the total mass of the entire box, these differences in the baryonic mass functions are neither imprinted in the DM halo mass functions nor visible in the largescale structure shown in Fig. 1.
3.3 Gas phases
Because we use two completely different hydro solvers, we may expect some deviations in the breakdown of various gas phases. A first simple way to obtain an overview of the global state of the gas in the simulation volume is the construction of density–temperature phasespace diagrams. We show in Fig. 7 massweighted  phasespace diagrams of the simulations at the highest resolution for (top row) and (bottom row). The left and right columns show results for AREPO and GADGET, respectively. We can readily identify a number of wellknown features in the gas phasespace (Davé et al., 2001), which show up in both runs. The narrow ridge of gas with an upward slope and and consists of low density, highly photoionised gas in the intergalactic medium (IGM). The tight relation between temperature and density in this regime is maintained by the competition between adiabatic expansion cooling and photoionisation heating. The plume of gas with and is comprised on the other hand of shockheated gas in virialised haloes and, at the lower density end, in and around filaments. The narrow, downward sloping locus of gas at and represents radiatively cooled, dense gas in galaxies. The cooling times at these densities are short, so that gas remains close to its equilibrium temperature where photoionisation heating balances radiative cooling. This temperature is a slowly decreasing function of density and lies close to . Finally, once the gas becomes very dense and reaches the star formation threshold, we describe the gas by an effective equation of state representing the mean thermal energy density of a twophase medium of hot and cold gas. This effective equation of state results in the upward sloping high density line of gas, which represents the ISM in galaxies. The pressurisation by supernova feedback in our subresolution model prevents this gas from fragmenting under selfgravity.
As Fig. 7 demonstrates, AREPO and GADGET lead to a qualitatively very similar phasespace diagram. This demonstrates that the properties of the gas distributions are overall quite comparable in both codes. However, although there is general broad agreement between the simulations, there are also some striking differences. First of all, the distribution of hot gas extends to higher densities in the AREPO runs compared to the GADGET simulations; i.e. there is more hot gas at higher densities in the AREPO simulations. We note however that in terms of total mass this is not very significant. More important, GADGET appears to have more hot gas in general, which can be inferred from the slightly more extended yellow region in the phase diagrams. This is consistent with the temperature structure in the surroundings of the galaxy shown in Fig. 1. The AREPO runs also exhibit a more pronounced cooling feature around , which corresponds to a local minimum of the cooling curve between the two line peaks of hydrogen and helium. Although this feature is readily visible in the phasespace diagrams of the AREPO runs, the actual gas mass populating that feature is very small. The faint stripelike features seen in the topleft and bottomleft panels of Fig. 7 below the effective equation of state arise from numerical inaccuracies in the temperature evolution of some cells around starformation sites. Here the temperature can sometimes scatter to very low values, which is one incarnation of the accuracy problems associated with supersonic cold fluid motions. In our multiphase model, the temperature of these cells is then relaxed back to the effective equation of state temperature on a timescale given by equation 12 of of Springel & Hernquist (2003a). The simulation timestep of these cells is small against , such that the temperature of the cell at the end of a relaxation step is . Because (see Springel & Hernquist, 2003a) and due to the poweroftwo hierarchy of possible values for , a set of parallel stripes of cells spaced by a factor of 2 results. The mass in these cells is however negligibly small, so that this effect has no influence on the dynamics.
To quantify the gas mass content in the different regions of the – phasespace diagrams in Fig. 7, we introduce the following cuts in the  plane: (1) diffuse cool gas with and , (2) diffuse warm gas with and , (3) diffuse hot gas with and , and finally, (4) dense gas with . In Fig. 8, we first show the different gas mass fractions of each of these four phases. All runs show the behaviour we anticipated earlier; i.e. there is the general trend that the cold gas fraction decreases and the warm/hot gas fraction increases with time. The hot gas fraction (lower left panel) increases due to shock heating in haloes. Interestingly the details vary quite strongly between the different numerical schemes. The GADGET runs typically show more hot and warm gas, and accordingly produce less dense gas. Especially the mass fractions in the hot phase are very different between the two schemes: for most of the time the mass difference here is larger than one order of magnitude. We will show below that these two findings are also consistent with the results for the star formation histories of the different runs. We note that the same trends are also found for the lower resolution simulations, but the quantitative numbers differ slightly. Although not fully converged, there is more very dense gas (lower right panel) in AREPO than in GADGET at all resolutions, and also the amount of cold gas (upper left panel) is larger in AREPO. These results confirm the visual impression obtained from Fig. 1, where we also saw that the hot atmosphere is smaller in AREPO, but the cold gas disk is larger.
Next, we consider in Fig. 9 the corresponding volume fractions of these gas phases. The volume fractions of the GADGET runs are based on the specific volume of individual SPH particles: . However, SPH is not volumeconserving in the sense that the sum of the specific volumes of all SPH particles will not sum up to the total simulation volume. In finitevolume methods like the one used by AREPO, this does not occur; all the cell volumes add up to the correct simulation volume. For the comparisons of the volume fractions we therefore do not calculate volume fractions with respect to the total simulation box volume, but with respect to the sum of all individual particle/cell volumes. This does not influence the volume fractions of AREPO, but changes the SPH volume fractions of GADGET, which would otherwise not add up to unity. The general trends we find for the volume fractions in Fig. 9 are similar to those for the mass fractions in Fig. 8. However, we can now clearly see that the hot atmospheres are not only strongly reduced in mass in AREPO, but also that their volume fraction is significantly smaller compared to GADGET. This is in agreement with the qualitative results shown in Fig. 1 and Fig. 2. The same is true for the volume fraction of the cold gas, which is however not as wellconverged.
To focus more on the density structure, we marginalise Fig. 7 over temperature and express the densities as physical hydrogen number densities. We hence derive the hydrogen number density probability distribution functions for the full simulation volume, which are shown in Fig. 10. The left panel accounts for all gas particles/cells, whereas the right panel shows only those with active star formation. In our star formation model, the density threshold for the onset of star formation is set to , and therefore the right hand panel occupies only this high density region. Interestingly, GADGET shows at all resolutions a significantly reduced gas fraction slightly below the star formation density threshold. We checked that the reduction of the gas mass in this region is caused by the more extended disks in AREPO; i.e. the additional gas found in AREPO at these densities stems from disk material. Presumably, the ‘gap’ phenomenon found in SPH across strong contact discontinuities (Agertz et al., 2007), such as the one we have here, also contributes to the different breakdown in that density range. The other parts of the density distributions look rather similar for the two simulation methods, however. We verified that the gas slightly below the threshold is predominantly responsible for the more extended disks in AREPO. This gas effectively fills the gap which is present in SPH. The galaxies in GADGET typically lead to a peak in the density PDF above the star formation threshold instead of a more plateaulike feature due to the extended gas disks in AREPO. Therefore, the difference in the density PDF is a reflection of significantly different galaxy radii for the different hydroschemes (see also Paper II).
3.4 Cosmic star formation history
An important prediction of our simulations is the global star formation history (SFH), which encodes key information about the overall efficiency of the galaxy formation process. The SFHs of our different simulation runs are presented in Fig. 11. The top left panel shows the SFH as a function of lookback time, whereas the lower left panel plots it on a logarithmic scale as a function of redshift. The corresponding integrated star formation histories are shown in the right panels. The most obvious trend with increasing resolution is that more high star formation in small haloes is resolved (Springel & Hernquist, 2003b). This is simply due to the better mass and spatial resolution for smaller haloes, which are in part not even seen in the coarser simulations. This trend with resolution also shifts the peak in the star formation rate density to higher redshift with increasing resolution. Note however that there is little cosmic time at these high redshifts, so that the mass of stars formed there is small compared to the present stellar density. Once the simulations resolve all haloes with virial temperature of and above (necessitating a dark matter particle resolution of about ), we expect however that the trend with resolution will stop and all ordinary star formation will be resolved (apart from additional PopIII star formation, which is not included in our models).
After the highredshift peak of star formation, the star formation rates decline and converge particularly well at lower redshifts for the SPHbased GADGET simulations. In this regime, most of the star formation is contributed by higher mass haloes (Springel & Hernquist, 2003b), for which our subresolution model produces converged results already at moderate resolution. This excellent convergence does not quite carry over to the moving mesh calculations with AREPO, where all resolutions slightly vary in their late time star formation rates (for further explanation see Section 4.3 and Paper III). However, the meshbased results are still very close to one other, and more importantly, they lie significantly higher than the SPH simulations, which is clearly visible from the inset in the upper left panel of Fig. 11 showing the ratio of the individual SFHs to the mean. We emphasise once more that this systematic difference is a result only of the different hydrodynamical schemes, since both codes use the same star formation model and calculate identical star formation rates for a given gas density. There is a minor technical difference in how stellar particles are created, but this does not affect the outcomes, as we have explicitly checked in numerous test calculations of isolated haloes and galaxies (see Paper III). The latter do show very good agreement in the star formation rates between GADGET and AREPO, so the difference we find in Fig. 11 must have a cosmological origin.
To understand better whether this increased star formation rate affects all halo masses or whether it is dominated by a certain halo population, we plot in Fig. 12 the star formation rates as a function of FoF group mass for our highest resolution simulations. Solid lines in Fig. 12 represent the median values and shaded regions show the range of the distribution. The insets show the specific star formation rates. In both cases, and for all redshifts shown, the scatter around the median relation is rather small. At high redshift (), the differences between the SPH and movingmesh calculations mainly occur at the intermediate and low mass end, where GADGET produces more stars than AREPO. This is in agreement with the general trend found for the global SFH, where GADGET shows a slightly higher star formation rate at high redshift. We note that the higher specific star formation rates in AREPO at late times are due to a significantly more efficient cooling.
We point out that one reason why GADGET shows at high a larger star formation rate (SFR) at a given resolution than AREPO is related to the fact that already at this early time the disks in AREPO are slightly more extended and hence have a lower gas surface density (as demonstrated in Paper II). Therefore, the star formation rate is lower compared to the denser, more bloblike galaxies formed in the SPH calculations. We note that this discrepancy between GADGET and AREPO decreases once the resolution in SPH becomes high enough, but it still induces a noticeable delay in AREPO’s SFRpeak compared to GADGET, although the peak amplitudes agree quite well.
Towards lower redshifts, a gradual trend of a different nature appears: while AREPO and GADGET start to agree better at low masses, the schemes deviate more and more strongly towards the high mass end. At , there is a nearly perfect overlap of the star formation rates below , but FoF groups above this mass form significantly more stars in the AREPO simulations compared to GADGET runs. Interestingly, this difference in large haloes first starts to show up at around (upper right panel), which roughly coincides with the time when the global star formation rates start to deviate, as can be seen from the lower left panel of Fig. 11. This implies that the massive end of the halo population is mainly responsible for the higher star formation rate in the moving mesh calculations at late times. These massive systems form at late times and systematically create more stars in AREPO than the corresponding GADGET calculations, driving the global SFH to a different behaviour in the two codes.
The only way such a difference in the SFH can occur is a more efficient cooling behaviour in AREPO in haloes above , allowing more gas to accumulate at the bottom of halo potentials and above the density threshold for star formation. This interpretation also agrees with our findings for the mass fractions and volume fractions in the different gas phases, where GADGET shows at all resolutions significantly more hot gas than AREPO. Furthermore it agrees with the gas density PDFs in Fig. 10, which show that AREPO has a larger fraction of dense gas slightly below the star formation density threshold. As we will discuss in the next section, the culprit of the reduced cooling in GADGET lies in different hydrodynamical dissipation rates in the outer parts of haloes compared with AREPO. In Paper III of this series, we furthermore show that AREPO’s more accurate treatment of hydrodynamical fluid instabilities strongly affects the stripping of gas out of gas clumps that fall into haloes. This in turn leads to significant differences in how the thermodynamic structure of haloes is affected by the hierarchical merging process, contributing to the cooling differences.
We note that the different star formation histories also result in a different distribution of stellar ages in individual galaxies. This can already be seen in Fig. 3 and Fig. 5, where not only the morphology of the stellar distribution is different, but also the average age as can be inferred from the colour scale. The GADGET galaxies look clearly redder overall than the AREPO galaxies, which are on average bluer. To quantify this in more detail, we show in Fig. 13 distribution functions of the formation redshifts of the stellar populations at , and , for our two highest resolution simulations. This figure demonstrates that at high redshift () the distributions agree reasonably well in shape. But at , a clear trend is already visible, because the AREPO distribution becomes biased towards slightly younger stars. This has to do with the slight shift of the peak of the SFH and the different late time behaviour of the SFH found in Fig. 11. The difference in the distributions becomes even more pronounced towards lower redshift and is clearly evident at . At that time the AREPO distribution is quite distinct from GADGET and shows a significantly larger fraction of young stars. The reason for this lies in the larger star formation rate at late times, as discussed above.
4 Gas cooling
4.1 Cooling emission and temperature evolution
The differences between AREPO and GADGET discussed so far point towards a different cooling behaviour between the two codes. To demonstrate this more clearly, Fig. 14 shows the total net cooling rate (), where and are the cooling and heating rates per unit mass as a function of redshift, accounting for all gas in our simulations. In the case of net heating we set this rate to zero. The figure demonstrates that at late times, the cooling rates of all AREPO runs are higher than those of corresponding GADGET runs. We note that this is true for all three resolutions, yielding a clear separation between the set of blue curves (representing the SPH simulations) and the red curves (representing the moving mesh calculations). Interestingly, the difference partially reverses at high redshift, where the cooling rates of AREPO at the intermediate and the low resolution are lower than those of GADGET. This is a consequence of the more compact galaxies in GADGET at these high redshifts, which result in larger densities and therefore more cooling radiation. This high cooling rate difference decreases and nearly disappears once we reach the highest mass resolution. In that case the cooling rates of GADGET and AREPO agree reasonably well at high redshifts, as shown in Fig. 14. In contrast, the quite large and systematic difference at low redshifts persists independent of resolution.
Further insight into the origin of the cooling difference is provided by Fig. 15, which shows the mean massweighted net cooling rates () at redshifts (left panel) and (right panel) as a function of hydrogen number density. If the temperature structure of the gas at fixed density would agree between all simulations, then the mean massweighted cooling rates should also agree for a given hydrogen number density. But as Fig. 15 shows, this is not the case and some systematic differences between GADGET and AREPO are present. Interestingly, the curves in Fig. 15 differ however only over a limited density range corresponding to the diffuse gas in haloes, beginning at the star formation density threshold (marked by the vertical green line) and extending to lower densities. In that density range, AREPO shows a significantly larger mean cooling rate at all resolutions, which must be due to a slightly lower gas temperature in the mean at these densities. We note that the magnitude of the difference is however redshift dependent and becomes smaller as we go to higher redshifts, as seen by comparing the and panels in Fig. 15.
At other hydrogen densities, including the range not shown in Fig. 15, the SPH and movingmesh cooling rates agree well for all our runs. Above the threshold for star formation this is however not surprising, because here the effective equation of state produces a tight correlation between density and temperature (as seen in Fig. 7). This leads to a nearly onetoone mapping of density to temperature in this regime, and explains the twopeak structure of the curves in Fig. 15 at high densities, which simply reflects the primordial cooling curve with its characteristic hydrogen and helium peaks.
Further evidence for a different temperature structure in the two simulation runs is provided by Fig. 16, which plots the mean massweighted temperature in the whole simulation volume as a function of time. In the inset we divide each curve by the mean of all the curves to emphasise deviations. We find that the mean temperature of GADGET simulations is higher than that of the AREPO runs. The higher temperature can explain the reduction in cooling emission of the SPH simulations relative to our movingmesh calculations, and it is consistent with our above findings for the SFR evolution and the cooling emission. It thus appears that the origin of the low redshift discrepancy must lie in a different efficiency of nonadiabatic heating processes in the gas, as this is required to explain differences in the temperature distribution at a given density.
4.2 Dissipative heating in haloes
It is widely appreciated that the intracluster medium of galaxy clusters is partially supported by subsonic turbulence (Schuecker et al., 2004), which is created by curved accretion shocks around the haloes, and more importantly, by the hierarchical infall of structures. Indeed, a number of numerical studies have analysed turbulence in the intracluster medium (Dolag et al., 2005; Vazza et al., 2009; Valdarnini, 2011; Iapichino et al., 2011). A similar level of turbulence can also be expected in smaller haloes that are large enough to support quasihydrostatic atmospheres. It has further been shown that shock heating is not only important in strong shocks at the outer accretion radius, but is also significant in weaker flow shocks that occur in large parts of the halo volume (Pfrommer et al., 2006). Physically, the dissipation associated with shocks and with the decay of turbulence occurs through microphysics on very small scales. In our numerical approach, we neglect the physical viscosity (which is assumed to be effectively zero on all resolved scales; this is why we employ the Euler and not the NavierStokes equations; see e.g. Muñoz et al. (2012) for a NavierStokes implementation of AREPO) and account for the necessary dissipation through numerical viscosity. In SPH this is provided explicitly in terms of an artificial viscosity, whereas in AREPO it is introduced implicitly through the solutions of the Riemann solver and cellaveraging.
The good conservation properties of SPH allow it to capture onedimensional shock waves quite well, even though the postshock velocity field typically shows substantial noise (Springel, 2010b; Abel, 2011). This already hints that SPH may not be particularly accurate for subsonic flow phenomena, such as the turbulence we expect in the virialised gas of newly formed haloes. Indeed, Bauer & Springel (2011) have systematically compared driven isothermal subsonic turbulence for GADGET and AREPO, using the same versions of the codes we employ here. They find that SPH fails quite badly to account for a turbulent cascade in the subsonic case, whereas a Kolmogorov scaling is obtained for the movingmesh code. This happens despite identical driving fields in both cases, and is ultimately caused by inaccurate gradient estimates in SPH and different dissipation as a function of scale. The SPH simulations dissipate most of the energy already close to the driving scale in the subsonic case, whereas in AREPO efficient dissipation happens only on much smaller scales, so that a selfsimilar turbulent cascade can develop over some inertial range. In addition, Bauer & Springel (2011) find that SPH exhibits a second maximum in the dissipation on very small scales of order the mean interparticle separation. Here the subsonic noise of SPH is dissipated by the artificial viscosity. Interestingly, the total amount of dissipation on these scales is quite independent of the artificial viscosity value itself. While a higher viscosity reduces the amplitude of the smallscale SPH noise, the dissipation rate on these scales still remains roughly the same, suggesting that the cause of the smallscale noise, which originates in errors in the pressure gradient estimates, cannot be reduced effectively with a different viscosity setting.
There are good reasons to expect that these differences in the dissipation properties of the hydrodynamical schemes may also induce important effects for the thermodynamic structure of cosmological haloes, which in turn can easily give rise to variations of the amount of gas that cools out of these haloes. To examine this further, we have carried out two auxiliary nonradiative simulations at resolutions of and particles/cells. These simulations use the same boxsize and parameters as our galaxy formation runs, except that cooling and star formation were not included. The use of nonradiative simulations allows us to cleanly employ the method of Bauer & Springel (2011) for measuring the instantaneous dissipation rate of particles and cells, respectively. In the case of SPH, this can simply be done by measuring the work per unit time done by the artificial viscosity forces, which represents the sole source of entropy generation in this method. For the movingmesh code, we instead also advect the thermodynamic entropy between the cells, and then measure the rate of entropy generation by comparing the state of a cell at the end of a timestep computed adopting energy conservation with the state expected if the entropy would have remained constant. The inferred rate of entropy production can then also be converted into a fiducial heating rate.
In Fig. 17, we compare stacked profiles of the spherically averaged dissipative heating rate of haloes obtained in this way. For the plots, we have selected all haloes in the virial mass range , where refers to the mass inside a radius that encloses 200 times the critical density. We have then placed 20 logarithmic bins onto the radial range to , and produced massweighted averages of the dissipation rate per unit mass, , that we measured for each particle/cell. In order to allow a calculation of an average profile by stacking the haloes, we express the dissipation rate in dimensionless units by multiplying it with the Hubble time at the given redshift, and by normalising it to of the particular halo, where is the circular velocity at the virial radius. In the individual panels of Fig. 17, we show results for different redshifts, ranging from to , and we compare AREPO with GADGET at the two resolutions considered here.
We see that there is a clear systematic difference in the dissipative heating rates as a function of halocentric distance. AREPO produces more heating in the infall region directly outside the virial radius whereas GADGET shows more heating in the outer parts of virialised haloes, where most of the gas mass is located. This systematic difference is present at all redshifts in large haloes. We also note that the nature of the difference is robustly preserved at different resolutions, even though there seem to be small residual trends with spatial resolution. Interestingly, at fixed halo mass, the relative level of dissipation in the innermost parts of haloes increases in the meshcode with time, which may be related to the buildup of a turbulent entropy core in these haloes. In Fig. 18, we consider dissipation profiles as a function of halo mass at , which however does not reveal any clear trend with halo mass.
Our interpretation of the systematic difference revealed by Figs. 17 and 18 is that: (1) shocks are captured more efficiently by AREPO in the accretion regime of haloes, and (2) AREPO strips gas out of infalling objects more efficiently there; both effects contribute to a stronger dissipation in this region. In contrast, SPH stops infalling gas less efficiently, leading to more dissipation at smaller radii. A further contributor to the heating there lies in the dissipation of the subsonic noise in SPH, which is constantly recreated in this region by tapping free energy from the disturbances that impinge on the haloes from the outside. As a net result of this dissipation of subsonic noise in the outer parts of haloes, the SPH simulations end up hotter overall (see also Fig. 16). Furthermore, this difference in the heating occurs in a region where one expects the cooling radius of many haloes in cosmological radiative simulations, thereby directly affecting the amount of cold gas produced in the quasihydrostatic cooling flows in large haloes. We note that the measurement of the dissipation rate is difficult due to its high time variability. We use above stacked profiles of small sets of halos to mitigate this problem, but even then the inner logarithmic bins average over much smaller gas mass than the outer parts. This implies that the profiles in the inner of the virial radius are relatively noisy. But this inner part is largely irrelevant as the impact on the cooling rates comes from larger radii as we discussed above.
4.3 Mixing in haloes
Another reason why differences in the global star formation rate occur between AREPO and GADGET lies in the different accuracy with which hydrodynamical fluid instabilities are treated. In particular, we expect that for the movingmesh code cold gas can be stripped more efficiently from galaxies as they fall into larger haloes, and this gas is then mixed with the halo’s diffuse gas, which also lowers its temperature.
To demonstrate this difference more explicitly in our cosmological runs, we performed special test simulations of a box with SPH particles/cells. In Fig. 19, we show the sum of the total amount of starforming gas mass and stellar mass in this box as a function of redshift. We note that an SPH particle/cell is starforming if its density is higher than the threshold of the subresolution star formation model. The solid lines show the result of calculations with our standard star formation prescription, i.e. these simulations are equivalent to the main runs presented in the paper. However, for the dashed lines we turned off the creation of stellar particles, keeping everything else the same. In this case, the cold gas accumulates in the galaxies and is supported by the effective equation of state expected for gas at these densities, except that the gas is not depleted and does not turn into stars at the normal rate. For these runs, the mass in the figure therefore refers to the total amount of starforming gas in these simulations (no stars are present). Interestingly, the solid and dashed curves overlap well for GADGET (blue) in Fig. 19, showing that once gas has overcome the density threshold it will never return to lower density. Stripping of gas out of galaxies does not appear to happen in any significant way, otherwise we would expect that the run without star formation should end up with a smaller amount of collapsed baryons. Instead, the SPH result is consistent with no stripping at all, i.e. once a gas particle has cooled, it never returns to lower density even though the galaxy may be subject to substantial shear flows upon halo infall. This behaviour has also been found by Heß & Springel (2011, submitted) in a comparison of galaxywind interactions in SPH and the VPH technique (Heß & Springel, 2010).
For AREPO the situation is very different. As Fig. 19 shows, the sum of stellar mass and starforming gas mass is larger than the amount of starforming gas in the run without star particle creation. Here the relevant fluid instabilities for stripping (like the KelvinHelmholtz instability) are significantly better resolved and facilitate a substantial mass loss of infalling overdense blobs/subhaloes. Because of this, the sum of the mass of stars and of starforming gas is higher in AREPO for the run with star particle creation than for the fiducial run where this is suppressed. In the latter simulation, there is simply more gas around that can be stripped again, while in the run with star particle creation, baryons that have been converted to stars do of course not suffer from fluid instabilities anymore. Apart from affecting the cooling in haloes, this difference in the stripping and mixing efficiency also modifies the dynamical friction timescale of infalling gas clumps, as we examine in more detail in Paper III of this series.
5 Generic issues with SPH
Various modifications have been proposed to the conventional implementation of SPH as in GADGET to improve its reliability in certain cases. For example, changes to the computation of smoothed pressure gradients (Abel, 2011) or the addition of an artificial thermal conductivity to the equations of motion (e.g. Price, 2008) enable SPH to better handle shearing interfaces and the onset of KelvinHelmholtz instabilities. Also, a number of studies focused on the development of improved artificial viscosity parameterisations in SPH codes (Morris, 1997; Cullen & Dehnen, 2010), with the goal of reducing the numerical viscosity away from shocks.
Given the results presented in this paper, it is tempting to argue that one or several of these modifications of “standard SPH” may lead to much better agreement with the movingmesh results or even resolve the discrepancies. While we cannot exclude this possibility, we note the above refinements are ad hoc in the sense that they are designed to mitigate against particular problems in some circumstances, and may have unwanted sideeffects in other situations. While these sideeffects may often be benign, the proposed modifications of SPH do not address other, more generic problems with this technique. In this section, we summarise these issues in order to put our findings into context with the recent literature on SPH techniques.
One issue that has rarely been discussed in the cosmological community concerns the convergence properties of SPH. This is manifested in at least two ways, the first having to do with local smoothed estimates in SPH, and the second is related to the fact that SPH is not formally Lagrangian. Below, we first address the question of under what conditions convergence is expected in SPH and how this influences the relation between spatial resolution and the total particle number. Then, we discuss the pseudoLagrangian character of SPH and its implications for defining convergence with this method. Finally, we briefly suggest ways in which these issues might be dealt with. Incidentally, they all would have the effect of making SPH resemble a movingmesh scheme like AREPO.
5.1 Convergence in SPH and nearest neighbour number
For a real fluid, we define the density, , to be a continuous function of space, determined by averaging over many molecules locally around . In order to formulate a discrete SPH counterpart to this (and this applies to all fluid variables, as well as the equations of motion) a twostep procedure is applied.
First, we introduce a smoothed version of the original density field, , by convolving with a smoothing kernel according to
(1) 
where the integral is over all space and is the smoothing length.
Second, the continuous smoothed field is replaced by a discrete quantity, , so that it can be represented computationally. This is done by regarding as a mass element and partitioning the fluid into a set of discrete mass elements so that the above integral can be approximated by a discrete sum:
(2) 
where is the mass of fluid element , is its spatial location, and the expression allows different fluid locations/elements to have different smoothing lengths. In principle, the sum extends over all fluid elements. However, in order that the smoothed density approaches the continuum limit represented by , it is necessary that is spatially localised. So in practice a localised kernel with compact support is adopted, implying that the discrete sum extends not over all fluid elements but over a subset of those neighbouring a certain point in space.
Now, we can ask about the conditions that must be satisfied for the smoothed, discrete version of the density field to asymptote to the actual continuum limit. That is, under what circumstances is the following true?
(3) 
Consider the second of these requirements first. In order for we must have as , as implied by the definition of in the above integral. This also requires that , otherwise there exist not enough particles around a given spatial location to form smoothed averages.
Turning to the first requirement above, will be satisfied only if at the same time we enforce the condition . This has not been emphasised in the cosmological literature, but from the defining relation above for it is clear that the discrete approximation will not asymptote to the smoothed density field unless this limit is taken. Or, to put it another way, if the number of neighbours is held fixed as , there will be a constant source of error present on the most wellresolved scales that will not vanish as the resolution and number of particles is increased indefinitely (see also Rasio, 2000; Read et al., 2010). (For a discussion of this requirement in the context of elasticity, see e.g. Belytschko et al. (1998).)
To the best of our knowledge, the optimal way to enforce this requirement has not been established. However, there is little doubt that this must be addressed in order that SPH provides properly convergent solutions, as argued recently by Robinson & Monaghan (2011): “When performing a convergence study using SPH, it is important to vary both the number of particles … as well as the ratio of smoothing length to particle spacing [to increase the number of neighbours]. [Otherwise] the error due to the summation interpolant [remains] constant [as the total particle number is increased].” We emphasise that this requirement is not fundamentally an issue with SPH, but simply reflects the fact that a numerical approximation to an integral in the form of a discrete sum will not converge to the true answer unless the number of points where the integrand is sampled is made larger.
An increase in the number of neighbours is also warranted to reduce the gradient errors in SPH, which give in fact rise to a sizable “zeroth order error” (Read et al., 2010). These gradient errors seriously degrade the accuracy of SPH in subsonic flow problems (Springel, 2010b) and are a primary cause for problems in the representation of subsonic turbulence (Bauer & Springel, 2011). However, in practice, simply increasing the number of neighbours is met by a serious obstacle, because it invokes the socalled clumping instability in which particles located in the inner parts of the kernel are pushed together by the pressure forces of the surrounding particles, forming little clumps that frustrate attempts to reduce the error of the kernel sums in this way. Partially circumventing this problem and allowing a higher neighbour number requires different kernel shapes than are commonly employed (Read et al., 2010; Price, 2012), but such kernels merely have different stability bands that still impose severe restrictions on the number of neighbours that may be used. In essence, due to the clumping instability, establishing formally convergent SPH solutions is an unsolved problem because the path of increasing the neighbour number is blocked in practice.
5.2 Implications for SPH convergence rates and efficiency
The requirement that the number of neighbours should increase to reduce errors as the total number of particles is made larger has also serious implications for the efficiency of SPH. As an example, consider a uniform fluid in a cube of sidelength represented by SPH particles. (For more general circumstances the argument generalises if we think about small scales on which the fluid properties are nearly uniform.) The mean separation between the particles, , is
(4) 
If we take , as has been done conventionally in SPH codes used in cosmology, then the smoothing length will shrink in proportion to and, so, the number of neighbours will be nearly constant. In a sense, this maximises the spatial resolution, but leaves a fixed source of error in the discrete sums (Read et al., 2010; Robinson & Monaghan, 2011), so this path is in principle not appropriate for achieving proper convergence with SPH.
If, instead, we set then the fundamental relation between and becomes . What are the conditions on so that we can meaningfully construct a procedure that converges numerically? Clearly, , otherwise will not decrease more slowly than as increases. Also, we must have , as the choice would mean that the spatial resolution is constant, independent of . The optimal value for is undetermined. One physically appealing choice would be , because then would be the geometric mean between the mean interparticle separation, , and the size of the system, . In that case, the relation between and becomes , implying a steeply rising computational cost with resolution. It is possible that a slightly larger value of could be preferred in terms of CPU costs, but it is clear, however, that in order to extend the dynamic range in spatial scales resolved by SPH in a way that guarantees eventual convergence, it is absolutely necessary to increase the number of particles more aggressively than has been advocated previously. The dilemma, of course, is that the clumping instability makes this highly problematic.
Ignoring the clumping issue for the moment, the implications for the convergence rate of SPH are in any case important. When the common practice in the field is followed and the number of neighbours is held constant, Springel (2010b) reported a global L1 error for a vortex flow (the Gresho test) that scales as with resolution, as opposed to for AREPO. We can then ask how the ratio of the CPUtime cost of simulations with the two schemes varies with the size of the error. For simulations in 3D, the computational cost scales for both codes as , where three powers of come from the spatial dimensions, and one additional power enters due to the reduction of timesteps for better resolution. This then implies that the cost ratio scales as with the error of the calculation. Reducing the error by a factor of 10 requires therefore about a times higher effort in SPH than in AREPO. This conclusion is problem dependent and can also be affected by specific details of the SPH algorithm, like the kernel shape. However, the question of computational efficiency of different numerical schemes is best phrased in terms of such convergence rate comparisons. Here the Monte Carlo character in the approximation of the SPH kernel sum invariably introduces a serious disadvantage for SPH compared with the moving mesh approach, as outlined above.
A related conclusion, but focusing on the efficiency impact of the artificial viscosity parameterisation in SPH, was reached by Bauer & Springel (2011). If the goal is a representation of subsonic turbulence with a certain Reynolds number , they showed that the computational cost of SPH scales at least as . In contrast, in the meshbased treatment of AREPO, the dissipation scale is tied to the mesh resolution, implying a computational cost for reaching a certain Reynolds number that scales as . Given that our cosmological simulations with AREPO are only about slower than GADGET for simulations involving selfgravity, for the same number of resolution elements, this means that at a given computational cost, AREPO resolves much larger Reynolds numbers than SPH. In fact, to reach the same Reynolds numbers as in our present movingmesh simulations, it appears plausible that a factor more SPH particles than AREPO cells would be required.
5.3 The pseudoLagrangian nature of SPH
SPH is often referred to as a Lagrangian algorithm, but this is not formally correct and we suggest it is perhaps better to characterise it as a “pseudoLagrangian” technique. To appreciate the reasons for this, consider the following example, which also highlights that this distinction is intimately related to issues of (suppressed) mixing in SPH.
Suppose gas is revolving in a thin disk on circular orbits, and imagine that the gas is partitioned at some instant into regions that are then represented by SPH particles, with the smoothing done over a circular area. Each SPH particle will comprise gas from an area around its centre, extending outwards in radius of order the local smoothing length, as shown for one such SPH particle in the left panels of Fig. 20 at time zero. Consider now how the system will look like a short time later, , after the disk has rotated by a small angle.
If the disk is in solid body rotation, as in the upper panels of Fig. 20, the fluid initially contributing to the SPH particle will occupy an area identical to the smoothing region of the SPH particle at time , as indicated by comparing the “true” situation in the upper middle panel with the SPH version in the upper right panel. In this case, the SPH representation is faithful to the actual equations of motion because the fluid contributing to the SPH particle at time zero is the same as that at , and the region bounded initially by the SPH smoothing volume has not changed its shape.
However, suppose instead that the disk is rotating differentially, turning around more rapidly in the inner parts than in the outer ones (lower panels of Fig. 20). The “true” situation shown in the middle panel tells us that the gas contributing to the area bounded by the SPH smoothing volume at time zero will be stretched out owing to shear. However, in the SPH formulation, shown in the lower right panel, the material initially in the SPH smoothing volume is forced to remain tied to this area and is not allowed to shear, inconsistent with the equations of motion of the real fluid.
A true Lagrangian picture would allow fluid elements to be deformed in the presence of shear, but this is inhibited by the SPH approach on scales smaller than those set by the smoothing procedure. Hence, while SPH retains some characteristics of a Lagrangian method, it does not evolve the fluid entirely in a manner consistent with the equations of motion and should, in this sense, be termed “pseudoLagrangian.”
Unfortunately, the error incurred by the approximations inherent to the SPH formalism is highly problem dependent and cannot be estimated based solely on the discretisation procedure. For example, there is no error made in the case of a disk in solid body rotation, or for a uniform disk rotating differentially, or for gas motions that are limited to expansion or contraction in threedimensions. However, this is not true for shearing flows involving fluids with distinct internal properties; here the implicit “remapping” involved in enforcing a spherical shape for the SPH smoothing volume, which would change the local composition of the internal properties, is ignored. This limitation is ultimately the reason why SPH does not handle mixing accurately, as we illustrate in the bottom, middle panel of Fig. 20. Suppose at the updated time we were to repartition the gas into new SPH particles, as indicated by the dashed circle in this frame. The gas associated with this particle should include gas from the original particle, but also gas from the surrounding flow. This would allow mixing to occur between the gas in the original particle and nearby parts of the flow, but is not allowed to occur in SPH, by construction.
We note that AREPO does not suffer from this defect. In this code, cells are not allowed to become arbitrarily distorted, in the interests of efficiency and accuracy, and so AREPO is also not formally Lagrangian. However, it still evolves fluids correctly when deviations from strict Lagrangian behaviour occur by allowing for mass exchange between cells, in a manner consistent with the equations of motion. Therefore, this method should be termed “quasiLagrangian.”
A related issue arises in the context of Nbody simulations of collisionless systems. There, a sixdimensional phase fluid is partitioned into particles of fixed size that move through space in a manner determined by the equations of motion. Because the phase fluid initially associated with a particular particle is always tied to that particle, the smallscale dynamics of the system will not be represented properly. However, in this case, unlike for the example discussed above, forces are strictly longrange and are less prone to inaccuracies in the smallscale distribution of the material than pressure gradients or viscosity. In that sense, the local distortions in the phase space fluid have fewer dynamical consequences than for a hydrodynamical fluid in three dimensions. Nevertheless, if an accurate description of the smallscale structure of collisionless systems is essential, these effects must be accounted for at some level (e.g. Vogelsberger et al., 2008; Vogelsberger & White, 2011b; Abel et al., 2011).
5.4 Discussion
We should emphasise that whether the limitations inherent to SPH lead to significant inaccuracies in the solution depend in detail on the circumstances. SPH can be expected to provide reliable results for flows that are kinetically dominated, or if the motions are controlled mainly by longrange forces. In these situations, errors in the local quantities are subdominant. For example, Bauer & Springel (2011) show that while SPH does not accurately describe subsonic turbulence, in the supersonic regime, when the flow energy is mainly kinetic, GADGET and AREPO give similar answers. Also, comparisons between SPH and AMR codes have yielded compatible results for the structure of the intergalactic medium, as reflected in the properties of the Lymanalpha forest (e.g. Regan et al., 2007). At these low densities, gravity dominates over internal energy and, moreover, the fluid is not subject to strong shear, so the above considerations are not critical. Even in some cases where shear is present, the approximations underlying SPH will not be significant, provided that gravitational forces are more important than local ones and the relevant evolution in flow properties occurs rapidly. Hayward et al. (2012, in preparation) have demonstrate this explicitly by simulating galaxy mergers using both GADGET and AREPO. They find that when the gas is treated using an effective equation of state, star formation rates during the course of a merger are nearly identical between these two codes. Thus, conclusions drawn from studies of galaxy collisions with SPH, such as the stellar profiles of merger remnants and their evolution (e.g. Hopkins et al., 2008, 2009, 2009, 2009) or the survivability of disks during mergers (Springel & Hernquist, 2005), are likely robust to variations in the hydro solver.
Unfortunately, this good level of agreement does not extend to the more complex flows associated with halos in cosmological environments, as we highlight in this paper. There, various phases of gas will be present in close proximity, often shearing relative to one another, leading to errors in simulations done with SPH like those we have identified here. Also, in the hydrostatic atmospheres of halos the hydrodynamic forces are comparable to gravity forces, and the subsonic turbulence present in these regions is affected by gradient errors in SPH. Without a formal convergence criterion, the consequences of these errors are difficult to assess, because even if a solution may plateau as the number of SPH particles is increased, this by no means guarantees that the correct answer is being obtained. In particular, the solution may exhibit “false convergence,” appearing to reach a converged solution, while fixed sources of error from e.g. summation interpolant are still present (Robinson & Monaghan, 2011). A case in point is the longstanding discrepancy identified for the central cluster entropy predicted by nonradiative SPH and meshbased simulations, first identified in the Santa Barbara cluster comparison project (Frenk et al., 1999).
The above considerations motivate thinking about ways to cure the defects inherent to SPH so that it can provide solutions of comparable accuracy and resolution to those obtained with a moving mesh approach like AREPO. For example, the artificial viscosity typically used in SPH codes to handle shocks could, in principle, be eliminated by incorporating a Riemann solver into SPH. Pioneering efforts along these lines have been made by Inutsuka (2002) and Murante et al. (2011). However, these implementations solve the Riemann problem for each particleneighbour pair, and this approach will become prohibitively expensive as long as the smoothing procedure requires averaging over an increasingly large number of neighbours to achieve convergence as the total particle number is increased. We note that alternatively a grid could be introduced into SPH just for the purposes of solving the Riemann problem around each particle. This grid would necessarily have to be adaptive, and hence resemble the Voronoi tessellation used in AREPO.
In order to eliminate the pseudoLagrangian character of SPH and allow for fluid elements to become distorted on scales small compared to a few smoothing lengths, as should occur in e.g. shearing flows, it is necessary to allow for mass exchange between particles in a manner consistent with the equations of motion. This would have the added benefit of eliminating the mixing problem in SPH. First suggestions in this direction have recently been made (Wadsley et al., 2008; Price, 2008; Read et al., 2010; Read & Hayfield, 2011), but the best way to formulate such mixing terms is not clear. One possibility to unambiguously calculate the required mass flux between particles would be to utilise some kind of grid. Again, this grid would need to be adaptive, like the unstructured mesh in AREPO, reinforcing the notion that our new movingmesh approach quite naturally addresses several of the generic issues with SPH.
6 Conclusions
In this study, we have presented the first cosmological hydrodynamical simulations of structure formation carried out with the novel AREPO movingmesh code. Compared to the widely employed SPH technique, this method promises an important gain in accuracy in the numerical treatment of gas dynamics for similar computational costs. In order to understand the implications of these differences for galaxy formation simulations, we have evolved the same initial conditions both with AREPO and the wellestablished SPH code GADGET. Both codes share the same highresolution gravity solver, and incorporate an identical radiative cooling and star formation model. Differences in the outcome are hence tied to systematics of the hydrodynamical solvers that are used. It is the primary goal of our paper to identify the main differences and their magnitude, as well as providing evidence for the primary sources of potential discrepancies.
To achieve this goal, we have considered a primary set of simulations in boxes of that include a basic treatment of galaxy formation physics, at various resolutions ranging from to particles/cells. In addition, we have carried out a few auxiliary simulations either in smaller boxes, or of nonradiative type, in order to more clearly measure particular numerical effects. For reasons discussed in Section 2.3, we have chosen to perform the comparisons between the codes at the same initial mass resolution, because then the computational costs are similar.
Our principle findings may be summarised as follows:

The overall distribution of gas in density and temperature is broadly in agreement between SPH and AREPO, but there are some subtle and important differences. In particular, the mean massweighted temperature in AREPO is slightly smaller than in GADGET. Also, the hot gas extends to somewhat lower density in GADGET and hence occupies a larger volume fraction.

The cosmic star formation rate density peaks at similar values, but at slightly lower redshifts in AREPO. At high redshift and at high resolution, the SFRs between the two codes are in good agreement, but towards lower redshift there are significantly stronger cooling flows in AREPO, causing a correspondingly larger star formation rate. This difference occurs primarily in haloes with mass larger than .

We find that the two codes differ significantly in the dissipative heating rates within haloes, with AREPO producing more dissipation in the halo infall regions, whereas SPH produces higher dissipative heating throughout most of the outer regions of virialised haloes. This difference is mainly responsible for the higher temperatures found in the SPH simulations, and the correspondingly weaker cooling. We argue that this dissipation in SPH is likely to be of spurious nature, and is a combination of viscous damping of SPH’s inherent noise and the unphysical damping of subsonic turbulence injected into haloes in the infall regions.

Visual comparison of simulated galaxies shows considerably larger stellar disks and more extended and disky stellar distributions in AREPO compared with GADGET. Also, the halo gas surrounding the galaxies looks smoother and less clumpy in AREPO. Fig. 21 shows gas surface density disk scale length radii obtained from exponential surface density fits of more than galaxies at in the halo mass range to . This clearly demonstrates that the gas disks forming in the AREPO simulations are typically significantly larger than those in the corresponding SPH runs. We further note that the gas surface density of disk galaxies in the AREPO simulations follow very closely exponential profiles, which is typically not the case for the SPH runs. A more detailed analysis is presented in Torrey et al. (2011). We note that the large disks in the AREPO simulations result even without strong feedback in the form of galactic winds or modifications of the star formation prescription. This is also demonstrated in simplified test simulations in Paper III. The larger extent of disk galaxies is also demonstrated in Paper II, where we do not assume exponential profiles, but rather use a cut in surface density to determine a characteristic size. We also independently calculated halfmass radii, which show the same trends. All these results confirm the findings reported in Fig. 21, i.e. that AREPO produces in general larger disks than GAGDET.
The above findings clearly suggest that there are significant quantitative differences caused in galaxy formation simulations by the choice of hydrodynamical technique. It appears that the limited accuracy of SPH for subsonic flow phenomena such as KelvinHelmholtz instabilities or subsonic turbulence induces important differences in the predicted properties of galaxies at low redshift, affecting both their morphology and stellar mass. Because of the accuracy problems of SPH for certain subsonic test problems (Agertz et al., 2007; Springel, 2010b, Bauer & Springel 2011) it appears that SPH cannot be expected to reach highly accurate results in all regimes relevant for cosmic structure formation. At the same time, AREPO performs significantly better on these same problems, and yields generally a smaller error norm and higher convergence rate when scrutinised against problems with known analytic solutions (Springel, 2010a, b). We are hence confident that the AREPO results presented in this paper entail a more faithful treatment of cosmological hydrodynamics, especially in view of the discussion in Section 5.3.
It is also reassuring that the AREPO runs improve the predicted morphologies of simulated galaxies, as our preliminary analysis suggests. In Paper II of this series we will back up this finding with a detailed analysis of the galaxy properties. Finally, in Paper III we provide a more detailed analysis of the relevant numerical effects responsible for the differences in mixing and cooling. The body of these results makes it clear that AREPO simulations provide a significant opportunity for the development of next generation models of galaxy formation that promise to achieve a sofar unknown combination of accuracy, dynamic range, and faithfulness to the relevant physics.
Acknowledgements
The simulations in this paper were run on the Odyssey cluster supported by the FAS Science Division Research Computing Group at Harvard University. Scaling tests were done on the Ranger cluster at the Texas Advanced Computing Center (TACC). We thank Paul Torrey for allowing us to show Fig. 21 and Elena D’Onghia, Patrik Jonsson, Andrew MacFadyen, Diego J. Munoz for useful discussions. We also thank the anonymous referee for helpful comments. DS acknowledges NASA Hubble Fellowship through grant HSTHF51282.01A. DK acknowledges support from NASA through Hubble Fellowship grant HSTHF51276.01A.
References
 Abel (2011) Abel T., 2011, MNRAS, 413, 271
 Abel et al. (2011) Abel T., Hahn O., Kaehler R., 2011, ArXiv eprints
 Agertz et al. (2007) Agertz O., Moore B., Stadel J., Potter D., Miniati F., Read J., Mayer L., Gawryszczak A., Kravtsov A., Nordlund Å., Pearce F., Quilis V., Rudd D., Springel V., Stone J., Tasker E., Teyssier R., Wadsley J., Walder R., 2007, MNRAS, 380, 963
 Agertz et al. (2011) Agertz O., Teyssier R., Moore B., 2011, MNRAS, 410, 1391
 Balsara (1995) Balsara D. S., 1995, Journal of Computational Physics, 121, 357
 Barnes & Hut (1986) Barnes J., Hut P., 1986, Nature, 324, 446
 Barnes & Hernquist (1992) Barnes J. E., Hernquist L., 1992, ARA&A, 30, 705
 Barnes & Hernquist (1996) Barnes J. E., Hernquist L., 1996, ApJ, 471, 115
 Barnes & Hernquist (1991) Barnes J. E., Hernquist L. E., 1991, ApJ, 370, L65
 Bate & Burkert (1997) Bate M. R., Burkert A., 1997, MNRAS, 288, 1060
 Bauer & Springel (2011) Bauer A., Springel V., 2011, ArXiv eprints
 Baugh (2006) Baugh C. M., 2006, Reports on Progress in Physics, 69, 3101
 Belytschko et al. (1998) Belytschko T., Krongauz Y., Dolbow J., Gerlach C., 1998, International Journal for Numerical Methods in Engineering, 43, 785
 Benson (2010a) Benson A. J., 2010a, ArXiv eprints, 1008.1786
 Benson (2010b) Benson A. J., 2010b, Phys. Rep., 495, 33
 Benson et al. (2000) Benson A. J., Cole S., Frenk C. S., Baugh C. M., Lacey C. G., 2000, MNRAS, 311, 793
 Berger & Colella (1989) Berger M. J., Colella P., 1989, Journal of Computational Physics, 82, 64
 Bertone et al. (2005) Bertone G., Hooper D., Silk J., 2005, Phys. Rep., 405, 279
 Blumenthal et al. (1984) Blumenthal G. R., Faber S. M., Primack J. R., Rees M. J., 1984, Nature, 311, 517
 Bond et al. (1991) Bond J. R., Cole S., Efstathiou G., Kaiser N., 1991, ApJ, 379, 440
 BoylanKolchin et al. (2010) BoylanKolchin M., Springel V., White S. D. M., Jenkins A., 2010, MNRAS, 406, 896
 Bryan & Norman (1998) Bryan G. L., Norman M. L., 1998, ApJ, 495, 80
 Cantalupo & Porciani (2011) Cantalupo S., Porciani C., 2011, MNRAS, 411, 1678
 Collins et al. (2010) Collins D. C., Xu H., Norman M. L., Li H., Li S., 2010, ApJS, 186, 308
 Cox et al. (2006) Cox T. J., Dutta S. N., Di Matteo T., Hernquist L., Hopkins P. F., Robertson B., Springel V., 2006, ApJ, 650, 791
 Crain et al. (2009) Crain R. A., Theuns T., Dalla Vecchia C., Eke V. R., Frenk C. S., Jenkins A., Kay S. T., Peacock J. A., Pearce F. R., Schaye J., Springel V., Thomas P. A., White S. D. M., Wiersma R. P. C., 2009, MNRAS, 399, 1773
 Croton et al. (2006) Croton D. J., Springel V., White S. D. M., De Lucia G., Frenk C. S., Gao L., Jenkins A., Kauffmann G., Navarro J. F., Yoshida N., 2006, MNRAS, 365, 11
 Cullen & Dehnen (2010) Cullen L., Dehnen W., 2010, MNRAS, 408, 669
 Davé et al. (2001) Davé R., Cen R., Ostriker J. P., Bryan G. L., Hernquist L., Katz N., Weinberg D. H., Norman M. L., O’Shea B., 2001, ApJ, 552, 473
 Davé et al. (1999) Davé R., Hernquist L., Katz N., Weinberg D. H., 1999, ApJ, 511, 521
 Di Matteo et al. (2008) Di Matteo T., Colberg J., Springel V., Hernquist L., Sijacki D., 2008, ApJ, 676, 33
 Di Matteo et al. (2005) Di Matteo T., Springel V., Hernquist L., 2005, Nature, 433, 604
 Diemand et al. (2008) Diemand J., Kuhlen M., Madau P., Zemp M., Moore B., Potter D., Stadel J., 2008, Nature, 454, 735
 Dolag et al. (2009) Dolag K., Borgani S., Murante G., Springel V., 2009, MNRAS, 399, 497
 Dolag et al. (2005) Dolag K., Grasso D., Springel V., Tkachev I., 2005, J. Cosmol. Astropart. Phys., 1, 9
 Dolag et al. (2004) Dolag K., Jubelgas M., Springel V., Borgani S., Rasia E., 2004, ApJ, 606, L97
 Dolag et al. (2005) Dolag K., Vazza F., Brunetti G., Tormen G., 2005, MNRAS, 364, 753
 D’Onghia et al. (2010) D’Onghia E., Springel V., Hernquist L., Keres D., 2010, ApJ, 709, 1138
 D’Onghia et al. (2010) D’Onghia E., Vogelsberger M., FaucherGiguere C.A., Hernquist L., 2010, ApJ, 725, 353
 D’Onghia et al. (2011) D’Onghia E., Vogelsberger M., Hernquist L., 2011, in prep
 Duffy et al. (2010) Duffy A. R., Schaye J., Kay S. T., Dalla Vecchia C., Battye R. A., Booth C. M., 2010, MNRAS, 405, 2161
 Dunkley et al. (2009) Dunkley J., Komatsu E., Nolta M. R., Spergel D. N., Larson D., Hinshaw G., Page L., Bennett C. L., Gold B., Jarosik N., Weiland J. L., Halpern M., Hill R. S., Kogut A., Limon M., Meyer S. S., Tucker G. S., Wollack E., Wright E. L., 2009, ApJS, 180, 306
 Eisenstein & Hu (1999) Eisenstein D. J., Hu W., 1999, ApJ, 511, 5
 FaucherGiguère et al. (2009) FaucherGiguère C., Lidz A., Zaldarriaga M., Hernquist L., 2009, ApJ, 703, 1416
 Frenk et al. (1999) Frenk C. S., White S. D. M., Bode P., Bond J. R., Bryan G. L., Cen R., Couchman H. M. P., Evrard A. E., Gnedin N., Jenkins A., Khokhlov A. M., Klypin A., Navarro J. F., Norman M. L., et al., 1999, ApJ, 525, 554
 Gao et al. (2005) Gao L., Springel V., White S. D. M., 2005, MNRAS, 363, L66
 Gingold & Monaghan (1977) Gingold R. A., Monaghan J. J., 1977, MNRAS, 181, 375
 Gnedin (1995) Gnedin N. Y., 1995, ApJS, 97, 231
 Governato et al. (2010) Governato F., Brook C., Mayer L., Brooks A., Rhee G., Wadsley J., Jonsson P., Willman B., Stinson G., Quinn T., Madau P., 2010, Nature, 463, 203
 Governato et al. (2007) Governato F., Willman B., Mayer L., Brooks A., Stinson G., Valenzuela O., Wadsley J., Quinn T., 2007, MNRAS, 374, 1479
 Guedes et al. (2011) Guedes J., Callegari S., Madau P., Mayer L., 2011, ArXiv eprints
 Guo et al. (2011) Guo Q., White S., BoylanKolchin M., De Lucia G., Kauffmann G., Lemson G., Li C., Springel V., Weinmann S., 2011, MNRAS, p. 164
 Heitmann et al. (2008) Heitmann K., Lukić Z., Fasel P., Habib S., Warren M. S., White M., Ahrens J., Ankeny L., Armstrong R., O’Shea B., Ricker P. M., Springel V., Stadel J., Trac H., 2008, Computational Science and Discovery, 1, 015003
 Hernquist (1987) Hernquist L., 1987, ApJS, 64, 715
 Hernquist & Katz (1989) Hernquist L., Katz N., 1989, ApJS, 70, 419
 Hernquist et al. (1996) Hernquist L., Katz N., Weinberg D. H., MiraldaEscudé J., 1996, ApJ, 457, L51
 Heß & Springel (2010) Heß S., Springel V., 2010, MNRAS, 406, 2289
 Hockney & Eastwood (1981) Hockney R. W., Eastwood J. W., 1981, Computer Simulation Using Particles. McGrawHill
 Hopkins et al. (2009) Hopkins P. F., Cox T. J., Dutta S. N., Hernquist L., Kormendy J., Lauer T. R., 2009, ApJS, 181, 135
 Hopkins et al. (2008) Hopkins P. F., Hernquist L., Cox T. J., Dutta S. N., Rothberg B., 2008, ApJ, 679, 156
 Hopkins et al. (2009) Hopkins P. F., Hernquist L., Cox T. J., Keres D., Wuyts S., 2009, ApJ, 691, 1424
 Hopkins et al. (2009) Hopkins P. F., Lauer T. R., Cox T. J., Hernquist L., Kormendy J., 2009, ApJS, 181, 486
 Iapichino et al. (2011) Iapichino L., Schmidt W., Niemeyer J. C., Merklein J., 2011, MNRAS, 414, 2297
 Inutsuka (2002) Inutsuka S.I., 2002, Journal of Computational Physics, 179, 238
 Jenkins et al. (2001) Jenkins A., Frenk C. S., White S. D. M., Colberg J. M., Cole S., Evrard A. E., Couchman H. M. P., Yoshida N., 2001, MNRAS, 321, 372
 Jubelgas et al. (2004) Jubelgas M., Springel V., Dolag K., 2004, MNRAS, 351, 423
 Jubelgas et al. (2008) Jubelgas M., Springel V., Enßlin T., Pfrommer C., 2008, A&A, 481, 33
 Katz et al. (1992) Katz N., Hernquist L., Weinberg D. H., 1992, ApJ, 399, L109
 Katz et al. (1996) Katz N., Weinberg D. H., Hernquist L., 1996, ApJS, 105, 19
 Katz et al. (1996) Katz N., Weinberg D. H., Hernquist L., MiraldaEscude J., 1996, ApJ, 457, L57
 Kauffmann et al. (1993) Kauffmann G., White S. D. M., Guiderdoni B., 1993, MNRAS, 264, 201
 Kennicutt (1998) Kennicutt Jr. R. C., 1998, ApJ, 498, 541
 Keres et al. (2011) Keres D., Vogelsberger M., Sijacki D., Springel V., Hernquist L., 2011, ArXiv eprints
 Klypin et al. (2010) Klypin A., TrujilloGomez S., Primack J., 2010, ArXiv eprints
 Komatsu et al. (2011) Komatsu E., et al., 2011, ApJS, 192, 18
 Lucy (1977) Lucy L. B., 1977, AJ, 82, 1013
 Mihos & Hernquist (1994) Mihos J. C., Hernquist L., 1994, ApJ, 437, 611
 Mihos & Hernquist (1996) Mihos J. C., Hernquist L., 1996, ApJ, 464, 641
 Mitchell et al. (2009a) Mitchell N. L., McCarthy I. G., Bower R. G., Theuns T., Crain R. A., 2009a, MNRAS, 395, 180
 Mitchell et al. (2009b) Mitchell N. L., McCarthy I. G., Bower R. G., Theuns T., Crain R. A., 2009b, MNRAS, 395, 180
 Monaghan (1992) Monaghan J. J., 1992, ARA&A, 30, 543
 Monaghan (2005) Monaghan J. J., 2005, Reports on Progress in Physics, 68, 1703
 Moore et al. (1999) Moore B., Ghigna S., Governato F., Lake G., Quinn T., Stadel J., Tozzi P., 1999, ApJ, 524, L19
 Moore et al. (1998) Moore B., Governato F., Quinn T., Stadel J., Lake G., 1998, ApJ, 499, L5
 Morris (1997) Morris J., 1997, Journal of Computational Physics, 136, 41
 Muñoz et al. (2012) Muñoz D., Springel V., Marcus R., Vogelsberger M., Hernquist L., 2012, ArXiv eprints
 Murali et al. (2002) Murali C., Katz N., Hernquist L., Weinberg D. H., Davé R., 2002, ApJ, 571, 1
 Murante et al. (2011) Murante G., Borgani S., Brunino R., Cha S.H., 2011, MNRAS, 417, 136
 Naab & Burkert (2003) Naab T., Burkert A., 2003, ApJ, 597, 893
 Nagamine et al. (2005) Nagamine K., Cen R., Hernquist L., Ostriker J. P., Springel V., 2005, ApJ, 627, 608
 Navarro et al. (1996) Navarro J. F., Frenk C. S., White S. D. M., 1996, ApJ, 462, 563
 Navarro et al. (1997) Navarro J. F., Frenk C. S., White S. D. M., 1997, ApJ, 490, 493
 Navarro et al. (2010) Navarro J. F., Ludlow A., Springel V., Wang J., Vogelsberger M., White S. D. M., Jenkins A., Frenk C. S., Helmi A., 2010, MNRAS, 402, 21
 O’Shea et al. (2004) O’Shea B. W., Bryan G., Bordner J., Norman M. L., Abel T., Harkness R., Kritsuk A., 2004, ArXiv Astrophysics eprints
 O’Shea et al. (2005) O’Shea B. W., Nagamine K., Springel V., Hernquist L., Norman M. L., 2005, ApJS, 160, 1
 Pearce et al. (1999) Pearce F. R., Jenkins A., Frenk C. S., Colberg J. M., White S. D. M., Thomas P. A., Couchman H. M. P., Peacock J. A., Efstathiou G., The Virgo Consortium 1999, ApJ, 521, L99
 Pen (1998) Pen U., 1998, ApJS, 115, 19
 Percival et al. (2010) Percival W. J., et al., 2010, MNRAS, 401, 2148
 Perlmutter et al. (1999) Perlmutter S., et al., 1999, ApJ, 517, 565
 Petkova & Springel (2010) Petkova M., Springel V., 2010, MNRAS, p. 1851
 Pfrommer et al. (2006) Pfrommer C., Springel V., Enßlin T. A., Jubelgas M., 2006, MNRAS, 367, 113
 Press & Schechter (1974) Press W. H., Schechter P., 1974, ApJ, 187, 425
 Price (2008) Price D. J., 2008, Journal of Computational Physics, 227, 10040
 Price (2012) Price D. J., 2012, Journal of Computational Physics, 231, 759
 Puchwein et al. (2008) Puchwein E., Sijacki D., Springel V., 2008, ApJ, 687, L53
 Rasio (2000) Rasio F. A., 2000, Progress of Theoretical Physics Supplement, 138, 609
 Read & Hayfield (2011) Read J. I., Hayfield T., 2011, ArXiv eprints
 Read et al. (2010) Read J. I., Hayfield T., Agertz O., 2010, MNRAS, 405, 1513
 Read et al. (2008) Read J. I., Lake G., Agertz O., Debattista V. P., 2008, MNRAS, 389, 1041
 Rees & Ostriker (1977) Rees M. J., Ostriker J. P., 1977, MNRAS, 179, 541
 Regan et al. (2007) Regan J. A., Haehnelt M. G., Viel M., 2007, MNRAS, 374, 196
 Riess et al. (1999) Riess A. G., Filippenko A. V., Li W., Treffers R. R., Schmidt B. P., Qiu Y., Hu J., Armstrong M., Faranda C., Thouvenot E., Buil C., 1999, AJ, 118, 2675
 Robertson et al. (2004) Robertson B., Yoshida N., Springel V., Hernquist L., 2004, ApJ, 606, 32
 Robinson & Monaghan (2011) Robinson M., Monaghan J. J., 2011, ArXiv eprints
 Scannapieco et al. (2011) Scannapieco C., et al., 2011, ArXiv eprints
 Scannapieco et al. (2008) Scannapieco C., Tissera P. B., White S. D. M., Springel V., 2008, MNRAS, 389, 1137
 Schaye et al. (2010) Schaye J., Dalla Vecchia C., Booth C. M., Wiersma R. P. C., Theuns T., Haas M. R., Bertone S., Duffy A. R., McCarthy I. G., van de Voort F., 2010, MNRAS, 402, 1536
 Schuecker et al. (2004) Schuecker P., Finoguenov A., Miniati F., Böhringer H., Briel U. G., 2004, A&A, 426, 387
 Schuessler & Schmitt (1981) Schuessler I., Schmitt D., 1981, A&A, 97, 373
 Sijacki et al. (2007) Sijacki D., Springel V., Di Matteo T., Hernquist L., 2007, MNRAS, 380, 877
 Sijacki et al. (2011) Sijacki D., Vogelsberger M., Keres D., Springel V., Hernquist L., 2011, ArXiv eprints
 Silk (1977) Silk J., 1977, ApJ, 211, 638
 Smith et al. (2008) Smith B., Sigurdsson S., Abel T., 2008, MNRAS, 385, 1443
 Springel (2005) Springel V., 2005, MNRAS, 364, 1105
 Springel (2010a) Springel V., 2010a, MNRAS, 401, 791
 Springel (2010b) Springel V., 2010b, ARA&A, 48, 391
 Springel et al. (2005) Springel V., Di Matteo T., Hernquist L., 2005, MNRAS, 361, 776
 Springel & Hernquist (2002) Springel V., Hernquist L., 2002, MNRAS, 333, 649
 Springel & Hernquist (2003a) Springel V., Hernquist L., 2003a, MNRAS, 339, 289
 Springel & Hernquist (2003b) Springel V., Hernquist L., 2003b, MNRAS, 339, 312
 Springel & Hernquist (2005) Springel V., Hernquist L., 2005, ApJ, 622, L9
 Springel et al. (2008) Springel V., Wang J., Vogelsberger M., Ludlow A., Jenkins A., Helmi A., Navarro J. F., Frenk C. S., White S. D. M., 2008, MNRAS, 391, 1685
 Springel et al. (2005) Springel V., White S. D. M., Jenkins A., Frenk C. S., Yoshida N., Gao L., Navarro J., Thacker R., Croton D., Helly J., Peacock J. A., Cole S., Thomas P., Couchman H., Evrard A., Colberg J., Pearce F., 2005, Nature, 435, 629
 Stone & Norman (1992) Stone J. M., Norman M. L., 1992, ApJS, 80, 753
 Tasker et al. (2008) Tasker E. J., Brunino R., Mitchell N. L., Michielsen D., Hopton S., Pearce F. R., Bryan G. L., Theuns T., 2008, MNRAS, 390, 1267
 Teyssier (2002) Teyssier R., 2002, A&A, 385, 337
 Toomre & Toomre (1972) Toomre A., Toomre J., 1972, ApJ, 178, 623
 Torrey et al. (2011) Torrey P., Vogelsberger M., Sijacki D., Springel V., Hernquist L., 2011, ArXiv eprints
 Trac et al. (2007) Trac H., Sills A., Pen U., 2007, MNRAS, 377, 997
 Valdarnini (2011) Valdarnini R., 2011, A&A, 526, A158
 Vazza et al. (2009) Vazza F., Brunetti G., Kritsuk A., Wagner R., Gheller C., Norman M., 2009, A&A, 504, 33
 Viel et al. (2009) Viel M., Bolton J. S., Haehnelt M. G., 2009, MNRAS, 399, L39
 Viel et al. (2005) Viel M., Lesgourgues J., Haehnelt M. G., Matarrese S., Riotto A., 2005, Phys. Rev. D, 71, 063534
 Vogelsberger et al. (2009) Vogelsberger M., Helmi A., Springel V., White S. D. M., Wang J., Frenk C. S., Jenkins A., Ludlow A., Navarro J. F., 2009, MNRAS, 395, 797
 Vogelsberger & White (2011a) Vogelsberger M., White S. D. M., 2011a, MNRAS, p. 168
 Vogelsberger & White (2011b) Vogelsberger M., White S. D. M., 2011b, MNRAS, 413, 1419
 Vogelsberger et al. (2008) Vogelsberger M., White S. D. M., Helmi A., Springel V., 2008, MNRAS, 385, 236
 Wadsley et al. (2008) Wadsley J. W., Veeravalli G., Couchman H. M. P., 2008, MNRAS, 387, 427
 Weinberg et al. (2004) Weinberg D. H., Davé R., Katz N., Hernquist L., 2004, ApJ, 601, 1
 White & Frenk (1991) White S. D. M., Frenk C. S., 1991, ApJ, 379, 52
 White & Rees (1978) White S. D. M., Rees M. J., 1978, MNRAS, 183, 341
 Wiersma et al. (2009) Wiersma R. P. C., Schaye J., Theuns T., Dalla Vecchia C., Tornatore L., 2009, MNRAS, 399, 574
 Zel’Dovich (1970) Zel’Dovich Y. B., 1970, A&A, 5, 84
Appendix A Voronoi mesh statistics
Here we present some basic geometric properties of the Voronoi mesh in our cosmological simulations. As discussed in Section 2, it is beneficial for the accuracy of AREPO for the Voronoi mesh to remain as regular as possible throughout the simulation. To this end, the motion of the mesh vertices is slightly modified for highly distorted cells in order to obtain a mesh that has more “roundish” cells and is closer to a centroidal Voronoi tessellation where geometrical the centres of the cells coincide with the vertex points of the individual Voronoi cells.
We can quantify the quality of a mesh in a similar way as was done in S10. There, mesh quality indicators were calculated for a nonradiative simulation of a massive cluster (the “Santa Barbara Cluster”, Frenk et al., 1999), and it was demonstrated that the mesh indeed preserved its desired properties of reasonably round cells during the simulation. However, in this paper we have used a modified regularisation scheme based on the maximum faceangle, and we include cooling, star formation, and feedback, which drastically increases the dynamic range the simulations need to address. It is therefore important to verify whether our regularisation scheme performs sufficiently well in these more demanding simulations.
In Fig. 22, we first show in the top left panel the maximum face angle distribution at for all AREPO simulations. Our new regularisation scheme should guarantee that these angles do not get too large. We have set up the regularisation such that the mesh is steered towards face angles smaller than . As the upper left panel of Fig. 22 demonstrates, the mesh indeed avoids very large face angles with a maximum in the distribution around the desired value, i.e. the regularisation works as expected. The remaining three panels of Fig. 22 show distribution functions for the number of faces of the Voronoi cells, for the distance of meshgenerating points to the geometric centres of their cell in units of the fiducial cell radius of each cell, and finally, for the parameter which measures how “roundish” each cell is (here is the total surface area of a cell). The corresponding distributions for a random Poisson sample of vertex points are shown as black lines in all panels, for comparison. Note that we do not explicitly control the offset in our regularisation scheme, but the face angle criterion is of course correlated with this quantity. This allows us to obtain also quite small values for the distance of the geometric centre of the cell from the location of the mesh generating points. This is important for the numerical accuracy of the linear reconstruction. Overall, our Voronoi mesh is significantly “rounder” and much closer to a centroidal Voronoi tessellation than the mesh of a random point distribution.
The goal of our re and derefinement strategy is to keep all cell masses close to a predefined target cell mass, which we have chosen to be the total baryonic mass in the box divided by the total number of cells at the initial time. In this way, a narrow mass spectrum for the formed stellar particles is obtained, and a straightforward and evenhanded comparison to the SPH calculations done with GADGET becomes possible. As we have described in Section 2, we guarantee the approximately constant mass resolution by splitting cells that reach a mass larger than into two cells, while cells dropping in mass below are dissolved. We note that thanks to the Lagrangian mesh motion in AREPO, these re/derefinement operations are invoked only rarely. In the top panel of Fig. 23 we demonstrate that the re and derefinement scheme successfully keeps the cell masses in the desired mass range, yielding approximately a normal distribution around the target mass within the desired bounds.
The mass distribution of stellar particles at is shown in the bottom panel of Fig. 23. Since our star formation implementation does not allow star particles to be formed with a mass larger than , the stellar particle mass distribution is cut off at that value. On the other hand, the derefinement operations are constrained by the requirement that neighbouring cells cannot both be derefined in the same time step. As a result, it is possible that some cells can temporarily have masses below for a few time steps, and hence also some star particles with masses smaller than this value may form in principle. To protect against the formation of unreasonably low mass star particles, which may be prone to twobody effects, we however do not allow cells with mass less than to form any star particles, imposing a lower limit in the star particle mass distribution. As the bottom panel of Fig. 23 shows, star particles with masses between and make up only a tiny fraction, as desired, such that the suppression of the formation of extremely small star particles does not lead to any appreciable error. Note that starforming cells with a mass above will form stars with exactly , keeping the rest of the gas mass in the cell. This explains the small spike in the distribution at that mass value. Our analysis thus confirms that star particles with a narrow range of masses are formed, which helps to limit twobody relaxation effects and is well matched to the fixed gravitational softening we use for collisionless particles.
Appendix B Gravitational softening and regularisation scheme
The AREPO simulations presented in this paper use an adaptive gravitational softening length with a lower floor for the gas, whereas the SPH simulations done with GADGET employ a fixed comoving softening length equal to the lower floor. We checked that this does not bias our results in any way. To demonstrate this point we show in Fig. 24 the star formation history of the standard A_L20n128 simulation together with a run where we held the gravitational softening of the cells fixed at the same value as in the SPH calculations. Clearly this leads only to very minor differences in the star formation history. We also checked that disk half mass radii are not affected by this. As stated above this is due to the fact that star forming gas is pressurised by an effective EOS of the ISM and therefore gravitational softening effects do not play an important role in that regime.
The mesh regularisation in AREPO can be done in different ways and we have used a scheme that is based on the maximum face angle as described above. We have also done simulations with the original regularisation scheme presented in Springel (2010a), which is based on the displacement of the geometric centre of the cell. The resulting star formation history for the A_L20n128 is also shown in Fig. 24. Again this leads only to minor changes. We note that this difference becomes smaller with resolution and essentially vanishes. Regularisation scheme differences are only relevant for very low resolution simulations. Disk sizes are also not significantly affected by the details of the regularisation scheme. In Fig. 25, we show gas density maps of a matched object in A_L20n128 and the A_L20n128 run, which features an alternative regularisation scheme. We stress again that L20n128 is our lowest resolution setup, and therefore most sensitive to details of how exactly the mesh is treated. But even in this regime the differences in the gas density maps are very small, and the size and overall extent of the gas disk do not change. Note that due to the low resolution of L20n128, individual Voronoi cells are clearly visible in the map, highlighting the changes in mesh geometry and cell shapes induced by different regularisation schemes.
Appendix C Code performance
The total runtime of our AREPO simulations is in the worst case only about longer than the corresponding GADGET run at the same nominal resolution. Fig. 26 shows in which code parts most of this CPU time is spent. Obviously, the rather complex mesh construction and mesh update in AREPO takes up a significant amount of time. However, in both codes the TreePM based gravity calculation consumes a large amount of time, too, alleviating speed differences in the hydrodynamical solvers. In any case, we argue that AREPO is actually surprisingly fast given the extremely complex mesh management operations that are required. Given the small difference in raw speed, one can rightly describe the overall performance of both codes as similar. However, accounting for the fact that we here find that SPH leads to a systematic and significant offset in the cooling rates of haloes at late times, as well as in galaxy sizes, the discussion of CPU time requirements at the same nominal resolution is a bit moot. Ideally one would like to select the fastest method for a given desired accuracy, and as it appears, this can be reached with AREPO more efficiently, whereas standard SPH’s systematic bias may be difficult or even impossible to overcome. The white parts in Fig. 26 account for the time spent in the time integration, star formation, cooling, and other minor contributions. We note that the I/O contribution is a bit larger than typical, because we wrote out snapshots at a very high frequency.
Appendix D Strong and weak scaling
In Fig. 27, we show a strong scaling test of AREPO, from to cores, carried out on Ranger at the Texas Advanced Computing Center (TACC). Here, the simulation size has been kept constant at , and only the number of compute cores has been increased. The reported times have been averaged for three full timesteps at high redshift. In this regime, gravity accounts for of the computational time, meshconstruction and mesh bookkeeping consumes of the time, while the calculation of the hydrodynamical fluxes amounts to of the time. The remainder is needed for miscellaneous items such as domain decomposition, radiative cooling and star formation. In order to clearly show the scalability of the most important different parts of our code, we have included separate measurements in Fig. 27 for longrange gravity (by means of FFTs), shortrange gravity (done through a treewalk), the necessary tree construction, the mesh construction and management, the hydrodynamic flux calculations, and the domain decomposition. We see that the code shows quite good strong scaling, despite the tightly coupled nature of the system. Some losses are in particular apparent for the mesh construction. These arise because the more cores are used, the larger the number of spatial domains in which the simulation volume is cut. This enlarges the surface area of domain boundaries, and as a result the cumulative “ghost” region volume in which the mesh has to be constructed twice on two neighbouring processes to ensure seamless consistency across the domain boundary.
For largescale cosmological applications it is more important that the code parallelisation shows good weak scaling behaviour. GADGET has been used for many largescale simulations and we will demonstrate here that AREPO performs also well in weak scaling tests. For these tests the computational load per core is kept constant, and the simulation volume and particle/cell number is scaled accordingly. In Fig. 28, we show two different weak scaling series carried out on Ranger at the TACC. They differ in their particle and cell load per core. In the results shown in the top panel, this load was ‘maximal’, corresponding to million cells and particles per core, where the current version of AREPO requires up to 1600 MB memory consumption per core in the peak. Because some memory is needed for the operating system and the MPI communication library, not all of the physical memory is available for the application code on the Ranger platform. Unfortunately, the amount of memory consumed by the MPI subsystem increases with increasing size of the MPI partition. This in fact prevents us from running the “full load” configuration for partitions larger than 4096 cores. This also prompted us to create a second weak scaling series shown in the bottom panel of Fig. 28, where we reduced the load by almost a factor of 2. This allows us to scale AREPO up to cores with MPI tasks.
It is apparent that the shortrange gravitational tree calculation, the Voronoi mesh construction, and the hydrodynamical flux calculation show excellent weak scaling (which runs horizontally in the plots of Fig. 28). However, there are deviations from perfect scalability for the FFTbased longrange gravitational force calculation, the domain decomposition, and the tree construction. This is primarily because all three of these parts involve substantial alltoall communication. For larger processor counts, especially for cores, these communication costs start to affect the weak scalability of our code and lead to losses in efficiency. One reason for the suboptimal scaling in this regime lies in the FFT part of the code. The slabbased parallelisation of the FFTs we use for the longrange gravity calculation does not scale to core numbers larger than the size of the FFT, a regime we enter once more than 4096 cores are used (because the FFTsize reaches at this point and cannot be grown further due to memory constraints). We note that this can easily be optimised by either using a blockstructured FFT decomposition or by using a mixed approach for the parallelisation with either threads or OpenMP. Overall we find that AREPO shows good weak scaling behaviour and can readily be applied to largescale cosmological hydrodynamics simulations.