Genetically modified halos: towards controlled experiments in \LambdaCDM galaxy formation

Genetically modified halos: towards controlled experiments in CDM galaxy formation

Nina Roth, Andrew Pontzen, Hiranya V. Peiris
Department of Physics and Astronomy, University College London, Gower Street, London, WC1E 6BT, UK
Accepted —. Received —; in original form —

We propose a method to generate ‘genetically-modified’ (GM) initial conditions for high-resolution simulations of galaxy formation in a cosmological context. Building on the Hoffman-Ribak algorithm, we start from a reference simulation with fully random initial conditions, then make controlled changes to specific properties of a single halo (such as its mass and merger history). The algorithm demonstrably makes minimal changes to other properties of the halo and its environment, allowing us to isolate the impact of a given modification. As a significant improvement over previous work, we are able to calculate the abundance of the resulting objects relative to the reference simulation.

Our approach can be applied to a wide range of cosmic structures and epochs; here we study two problems as a proof-of-concept. First, we investigate the change in density profile and concentration as the collapse time of three individual halos are varied at fixed final mass, showing good agreement with previous statistical studies using large simulation suites. Second, we modify the mass of halos to show that our theoretical abundance calculations correctly recover the halo mass function. The results demonstrate that the technique is robust, opening the way to controlled experiments in galaxy formation using hydrodynamic zoom simulations.

galaxies: evolution — galaxies: formation — cosmology: dark matter

1 Introduction

Understanding galaxy formation requires us to take account of the variety of halo assembly histories that build today’s population. Many pressing questions — such as the origin of varying morphologies (e.g. van Dokkum et al., 2013; Papovich et al., 2015) and bulge sizes (Kormendy, 2015) — will be answered by understanding the interplay between complex, non-linear physics and the various histories for mass accretion. The fundamental difficulty is that these histories are in turn determined by the random initial conditions seeded in the early universe.

This paper is the first in a series to directly tackle that problem using a novel approach. The most typical solution is to simulate large numbers of galaxies in a representative volume (e.g. Genel et al., 2014; Schaye et al., 2015; Codis et al., 2015). However, this is computationally expensive and limits the resolution that can be achieved for any single object. Conversely, zoom-in simulations achieve the maximum level of physical detail for a given computational time. They have been used to establish that qualitatively different processes come into play at sub-kpc resolutions, where processes within the interstellar medium begin to be resolved (e.g. Governato et al., 2007; Guedes et al., 2011; Brook et al., 2011; Hopkins et al., 2013; Pontzen & Governato, 2014). But such approaches only sample over a small and potentially biased range of merger histories. A third tactic is to use isolated, idealised set-ups to test particular hypotheses (e.g. Naab et al., 1999; Robertson et al., 2006; Hopkins et al., 2009); but these by definition lack a full cosmological environment. Thus, it is difficult to quantitatively connect the results of isolated and zoom simulations to the observed galaxy population.

Our aim is to combine the best aspects of these three types of numerical study. We proceed by systematically changing aspects of individual galaxies’ histories (such as mass and merger history) within a cosmological simulation, while keeping track of the statistical likelihood of the changes to understand the relative abundance of objects of different types. This can be achieved by using the Hoffman-Ribak algorithm (Hoffman & Ribak 1991, hereafter HR91; see also Bardeen et al. 1986; Bertschinger 1987; van de Weygaert & Bertschinger 1996 for further theoretical background). A more common use for HR91 is to obtain simulations resembling the local universe by turning a given observational dataset (e. g. the local distribution of galaxies) into a prescription for the initial conditions of a numerical simulation (Bistolas & Hoffman, 1998; Mathis et al., 2002; Kravtsov et al., 2002; Klypin et al., 2003; Heß et al., 2013; Jasche & Wandelt, 2013; Sorce et al., 2014). There is a significant literature that uses this technique to study the formation history of the Local Group (Zavala et al., 2009; Klimentowski et al., 2010; Libeskind et al., 2010; Iliev et al., 2011; Forero-Romero et al., 2011; Kitaura, 2013; Doumler et al., 2013; Dayal et al., 2013; Nuza et al., 2014; Brook et al., 2014).

Instead, we propose to use constrained initial conditions as an experimentation toolkit for the formation of a particular halo embedded in a cosmological volume. This approach has precedent: for example, Frenk et al. (1999) used the HR91 method to create galaxy cluster initial conditions for a comparative study of numerical simulation codes. More recently, Romano-Díaz et al. (2006, 2007) and Hoffman et al. (2007) simulated a single dark matter object of with different substructures to understand the impact of quiescent and violent accretion phases on the inner properties of the halo, and the origin of the universal halo density profile. By including baryonic physics, Romano-Díaz et al. (2011a, b, 2014) studied galactic properties in extremely overdense regions which may host the early precursors of QSOs. In a similar vein, Dubois et al. (2012) use the numerical implementation from Prunet et al. (2008) to investigate the accretion of material in the cores of very massive halos to shed light on the formation of black holes at high redshifts.

In all of the above cases, a simulated object was created by constraining the properties of a region defined by an analytical profile (typically a Gaussian peak). Constrained properties included the height of the density peak at the origin and its first- and second-order derivatives (see e. g. Prunet et al. 2008 and the Appendix of Romano-Díaz et al. 2011a). This creates objects that are well-defined in a theoretical sense (e. g. one can predict their collapse time reasonably well), but that represent configurations which may or may not be common in fully random initial conditions (ICs).

What sets our work apart from these previous efforts is that we always start with a ‘reference’ halo from a simulation based on fully random ICs. We are able to impose constraints on volumes of completely arbitrary shape, using the particles that make up a single dark matter halo embedded in a cosmological volume. Once the constraints are applied, we re-run the simulation and compare the results to the original reference run.

This has two immediate benefits. First, we can fine-tune selected properties of the halo while demonstrably ensuring that the constrained object is as similar as possible to the reference run — a controlled ‘genetic modification’ (GM) of the halo. Second, we can calculate the change in the likelihood of the field after the modification; in other words, we can assess the relative abundance of the genetically-modified systems compared to the original. This will allow us to test whether connections between merger history and morphology quantitatively account for observed population statistics.

The current work provides a first illustration of both these aspects of the technique. Specifically, we study the properties of several halos as their total mass and merger history are systematically changed. We investigate the concentration at for different mass accretion histories and find overall excellent agreement of our constrained halos with relations derived from statistical averages over large simulations. There are many studies that connect the concentration parameter to other halo properties like the mass, collapse time or mass accretion history, halo shape and angular momentum (e. g. Bullock et al., 2001; van den Bosch, 2002; Wechsler et al., 2002; Zhao et al., 2003; Reed et al., 2005; Bett et al., 2007; Macciò et al., 2007; Neto et al., 2007; Duffy et al., 2008; Macciò et al., 2008; Zhao et al., 2009; Ragone-Figueroa et al., 2010; Prada et al., 2012; Ludlow et al., 2013, 2014; Klypin et al., 2014; Correa et al., 2014, 2015a, 2015b). Often, these studies operate by considering a statistical sample from a large volume simulation to find correlations and provide fitting functions. Even though the statistical power in recent simulations is excellent, there is still considerable scatter around the median relations. Since the density profile of dark matter halos is an important ingredient in theoretical models, it is important to understand these correlations and the scatter. Given the large number of parameters that could influence the evolution of a halo, principal component analysis has been used to investigate correlations between them (Skibba & Macciò, 2011; Jeeson-Daniel et al., 2011; Wong & Taylor, 2012). Our approach of designing ‘experiments’ in galaxy formation provides a complementary approach to computationally expensive statistical studies.

This paper is organised as follows: in Sec. 2 we give a brief outline of the HR91 technique and our specific implementation. Section 3 contains details of the numerical simulations that are used to obtain the results in the rest of the paper. In Sec. 4 we provide a brief illustration of some of the constraints we have applied to the reference initial density field, focusing on influencing a single halo traced by its particles. Next, we study the results of designing different merger histories for a set of halos in Sec. 5, focusing on their collapse-concentration relation. In Sec. 6 we discuss a method for assessing the relative abundance of the modified halos by defining a measure, and show that our results are consistent with the cosmological halo mass function. We summarise in Sec. 7. Finally, Appendices A and B contain the mathematical details of our reformulation of the HR91 technique including a translation between our notation and theirs.

2 Outline of the Method

We now present a brief outline of the mathematical technique by which initial conditions can be generated that satisfy certain constraints, while remaining consistent with a CDM power spectrum. This technique is described in a slightly different formulation by HR91. The full derivation can be found in Appendix A.

By assumption, the density field in the early universe is linearly perturbed around a background density , so that


where is a Gaussian random field with statistical properties specified by the CDM transfer function and inflationary tilt.

Generating ICs involves sampling the Gaussian random field at a list of discrete points , where the integer value decides which point we are discussing. In particular, when running a uniform-resolution cosmological simulation with particles on a box side, runs from to . The sampled field then consists of an -length vector , where an element is given by . The values of are drawn from a multivariate Gaussian probability distribution with mean and covariance matrix . Cosmological initial conditions have zero mean, , but the HR91 technique is not limited to this case.

A constrained field is defined by requiring for some constraint vector , which also contains elements. In general, is real, though the formalism also extends to the case where it is complex-valued. A simple example would be to fix the density contrast to 0 at position . This requires for (before normalisation, see below) and 0 otherwise, and . Throughout, we will use Greek indices in this way to denote values of either or at a specific grid position .

To actually create a field satisfying any given constraint, one could sample repeatedly from the underlying population until obtaining a realization that satisfies (or is close to satisfying) the requirement. However, such an accept-reject algorithm would be computationally expensive to implement in practice; instead, the HR91 technique makes a mathematical rearrangement that requires only one set of random numbers to be generated. As detailed in Appendix A, this rearrangement also shows that a constrained Gaussian random field remains Gaussian. This allows us to apply a large number of constraints independently, with the final result obeying for each where the Roman index denotes the different constraints. The properties of the constrained field are then determined by a new mean and covariance


provided that the have been orthonormalized111Note that this orthonormalization can always be arranged for any set of non-conflicting original constraints. in the sense that .

Name Halo Constraint Section
Reference N/A none 3
H24-MH Halo 24 4, 5
H24-MH* Halo 24 4, 5
H37-MH Halo 37 4, 5
H37-MH* Halo 37 4, 5
H40-MH Halo 40 4, 5
H40-MH* Halo 40 4, 5
H24-mass Halo 24 4, 6
H37-mass Halo 37 4, 6
H40-mass Halo 40 4, 6
Table 1: Overview of the simulations used in this paper. For more details on the individual runs, see the text in the sections mentioned in the last column. MH and MH* stand for the two different ways of constraining the merger history of the halo. The simulations marked in bold are not actually used because they are not in equilibrium at (see Sec. 5 for a discussion).

A realization of the constrained field could therefore be obtained by calculating and and drawing random numbers accordingly. However, in practice, dealing directly with from Eq. (3) becomes prohibitively expensive for large . The problem is that, whereas is the CDM power spectrum and therefore diagonal in Fourier space, is generally not sparse in either pixel or Fourier space. Instead, one can make the ansatz that a realization obeying constraints, , can be obtained starting from a realization of the unconstrained field, , via


where is a matrix that depends on and .

By requiring that obeys the correct statistics and, additionally, requiring that the changes made to the field are minimal, one can uniquely derive the HR91 solution for . The details are given in Appendix A, with the result that


where we have defined to represent the value of the constrained quantities in the unconstrained realization, and again require that the are orthonormalized. This reduces the actual calculation to a series of vector multiplications and summations in Fourier-space (since is diagonal there). The memory requirements are managable since we need only store vectors of length , instead of the matrix .

Consequently, a continuum of constrained realizations can be generated from a single realization of the original ensemble. We select a single dark matter halo from a reference run at , and then return to the initial conditions and place constraints on the particles that make up this object. In this way our constraint regions are defined directly via the halo particles, without requiring any assumptions about the properties of density peaks in the initial conditions, or any type of direct smoothing of the density field (only indirectly through the halo finding at ).

3 Simulation setup

All our simulations were run with P-Gadget-3 (Springel, 2005; Springel et al., 2008). The initial conditions have been set up at redshift and evolved to , saving 100 snapshots from equally spaced in scale factor. The cosmological model is , , , , , , corresponding to a WMAP5 cosmology (Dunkley et al., 2009). While these cosmological parameters have been revised in more recent datasets, they allow an easier comparison of our results with the literature. All simulations have a (comoving) box size of , and dark-matter particles, resulting in a particle mass of . The Plummer equivalent force softening length (which limits the smallest accessible scales) is kpc in comoving units.

We use the subfind code (Springel et al., 2001), which finds halos with the friends-of-friends (FoF) method. subfind also identifies subhalos inside the top level FoF groups, but we always use the whole group for particle tracking. Each FoF group is assigned a unique number, sorted in descending order by mass. subfind also provides a list of halo particle IDs, which allows us to track the halos between snapshots and across different simulations (by selecting those objects which have the most particles in common at ). We choose the standard FoF linking length of 0.2 times the mean interparticle distance.

Our analysis makes use of the Python module pynbody (Pontzen et al., 2013). We select halos with mass at , which are well-resolved but not the most massive (and therefore rare) objects in the box. Throughout this paper, refers to the halo mass contained in , the radius within which the mass density is 200 times the critical density of the universe at that time. We construct halo merger trees by tracing halo particles from backwards in time through each simulation snapshot. This allows us to determine the mass accretion history and other internal properties of each object as a function of time. We take advantage of the fact that the density field is first set up by assigning one particle to each grid node (the displacements are applied later). This means that any particle at can be traced back to a grid position in the initial conditions, and no additional interpolation is necessary to calculate .

Table 1 gives an overview of the different runs that are used in this paper; the last column provides the section where the constraints are described and the results are discussed. In order to show that the technique is robust, we have selected three halos of similar mass in the reference run, and constrained their properties in different ways. Therefore each simulation name contains a halo number and constraint type.

Figure 1: Left panel: The density of the reference ICs (black circles) and modified H40-MH-2 ICs (red crosses) for the early collapse constraint, where the density of the 10% innermost particles is increased by a factor of 2. The slice is 5 kpc wide in the - and -coordinates, to give an impression of the 3D structure. Each symbol corresponds to a single particle/initial grid point. The constrained density field maintains the complicated (sub-)structure that was present in the reference run. Right panels: The same two ICs as a 2D projection in the -plane. Only those particles that form each halo at are shown here; it is these particles that are used for generating the constraint in our algorithm. The higher central density is clearly visible in the constrained case. The results of these simulations will be discussed in detail in Sec. 5.


4 Illustration of constraints

We now present a simple illustration of the technique with which we generate a density constraint. We will discuss the results obtained from running simulations with these constraints in the following sections.

Our approach constrains the actual Lagrangian region that collapses into a halo at ; by contrast, in previous work the constraints were typically chosen to follow some analytical form, in order to connect the constraints to theoretical models such as Press-Schechter theory (e. g. van de Weygaert & Bertschinger 1996, Romano-Díaz et al. 2006). Since we know which particles are going to collapse in the reference run, we do not need to assume a specific form for the peak or a smoothing scale. The resulting constrained halo will be very similar to the reference object, unless the constraint radically changes the collapsing region, e. g. by introducing a large overdensity in a region which only forms an intermediate mass halo in the reference run.

Designing the constraints for a given modification to the final halo requires a physical understanding of the evolution. Ultimately a proposed constraint must be tested by trialling the changes and testing that they have the desired effect and that they are statistically consistent with the modified halo existing in the unconstrained universe. We will demonstrate both of these properties over the remainder of the paper.

Changing the mass can be achieved by changing the density contrast of the halo particles in the initial conditions. By creating a larger or smaller overdensity, we influence the final mass by increasing or decreasing the overall size of the region which has the average threshold density to collapse by a specified redshift (Press & Schechter, 1974). We term this a density constraint: in the initial conditions, we calculate the average mass overdensity of all particles in the reference halo


where we again use the fact that each particle corresponds to a grid position . Before orthonormalization, the value of the constraint vector is then for each particle which belongs to the halo, and otherwise. The density can now be increased or decreased by enforcing the value of . Results of simulations with different choices for that produce halos with higher or lower mass at will be used in Sec. 6.

More specifically, according to the Press-Schechter argument, the collapse time of a halo is related to its peak height , where is the variance of the density field smoothed on a scale . Therefore by fine-tuning the overdensity on different scales within the initial conditions we can modify the accretion history. In particular, by increasing (or decreasing) the density contrast in an inner region of the halo, while requiring that the overall density contrast of the halo particles stays the same, we are able to generate a halo with very similar mass at , but a faster (or slower) accretion history.

Figure 2: Left panel: Mass accretion history for early (red solid) and late collapse (blue dashed) runs, expressed by the FoF mass (all particles assigned by the halo finder). The black solid line with points shows the same halo in the reference run; each point is one snapshot, illustrating the time resolution of our simulations. Right panel: same but for the virial mass , which does not converge to a common value at late times because probes the inner regions of the halo (see Figure 3), which are affected by the collapse time.

Figure 1 illustrates an example of such a constraint acting on the initial conditions. In the left panel, we show a slice through the dark matter density field in the initial conditions, centred on the halo’s centre of mass. Each black circle corresponds to a density value in the reference run, and the red crosses show the same position in the constrained run. The slice is 5 kpc wide in the - and -coordinates, to give an impression of the 3D structure. Here, we show the constraint that will be used in the ‘early’ run (discussed in detail in Sec. 5), where we have increased the density of the innermost 10% of halo particles to be a factor of two higher, while keeping the overall density the same. An alternative approach is to actually identify substructures at an intermediate redshift and apply the ‘inner’ constraints to those specific particles. We have tried this second approach for the present work, selecting the particles which have already collapsed around by constructing a merger tree from the subfind output in the reference run. Table 1 contains an overview of all simulations used in this paper; we denote the first method of constraining the inner region with and the second by there. The two modification methods give near-identical results (Sec. 5), making the outcomes reassuringly insensitive to the intuition guiding the modifications. We have found that it is also possible to add further constraints to modify the build-up in different subhalos and so fine-tune the accretion history to any required degree.

As explained in Sec. 2, the modified field is constructed to follow the peaks and troughs of the underlying density field, thereby maintaining the same substructure as much as possible. In the right panel, we show the 2D projection (-plane) of the density of halo particles in the initial conditions, again for both the reference run and constrained run. In both cases, the density is calculated for all particles that are part of the halo at , which have been traced back to . The effect of increasing the density in the innermost region can be clearly seen in the constrained run (red border). In addition the second constraint, which keeps the overall mass the same, leads to a compensation effect in the initial conditions, removing some particles in the outer regions which fall into the reference halo but not the constrained one. We will discuss the results of simulations with these modified initial conditions in the next section.

5 Merger history and concentration-collapse relation

So far we have looked at how applying various constraints modifies the initial linear overdensity field. We will now consider the changes that result when the new initial conditions are used in a numerical simulation, starting with our modified merger history.

The mass accretion histories for simulations with the H40-MH-2 ‘early’ (circles) and H40-MH-0.5 ‘late’ collapse constraint (crosses) are shown in Fig. 2. Here, we chose the innermost 10% of particles and changed their overdensity by a factor of 2 (0.5) for the early (late) collapse cases. In the left panel we show the time evolution of the total mass (including all substructures) for the two constrained runs and the reference halo. It is clear that the accretion rates differ quite significantly at early times, but are compensated at late times, leaving the overall mass of the objects the same. In the right panel we show the time evolution of . This quantity only measures the mass up to instead of the total mass of linked FoF particles (which can extend out to several ). As in the former case, the accretion rate follows the expected behaviour in the early and late collapse cases. However, at late times, differs between the constrained runs and the reference runs. This is due to a change in the halo density profile related to the collapse time, as we will now explore.

The halo radial density profiles for these three simulations at are illustrated in Fig. 3, with inset panels showing projected density maps. As the collapse is delayed, the slope in the inner regions becomes less steep. The location of the virial radius is indicated by an arrow in each case; as expected given our discussion above, this is displaced inwards by the relative shallowness of the late collapse H40-MH-0.5 case.

The difference between the density profiles can be encapsulated in the concentration parameter


where is the virial radius of the halo (defined in Sec. 3), and is the scale radius in the NFW density profile (Navarro et al., 1997)


We fit an NFW profile to each of our halos at , after determining its centre using a shrinking-sphere method and estimating the density in 100 radial bins of equal size. In order not to contaminate the fits with numerical artefacts introduced by the finite particle resolution, we exclude from the fit the innermost regions (2 times the softening length, ) which are affected by the force softening (e. g. Power et al. 2003), and regions with , which may not be relaxed. We tested that the choice of the minimum and maximum radius has negligible impact on the estimate of . The measured values of the concentration are , and for late, reference and early simulations respectively.

Figure 3: Density profile of the reference halo (black dot-dashed) and the ‘early’ (blue dashed) and ‘late’ (red solid) constrained runs at . The leftmost arrow indicates the softening length of the simulation, and the other arrows indicate the virial radius of each halo. Inset panels: density projection (-plane) of the resulting halos at . All panels show a region 2.5 Mpc across, include only the FoF group particles, and use the same colour scale for the column density.

The GM method allows us to study the relationship between the collapse time of a halo (defined below) and its concentration as measured at . Using a large statistical sample, Wechsler et al. (2002) found that , the scale factor at collapse time. We follow their procedure to obtain the collapse scale factor by fitting the mass accretion history of each halo with


with and . Extensions to this simple function have been proposed by e. g. Tasitsiomi et al. (2004), McBride et al. (2009) and Correa et al. (2014), but for our purposes these are not necessary: the refined formulae are designed to accurately represent the median mass accretion histories for many halos, and the corrections are smaller than the scatter between individual halos.

There are some differences between the conventions of Wechsler et al. (2002) and the present work which we need to understand before proceeding. In the left panel of Fig. 4, the grey band and black dashed line show the average concentration and scatter from measuring the properties of halos of mass in our unmodified box. Wechsler et al. (2002) used a slightly different definition of halo mass from the one used in our study. Instead of defining with respect to the critical density of the universe, they define relative to the mean density.

To show how this affects the measured concentration, the green points show a sample of concentration parameters estimated using , using the same halos that were used to generate the grey shaded region. There is an overall upward offset of these points relative to the grey band because is correspondingly larger. The black solid line and the green band show the median relation and scatter predicted by Wechsler et al. (2002) (taken from their Fig. 7), corrected by a factor of 0.8 following Duffy et al. (2008) to account for their different (, instead of in our simulations).

In summary, once the differences in conventions and cosmological parameters are taken into account, we can reproduce the median results of Wechsler et al. (2002) in our unmodified boxes. The scatter from our simulation is also compatible with the much larger Wechsler et al. (2002) sample; outliers are likely due to the fact that we do not pre-select relaxed halos as they do. For consistency with the rest of the paper, we will use all quantities derived w. r. t. critical density in the following analysis.

Figure 4: Halo concentration parameter as a function of the collapse time. Left panel: Our reference simulation gives a volume to probe the relationship using the traditional statistical technique. Taking 120 halos from our simulation, this results in a scatter of points in the region of the grey band. To compare with existing literature, we need to redefine (and hence ) relative to the mean (rather than critical) density, after which these halos are represented by the green points with error bars. The black solid line and green band show the average relation and scatter as predicted by Wechsler et al. (2002), multiplied by a factor of 0.8 to account for their different choice of (Duffy et al., 2008). Right panel: Our three constrained halos (24, 37 and 40), showing fits to each constrained family individually (colours) and all of them together (black dashed). The grey band is the same as in the left panel. Together the panels establish that (left) halos in our reference volume recover the known relationship between concentration and collapse scale factor; and (right) relationships consistent with this relation are also recovered individually by each GM family. The scatter of slopes between different families is expected (see text).

We are now ready to see how this relationship emerges when using GM halos instead of a statistical sample. For this study, we have selected three different halos in the reference run, which are all of similar mass (). The right panel of Fig. 4 shows the result for 13 simulations (4 each for halos 24 & 37, and 5 for halo 40) with different collapse times. We call each set a ‘halo family’, including the reference run. The right panel of Fig. 4 shows the results for the three families illustrated by red diamonds, green circles and blue squares for family 24, 37 and 40 respectively.

The slopes of the three families appear to be consistent with, but scattered around, the population average (grey band). We find that each halo family is well-described by a linear relation


which contains the offset as an additional parameter compared to the fit used in Wechsler et al. (2002). These fits are shown in the right panel of Fig. 4 as coloured solid lines; we also fit all 13 points together (black dashed line). In addition, the grey band shows the scatter expected for halos in our selected mass bin, as in the left panel. The fit to all 13 simulations is very similar to the median relation from our unconstrained box which is consistent with the larger sample from Wechsler et al. (2002).

The shift of each family member along its line is dictated by the direction and amplitude of the density constraint in the initial conditions. Higher values of the density in the inner region shift a point towards the top left, and lower values to the bottom right w. r. t. the reference run. This gives us considerable insight into the kind of results that can be expected from GM compared to large population studies. The scatter of individual simulations within a GM family is very small — in other words, the concentration is highly predictable from a single variable. This is because, as we have previously emphasized, the history of each halo within a single family is as similar as possible to all the others. The normal scatter in the concentration–collapse relation is then seen to be due to factors that are not being constrained within a single family (such as more detailed aspects of the merger history or other variables such as halo spin). The GM technique allows for a detailed exploration of results from specific, precise changes.

For halo 24 (red diamonds), two of the results are nearly identical: the point with the highest concentration value is actually two points nearly on top of each other. These points have been obtained using the two methods for setting mass accretion history constraints discussed previously, emphasising that they can lead to very similar results. Indeed, for each halo we have performed constrained runs using both methods, and a mixture of the resulting datapoints are shown in Fig. 7. For a list of all the simulations used in this paper, see Table 1.

The results of four additional runs (one each for halos 24 and 37, two for halo 40) are excluded from this Figure. In each case, the estimated density profile was not well-described by an NFW profile due to the halo undergoing a merger or the presence of large substructures. This is in agreement with Zhao et al. (2003) who find that at least part of the scatter around the Wechsler et al. (2002) relation is due to poor fits to the NFW profiles and the mass accretion history.

6 Likelihood of the modified field

As explained previously, the HR91 algorithm constructs a constrained realization which is equivalent to (but much more efficient than) rejection sampling, i. e. repeatedly drawing from an ensemble of CDM universes until one obtains a realization that satisfies a given number of constraints. However, a naive choice of constraints can easily result in extreme configurations which are very unlikely to occur within the Hubble volume of the real universe. Depending on context, this could even be intentional (e. g. when investigating rare objects, Romano-Díaz et al. 2011a, b; Dubois et al. 2012; Romano-Díaz et al. 2014); but nevertheless it is important to understand how likely it is for a given constrained configuration to arise, relative to the reference realization. We now derive a general expression for evaluating this likelihood and show how it is related to the abundance of halos when changing the mass.

Figure 5: Left panel: The relationship between and the initial overdensity for different halo families (H24-mass, H37-mass and H40-mass; see Table 1). Lines show the theoretical prediction from Eq. (13), whereas points give the actual change measured from the IC generator output, confirming that the algorithm is operating as expected. Right panel: values (points) can be interpreted as giving the relative abundance of the halos within each genetically-modified family, and therefore should agree with estimates from a halo mass function (lines). The agreement is indeed good except for the H37-mass-1.5 point which appears to have too small a mass at compared to expectations. This is because the halo mass function is based on an average mass build-up rate, whereas in this specific case the mass will only be acquired after a major merger in around (see Fig. 6 and discussion in text).

We can compare the unconstrained and constrained fields with respect to the unmodified CDM covariance matrix by evaluating the change in , defined as


where is a field with constraints. This constrained field has a relative abundance in the universe of compared to the original, unconstrained field . Since this is only a relative abundance, applying a constraint to a halo that is rare in the reference simulation will in general also generate a rare object in the constrained run. We therefore modify several halos in a similar way, in order to reduce the impact that picking a rare object may have on any of our results.

One can calculate directly from the density field, but we can also expand the above equation analytically by inserting Eq. (5) and making use of . This leads to a series of cancellations, with the final result


where we again use to express the value of the constraints in the underlying realization.

Crucially, the details of the original realization have disappeared except in the initial values of the constrained quantities, . In other words, the relative likelihood of the constrained simulation compared to the unconstrained case is dependent only on the choice of constraints. It is therefore specifically related to properties of the individual halo, not to details of its surroundings. This is another very desirable property of the HR91 formalism and reflects the minimality of the changes made to the field going from to .

For a single constraint, Eq. (12) has a particularly transparent interpretation. Because of the normalization condition, the variance of for in unconstrained realizations is


Thus, for a single constraint, a change in of corresponds to a variation in the property measured in the population-at-large.

The left panel of Fig. 5 shows for a single constraint as a function of , the ratio between a halo’s average density contrast after the constraint and its value in the reference run (see Sec. 4). The values are calculated directly from the fields (points) and using Eq. (12) (lines); the two methods agree to within numerical accuracy, which is a useful verification of the algorithm. As before, the results for the three families are illustrated by red diamonds, green circles and blue squares for haloes 24, 37 and 40 respectively. The minimum at , as well as the symmetry, is expected for a zero-mean Gaussian random field.

6.1 Connecting initial conditions and non-linear structure

In this section, we work towards establishing a quantitative connection between the degree of change in the initial conditions and in the final, non-linear structure. Using a single constraint, we investigate how a change in density contrast in the initial conditions is related to the resulting halo mass at late times. Qualitatively, an increase in overdensity should lead to a more massive object at as explained in Sec. 4.

Quantitatively, the probability of finding a halo with mass at is given by the halo mass function, , which depends on the cosmological power spectrum and growth function. Given two halos of mass and , their relative abundance is given by the ratio of the halo mass function at those masses, . Assuming that the statistical properties of constrained and unconstrained simulations can be related by the change in mass of the target halo alone, this ratio also gives the relative probability of the structure in the two simulations.

Additionally, we can calculate the relative probability of the two initial conditions using Eq. (11); specifically


where is short-hand for the probability of a constrained field using one ‘density constraint’ with value . Since we only consider probability ratios and , any terms which only change the normalization of have dropped out.

Now we have two methods of calculating the relative probabilities, and we can check whether they agree. We rewrite in terms of halo mass using the conservation of probability


where is the halo mass at , and is the derivative of the constraint with respect to the mass, evaluated at . By combining Eqs. (14) and (15), we can find a relationship between the and ; namely


If our assumptions are correct, this expression for should be equal to the mass function ratio . We have used HMFcalc (Murray et al., 2013) to generate a halo mass function at for our cosmological model, and confirmed that it provides a good fit to our simulations.222This is preferable to obtaining a (noisy) estimate of the mass function directly from our limited volume; the fitting functions included in the HMFcalc tool were validated using detailed studies of large simulation suites (e. g. Tinker et al., 2008).

Evaluating Eq. (16) also requires an estimate of the Jacobian factors on the right hand side. This can be obtained either from a physical model underlying the mass function or by using an empirically calibrated relationship from the simulations. We chose the latter approach by fitting a power-law relation between and , which allows us to obtain values for at different halo masses separately for each halo family (24, 37 & 40; introduced in the previous section). This leaves us with a ‘semi-analytical’ prediction: theoretical halo mass function plus fit to the Jacobian.

Figure 6: Slices from the original simulation (upper panel) and H37-mass-1.5 simulation (lower panel) illustrate how the target halo is, at , seen at a time where it is about to undergo a major merger in the latter case. For that reason its mass undershoots the expectation from the halo mass function (Fig. 5) which averages over all the possible discrete realizations of the accretion history. Black circles show the size of the virial radius.

The right panel of Fig. 5 shows the results of the calculation (lines), as well as points evaluated directly from the simulations. Overall, most points show a good agreement: there is consistency between the population statistics and the abundance calculated from the GM values. The broad agreement justifies our set of assumptions for calculating abundances in this specific case of a single density constraint. However the individual halos do scatter around the relation and there is one point that clearly does not fit the expectations. This arises from our H37-mass-1.5 run where we increase the initial overdensity of the proto-halo 37 region by a factor of 1.5.

The mismatch can be understood by considering the discrete nature of merger histories. Specifically, Fig. 6 shows the projected density at the last output () in a region around halo 37 in the original run (upper panel) and the run (lower panel). In the latter case, a major merger (mass ratio ) will occur in around . After this merger, the anomalous point will shift significantly rightwards in Fig. 5 to the correct mass ratio () according to the derived from the halo mass function.333Note that the nearest massive halo in the original run corresponds to the same particles, but is considerably further away from halo 37 and is not on a trajectory that will lead to a merger within a Hubble time. We can frame this in another, more general way: the halo mass function is a statistical construction that corresponds to averaging over all possible histories, but the individual points in Fig. 5 represent modifications to specific halos which have a discretized accretion history. Therefore, they scatter away from the line, especially when seen at special times (such as shortly before a major merger).

The main conclusion from Fig. 5 is therefore that the changes in the give us a good quantitative handle on the relative abundance of halos of different types, at least in this case where we have only changed the mass. However, the complex non-linear connection between initial and final states means that will always need to be interpreted with care.

7 Discussion & Conclusions

In this paper we have demonstrated an extension of the HR91 technique to modify the initial conditions of a numerical simulation. For this modification, we selected regions of arbitrary shape, defined solely by the particles that form a halo in our reference simulation. This is a different approach than that used in previous works, which relied on imposing constraints of a given analytic profile. By applying our constraints only to the halo particles, we showed that we can ‘genetically modify’ a single object, changing its properties in a smooth and continuous way.

Using constraints on the density averaged over all halo particles controls the total mass, whereas adding additional constraints allows us to change the halo’s collapse time. This serves as a demonstration of the technique and is the basis of the creation of further constraint types to study the impact of other halo characteristics on a halo’s evolution.

Using the collapse time constraint, we investigated the density profiles of the resulting halos at and found that the distribution of their concentration parameter is consistent with the results of statistical analyses such as Wechsler et al. (2002). However, we also find that different halos occupy different regions in the parameter-space, and have different trajectories when their collapse time is changed. We plan to study this behaviour in future work, in order to determine which other halo parameters have changed. This should be complementary to the principal component analysis carried out by Skibba & Macciò (2011); Jeeson-Daniel et al. (2011); Wong & Taylor (2012), which revealed somewhat inconclusive correlations between additional internal halo parameters. With our constrained simulations, we will not only be able to find correlations but to explicitly test their significance. Since we can directly compare the constrained halo to its reference in the unconstrained run, we can establish exactly which changes in the halo parameters have a physical impact.

We have provided a way of quantifying the likelihood of modified initial conditions via a difference between the constrained and unconstrained fields. In general, this statistic can be used to assess how compatible the modified object is with the underlying cosmology. Similar expressions were obtained by van de Weygaert & Bertschinger (1996); however our orthonormalisation procedure allows for the derivation of the considerably simpler Eq. (12). The statistic can be used to quantify the rarity of genetically modified objects relative to the unconstrained realization. As an example we showed that modifying the mass produces abundance constraints that are quantitatively consistent with the traditionally-measured halo mass function at . Individual halos have discrete accretion histories and scatter around the mean relation predicted by the mass function; the strongest outlier in our study is about to undergo a merger at , which significantly lowers its current mass. In principle, the measure could also be used to specifically create objects that are ‘rare’ in a CDM universe (similar to e. g. Romano-Díaz et al., 2011a, b; Dubois et al., 2012; Romano-Díaz et al., 2014); we leave such a study to future work.

In this paper we have used a uniform resolution over a box size of 50 Mpc; having a sufficiently large box is important to ensure that halos are embedded in the correct large-scale environment. Our code also produces ICs for ‘zoom’ simulations with varying resolution (see also Prunet et al., 2008; Romano-Díaz et al., 2014); the only major difference when generating these is the extra computational complexity introduced in the transformation between real space and Fourier space on an irregularly-spaced grid, which has been tackled elsewhere in the literature (e.g. Bertschinger, 2001; Hahn & Abel, 2011). The real power of the approach to generate insight into a population from a handful of runs will become more apparent as we begin to use these zoom ICs in tandem with high-resolution baryonic physics.

While our focus here has been on basic properties such as the formation time and mass of a system, many other interesting aspects of evolution can be changed by constraining different properties. One example with which we are experimenting is the specific angular momentum, which can be controlled because tidal torque theory describes the connection between ICs and final spin (e.g. White, 1984; Catelan & Theuns, 1996; Porciani et al., 2002a, b); the internal properties of the galaxy forming inside the dark matter halo will naturally depend on the spin parameter of the halo. We are able to generate initial conditions that modify the spin parameter of the halo, but leave the mass and merger history untouched. Studies of halo spin constraints for dark matter and hydrodynamic simulations will be presented in future work.


We thank Volker Springel for allowing access to P-Gadget-3 and subfind. NR thanks Emilio Romano-Díaz and Cristiano Porciani for useful discussions. AP acknowledges helpful conversations with Fabio Governato. NR and HVP are supported by STFC and the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement no 306478-CosmicDawn. AP is supported by a Royal Society University Research Fellowship. This work used the DiRAC Complexity system, operated by the University of Leicester IT Services, which forms part of the STFC DiRAC HPC Facility ( This equipment is funded by BIS National E-Infrastructure capital grant ST/K000373/1 and STFC DiRAC Operations grant ST/K0003259/1. DiRAC is part of the National E-Infrastructure.


  • Bardeen et al. (1986) Bardeen J. M., Bond J. R., Kaiser N., Szalay A. S., 1986, ApJ, 304, 15
  • Bertschinger (1987) Bertschinger E., 1987, ApJ, 323, L103
  • Bertschinger (2001) Bertschinger E., 2001, ApJS, 137, 1
  • Bett et al. (2007) Bett P., Eke V., Frenk C. S., Jenkins A., Helly J., Navarro J., 2007, MNRAS, 376, 215
  • Bistolas & Hoffman (1998) Bistolas V., Hoffman Y., 1998, ApJ, 492, 439
  • Brook et al. (2011) Brook C. B., et al., 2011, MNRAS, 415, 1051
  • Brook et al. (2014) Brook C. B., Di Cintio A., Knebe A., Gottlöber S., Hoffman Y., Yepes G., Garrison-Kimmel S., 2014, ApJ, 784, L14
  • Bullock et al. (2001) Bullock J. S., Kolatt T. S., Sigad Y., Somerville R. S., Kravtsov A. V., Klypin A. A., Primack J. R., Dekel A., 2001, MNRAS, 321, 559
  • Catelan & Theuns (1996) Catelan P., Theuns T., 1996, MNRAS, 282, 436
  • Codis et al. (2015) Codis S., et al., 2015, MNRAS, 448, 3391
  • Correa et al. (2014) Correa C. A., Wyithe J. S. B., Schaye J., Duffy A. R., 2014, preprint, (arXiv:1409.5228)
  • Correa et al. (2015a) Correa C. A., Wyithe J. S. B., Schaye J., Duffy A. R., 2015a, preprint, (arXiv:1501.04382)
  • Correa et al. (2015b) Correa C. A., Wyithe J. S. B., Schaye J., Duffy A. R., 2015b, preprint, (arXiv:1502.00391)
  • Dayal et al. (2013) Dayal P., Libeskind N. I., Dunlop J. S., 2013, MNRAS, 431, 3618
  • Doumler et al. (2013) Doumler T., Gottlöber S., Hoffman Y., Courtois H., 2013, MNRAS, 430, 912
  • Dubois et al. (2012) Dubois Y., Pichon C., Haehnelt M., Kimm T., Slyz A., Devriendt J., Pogosyan D., 2012, MNRAS, 423, 3616
  • Duffy et al. (2008) Duffy A. R., Schaye J., Kay S. T., Dalla Vecchia C., 2008, MNRAS, 390, L64
  • Dunkley et al. (2009) Dunkley J., et al., 2009, ApJS, 180, 306
  • Forero-Romero et al. (2011) Forero-Romero J. E., Hoffman Y., Yepes G., Gottlöber S., Piontek R., Klypin A., Steinmetz M., 2011, MNRAS, 417, 1434
  • Frenk et al. (1999) Frenk C. S., et al., 1999, ApJ, 525, 554
  • Genel et al. (2014) Genel S., et al., 2014, MNRAS, 445, 175
  • Governato et al. (2007) Governato F., Willman B., Mayer L., Brooks A., Stinson G., Valenzuela O., Wadsley J., Quinn T., 2007, MNRAS, 374, 1479
  • Guedes et al. (2011) Guedes J., Callegari S., Madau P., Mayer L., 2011, ApJ, 742, 76
  • Hahn & Abel (2011) Hahn O., Abel T., 2011, MNRAS, 415, 2101
  • Heß et al. (2013) Heß S., Kitaura F.-S., Gottlöber S., 2013, MNRAS, 435, 2065
  • Hoffman & Ribak (1991) Hoffman Y., Ribak E., 1991, ApJ, 380, L5
  • Hoffman et al. (2007) Hoffman Y., Romano-Díaz E., Shlosman I., Heller C., 2007, ApJ, 671, 1108
  • Hopkins et al. (2009) Hopkins P. F., Cox T. J., Younger J. D., Hernquist L., 2009, ApJ, 691, 1168
  • Hopkins et al. (2013) Hopkins P. F., Cox T. J., Hernquist L., Narayanan D., Hayward C. C., Murray N., 2013, MNRAS, 430, 1901
  • Iliev et al. (2011) Iliev I. T., Moore B., Gottlöber S., Yepes G., Hoffman Y., Mellema G., 2011, MNRAS, 413, 2093
  • Jasche & Wandelt (2013) Jasche J., Wandelt B. D., 2013, MNRAS, 432, 894
  • Jeeson-Daniel et al. (2011) Jeeson-Daniel A., Dalla Vecchia C., Haas M. R., Schaye J., 2011, MNRAS, 415, L69
  • Kitaura (2013) Kitaura F.-S., 2013, MNRAS, 429, L84
  • Klimentowski et al. (2010) Klimentowski J., Łokas E. L., Knebe A., Gottlöber S., Martinez-Vaquero L. A., Yepes G., Hoffman Y., 2010, MNRAS, 402, 1899
  • Klypin et al. (2003) Klypin A., Hoffman Y., Kravtsov A. V., Gottlöber S., 2003, ApJ, 596, 19
  • Klypin et al. (2014) Klypin A., Yepes G., Gottlober S., Prada F., Hess S., 2014, preprint, (arXiv:1411.4001)
  • Kormendy (2015) Kormendy J., 2015, in Laurikainen E., ed., , Galactic Bulges. Springer, New York (arXiv:1504.03330)
  • Kravtsov et al. (2002) Kravtsov A. V., Klypin A., Hoffman Y., 2002, ApJ, 571, 563
  • Libeskind et al. (2010) Libeskind N. I., Yepes G., Knebe A., Gottlöber S., Hoffman Y., Knollmann S. R., 2010, MNRAS, 401, 1889
  • Ludlow et al. (2013) Ludlow A. D., et al., 2013, MNRAS, 432, 1103
  • Ludlow et al. (2014) Ludlow A. D., Navarro J. F., Angulo R. E., Boylan-Kolchin M., Springel V., Frenk C., White S. D. M., 2014, MNRAS, 441, 378
  • Macciò et al. (2007) Macciò A. V., Dutton A. A., van den Bosch F. C., Moore B., Potter D., Stadel J., 2007, MNRAS, 378, 55
  • Macciò et al. (2008) Macciò A. V., Dutton A. A., van den Bosch F. C., 2008, MNRAS, 391, 1940
  • Mathis et al. (2002) Mathis H., Lemson G., Springel V., Kauffmann G., White S. D. M., Eldar A., Dekel A., 2002, MNRAS, 333, 739
  • McBride et al. (2009) McBride J., Fakhouri O., Ma C.-P., 2009, MNRAS, 398, 1858
  • Murray et al. (2013) Murray S. G., Power C., Robotham A. S. G., 2013, Astronomy and Computing, 3, 23
  • Naab et al. (1999) Naab T., Burkert A., Hernquist L., 1999, ApJ, 523, L133
  • Navarro et al. (1997) Navarro J. F., Frenk C. S., White S. D. M., 1997, ApJ, 490, 493
  • Neto et al. (2007) Neto A. F., et al., 2007, MNRAS, 381, 1450
  • Nuza et al. (2014) Nuza S. E., Parisi F., Scannapieco C., Richter P., Gottlöber S., Steinmetz M., 2014, MNRAS, 441, 2593
  • Papovich et al. (2015) Papovich C., et al., 2015, ApJ, 803, 26
  • Pontzen & Governato (2014) Pontzen A., Governato F., 2014, Nature, 506, 171
  • Pontzen et al. (2013) Pontzen A., Roškar R., Stinson G., Woods R., 2013, pynbody: N-Body/SPH analysis for python, Astrophysics Source Code Library (ascl:1305.002)
  • Porciani et al. (2002a) Porciani C., Dekel A., Hoffman Y., 2002a, MNRAS, 332, 325
  • Porciani et al. (2002b) Porciani C., Dekel A., Hoffman Y., 2002b, MNRAS, 332, 339
  • Power et al. (2003) Power C., Navarro J. F., Jenkins A., Frenk C. S., White S. D. M., Springel V., Stadel J., Quinn T., 2003, MNRAS, 338, 14
  • Prada et al. (2012) Prada F., Klypin A. A., Cuesta A. J., Betancort-Rijo J. E., Primack J., 2012, MNRAS, 423, 3018
  • Press & Schechter (1974) Press W. H., Schechter P., 1974, ApJ, 187, 425
  • Prunet et al. (2008) Prunet S., Pichon C., Aubert D., Pogosyan D., Teyssier R., Gottloeber S., 2008, ApJS, 178, 179
  • Ragone-Figueroa et al. (2010) Ragone-Figueroa C., Plionis M., Merchán M., Gottlöber S., Yepes G., 2010, MNRAS, 407, 581
  • Reed et al. (2005) Reed D., Governato F., Verde L., Gardner J., Quinn T., Stadel J., Merritt D., Lake G., 2005, MNRAS, 357, 82
  • Robertson et al. (2006) Robertson B., Bullock J. S., Cox T. J., Di Matteo T., Hernquist L., Springel V., Yoshida N., 2006, ApJ, 645, 986
  • Romano-Díaz et al. (2006) Romano-Díaz E., Faltenbacher A., Jones D., Heller C., Hoffman Y., Shlosman I., 2006, ApJ, 637, L93
  • Romano-Díaz et al. (2007) Romano-Díaz E., Hoffman Y., Heller C., Faltenbacher A., Jones D., Shlosman I., 2007, ApJ, 657, 56
  • Romano-Díaz et al. (2011a) Romano-Díaz E., Shlosman I., Trenti M., Hoffman Y., 2011a, ApJ, 736, 66
  • Romano-Díaz et al. (2011b) Romano-Díaz E., Choi J.-H., Shlosman I., Trenti M., 2011b, ApJ, 738, L19
  • Romano-Díaz et al. (2014) Romano-Díaz E., Shlosman I., Choi J.-H., Sadoun R., 2014, ApJ, 790, L32
  • Schaye et al. (2015) Schaye J., et al., 2015, MNRAS, 446, 521
  • Sherman & Morrison (1950) Sherman J., Morrison W. J., 1950, Ann. Math. Statist., 21, 124
  • Skibba & Macciò (2011) Skibba R. A., Macciò A. V., 2011, MNRAS, 416, 2388
  • Sorce et al. (2014) Sorce J. G., Courtois H. M., Gottlöber S., Hoffman Y., Tully R. B., 2014, MNRAS, 437, 3586
  • Springel (2005) Springel V., 2005, MNRAS, 364, 1105
  • Springel et al. (2001) Springel V., White S. D. M., Tormen G., Kauffmann G., 2001, MNRAS, 328, 726
  • Springel et al. (2008) Springel V., et al., 2008, MNRAS, 391, 1685
  • Tasitsiomi et al. (2004) Tasitsiomi A., Kravtsov A. V., Gottlöber S., Klypin A. A., 2004, ApJ, 607, 125
  • Tinker et al. (2008) Tinker J., Kravtsov A. V., Klypin A., Abazajian K., Warren M., Yepes G., Gottlöber S., Holz D. E., 2008, ApJ, 688, 709
  • Wechsler et al. (2002) Wechsler R. H., Bullock J. S., Primack J. R., Kravtsov A. V., Dekel A., 2002, ApJ, 568, 52
  • White (1984) White S. D. M., 1984, ApJ, 286, 38
  • Wong & Taylor (2012) Wong A. W. C., Taylor J. E., 2012, ApJ, 757, 102
  • Zaroubi et al. (1995) Zaroubi S., Hoffman Y., Fisher K. B., Lahav O., 1995, ApJ, 449, 446
  • Zavala et al. (2009) Zavala J., Jing Y. P., Faltenbacher A., Yepes G., Hoffman Y., Gottlöber S., Catinella B., 2009, ApJ, 700, 1779
  • Zhao et al. (2003) Zhao D. H., Mo H. J., Jing Y. P., Börner G., 2003, MNRAS, 339, 12
  • Zhao et al. (2009) Zhao D. H., Jing Y. P., Mo H. J., Börner G., 2009, ApJ, 707, 354
  • van Dokkum et al. (2013) van Dokkum P. G., et al., 2013, ApJ, 771, L35
  • van de Weygaert & Bertschinger (1996) van de Weygaert R., Bertschinger E., 1996, MNRAS, 281, 84
  • van den Bosch (2002) van den Bosch F. C., 2002, MNRAS, 331, 98

Appendix A Technique

Here, we will give a detailed description of the derivation of the HR91 operator and our Eq. (5).

The values of the three-dimensional overdensity field in the initial conditions of our cosmological simulations are distributed according to a multivariate Gaussian


with mean and (the covariance, or the power spectrum in Fourier space). Cosmological initial conditions have zero mean, , but we will consider the fully general case.

We will build the general procedure by induction. Suppose we have a that describes the probability distribution function for constraints; we now want to add the th constraint, ensuring that for some constraint vector and constant . To gain samples from the constrained distribution one could sample from the original distribution and reject all those trials which lie too far away from . Mathematically this can be expressed by multiplying the original probability distribution by a penalty function, e. g.


where the constant of proportionality renormalizes the probability distribution function and is dependent on . In the limit , the penalty function becomes a Dirac-Delta distribution and the constraint is satisfied exactly.

Under the assumption that is Gaussian, the new probability function is the product of two Gaussians, and so remains Gaussian itself; consequently after imposing constraints we must be able to write


for some mean and covariance which we will now derive. By multiplying out Eq. (18) we obtain


where we have already thrown away several terms which are zero-order in since they just change the normalization. By comparing terms in Eqs. (19) and (20) we can first read off . We will also need a normalization for the , which conveniently can be chosen444Unless is a null direction of , but then there would be zero probability of our constraint in the original distribution. as


Next, we apply the Sherman-Morrison formula (Sherman & Morrison, 1950),


where we have used and the normalization condition (21) in the second step.

The terms in the second line of Eq. (20) have to cancel exactly. Plugging Eq. (22) into this expression leads to


and finally taking the limit in Eq. (22) yields


This result allows us to apply as many constraints as desired analytically – by looping over the constraints and updating the covariance matrix and mean at each step, then drawing a constrained realization – but this would be computationally expensive. Instead, the constrained realization can be constructed from the unconstrained field using a projection operator, which we will now derive.

For notational simplicity, in addition to normalizing the constraints, it is also helpful to make them orthogonal (e. g. through a Gram-Schmidt procedure) in the sense that for . One can then verify by substitution (see Sec. A.1) that the constrained field has mean


and covariance


for orthonormalized .

Efficiently drawing from the distribution implied by the above mean and covariance is made possible by any operator that takes a realization from the unconstrained field and forms a new realization under constraints via the ansatz


where to gain the correct covariance one must demand


There are an infinity of operators with this property: Given any specific one can form where , and the new satisfies the required identity (28). To obtain the unique HR91 operator, we additionally require to make minimal changes to the field. This implies — in other words, that no changes are made if the field already satisfies the constraints. Using Eq. (27), it immediately follows that and . The first of these conditions implies that all eigenvalues of are either or .

One can verify by substitution that all these requirements are satisfied by


Note that the HR91 form given in their Eqs. (2) – (4) builds the orthonormalization procedure into the projection operator (appearing as in their notation). However, as stated above we found it notationally simpler to pre-condition the constraints into orthonormal form using the Gram-Schmidt procedure. Both formulations are mathematically equivalent (see Appendix B).

Inserting Eqs. (29) and (25) into (27) then leads to the final expression


where .

In practice, most of the necessary calculations are performed in Fourier-space, because there is the CDM power spectrum which is diagonal. Any constraint vector and density field can be easily converted using numerical Fast Fourier transformations.

Note that the algorithm in its current form only takes into account the contribution from the power spectrum. If one wanted to generate constrained initial conditions based on an observational dataset, the associated uncertainties would introduce extra contributions in the new covariance matrix (e. g. Zaroubi et al., 1995; van de Weygaert & Bertschinger, 1996), which is not included in the current implementation.

a.1 Comments on normalisation

Throughout this paper, we use the same notation for the normalised and unnormalised constraints (expressed by ). In practice, these quantities are affected by the normalisation condition in the following way: if before normalisation, then we immediately find to satisfy . Accordingly, the constant transforms as as well, such that is still obeyed after normalisation.
The consistency of the Gram-Schmidt condition and our normalisation can be shown as follows: Multiplying (26) by on both sides yields