# N-point Statistics of Large-Scale Structure in the Zel’dovich Approximation

###### Abstract

Motivated by the results presented in a companion paper, here we give a simple analytical expression for the matter -point functions in the Zel’dovich approximation (ZA) both in real and in redshift space (including the angular case). We present numerical results for the 2-dimensional redshift-space correlation function, as well as for the equilateral configuration for the real-space 3-point function. We compare those to the tree-level results. Our analysis is easily extendable to include Lagrangian bias, as well as higher-order perturbative corrections to the ZA. The results should be especially useful for modelling probes of large-scale structure in the linear regime, such as the Baryon Acoustic Oscillations. We make the numerical code used in this paper freely available.

Department of Astrophysical Sciences, Princeton University, 4 Ivy Lane, Princeton,

NJ 08544, USA

## 1 Introduction

Probes of Large-Scale Structure (LSS) have the potential to give powerful constraints on dark energy and dark matter (e.g. [1]). One such measurement is the accurate determination of the Baryon Acoustic Oscillations (BAO) scale. However, future surveys such as WFIRST, Euclid and LSST will provide precise measurements of the acoustic peak in the matter 2-point correlation function at a level which our current theoretical understanding of LSS does not yet completely match.

Using N-body simulations to model the acoustic peak, and especially the uncertainties behind those measurements, requires running hundreds and even thousands of simulations [2]. Therefore, theoretical models of the BAO signal and its covariance can still play an essential role for extracting cosmological data from observations.

In a companion paper [3], we demonstrated that models of LSS can be improved beyond the level of accuracy required for future surveys, as long as one avoids Fourier space and builds a perturbative model around the Zel’dovich Approximation (ZA) [4]. However, this result would be of little value if even the simplest of such models – the ZA itself – cannot provide analytical^{1}^{1}1Meaning, predictions which do not require numerical averaging over realizations. predictions beyond the 2-pt correlation function, such as its covariance. Indeed, until now the ZA has had a very serious drawback – analytical solutions existed only for the 2-point correlation function (e.g. [5]). Thus, the ZA has mostly been regarded as a cheap way of performing crude N-body simulations (e.g. [2]), and rarely as an analytical power horse.

In an attempt to change that, in this paper we give a simple analytical expression for the matter -point functions in the ZA both in real and redshift space.^{2}^{2}2Following the discussion we present in [3], we avoid Fourier space at all cost.
Thus, we show that for models perturbing around the ZA solution, the lowest order (the ZA itself) gives well defined analytical predictions for the statistics of Cold Dark Matter (CDM).

We implement our result numerically in an open-source code, called Zel’dovich Calculator (ZelCa^{3}^{3}3In case the reader is curious, the word means cabbage in Bulgarian. The code is freely available at the following URL: https://bitbucket.org/tassev/zelca/). The code is capable of calculating the real and redshift space 2-pt functions, as well as the real-space 3-pt function in the ZA.
We include results for those quantities in this paper.

In Section 2 we give a quick overview of the Zel’dovich approximation and introduce our notation. We use Section 3 to warm up with the standard calculation for the 2-pt function, and then in Section 4 we derive our expression for the matter -point functions in real and redshift space. In Section 5 we describe the numerical implementation of our results in ZelCa and discuss current numerical limitations. In Section 6 we present our numerical results. We summarize in Section 7, where we discuss how our results can be extended to include Lagrangian bias as well as higher-order corrections (including corrections arising when the theory is renormalized, see [3]).

## 2 The Zel’dovich approximation

Let us introduce some notation by doing a quick overview of the ZA. A CDM particle with an initial (Lagrangian) position ends up at a final (Eulerian) position after time according to:

(1) |

with being the so-called displacement field. The above equation is valid in the general case. The ZA boils down to using the linear result for the displacement field given by , where is the linear growth factor. The displacement field in the ZA is a Gaussian random variable with zero mean. Its variance is given by

(2) |

where is the linear power spectrum at time . It is given by , where is defined by . After some algebra, the above expression can be written as (we drop the time arguments for brevity)

(3) |

where

(4) |

Note that

(5) |

where

(6) |

### 2.1 Real space

The CDM overdensity field, , is given by

(7) |

which can also be rewritten as

(8) |

From now on we drop the arguments of the functions above for brevity.

### 2.2 Redshift space

In redshift space, the density in the ZA is given by

(9) |

where is the dimensionless linear growth rate, , being the scale factor. Subscript denotes redshift space. Above we used , where is the direction along the line-of-sight. Thus, implicitly depends on the for which we evaluate .

In that case, we can redo the algebra of the previous section to find that nothing changes, except now has to be replaced with its redshift-space counterpart, , which can be easily seen to be equal to:

(10) |

where the superscripts keep track of the towards which the line-of-sight vector is pointing. Note that the above expression is valid in both the plane-parallel (for which is constant, and the superscripts can be dropped as well as the dependence) and in the angular case of redshift-space distortions.

## 3 The 2-point function in real and redshift space

Let us warm up with deriving the 2-pt function, , which is well-known in the literature (e.g. [5]). We will go step by step, because deriving the -point functions will follow the same logic.

Let us first start with real space. Using (8), the 2pt function in the ZA can be written as:

where (and in general, vector labels in this paper will appear as superscripts, whereas vector indices – as subscripts). In writing the last line of the above equation, we used that for any Gaussian random variable, , with zero mean, the cumulant expansion gives us:

(12) |

Plugging (2,5) in (15) we find

(13) | |||||

where repeated subscripts are summed over. Now let’s change variables to . We can then do the integral in which gives a delta function, removing one of the integrals. We then find

(14) |

Now we perform the Gaussian integral in to obtain

(15) |

where and

(16) |

with given by (3).

Fixing the to be our axis, then the azimuthal angular integral of trivially gives , and we are left with a 2-dimensional integral to evaluate numerically^{4}^{4}4Evaluating the polar angle integral analytically is also possible, but then the expression can no longer be easily compared to the general -point function result.: in and in the polar angle, the one between and .

In redshift space, in the angular case is a function of both and and not only their difference as translation invariance is broken; while in the plane-parallel case one has depending on the direction of as isotropy is broken. Keeping in mind those two things, one can see that the above calculation is followed through transparently, except that now one has to use the redshift space in (15) defined through:

The above equation is again valid both for the plane-parallel as well as for the angular case. Note that appears alone (i.e. not as the difference ) only in through , which in turn depends on . Moreover, the dimensionality of the integral remains unchanged in redshift space.

## 4 General -point functions in real and redshift space

In this section we will give a general expression for the -point functions of . We start by noting that the -point function of equals the -point function of , since a shift in the mean value of a random variable does not change its cumulants. Any -point function can therefore be expressed through a linear combination of the moments (which include connected and disconnected pieces) , with . Thus, if we know , we can easily obtain the desired -point function of .

So, in this section we will calculate the moments , or more explicitly:

(18) |

where numerical superscripts allow us to distinguish the different ’s. Note that even though we could set without loss of generality in real space, we would like to keep the discussion applicable in the angular case in redshift space. So, we keep in . However, note that even in the angular case in redshift space, we still can use translation invariance in Lagrangian space – a fact which will simplify greatly our final result.

To make the notation more compact, let us construct the following column vectors of length by stacking , , etc.:

(19) |

where as before , and a superscript stands for the matrix transpose. Note that here stands either for the real-space or for its redshift-space counterpart, . The role of the tildes will become apparent below. As an example, if , then , where again a superscript denotes a vector label, while subscript – the vector index. Therefore, as an example, the fourth element of is given by .

With the above notation, we can automatically write down in the same way we calculated the 2-pt function:

(20) |

where the subscripts of , , etc. run over both the subscripts and superscripts of , , etc. as per their definition (4).

Let us define the displacement covariance:

(21) |

Explicitly in real space it is given by the following block form:

(22) |

where each block is a 3-by-3 matrix. We defined . Implicitly each is multiplied by , the identity matrix. The 3-by-3 matrices are given by (3). Note that we can make translation invariance in Lagrangian space explicit by using , and therefore can be safely set to zero in . In redshift space (22) still holds after replacing and . This is valid for the plane-parallel case where is a constant not depending on . For the angular case, one has to keep track towards which the ’s are pointing. We will make this dependence explicit only in the next section.

We can do the Gaussian integral above over in (20) to find

(23) |

Clearly, we could simply stop here and claim victory. However, the expression above is not explicitly translation invariant in Lagrangian space as it depends on . This results in the fact that for , the expression above involves a 6-dimensional integral, while our previous expression for , (15), involved just a 3-dimensional integral. So, let us do some more work and make translation invariance in Lagrangian space explicit. To do that we need to undo the integral in and work with (20).

### 4.1 Making translation invariance in Lagrangian space explicit

Let us subtract from all ’s, and from all ’s, and split in two pieces, and denote the resulting vectors (of total length ) as

(24) |

Comparing with (4), one can write:

(25) |

where by inspection we can read off to be a 3-by- matrix defined by stacking (3-by-3) identity matrices ():

(26) |

We defined to be a 3-by- matrix defined by stacking (3-by-3) identity matrix with (n-1) null matrices of size 3-by-3 :

(27) |

We also find to be a -by- matrix defined by

(28) |

Making the substitutions given by (4.1) in (20), and changing integration variables according to (the Jacobian of the transformation is 1), we can do the integral in by invoking translation invariance in Lagrangian space, which tells us that is really a function of only, and not of . The integral gives a delta function setting . This looks obscure until we plug in the values of and to find

which is nothing but the standard “the sum of ’s should equal zero” rule.

Using this value of , one can check that , which goes in the exponent of (20) through , drops out. Thus, in the angular case of redshift space, appears only in the covariance through in . Remember that we obtained the same result for the 2-point function in redshift space as well.

Having eliminated the and integrals, we are left with a Gaussian integral in . Performing it and using the expressions for , and one can show after a bit of linear algebra that is given by:

(29) | |||||

is the -vector given by (4.1), which we copy here in its explicit form:

(30) |

Before we write down , let us redefine the -vector so as not to make a reference to the irrelevant :

(31) |

where each is a column vector of length 3. The -by- covariance matrix then consists of 3-by-3 blocks given by

(32) |

where and correspond to the block of at position . Thus, and go between 1 and .

In real space, we can safely set above. In redshift space we recover (29) but with the replacement , where as one can easily guess:

(33) | |||||

### 4.2 Summary

To tersely summarize, our results are given by (29) with the definitions (30) and (31). For real space one should use the real-space displacement covariance, given by (32), while for redshift space one should instead substitute in (29) and use (33). The results are valid for both the plane-parallel as well as for the angular case in redshift space, with the line-of-sight vector entering through defined through (2.2). The fact that translation invariance is broken in the angular case is captured completely by entering through .

Let us compare our final result, (29), with our initial much more easily derived result, (23). We see that (29) explicitly shows translation invariance in Lagrangian space and involves an integral with dimensions smaller by 3. Reducing the dimensionality of the integral by that much results in an enormous speed-up when evaluating it numerically, thus justifying the algebra of the previous section.

## 5 Numerical implementation

In the previous section we gave the explicit expression for the moments – equation (29). The integral in (29) is over a positive-definite function and so can be done easily numerically. However, extracting the connected -point functions knowing the moments (with ), albeit trivial to write down, involves subtracting comparable quantities all of which are to obtain a quantity which is .

So, for the 2-point function, for example, (15) gives as an integral over a positive definite function. However, to get one needs to subtract from that integral, which is a recipe for a numerical disaster when is small and comparable to round-off errors – e.g. at large separations or high redshift.

This problem is especially exacerbated when one goes to higher and higher -points functions outside the non-linear regime, when applying the ZA makes sense. We can trace the problem to (7), which tells us that unlike Eulerian perturbation theory, Lagrangian perturbation theory gives us the total density, from which we need to subtract the mean to get the overdensity.

This paper does not solve this problem. Instead it ameliorates it by using the following trick. Let us first focus on in (15). The (i.e. the disconnected piece) can be plugged back into the integral by rewriting it as the right hand side of (15) but with an given by times the identity matrix. So, the disconnected piece is given by the rhs of (15) after setting the cross term, , to zero, thus making the two points ( and ) uncorrelated, i.e. disconnected. This comes at a cost. The integrand in (15) will then no longer be positive definite. However, at large , when , the integrand will quickly fall to zero.

The same trick can be easily applied for extracting higher order -point functions. For each disconnected piece, one has to use the same integrand as the one in (29) but after setting the ’s between the disconnected points to zero.

The above trick is implemented in ZelCa both for the 2-point and for the 3-point functions. ZelCa uses the Cuba^{5}^{5}5http://www.feynarts.de/cuba/ [6] library for multidimensional numerical integration, as well as the Eigen^{6}^{6}6http://eigen.tuxfamily.org C++ template library for linear algebra, which makes the ZA part of the code especially readable and easy to understand and modify as needed. We also provide a Sage^{7}^{7}7http://www.sagemath.org/ code for displaying the results, and also for calculating the tree-level 3-point function used in the next section.

## 6 Numerical results

### 6.1 The 2-dimensional redshift-space 2-pt correlation function

Results from ZelCa for the real-space 2-point function, as well as for the redshift-space monopole, quadrupole and hexadecapole, were already included in Figure 3 of [3]. By comparing them to the N-body results from [5], there we showed that the ZA and the N-body results show excellent agreement at large scales (Mpc) relevant for the BAO (see [3] for further details).

In this paper, we show the first results for the 2-dimensional redshift-space 2-point correlation function – the right panel of Figure 1. In the left panel of that figure we show the result of linear Standard Perturbation Theory (SPT) (e.g. [8]). The most notable difference between the two is the fact that the ZA produces a relatively smoother 2-pt function – a direct consequence of the smearing effects of the bulk flows, which the ZA captures, while linear theory does not.

One can also compare the figure to the upper-right panel of Figure 4 in [7], which shows results obtained using the LasDamas (Large suit of Dark matter simulations) mock catalogs^{8}^{8}8http://lss.phy.vanderbilt.edu/lasdamas/. We do not include Lagrangian bias in this calculation, and therefore a detailed quantitative comparison is not possible. Yet, even by eye one can see that the features in the mock catalog calculation are smoother than the linear theory result, and on par with the ZA prediction.

This should not come as a surprise – as already discussed at length in the companion to this paper, [3], we already expected that at the scales of the BAO, linear theory should receive corrections in real space, while the ZA should be correct to within for the matter density (again in real space). The discussion in Section 4.2 of that paper, shows that the ZA performs extremely well compared to linear theory for redshift space as well. Thus, Figure 1 serves as yet another confirmation of those results.

### 6.2 The real space 3-pt function

Even though we only focus on the 2-pt function in [3], it is clear that a similar analysis should apply to higher-order statistics – with the ZA capturing the main effects of the large-scale coherent motions in the universe. To check that intuition, in Figure 2 we show results for the real-space 3-pt function, , in the equilateral configuration. One can clearly see the effects of the bulk flows smearing the BAO peak when compared to the tree-level prediction (as given in e.g. [8]).

One can immediately object, however, that the ZA does not capture the tree-level result for , given that the overdensity it predicts matches SPT only at the linear level. Therefore, we also calculate the following quantity:

(34) |

which includes the piece of the tree-level, which the ZA misses. We plot in Figure 2 as well (line denoted “ZA + tree-level residual”). It is quite interesting to note how close is to (within 10%), i.e. how small the piece of the tree-level result that the ZA misses is.^{9}^{9}9The tree-level prediction for in SPT is given by eq.(157) of [8]. It is obtained using the standard perturbation theory kernel, given in e.g. [9]. Note that [9] also gives the kernel in the ZA, . Comparing the two, we see that the ZA captures the coefficients of the four terms of the tree-level , appearing in eq.(157) of [8], with errors of 30%, 0, 0 and 75%, respectively. At the same time, the difference between the tree-level and ZA results is at around the BAO peak, in direct analogy with the 2-pt correlation function results [3].

The difference between and can be thought of as the typical error one makes by using the ZA. That difference being small, in what follows, we will assume that is close to the true result. Whether that is indeed the case remains to be seen, but we consider it our best guess in light of [3].

Then the fact that is so close to the truth (as approximated by ) can be qualitatively understood by remembering that CDM particle trajectories are extremely well-captured by the ZA (see [3], and in particular their Section 8). Therefore, the predicted density field using the full unexpanded ZA is quite close to the true density field (e.g. Fig. 1 in [3]). Indeed, one finds that the cross-correlation coefficient between the true and ZA-predicted density fields is close to 1 well into the non-linear regime, unlike the SPT prediction, which decorrelates from the truth at relatively large scales [10, 11]. A good cross-correlation coefficient has already been qualitatively shown to imply well-recovered -point functions [12, 3]. So, it is not surprising that if one keeps the ZA unexpanded (as emphasied in [3]), one recovers a very good approximation for the density 3-pt statistics at large scales.

## 7 Summary

In this paper we presented an expression for the matter -point functions in the Zel’dovich approximation in both real and redshift space, with the results for the latter valid both for the plane-parallel limit, as well as for the angular case. We also discussed the numerical implementation of our results and provide the code, called ZelCa, generating the results of this paper.

Using that code we obtained numerical results for the 2-dimensional redshift-space 2-pt correlation function. We find a qualitative match (as we do not include bias) between the ZA prediction and the result obtained from mock catalogs, while the tree-level result gives a BAO feature which is too sharp, as already expected from the results for the redshift-space multipoles [3].

We also show results for the equilateral configuration for the 3-pt function around the acoustic peak and demonstrate the expected smearing of the BAO due to bulk motions, similar to the smearing observed for the 2-pt function. We calculate the tree-level correction to the ZA result and show it to contribute only at , reassuringly in line with our analysis in [3].

The road from here on involves adding more realism to the ZA. Fortunately, a lot of progress in that direction was already done for the 2-pt function in [5]. Those authors showed how one can include Lagrangian bias and higher order corrections to the 2-pt function in the ZA. Their analysis can be transparently followed for the higher -point functions as well. However, one should always keep in mind that short scales may introduce non-negligible corrections to the analytical perturbative results, which need to be calibrated from simulations through the effective field theory formalism (see [13]).

## References

- [1] D. H. Weinberg, M. J. Mortonson, D. J. Eisenstein, C. Hirata, A. G. Riess, and E. Rozo, Observational probes of cosmic acceleration, \physrep 530 (Sept., 2013) 87–255, [arXiv:1201.2434].
- [2] M. Manera, R. Scoccimarro, W. J. Percival, L. Samushia, C. K. McBride, A. J. Ross, R. K. Sheth, M. White, B. A. Reid, A. G. Sánchez, R. de Putter, X. Xu, A. A. Berlind, J. Brinkmann, C. Maraston, B. Nichol, F. Montesano, N. Padmanabhan, R. A. Skibba, R. Tojeiro, and B. A. Weaver, The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: a large sample of mock galaxy catalogues, \mnras 428 (Jan., 2013) 1036–1054, [arXiv:1203.6609].
- [3] S. Tassev, Lagrangian or Eulerian; Real or Fourier? Not All Approaches to Large-Scale Structure Are Created Equal, ArXiv e-prints (Nov., 2013) [arXiv:1311.4884].
- [4] Y. B. Zel’dovich, Gravitational instability: An approximate theory for large density perturbations., \aap 5 (Mar., 1970) 84–89.
- [5] J. Carlson, B. Reid, and M. White, Convolution Lagrangian perturbation theory for biased tracers, \mnras 429 (Feb., 2013) 1674–1685, [arXiv:1209.0780].
- [6] T. Hahn, CUBA-a library for multidimensional numerical integration, Computer Physics Communications 168 (June, 2005) 78–95, [hep-ph/0404043].
- [7] N. Padmanabhan, X. Xu, D. J. Eisenstein, R. Scalzo, A. J. Cuesta, K. T. Mehta, and E. Kazin, A 2 per cent distance to z = 0.35 by reconstructing baryon acoustic oscillations - I. Methods and application to the Sloan Digital Sky Survey, \mnras 427 (Dec., 2012) 2132–2145, [arXiv:1202.0090].
- [8] F. Bernardeau, S. Colombi, E. Gaztañaga, and R. Scoccimarro, Large-scale structure of the Universe and cosmological perturbation theory, \physrep 367 (Sept., 2002) 1–248, [astro-ph/0112551].
- [9] R. Scoccimarro and J. Frieman, Loop Corrections in Nonlinear Cosmological Perturbation Theory, \apjs 105 (July, 1996) 37, [astro-ph/9509047].
- [10] S. Tassev and M. Zaldarriaga, The mildly non-linear regime of structure formation, \jcap 4 (Apr., 2012) 13, [arXiv:1109.4939].
- [11] P. Coles, A. L. Melott, and S. F. Shandarin, Testing approximations for non-linear gravitational clustering, \mnras 260 (Feb., 1993) 765–776.
- [12] S. Tassev, M. Zaldarriaga, and D. J. Eisenstein, Solving large scale structure in ten easy steps with COLA, \jcap 6 (June, 2013) 36, [arXiv:1301.0322].
- [13] R. A. Porto, L. Senatore, and M. Zaldarriaga, The Lagrangian-space Effective Field Theory of Large Scale Structures, ArXiv e-prints (Nov., 2013) [arXiv:1311.2168].