# Measuring the Universe with galaxy redshift surveys

## Abstract

Galaxy redshift surveys are one of the pillars of the current
standard cosmological model and remain a key tool
in the experimental effort to understand the origin of
cosmic acceleration.
To this end, the
next generation of surveys aim at achieving sub-percent
precision in the measurement of the equation of state of dark energy
and the growth rate of structure . This however
requires comparable control over systematic
errors, stressing the need for improved modelling methods. In this
contribution
we review at the introductory level some highlights of the work done in this direction by the
Darklight project^{1}^{2}

## 1 Introduction

A major achievement in cosmology over the 20th century has been the detailed reconstruction of the large-scale structure of the Universe around us. Started in the 1970s, these studies developed over the following decades into the industry of redshift surveys, beautifully exemplified by the Sloan Digital Sky Survey (SDSS) in its various incarnations (e.g. [1]). These maps have covered in detail our “local” Universe (i.e. redshifts ) and only recently we started exploring comparable volumes at larger redshifts, where the evolution of galaxies and structure over time can be detected (see e.g. [2]). Fig. 1 shows a montage using data from some of these surveys, providing a visual impression of the now well-established sponge-like topology of the large-scale galaxy distribution and how it stretches back into the younger Universe.

In addition to their purely cartographic beauty, these maps provide a quantitative test of the theories of structure formation and of the Universe composition. Statistical measurements of the observed galaxy distribution represent in fact one of the experimental pillars upon which the current “standard” model of cosmology is built. Let us define the matter over-density (or fluctuation) field, with respect to the mean density, as ; this can be described in terms of Fourier harmonic components as

(1) |

where is the volume considered. The power spectrum is then defined by the variance of the Fourier modes:

(2) |

The observed number density of galaxies is related to the matter fluctuation field through the bias parameter by

(3) |

which corresponds to assuming that . This linear and scale-independent relation provides an accurate description of galaxy clustering at large scales, although it breaks down in the quasi-linear regime below scales of [8]. In general, depends on galaxy properties, as we shall discuss in more detail in Sect. 3. From the hypothesis of linear bias, it descends that , where is the observed galaxy-galaxy power spectrum. This connection allows us to use measurements of to constrain the values of cosmological parameters that regulate the shape of .

Fig. 2 [9] shows an example of such measurements: the left panel plots four estimates of the power spectrum (more precisely, its monopole, i.e. the average of over spherical shells) obtained at from the VIPERS survey data of Fig. 1 (see also Sect. 2.2). In the central and right panels, we show the posterior distribution of the mean density of matter and the baryon fraction from a combined likelihood analysis of the four measurements; these are compared to similar estimates from other surveys and from the Planck CMB anisotropy constraints [10]. More precisely, the galaxy power spectrum shape on large scales probes the combination , where . Such comparisons provide us with important tests of the CDM model, with the estimate from VIPERS straddling Planck and local measurements.

If one goes beyond the simple shape of angle-averaged quantities, two-point statistics of the galaxy distribution contain further powerful information, which is key to understanding the origin of the mysterious acceleration of cosmic expansion discovered less than twenty years ago [11, 12]. First, tiny “baryonic wiggles” in the shape of the power spectrum define a specific, well known comoving spatial scale, corresponding to the sound horizon scale at the epoch when baryons were dragged into the pre-existing dark-matter potential wells. In fact, it turns out that there are enough baryons in the cosmic mixture to influence the dominant dark-matter fluctuations [13, 7] and leave in the galaxy distribution a visible signature of the pre-recombination acoustic oscillations in the baryon-radiation plasma. Known as Baryonic Acoustic Oscillations (BAO), these features provide us with a formidable standard ruler to measure the expansion history of the Universe , complementary to what can be done using Type Ia supernovae as standard candles (see e.g. [14] for the latest measurements from the SDSS-BOSS sample).

Secondly, the observed redshift maps are distorted by the contribution of peculiar velocities that cannot be separated from the cosmological redshift. This introduces a measurable anisotropy in our clustering statistics, what we call Redshift Space Distortions (RSD), an effect that provides us with a powerful way to probe the growth rate of structure . This key information can break the degeneracy on whether the observed expansion history is due to the presence of the extra contribution of a cosmological constant (or dark energy) in Einstein’s equations or rather require a more radical modification of gravity theory. While RSD were first described in the 1980’s [15, 16]), their potential in the context of understanding the origin of cosmic acceleration was fully recognized only recently [17]; nowadays they are considered one of the potentially most powerful “dark energy tests” expected from the next generation of cosmological surveys, as in particular the ESA mission Euclid [18], of which the Milan group is one of the original founders.

## 2 Measuring the growth rate of structure from RSD

### 2.1 Improved models of redshift-space distortions

Translating galaxy clustering observations into precise and accurate measurements of the key cosmological parameters, however, requires modelling the effects of non-linear evolution, galaxy bias (i.e. how galaxies trace mass) and redshift-space distortions themselves. The interest in RSD precision measurements stimulated work to verify the accuracy of these measurements [19, 20]. Early estimates – focused essentially on measuring , given that in the context of General Relativity (e.g. [21]) – adopted empirical non-linear corrections to the original linear theory by Kaiser; this is the case of the so-called “dispersion model” [22], which in terms of the power spectrum of density fluctuations is expressed as

(4) |

where is the redshift-space power spectrum, which depends both on the amplitude and the orientation of the Fourier mode with respect to the line-of-sight, is the real-space (isotropic) power spectrum of the matter fluctuation field and , with being the growth of structure and the linear bias of the specific population of halos (or galaxies) used. The latter is defined as the ratio of the rms clustering amplitude of galaxies to that of the matter, conventionally measured in spheres of radius, . For what will follow later, it is useful to note that

(5) |

can be recast as

(6) |

which combines two directly measurable quantities to the left, showing that what we actually measure is the combination of the growth rate and the rms amplitude of clustering, . This is what nowadays is customarily plotted when presenting measurements of the growth rate from redshift surveys (e.g. Fig. 8).

Going back to eq. (4), the term is usually either a Lorentzian or a Gaussian function, empirically introducing a nonlinear damping to the Kaiser linear amplification, with the Lorentzian (corresponding to an exponential in configuration space) normally providing a better fit to the galaxy data [23]. This term is regulated by a second free parameter, , which corresponds to an effective (scale-independent) line-of-sight pairwise velocity dispersion. Fig. 3 (from [20]), shows how estimates of using the dispersion model can be plagued by systematic errors as large as 10%, depending on the kind of galaxies (here dark matter halos) used. With the next generation of surveys aiming at 1% precision by collecting several tens of millions of redshifts, such a level of systematic errors is clearly unacceptable.

Exploring how to achieve this overall goal by optimising measurements of galaxy clustering and RSD, has been one of the main goals of the Darklight project, supported by an ERC Advanced Grant awarded in 2012. Darklight focused on developing new techniques, testing them on simulated samples, and then applying them to the new data from the VIMOS Public Extragalactic Redshift Survey (VIPERS), which was built in parallel.

After assessing the limitations of existing RSD models [20, 24] the first goal of Darklight has been to develop refined theoretical descriptions. This work followed two branches: one, starting from first principles, was based on revisiting the so-called streaming model approach; the second, more pragmatic, aimed at refining the application to real data of the best models available at the time, as in particular the “TNS” model [25]. Such more “data oriented” line of development also included exploring the advantages of specific tracers of large-structure in reducing the impact of non-linear effects.

The first approach [26] focused on the so-called streaming model [27], which in the more general formulation by Scoccimarro [28] (see also [29]), describes the two-point correlation function in redshift space as a function of its real-space counterpart

(7) |

Here quantities noted with and correspond to the components of the pair separation – in redshift or real space – respectively perpendicular and parallel to the line of sight, with and . The interest in the streaming model is that this expression is exact: knowing the form of the pairwise velocity distribution function at any separation , a full mapping of real- to redshift-space correlations is provided. The problem is that this is a virtually infinite family of distribution functions.

The essential question addressed in [26] has been whether a sufficiently accurate description of this family (and thus of RSD) is still possible with a reduced number of degrees of freedom. It is found that, at a given galaxy separation , they can be described as a superposition of virtually infinite Gaussian functions, whose mean and dispersion are in turn distributed according to a bivariate Gaussian, with its own mean and covariance matrix. A recent extension of this work [30] shows that such“Gussian-Gaussian” model cannot fully match the level of skewness observed at small separations, in particular when applied to catalogues of dark matter halos. They thus generalize the model by allowing for the presence of a small amount of local skewness, meaning that the velocity distribution is obtained as a superposition of quasi-Gaussian functions. In its simplest formulation, this improved model takes as input the real space correlation function and the first three velocity moments (plus two well defined nuisance parameters) and returns an accurate description of the anisotropic redshift-space two-point correlation function down to very small scales ( for dark matter particles and virtually zero for halos). To be applied to real data to estimate the growth rate of structure , the model still needs a better theoretical and/or numerical understanding of how the velocity moments depend on on small scale, as well as tests on mock catalogues including realistic galaxies.

The second, parallel approach followed in Darklight was to work on the “best” models existing in the literature, optimising their application to real data. The natural extensions to the dispersion model (4) start from the Scoccimarro [28] expression

(8) |

where and are respectively the so-called density-velocity divergence cross-spectrum and the velocity divergence auto-spectrum, while is the usual matter power spectrum. If one then also accounts for the non-linear mode coupling between the density and velocity-divergence fields, two more terms arise inside the parenthesis, named and , leading to the TNS model by Taruya and collaborators [25].

A practical problem in the application of either of these two models is that the values of and cannot be measured from the data. As such, they require empirical fitting functions to be calibrated using numerical simulations [31]. As part of the Darklight work, we used the DEMNUni simulations (see sect. 4) to derive improved fitting functions in different cosmologies [32]:

(9) |

(10) |

where is the linear matter power spectrum and is a parameter representing the typical damping scale of the velocity power spectra, which is well described as , where are the only two parameters that need to be calibrated from the simulations. These forms for and have valuable, physically motivated properties: they naturally converge to in the linear regime, including a dependence on redshift through . They represent a significant improvement over previous implementations of the Scoccimarro and TNS models and allowed us to extend their application to smaller scales and to the high redshifts covered by VIPERS.

### 2.2 Application to real data: optimising the samples

The performance, in terms of systematic error, of any RSD model when applied to real data does not depend only on the quality of the model itself. The kind of tracers of the density and velocity field that are used, significantly enhance or reduce some of the effects we are trying to model and correct. This means that, in principle, we may be able to identify specific sub-samples of galaxies for which the needed non-linear corrections to RSD models are intrinsically smaller. This could be an alternative to making our models more and more complex, as it happens for the full galaxy population.

Such an approach becomes feasible if the available galaxy survey was constructed with a broad selection function and supplemented by extensive ancillary information (e.g. multi-band photometry, from which spectral energy distributions, colours, stellar masses, etc. can be obtained). This allows a wide space in galaxy physical properties to be explored, experimenting with clustering and RSD measurements using different classes of tracers (and their combination), as e.g. red vs. blue galaxies, groups, clusters. This is the case, for example, of the Sloan Digital Sky Survey main sample [6]. The VIMOS Public Extragalactic Redshift Survey (VIPERS) [3] was designed with the idea of extending this concept to , i.e. when the Universe was around half its current age, providing Darklight with a state-of-the-art playground.

VIPERS is a new statistically complete redshift survey,
constructed between 2008 and 2016 as one of the “ESO Large Programmes”, exploiting the
unique capabilities of the VIMOS multi-object spectrograph at the
Very Large Telescope (VLT) [5]. It has secured redshifts for galaxies
with magnitude (out of spectra) over a total area of square degrees,
tiled with a mosaic of 288 VIMOS pointings. Target galaxies
were selected from the two fields (W1 and W4) of the
Canada-âFrance-âHawaii Telescope Legacy Survey Wide catalogue (CFHTLS–Wide),
benefiting of its excellent image
quality and photometry in five bands ()^{3}

VIPERS released publicly its final catalogue and a series of new scientific results in November 2016. More details on the survey construction and the properties of the sample can be found in [5, 4, 3].

Fig. 5 shows two measurements of the anisotropic two-point correlation function in redshift space (i.e. what is called in eq. (7); here and ), using the VIPERS data. In this case the sample has been split into two classes, i.e. blue and red galaxies, defined on the basis of their rest-frame photometric colour (see [34] for details). The signature of the linear streaming motions produced by the growth of structure is evident in the overall flattening of the contours along the line-of-sight direction (). These plots also show how blue galaxies (left) are less affected by small-scale nonlinear motions, i.e. those of high-velocity pairs within virialised structures. These produce the small-scale streching of the contours along (vertical direction), which is instead evident in the central part of the red galaxy plot on the right. For this reason, blue galaxies turn out to be better tracers of RSD, for which it is sufficient to use a simpler modelling, as shown in Fig. 6. When using the full galaxy population, the best performing model is the TNS by Taruya et al. [25] (left panel), while when we limit the sample to luminous blue galaxies only, it is sufficient to use the simpler nonlinear corrections by Scoccimarro [28] (filled circles, right panel); open circles correspond to the simplest model, i.e. the standard dispersion model [22], which is not sufficient even in this case. See [34] for details.

### 2.3 RSD from galaxy outflows in cosmic voids

Cosmic voids, i.e. the large under-dense regions visible also in Fig. 1, represent an interesting new way to look at the data from galaxy redshift surveys. As loose as they may appear, over the past few years they have proved to be able to yield quantitative cosmological constraints on the growth of structure. Indeed, growth-induced galaxy peculiar velocities tend to outflow radially from voids, which leaves a specific mark in the observed void-galaxy cross-correlation function (see e.g. [35]). The dense sampling of VIPERS makes it excellent for looking for cosmic voids at high redshift. Fig. 7 shows an example of how a catalogue of voids was constructed from these data [36].

The Darklight contribution to this new research path has been presented recently [37]. By modelling the void-galaxy cross-correlation function of VIPERS, a further complementary measurement of the growth rate of structure has been obtained[37]. This value is plotted in Fig. 8, which provides a summary of all VIPERS estimates, plotted in the customary form (see Sect. 2.1 for details). The figure also includes one further measurement, based on a joint analysis of RSD and galaxy-galaxy lensing [38], which has not been discussed here. In addition, one more analysis is in progress, based on the linearisation technique called “clipping” [39].

Such a multifaceted approach to estimating the growth rate of structure clearly represents an important cross-check of residual systematic errors in each single technique. We stress again how this has been made possible thanks to the broad “information content” of the VIPERS survey, which provides us with an optimal compromise (for these redshifts) between a large volume, a high sampling rate and extensive information on galaxy physical properties.

## 3 Optimal methods to derive cosmological parameters

The cosmological information we are interested in is encoded in the two-point statistics of the matter density field, i.e. its correlation function or, in Fourier space, its power spectrum . As we have seen in the Introduction, this is connected to the observed galaxy fluctuations as , with . The galaxy bias depends in general on the galaxy properties, such as their luminosity and morphology, as well as the environment in which they are found (in groups or in isolation). Thus, in this context the bias terms are nuisance parameters that are marginalized in the analysis. However, the precision with which the measurement can be made depends very much on these parameters as they set the amplitude of the power spectrum and the effective signal-to-noise ratio.

Going beyond the standard approach to estimate cosmological parameters, as e.g. used in the analysis of Fig. 2, in Darklight we have investigated and applied optimal methods given the observed constraints (luminosity function and bias). We can formulate this as a forward modelling problem through Bayes’ theorem, which tells us how the measurements relate to the model:

(11) |

On the left-hand side, the posterior describes the joint distributions of the model parameters, here explicitly written as the density field , its power spectrum , the galaxy bias and the mean number density , but we can generalize to the underlying cosmological parameters. The posterior is factored into the likelihood and prior terms on the right-hand side. To evaluate the posterior we must assume forms for these functions. We begin by assuming multi-variate Gaussian distributions for the likelihood and priors since these forms fully encode the information contained in the power spectrum or correlation function statistics. In this limit the maximum-likelihood solution is given by the Wiener filter. In [45] we demonstrate that in this limit the solution is optimal in the sense that it minimizes the variance on the density field and power spectrum.

Fig. 9 shows one possible reconstruction of the VIPERS density field. It represents a single step in the Monte Carlo chain used to sample the full posterior distribution as presented in [45]. In this work we characterized the full joint posterior likelihood of the density field, the matter power spectrum, RSD parameters, linear bias and luminosity function. These terms, particularly since they are estimated from a single set of observations, are correlated and the analysis naturally reveals these correlations.

A notable aspect of this analysis is that we optimally use diverse information including the luminosity function, density field and power spectrum to infer cosmological parameters and it becomes even more interesting with additional observables. We can envision simultaneous inference using cluster counts or cosmic shear. Generalizing requires putting a full dynamical model for large-scale structure in the likelihood term effectively moving the likelihood analysis to the initial conditions. Observational systematics may be naturally included as well.

## 4 A new kid in town: massive neutrinos

The non-vanishing neutrino mass, implied by the discovery of neutrino flavour oscillations, has important consequences for our analysis of the large-scale structure in the Universe. Even if sub-dominant, the neutrino contribution suppresses to some extent the growth of fluctuations on specific scales, producing a deformation of the shape of the total matter power spectrum. Given current upper limits on the sum of the masses ( eV at % confidence [14]), the expected effect corresponds to a few percent change in the amplitude of total matter clustering. In the era of precision cosmology, neutrinos are an ingredient that cannot be neglected anymore. Conversely, future surveys like Euclid may eventually be able to obtain an estimate of the total mass of neutrinos with a precision that surpasses ground-based experiments [46]. To achieve this goal, we shall be able to: (a) describe how these effects are mapped from the matter to the galaxy power spectrum, i.e. what we measure; (b) distinguish these spectral deviations from those due to non-linear clustering, and to the presence of other possible contributions, e.g. forms of dark energy beyond the cosmological constant, like quintessence or in general an evolving equation of state of dark energy .

This has been addressed in Darklight through the ”Dark Energy and Massive Neutrino Universe” (DEMNUni) simulations, a suite of fourteen large-sized N-body runs including massive neutrinos (besides cold dark matter), which have been recently completed [47]. They explore the impact on the evolution of structure of a neutrino component with three different total masses ( eV), including scenarios with evolving , according to the phenomenological form .

Running these simulations required developing new techniques to account for the evolving hot dark matter component represented by neutrinos [48]. Early analyses of the whole suite show that the effects of massive neutrinos and evolving dark energy are highly degenerate (less than % difference) with a pure CDM model, when one considers the clustering of galaxies or weak lensing observations. Disentangling these different effects will therefore represent a challenge for future galaxy surveys as Euclid and needs to be carefully addressed.

Fig. 10 gives an example of physical effects that can be explored using these numerical experiments, showing weak-lensing maps (in terms of the amplitude of the resulting deflection angle) built via ray-tracing through the matter particle distribution of the simulations, for sources placed at redshift . The middle panel shows the difference between a pure CDM scenario and a model with eV. More quantitatively, in terms of angular power spectra of the deflection field, massive neutrinos produce a scale-dependent suppression with respect to the CDM case, which, on small scales, asymptotically tends towards a constant value of about %, %, % for eV, respectively.

#### Acknowledgments.

Many of the results presented here would have not been possible without the outstanding effort of the VIPERS team to build such a unique galaxy sample. We are particularly grateful to S. de la Torre and J.A. Peacock for their insight and crucial contribution to the cosmological analyses discussed in this paper. Scientific discussions and general support in the development of Darklight by J. Dossett, J. He, and J. Koda are also warmly acknowledged.

### Footnotes

- http://darklight.fisica.unimi.it
- Review to appear in Towards a Science Campus in Milan: A snapshot of current research at Physics Department ’Aldo Pontremoli’ (2018, Springer, Berlin, in press)
- http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/en/cfht

### References

- Eisenstein, D. J., et al. AJ 142, 72 (2011).
- Guzzo, L., Vipers Team. The Messenger 168, 40 (2017).
- Guzzo, L., et al. AAP 566, A108 (2014).
- Garilli, B., et al. AAP 562, A23 (2014).
- Scodeggio, M., et al. AAP, in press, ArXiv e-print 161107048 (2017).
- York, D. G., et al. AJ 120, 1579 (2000).
- Eisenstein, D. J., et al. ApJ 633, 560 (2005).
- Di Porto, C., et al. AAP 594, A62 (2016).
- Rota, S., et al. AAP 601, A144 (2017).
- Planck Collaboration, et al. ArXiv e-print: 150201589 (2015).
- Riess, A. G., et al. AJ 116, 1009 (1998).
- Perlmutter, S., et al. ApJ 517, 565 (1999).
- Cole, S., et al. MNRAS 362, 505 (2005).
- Alam, S., et al. MNRAS 470, 2617 (2017).
- Davis, M., Peebles, P. J. E. ApJ 267, 465 (1983).
- Kaiser, N. MNRAS 227, 1 (1987).
- Guzzo, L., et al. Nature 451, 541 (2008).
- Laureijs, R., et al. ArXiv e-print 11103193 (2011).
- Okumura, T., Jing, Y. P. ApJ 726, 5 (2011).
- Bianchi, D., et al. MNRAS 427, 2420 (2012).
- Peacock, J. A., et al. Nature 410, 169 (2001).
- Peacock, J. A., Dodds, S. J. MNRAS 267, 1020 (1994).
- Pezzotta, A., et al. AAP 604, A33 (2017).
- de la Torre, S., Guzzo, L. MNRAS 427, 327 (2012).
- Taruya, A., Nishimichi, T., Saito, S. Phys. Rev. D 82, 6, 063522 (2010).
- Bianchi, D., Chiesa, M., Guzzo, L. MNRAS 446, 75 (2015).
- Fisher, K. B. ApJ 448, 494 (1995).
- Scoccimarro, R. Phys. Rev. D 70, 8, 083007 (2004).
- Reid, B. A., et al. MNRAS 426, 2719 (2012).
- Bianchi, D., Percival, W. J., Bel, J. MNRAS 463, 3783 (2016).
- Jennings, E., Baugh, C. M., Pascoli, S. MNRAS 410, 2081 (2011).
- Bel, J., et al. in preparation (2017).
- Blake, C., et al. MNRAS 406, 803 (2010).
- Mohammad, F. G., et al. ArXiv e-prints (2017).
- Hamaus, N., et al. Physical Review Letters 117, 9, 091302 (2016).
- Micheletti, D., et al. AAP 570, A106 (2014).
- Hawken, A. J., et al. AAP, in press, ArXiv e-print 161107046 (2017).
- de la Torre, S., et al. submitted to AAP, ArXiv e-prints 161205647 (2017).
- Wilson, M., et al. in preparation (2017).
- Blake, C., et al. MNRAS 425, 405 (2012).
- Beutler, F., et al. MNRAS 466, 2242 (2017).
- Beutler, F., et al. MNRAS 423, 3430 (2012).
- Howlett, C., et al. MNRAS 449, 848 (2015).
- Okumura, T., et al. Pub Astr Soc Japan 68, 38 (2016).
- Granett, B. R., et al. AAP 583, A61 (2015).
- Carbone, C., et al. JCAP 3, 030 (2011).
- Carbone, C., Petkova, M., Dolag, K. JCAP 7, 034 (2016).
- Zennaro, M., et al. MNRAS 466, 3244 (2017).