The phase-space structure of nearby dark matter as constrained by the SDSS

The phase-space structure of nearby dark matter as constrained by the SDSS

Florent Leclercq Institute of Cosmology and Gravitation (ICG), University of Portsmouth,
Dennis Sciama Building, Burnaby Road, Portsmouth PO1 3FX, United Kingdom
   Jens Jasche Excellence Cluster Universe, Technische Universität München,
Boltzmannstrasse 2, D-85748 Garching, Germany
   Guilhem Lavaux Institut d’Astrophysique de Paris (IAP), UMR 7095, CNRS – UPMC Université Paris 6, Sorbonne Universités, 98bis boulevard Arago, F-75014 Paris, France Institut Lagrange de Paris (ILP), Sorbonne Universités,
98bis boulevard Arago, F-75014 Paris, France
   Benjamin Wandelt Institut d’Astrophysique de Paris (IAP), UMR 7095, CNRS – UPMC Université Paris 6, Sorbonne Universités, 98bis boulevard Arago, F-75014 Paris, France Institut Lagrange de Paris (ILP), Sorbonne Universités,
98bis boulevard Arago, F-75014 Paris, France
Department of Physics, University of Illinois at Urbana-Champaign,
1110 West Green Street, Urbana, IL 61801, USA
Department of Astronomy, University of Illinois at Urbana-Champaign,
1002 West Green Street, Urbana, IL 61801, USA
   Will Percival Institute of Cosmology and Gravitation (ICG), University of Portsmouth,
Dennis Sciama Building, Burnaby Road, Portsmouth PO1 3FX, United Kingdom
July 9, 2019

Previous studies using numerical simulations have demonstrated that the shape of the cosmic web can be described by studying the Lagrangian displacement field. We extend these analyses, showing that it is now possible to perform a Lagrangian description of cosmic structure in the nearby Universe based on large-scale structure observations. Building upon recent Bayesian large-scale inference of initial conditions, we present a cosmographic analysis of the dark matter distribution and its evolution, referred to as the dark matter phase-space sheet, in the nearby universe as probed by the Sloan Digital Sky Survey main galaxy sample. We consider its stretchings and foldings using a tetrahedral tessellation of the Lagrangian lattice. The method provides extremely accurate estimates of nearby density and velocity fields, even in regions of low galaxy density. It also measures the number of matter streams, and the deformation and parity reversals of fluid elements, which were previously thought inaccessible using observations. We illustrate the approach by showing the phase-space structure of known objects of the nearby Universe such as the Sloan Great Wall, the Coma cluster and the Boötes void. We dissect cosmic structures into four distinct components (voids, sheets, filaments, and clusters), using the Lagrangian classifiers diva, origami, and a new scheme which we introduce and call lich. Because these classifiers use information other than the sheer local density, identified structures explicitly carry physical information about their formation history. Accessing the phase-space structure of dark matter in galaxy surveys opens the way for new confrontations of observational data and theoretical models. We have made our data products publicly available.


I Introduction

The accurate description of large-scale structure formation, from which the cosmic web originates (Bond, Kofman & Pogosyan, 1996), is one of the major goals of modern cosmology. Traditionally, the Lagrangian and Eulerian descriptions of a discretized fluid of collisionless dark matter are comparatively discussed. In the Eulerian picture, quantities are tracked at fixed positions as particles move. In the Lagrangian framework, the motion and evolution of individual elements is followed. Therefore, particles can be thought of not only as mass tracers, but also as elements of a three-dimensional dark matter sheet, which stretches, folds and distorts in a six-dimensional velocity-position phase space.

Using cosmological simulations, several recent papers have demonstrated the power of working within the dark matter sheet. Tessellations of the initial Lagrangian space allow the identification of stream crossings and improve density measurements (Abel, Hahn & Kaehler, 2012; Shandarin, Habib & Heitmann, 2012), provide accurate estimates of cosmic velocity fields (Hahn, Angulo & Abel, 2015), and yield a new approach for simulating dark matter (Hahn, Abel & Kaehler, 2013). In addition, the study of caustics (Arnold, Shandarin & Zeldovich, 1982; White & Vogelsberger, 2009; Vogelsberger & White, 2011; Hidding, Shandarin & van de Weygaert, 2014; Feldbrugge, Hidding & van de Weygaert, 2014), the two-dimensional edges along which the dark matter sheet folds, brings valuable insights into structure formation (Neyrinck, 2012). Strong observational motivations exist for a Lagrangian understanding of the cosmic web. These include studying the dependence of galaxy properties on their evolving environment (e.g. Blanton et al., 2005; Libeskind et al., 2015), probing the effect of the dynamic large-scale structure on the cosmic microwave background (e.g. Lavaux, Afshordi & Hudson, 2013; Planck Collaboration, 2014), testing the standard general-relativistic picture of gravitational instability (Falck, Koyama & Zhao, 2015), and identifying the most promising regions to observe a possible dark matter annihilation signal (e.g. Gao et al., 2012).

Unfortunately, the Lagrangian picture cannot be readily applied to observational data. Difficulties are of two sorts: (i) the need for reconstructions of the dark matter density from galaxies, accounting for typical observational effects such as biases, survey mask and selection functions; and (ii) the need for corresponding initial conditions, which requires the inclusion of a physical picture of structure formation into the data model. For these reasons, most cosmic web analyses have been limited so far to simulations (see e.g. Cautun et al., 2014, for a recent study). However, recently, Jasche et al. (2010) and Leclercq, Jasche & Wandelt (2015b, hereafter borg sdssT-web) addressed the first issue. Using real data, they presented Eulerian classifications of cosmic environments in constrained realizations of the large-scale structure. Importantly, they demonstrated capability of propagating observational uncertainties to cosmic web classification. Taking advantage of the inclusion of a physical model within the inference process, borg sdssT-web also presented the first probabilistic analysis of proto-structures present in the initial density field. However, a complete investigation of the problem, including a probabilistic description of phase-space properties of dark matter based on real data, has not yet been performed in the literature.

This paper describes the first analysis of the Lagrangian dark matter sheet in the nearby Universe, as probed by galaxies of the northern galactic cap of the Sloan Digital Sky Survey (SDSS) main sample. Capitalizing on the full phase-space information, we produce highly-detailed and accurate maps of cosmic density and velocity fields, even in regions only poorly sampled by galaxies. We also analyze additional information specific to collisionless fluids, which was so far inaccessible in observations: the number of matter streams, the deformation and parity of fluid elements, and the presence of caustics. We present the first application of the Lagrangian cosmic web classifiers diva and origami to the real Universe, as well as a new algorithm, lich, which we design to take into account the heterogeneous (potential and vortical) nature of flows.

This chrono-cosmography project builds upon the inference of the past and present cosmic structure in the SDSS (Jasche, Leclercq & Wandelt, 2015, hereafter borg sdss–inference), performed using the borg algorithm (Jasche & Wandelt, 2013). It also relies on a large set of constrained realizations of the Universe, produced using the cola method (Tassev, Zaldarriaga & Eisenstein, 2013). These necessarily introduce some degree of extrapolation with respect to the original inference results, since they contain aspects of the physical model that have not been constrained by the data (in particular, in this work, vortical matter flows). However, these extrapolations are not arbitrary, but are methodical predictions that need to be consistent with the observational constraints. Indeed, a complex forward model naturally introduces correlations between its constrained and unconstrained aspects, which may yield reliable predictions for the physics that has not been captured by the simpler data model. Our methodology therefore predicts the properties of the Lagrangian dark matter sheet based on fusing prior information from simulations and data constraints. Such an approach has been previously demonstrated to accurately recover the properties of voids at the level of the dark matter distribution (Leclercq et al., 2015). Importantly, all of our predictions are fully probabilistic, which means that their degree of speculativeness is controlled and can be checked against unconstrained simulations. From a Bayesian perspective, all the results presented in this work can then be used as prior information for updating our knowledge in light of additional observations, or for making optimal decisions in the presence of uncertainty (see Leclercq, Jasche & Wandelt, 2015a).

The structure of this paper is as follows. We start by describing our tools for analyzing the dark matter sheet in section II. This includes a review of the Lagrangian displacement field, the description of the existing Lagrangian classifiers diva and origami and the introduction of a new one that we call lich, and the definition of estimators based on tessellating the Lagrangian particle grid. In section III, we then infer, characterize and classify the SDSS volume in Lagrangian space. Finally, in section IV, we show the consequences of these results in Eulerian coordinates and demonstrate that resulting maps of the redshift-zero cosmic web have an interesting level of predictive power. We summarize our study and conclude in section V.

Ii Description of the Lagrangian sheet

ii.1 The Lagrangian displacement field

The fundamental object studied in this paper is the displacement field , which maps the initial (Lagrangian) position of particles q to their final (Eulerian) position (see e.g. Bernardeau et al., 2002, for a classic review):


As any smooth vector field, it is possible to split into a scalar and a rotational part (the Helmholtz decomposition, Chan, 2014),


where is the scalar potential and is the vector potential, which obey the Poisson equations:


From analytic approaches and numerical simulations, it has been known for some time that the displacement field is almost fully potential; in particular, contains all the information up to second order in Lagrangian perturbation theory (2LPT). In this paper, we estimate and by computing respectively and , where is a fast Fourier transform (FFT) on the Lagrangian mesh on which the displacement field is defined. For later use, we define .

We denote by the Jacobian determinant of the transformation between Lagrangian and Eulerian coordinates,


where the deformation tensor can be written as the identity tensor plus the shear of the displacement field . The characteristic polynomial of is written as


where the are the three eigenvalues (real or complex) of (i.e. the solutions of ). The Jacobian is then .

The characteristic equation can be written


where we call the the Lagrangian invariants (in analogy to the rotational invariants introduced by Wang et al., 2014, who use the shear of the Eulerian velocity field instead of the Lagrangian displacement field). They are given in terms of the eigenvalues by (Chong, Perry & Cantwell, 1990)


from which one can derive the explicit formulation in terms of the coefficients of (summation over repeated indices is implied):


In the previous equations, we have used the decomposition of into a symmetric part (the rate of deformation tensor) and an anti-symmetric part (the spin tensor),




Physically, denotes the rate of deformation of a Lagrangian fluid element and the rate of rotation.

The surface in Lagrangian invariants space that divides real and complex solutions for is given by the equation


Using the discriminant of this quadratic equation, one finds two real solutions and () if and only if . Therefore, at fixed , the regions where all the eigenvalues of are real correspond to


In this case, the displacement field is purely potential and we denote the eigenvalues by . We also define the ellipticity and prolateness (e.g. Peacock & Heavens, 1985; Bardeen et al., 1986) as


Ellipsoids are prolate-like if or oblate-like if . The limiting cases are for prolate spheroids and for oblate spheroids. In the following, we will also consider and , for which the boundary conditions are still valid.

After orbit-crossing, an anti-symmetric part of is generated and the eigenvalues of are no longer all real. We parametrize them as and where are real. The invariants then simply read (see equations (8)–(10))


As we will show in section II.2, the Lagrangian invariants allow a complete classification of the cosmic web, including both potential and vortical flow patterns (as with the rotational invariants, see Chong, Perry & Cantwell, 1990; Wang et al., 2014).

Finally, we note that it is possible to translate the Lagrangian invariants to the eigenvalues , using Vieta’s formulas for cubic equations:


and for ,


ii.2 Lagrangian web-type classifications

In our earlier analysis of the volume covered by the SDSS main sample galaxies (borg sdssT-web), we adopted the Eulerian cosmic web classifier known as the T-web (Hahn et al., 2007; Forero-Romero et al., 2009). In this framework, structures are classified according to the sign of the eigenvalues of the tidal field tensor , defined as the Hessian of the rescaled gravitational potential :


where obeys the reduced Poisson equation


being the local density contrast. A void point corresponds to no positive eigenvalue, a sheet to one, a filament to two, and a cluster to three positive eigenvalues.

Related Hessian-based techniques are the V-web (Hoffman et al., 2012) and classic (Kitaura & Angulo, 2012). More broadly, Eulerian cosmic web algorithms include topological (Aragón-Calvo et al., 2010; Sousbie, 2011), stochastic (González & Padilla, 2010; Tempel et al., 2014), and multiscale field (Aragón-Calvo et al., 2007; Cautun, van de Weygaert & Jones, 2013; Aragon-Calvo & Yang, 2014) methods, all of which yield different physical insights from the phase-space analysis subject of this paper.

In this section, we discuss three alternative methods that can be used for producing Lagrangian classifications of the cosmic web: diva, lich and origami.

ii.2.1 Diva

In analogy with the Zel’dovich (1970) “pancake” theory, Lavaux & Wandelt (2010) propose to define cosmic structures using , the eigenvalues of the shear of the displacement field, . In the diva (DynamIcal Void Analysis) scheme, a particle is defined as belonging to a Lagrangian void, sheet, filament or cluster, if, respectively, zero, one, two or three of the are negative. In order to deal only with real eigenvalues, diva assumes that the flow is potential, i.e. that is equal to its symmetric part .

The diva and T-web methods share strong similarity in their formalism. However, a major difference is that the T-web operates on voxels of the discretized domain, whereas diva operates on Lagrangian patches (numerically approximated by particles). As shown in borg sdssT-web (sections III and IV), the T-web can be used to classify both early-time and late-time structures; however the T-web classification of early-time structures is performed at the level of voxels of the discretized initial density field, whereas the diva classification of Lagrangian structures is performed at the level of particles of the dark matter sheet.

ii.2.2 Lich

Figure 1: Illustration of structure type classification in Lagrangian invariants space, according to the lich procedure (see also figure 3 in Wang et al., 2014). We show the plane for (left panel) and (right panel). The Lagrangian structure type corresponding to each type of potential or vortical flow is given in the legend.

This section introduces a new cosmic-web classification procedure based on the Lagrangian invariants, which we call lich (Lagrangian Invariants Classification of Heterogeneous flows). As we will show, lich is a generalization of diva to the case of heteregenous (i.e. potential and vortical) flows. lich uses for the Lagrangian invariants the same formalism as introduced by Wang et al. (2014) for the rotational invariants; the reader is therefore referred to this paper for more details.

Let us first consider the case of a potential flow. The real solution condition, equation (16), must be fulfilled. The case where all the are negative (a diva cluster) further corresponds to for all (see equations (8)–(10)). A lich potential cluster is therefore defined by


Similarly, a lich potential void, which generalizes a diva void (all positive), corresponds to


Two regions in Lagrangian invariants space correspond to a diva sheet (, ):


They define the lich potential sheets. Two regions also correspond to a diva filament (, ):


They define the lich potential filaments.

The case of a vortical flow corresponds to either (in which case there is no solution to equation (15)), either . In these regions in Lagrangian invariants space, an additional frontier is important: (i.e. ), which corresponds to the plane


The remaining regions can then be labeled as vortical voids, vortical sheets, vortical filaments or vortical clusters. Figure 1 summarizes the lich classification procedure in Lagrangian invariants space. For the detailed criteria defining all the regions there, the reader is referred to Appendix A in Wang et al. (2014).

In section III.1.2, we will also use a simplified version of the full lich procedure, where particles fall into one of the usual four categories (void, sheet, filament or cluster) by simply ignoring the potential/vortical subdivision. Note that this simplified lich procedure is different from diva, since it does not use the initial assumption that the flow is potential.

ii.2.3 Origami

Another way of classifying particles is to consider them as vertices of an initially regular grid, distorted by gravity as the large-scale structure forms. Shell-crossing happens when Lagrangian cells collapse or invert. A particle’s origami (Order-ReversIng Gravity, Apprehended Mangling Indices) morphology is defined by the number of orthogonal axes along which shell-crossing has occurred (Knebe et al., 2011; Falck, Neyrinck & Szalay, 2012). More precisely, void, sheet, filament, and cluster particles are defined as particles that have been crossed along zero, one, two, and three orthogonal axes, respectively.

In practice, the computer implementation of origami works as follows: a particle has been crossed along an axis if there exists a particle in the same row of the Lagrangian grid, such that and have opposite sign. Such particle-crossing are checked along the three orthogonal axes of the original Cartesian grid, as well as three triplets of orthogonal axes, obtained by rotating the original grid by 45 along one of its axes. For more details, the reader is referred to section 2.1 in Falck, Neyrinck & Szalay (2012).

ii.3 Tessellation of the Lagrangian grid

Figure 2: Tessellation of the unit cube into six tetrahedra. The blue points represent the usual dark matters particles (the flow tracers), initially placed at the vertices of a uniform Lagrangian lattice. Each of the six tetrahedra can be approximated by four pseudo-particles shown in red (the mass tracers), which carry 1/24 of the mass of one flow tracer.

Abel, Hahn & Kaehler (2012) and Shandarin, Habib & Heitmann (2012) proposed new techniques to trace the dark matter phase-space sheet and demonstrated that they give a wealth of information previously unavailable in simulations. In this paper, we will demonstrate that these techniques can now be used in data-constrained realizations of the nearby Universe. The core idea is to use a tessellation of the uniform Lagrangian lattice of particles. Each fundamental cube is decomposed into tetrahedral cells that use the particles as their vertices. These cells are tracked in the final conditions, using the particle IDs which are set-up so that they encode the initial position of the particle.

Several possible decompositions of the unit cube exist: for example, Shandarin, Habib & Heitmann (2012) use five tetrahedra of two different volumes. In this work, however, we will follow Abel, Hahn & Kaehler (2012); Hahn, Angulo & Abel (2015) and use the Delaunay decomposition into six identical tetrahedra (see figure 2). Specifically, the connectivity list is , , , , and where the unit cube vertices are labeled as in figure 2.

Initially, the volume of each tetrahedron is where is the side length of the fundamental cube. In the final state, the same tetrahedron has a volume given by a determinant (or cross-product)


where A, B, C, D label the final positions of the tetrahedron vertices. Note that the volume can be negative if the tetrahedron has been turned inside-out. In fact, the number of sign reversals of an elementary Lagrangian volume defines the flip-flop field (e.g. Neyrinck, 2012; Shandarin & Medvedev, 2016). Each tetrahedron carries 1/6 of the mass of a particle , the mass density associated is therefore . At any particle position, there are 24 tetrahedra that share this vertex (, see figure 2).

Tessellating the dark matter phase-space sheet allows one to define a new quantity on the Lagrangian grid, the primordial stream density,


where the summation runs on the 24 tetrahedra that share the particle’s initial position q as one of their vertices. Altogether, these tetrahedra have an initial volume of .

In addition to the primordial stream density, tracking the final configuration of the tessellated dark matter phase-space sheet allows one to perform the Lagrangian transport of any quantity defined on a particle basis, which can be used to define a variety of estimators on the Eulerian grid. This is achieved by finding all the tetrahedra that contain the point x at which properties are to be determined. Each such tetrahedron is given four coefficients , , , for the four particles that constitute its vertices (A, B, C, D). Fields are interpolated inside each tetrahedron using Shepard’s method (inverse distance-squared weighting); namely, each tetrahedron deposits to the point x the quantity


where and .

In this work, we consider the following coefficients and associated estimators:

  • simply yields the number of tetrahedra enclosing point x. Since we start from a complete tessellation of the considered volume, this is also the number of matter streams contributing at point x, which we call the Eulerian secondary stream density, following Abel, Hahn & Kaehler (2012).

  • yields an estimator of the final Eulerian density field (see Abel, Hahn & Kaehler, 2012), which we note .

  • if particle P is of type , 0 otherwise yields an estimator of the density contained in structure of type , i.e. . The field represents the probability for a final Eulerian point to belong to structure type . This method can therefore be used to translate Lagrangian classifications into Eulerian coordinates. As it is computationally expensive, we will introduce a simplified alternative in section IV.2.

  • where is the velocity of particle P along the direction yields an estimator of the component of the mass-weighted velocity field. The resulting field divided by yields the velocity field estimated from the dark matter phase-space sheet (see Hahn, Angulo & Abel, 2015).

Instead of explicitly projecting all the tetrahedra to the Eulerian grid, it is possible to approximate each of them by four pseudo-particles (in red in figure 2; see the discussion in Hahn, Abel & Kaehler, 2013, section 2.4, for where to place these pseudo-particles). In this context, the original particles are called the flow tracers and the pseudo-particles are called the mass tracers. Flow tracers uniquely determine the position of mass tracers, which can be deposited to the Eulerian grid by standard interpolation techniques such as the cloud-in-cell (CiC, Hockney & Eastwood, 1981) algorithm.

Iii Lagrangian analysis of the SDSS volume

As mentioned in the introduction, this paper is part of a series following the recent application of the Bayesian inference framework borg to the SDSS main sample galaxies (borg sdss–inference). We further rely on the set of 1,097 constrained simulations presented in borg sdssT-web. These realizations are obtained, starting from borg initial conditions, by non-linear filtering with cola (Tassev, Zaldarriaga & Eisenstein, 2013). The initial density field, defined on a -voxel grid with side length of  Mpc/, is populated by dark matter particles placed on a regular Lagrangian lattice. Particles are evolved with 2LPT to the redshift of then with 30 cola timesteps from to . The displacement field is obtained on the Lagrangian lattice by subtracting the final position of particles from their initial position, and any necessary quantity involving derivatives is computed using FFTs on the same lattice.

In this section, we first analyze the Lagrangian properties of individual samples (section III.1), then show how the full set of constrained samples can be used to propagate uncertainties to all inferred quantities (section III.2).

iii.1 Analysis of individual samples

Figure 3: Slices through one sample of the observed nearby Lagrangian dark matter sheet, as constrained by the SDSS main galaxy sample. A region of the celestial equatorial plane encompassing the Sloan Great Wall in shown. Each pixel corresponds to one particle of the Lagrangian lattice. The first row shows the divergence of the displacement field (left panel), the norm of the vector part of the displacement field (middle panel), and the primordial stream density, defined by equation (36) (right panel). The second row shows Lagrangian classifications of particles: diva (left panel), lich (middle panel) and origami (right panel). Blue, green, yellow, and red correspond to void, sheet, filament, and cluster particles, respectively. For lich, dark colors correspond to potential structures and light colors to vortical structures, as in figure 1.

iii.1.1 Stream properties

The first row of figure 3 shows the stream properties in one of the samples of our set. The slices show a region of the celestial equatorial plane encompassing the Sloan Great Wall (Gott et al., 2005; Einasto et al., 2011).111In the plots of this paper, we kept the coordinate system of borg sdss–inference. The flow is characterized in terms of the divergence of the displacement field , the norm of its vector part , and the primordial stream density (see equation (36)). quantifies the angle-average stretching and distortion of the dark matter sheet (Neyrinck, 2013) and contains all the information on as long as the flow remains potential. can be used to quantify how much vorticity is generated when structure formation enters the multi-stream regime, and the primordial stream density counts these streams.

It is important to note that, since the displacement field is curl-free up to second order in Lagrangian perturbation theory (see e.g. Bernardeau et al., 2002), and since borg relies on 2LPT as a proxy for gravitational dynamics (Jasche & Wandelt, 2013), any vorticity present in our large-scale structure realizations is not explicitly constrained by the data. The field therefore corresponds one plausible realization, compatible both with the constrained initial conditions and with the non-linear structure formation model (cola).

Our constrained samples exhibit the same behavior as previously observed in simulations. Specifically, regions of high primordial stream density nicely correspond to Lagrangian clusters, the “lakes” where . This is also where a rotational part of the displacement field is generated. Conversely, single-stream regions correspond to the “mountains” where . From the behavior of the field (limited at most to  Mpc/ in the regions of highest stream density), we also note that we can take advantage of the absence of substantial vorticity on large scales. This is consistent with the results of Chan (2014): at and up to  Mpc/, the contribution to the power spectrum of coming from its curl component remains negligible compared to the non-linear evolution of its scalar part. Therefore, for the sake of our analysis (except when explicitly measuring vorticity), it will be safe to assume that all the information on is contained in its divergence , or equivalently, that .

iii.1.2 Lagrangian classifications

We then assign a Lagrangian structure type to each particle, according to the diva, lich, and origami procedures. Lavaux & Wandelt (2010) discuss the possibility of smoothing the displacement field at different Lagrangian scales, in order to probe the hierarchical cosmic web. This allows the description of structure formation at different dynamical epochs, each Lagrangian scale corresponding to a different collapse time. In contrast, explicit Eulerian multiscale formalisms (Aragón-Calvo et al., 2007; Cautun, van de Weygaert & Jones, 2013; Aragón-Calvo et al., 2010; Aragon-Calvo & Szalay, 2013) probe the hierarchy of structures at a given time. The adhesion approximation (Gurbatov, Saichev & Shandarin, 1989; Weinberg & Gunn, 1990; Kofman, Pogosian & Shandarin, 1990; Kofman et al., 1992; Gurbatov, Saichev & Shandarin, 2012; Hidding, van de Weygaert & Shandarin, 2016) provides a Lagrangian description of the hierarchical evolution of structures, but it is only correct in the linear and weakly non-linear regime, contrary to the numerical cola model used in this work. The main effect of smoothing the Lagrangian grid is to change the unconditional number of particles in different structures; data constraints do not dramatically alter this result (Lavaux & Wandelt, 2010). Therefore, while this would be perfectly feasible in our framework, we choose to focus our analysis on the scale defined by our setup, without an additional smoothing step. This choice corresponds to a resolution of  Mpc/ for the Lagrangian grid ( particles in a  Mpc/ cubic box) and  Mpc/ for discretized Eulerian maps (on a -voxel grid).

The second row of figure 3 represents Lagrangian structures on the grid of particles, in one of our constrained samples. From left to right, the panels show the diva, lich, and origami morphologies of the dark matter sheet. As expected, all the classifications correlate well with the stream properties shown in the first panel (in particular with ). Clusters (in red) correspond to the “lakes” of high primordial stream density where , while less complex structures are regions with fewer streams and higher values of . Single-stream regions, which correspond to origami void particles, form a percolating system, as noted by Falck & Neyrinck (2015).

The difference between diva and origami classifications (notable primarily for voids, then for sheets and filaments) is caused by the different phase-space criterion used for defining structures: particle-crossings that have already happened for origami, versus those non-locally predicted from the deformation tensor for diva. A large single-stream void percolates with origami, while filament and sheet regions are limited: this reflects the current configuration of the cosmic web. Conversely, with diva, particles around filaments and sheets typically get the signature of these structures, on the basis of the dynamical trend.

As shown in section II.2.2, lich is a generalization of diva to the case of heterogeneous (potential and vortical) flows — applying lich under the assumption of a potential flow yields diva. Thus, the comparison between diva and lich allows to check this assumption. Comparing the diva, lich, and panels shows that the two classifications differ in the regions where vorticity is generated in . Specifically, with lich, most cluster regions are vortical ( of cluster particles), except the smallest ones, and surrounded by a filamentary shell. of filament particles are vortical, in the surroundings clusters; the rest are potential. Sheet regions are potential ( of particles). Void regions are mostly potential ( of particles), but with some spinning regions. When ignoring the subclassification into potential and vortical structures, but without assuming a potential flow, the simplified lich procedure agrees with diva for of particles, which is an additional argument for the accuracy of the potential flow approximation. The difference is mostly driven by clusters and voids, with of lich cluster and of lich void particles misidentified by diva (almost all of which as filaments and sheets, respectively).

iii.2 Uncertainty quantification

Figure 4: Uncertainty quantification on the observed Lagrangian dark matter sheet. The first row shows slices through the posterior mean of the eigenvalues of the rate of deformation tensor, used by diva. The second row shows the ellipticity (left panel) and prolateness (middle panel) of the Lagrangian potential. The right panel shows the mean of the Jacobian , with particles colored according to their Eulerian volume times their parity. The third row shows the posterior mean of the Lagrangian invariants, used by lich.

As in our previous work (Jasche, Leclercq & Wandelt, 2015; Leclercq, Jasche & Wandelt, 2015b; Lavaux & Jasche, 2016), the variation among constrained samples constitutes a Bayesian quantification of uncertainties (coming in particular from the survey mask and selection effects). This is illustrated in figure 4, where we show the particle-wise posterior mean of different quantities on the Lagrangian grid. The first row shows the eigenvalues of the rate of deformation tensor (the symmetric part of the shear of the displacement field ). These are used by diva to classify structures. The second row shows derived quantities: the (reduced) ellipticity and prolateness , and the Jacobian of the transformation from Lagrangian to Eulerian coordinates, . For each particle, it is the volume of the associated fluid element times its parity (Neyrinck, 2012; Shandarin & Medvedev, 2016). White/blue particles have the same parity as in the initial conditions, while black/orange particles have swapped parity. Caustics are clearly visible wherever blue and orange regions are juxtaposed, for example in the patches that collapse to form halos and in the shell-crossed walls surrounding voids. In some clusters, several parity inversions can be observed. The last row shows the posterior mean for the Lagrangian invariants , , , which are used by lich to classify structures. These cosmographic results constitute the first characterization of the nearby Lagrangian dark matter sheet, and contain a wealth of information on the dynamic large-scale environment of SDSS main sample galaxies. They come as fully probabilistic, which means that uncertainties are quantified at the level of the map.




Figure 5: Slices through the posterior probabilities for different structure types (from left to right: void, sheet, filament, and cluster), in the primordial large-scale structure in the Sloan volume (). Structures are defined on the Lagrangian grid of particles with diva (first row), lich (second and third rows), or origami (fourth row). For lich, potential and vortical structures are distinguished. For each classifier, the four or eight three-dimensional probabilities sum up to one at each location. For comparison, the reader is referred to figure 6 in Leclercq, Jasche & Wandelt (2015b), where the primordial density field, defined on a -voxel grid, is classified according to the T-web procedure.

Uncertainty quantification can also be self-consistently propagated to Lagrangian structure type classification, as we now show. Let us denote by one of the Lagrangian classifiers (diva, lich, or origami). In a specific realization, uniquely characterizes the Lagrangian grid, meaning that it provides particle-wise scalar fields that obey the following conditions for each particle :


where for diva and origami ( void, sheet, filament, cluster); for lich ( potential void, potential sheet, potential filament, potential cluster, vortical void, vortical sheet, vortical filament, vortical cluster). By applying to the complete set of borg-cola realizations, we can quantify the degree of belief in structure type classification in terms of particle-wise scalar fields that obey the following conditions for each particle :


Here, are the posterior probability distribution functions (pdfs) for particle to belong to structure , as defined by classifier , conditional on the data . These are estimated by counting the relative frequencies at each individual particle position within the set of -realizations (see also Jasche et al., 2010; borg sdssT-web). Therefore, the cosmic web-type pdf mean on the Lagrangian grid is given by


where labels one of the samples, is the result of classifier on the -th sample at particle position (i.e. a unit -vector containing zeros except for one component, which indicates the structure type), and is a Kronecker symbol.

Using the above definitions, we obtain probabilistic maps of structures on the Lagrangian grid describing the early Universe in the volume probed by SDSS main sample galaxies. More precisely, for each classifier , we obtain a probability mass function (pmf) at each particle position on the Lagrangian grid, indicating the possibility to encounter a specific structure type at that position. These pmfs consist of numbers in the range , , that sum up to one for each particle. They are approximated by the mean of each structure pdf (see equation (40)).

In figure 5, we show slices through these web-type posterior probabilities. The rows represent the structures inferred using diva, lich, and origami, respectively. For comparison, the reader is referred to figure 6 in borg sdssT-web, which shows structures inferred in the initial density field using the T-web. Note that the present results are defined on the Lagrangian grid of particles, whereas the T-web result is defined on the -voxel grid on which initial density fields are discretized. The plot shows the anticipated behavior, with values close to certainty (i.e. zero or one) in regions covered by data, while the unobserved regions (at high redshift or out of the survey boundary) approach a uniform value corresponding to the prior. It is worth noting that different classifiers can have very different prior probabilities for the same structure types (these are given in table II in Leclercq et al., 2016). The good agreement between diva and lich at the level of individual samples, coming from the absence of a substantial vector part in , is also correctly propagated to the posterior mean.

A voxel-wise quantification of the information gain due to SDSS galaxies is provided by the relative entropy of corresponding posteriors and priors (borg sdssT-web; Leclercq et al., 2016). We applied this technique and found that the Lagrangian web-type posteriors presented in figure 5 yield much higher information gain than the application of the T-web to initial density fields. Averaged over the constrained volume, this number is  Sh for diva,  Sh for lich, and  Sh for origami, as compared to  Sh for the T-web. This consideration substantiates the non-local transport of information along Lagrangian trajectories operated by the structure formation model, and demonstrates that a probabilistic description of the Lagrangian dark matter sheet yields a much higher information content for structure type classification than previous methods.

Iv Consequences in Eulerian coordinates

In this section, we describe potentially observable consequences of our inference of the nearby Lagrangian dark matter phase-space sheet. To do so, we analyze the properties of our constrained simulations in Eulerian coordinates in light of previous Lagrangian results. In other words, we demonstrate that our analysis of dark matter phase-space properties has an interesting degree of predictive power.

Formally, the problem can be expressed as follows. Before the data are considered, the distribution of an unknown observable is its prior predictive distribution,


where are the three-dimensional initial conditions. After the data have been observed, the distribution for is the posterior predictive distribution,


In the above equations, we have used the borg inference of initial conditions, i.e. where is one of the samples. We have also assumed that does not further depend on the data once the initial conditions are known, and that is a deterministic function of the initial conditions, i.e. . Therefore, under the assumption that inferred initial conditions contain all the information on , our procedure generates samples of .

In the following, we start by looking at phase-space properties traced at Eulerian coordinates, which yields improved estimators for density fields (section IV.1), then we show how to translate Lagrangian classifications to Eulerian coordinates (section IV.2), discuss the properties of cosmic velocity fields (section IV.3), and finally present some cosmographic results (section IV.4). For all results, we also show that observational uncertainties are propagated in a non-linear and non-Gaussian fashion by the cola forward model, from which we get a feeling for how precise the predictions are.

iv.1 Phase-space properties traced at Eulerian coordinates

iv.1.1 Eulerian secondary stream density

Figure 6: Slice through the posterior mean for the Eulerian secondary stream density, defined on a cubic grid of voxels. Galaxies of the SDSS main galaxy sample are overplotted. The slice is  Mpc/ thick around the celestial equator.

The simplest phase-space property that can be translated into Eulerian coordinates is the number of matter streams. As discussed in section II.3, this is simply achieved by counting, in the final configuration of the tessellated Lagrangian dark matter sheet, the number of tetrahedra that contain any given position. The resulting field is called the secondary stream density. Since the dark matter phase-space sheet is discretized using a finite number of particles (and hence of tetrahedra), the secondary stream density is only a lower bound on the actual number of dark matter streams at any single location.

We applied this estimator to our full set of constrained realizations. Figure 6 shows the resulting posterior mean for the Eulerian secondary stream density. A slice of  3 Mpc/ (one voxel) thickness around the celestial equator is shown, and the galaxies of the northern cap of the SDSS main galaxy sample that are contained in the same volume are overplotted.222The transformation between our set of Cartesian coordinates and equatorial J2000.0 spherical coordinates can be found in appendix B of borg sdss–inference.

As expected from simulations (e.g. Ramachandra & Shandarin, 2015), the number of streams correlates well with the galaxy density field: regions of high galaxy density are the ones where several matter flows contribute. For example, in the inner regions of the Sloan Great Wall, which is clearly visible in the middle of the slice, at least 6 matter streams are converging. At large distances ( Mpc/), comparing the posterior mean for the Eulerian secondary stream density and galaxy number counts reveals that the predictions remain accurate even in regions poorly sampled by galaxies. This comes from the capacity of the borg algorithm to accurately recover structures even in regimes of low signal-to-noise (see the discussion in borg sdss–inference, ).

Our result readily provides a probability distribution for the number of large-scale matter streams at any galaxy position. This information, which is not available from raw survey data, is of potential interest for various cosmological and astrophysical analyses. It is expected to be crucially linked to the formation history of galaxies. If dark matter annihilates, then clear signals are predicted to come from nearby rich galaxy clusters (Gao et al., 2012). In fact, if dark matter is absolutely cold, its mass density (and therefore the annihilation luminosity) should be infinite at caustics (White & Vogelsberger, 2009). Galaxies living in stream-crossed regions are also the ones for which the “reconstruction” procedure used to correct for large-scale flows in baryon acoustic oscillations analyses (Eisenstein et al., 2007; Padmanabhan et al., 2012; Burden, Percival & Howlett, 2015) will fail, as it relies on the single-stream assumption and on the Zel’dovich approximation.

iv.1.2 Density estimators

Figure 7: The influence of different structure formation models and density estimators on borg samples. Slices through the final density field in one sample (first row), the posterior mean (second row) and the posterior standard deviation (third row) are shown. From left to right, the columns use: 2LPT and CiC of flow tracers (as borg), cola and CiC of flow tracers, cola and CiC of mass tracers, cola and explicit projection of all tetrahedra of the tessellated Lagrangian grid. For the last column only and for visualization purposes, densities are clipped at 800 times the mean density. The improvement of the density estimator (third and fourth columns) constitute an extension of the non-linear filtering procedure (going from the first to the second column).

As shown by Abel, Hahn & Kaehler (2012) and Hahn, Abel & Kaehler (2013) in simulations, instead of simply assigning dark matter particles to the grid via CiC deposit, knowledge of their initial positions can be exploited to give better density estimators. Specifically, the tessellated dark matter phase-space sheet yields a density estimator when the four vertices of each tetrahedron are given a weight of (see section II.3). As a simpler alternative, tetrahedral mass elements can be approximated by pseudo-particles (the mass tracers) which are assigned to the grid by standard CiC.

We applied these two new density estimators to our set of constrained simulations. The results are shown in figure 7 in comparison to earlier results. There, the first column represents the raw borg output, for which the structure formation model is 2LPT (borg sdss–inference). The second column represents the standard CiC assignment of usual dark matter particles (flow tracers) in our borg-cola realizations, as obtained in borg sdssT-web. The third and fourth columns are the new density estimators: the third raw corresponds to the CiC assignment of mass tracers, and the fourth row to the explicit projection of all tetrahedra. For each density estimator, we show one sample (first row), the posterior mean (second row) and the posterior standard deviation (third row). Tetrahedra can be projected to an arbitrarily fine grid. Similarly, there are 24 times more mass tracers than flow tracers, but as argued by Hahn, Abel & Kaehler (2013), the pseudo-particle approximation technique yields more than a factor 24 increase in effective mass resolution, when comparing CiC deposit of flow tracers versus mass tracers. For these reasons, we have been able to double the resolution of the Eulerian grid: the third and fourth columns show a grid of voxels versus voxels for the first and second columns.

In individual samples, the plot shows the anticipated behavior, with visually indistinguishable observed and unobserved regions. This indicates the joint performance of borg, cola, and phase-space estimators to augment unobserved regions with physically and statistically correct information. The ensemble of data-constrained realizations also permits to correctly propagate observational uncertainties. As expected, all mean density fields exhibit highly-detailed structures where data constraints are available, and approach the cosmic mean value in unobserved regions. The standard deviation reveals a high degree of correlation between signal and noise, which is typical of non-linear inference problems. In particular, in cola realizations, the densest clusters which cannot be reproduced by 2LPT correspond to high variance regions. In some small patches away from the well-constrained regions, the density estimated from explicit projection of tetrahedra has high posterior mean and variance. These are numerical artifacts driven by one or a few samples, and due to a bias of the estimator in the densest regions (the linearly-interpolated phase-space sheet no longer tracks the true distribution function there, see Abel, Hahn & Kaehler, 2012).

The two new density estimators extend the non-linear filtering procedure originally introduced by Leclercq et al. (2015) and reviewed in Leclercq (2015, chapter 7). Starting from inferred density fields, we go forward in time using an arbitrary structure formation model, , combined with a density estimator, . The result is a non-linear function, noted , which maps the initial density field into a final density field at a scale factor :


The borg data model uses  2LPT and  {CiC of flow tracers} (Jasche & Wandelt, 2013). In Leclercq et al. (2015) and borg sdssT-web, we improved the physical model over 2LPT by using respectively  Gadget-2 (Springel, 2005) and  cola (Tassev, Zaldarriaga & Eisenstein, 2013); but we kept  {CiC of flow tracers}. Here, we keep the physical model fixed,  cola, but we improve instead the density estimator, using  {CiC of mass tracers} or  {projection of tetrahedra}. As shown, this alternative approach also yields improvements of the composite function .

iv.2 Translation of Lagrangian classifications to Eulerian coordinates




Figure 8: Slice through the posterior probabilities for different structure types (from left to right: void, sheet, filament, and cluster), in the late-time large-scale structure in the Sloan volume (). Structure types are classified using diva (first row), lich (second and third rows), or origami (fourth row). For lich, potential and vortical structures are distinguished. For each classifier, the four or eight three-dimensional probabilities, defined on a -voxel grid, sum up to one on a voxel basis. For comparison, the reader is referred to figure 3 in Leclercq, Jasche & Wandelt (2015b), where structures are classified according to the T-web procedure.

In section III.1.2, we have considered classifications of the dark matter sheet, discretized by the simulation particles. In order to classify structures at redshift zero, it is possible to translate the results of Lagrangian classifiers into Eulerian coordinates. As mentioned in section II.3, a natural way to do this is to use the tessellation of the Lagrangian lattice into tetrahedra, which allows the translation of any particle-wise quantity to the Eulerian grid. However, since this is rather computationally expensive, we introduce below a simplified procedure to translate Lagrangian classifications into Eulerian coordinates. The latter is less precise at the level of individual samples, but we have checked that it yields negligible differences with the respect to the tessellation approach, as far as the posterior mean (figure 8) is concerned.

The simplified approach to translate Lagrangian classifications into Eulerian coordinates works as follows. We consider particles at their final positions and assign them to the Eulerian grid (of voxels in our case) using the CiC scheme. For structure type , each particle gets a weight of if its Lagrangian classification is , and 0 otherwise. In each voxel, we then count the relative frequencies for each mass-weighted structure type. In this fashion, in each realization , we obtain probabilities, in the range , for a final Eulerian voxel to belong to each of the structure types. We denote them by . In general, we have ; this sum is if and only if, in realization , no particle was assigned to the cell by the CiC scheme (i.e. it is empty as well as its eight neighbors). These voxels are flagged as noisy and discarded.

It is then possible to go from these per-sample pdfs to a global pdf for the entire analysis, in a similar way as before: we count the relative frequencies of the probabilities at each Eulerian spatial position within the set of samples. With this definition, the web-type posterior mean is given by


where labels one of the samples, and the normalization in voxel is


In figure 8, we show slices through the redshift-zero web-type posterior mean for the different structures as defined by diva, lich, and origami. As before, the plot demonstrates that we are able to propagate typical observational uncertainty to structure type classification. Consistently with the results of section III.1.1, where we argued for the accuracy of the approximation of to a potential field, we find only small differences between diva voids, sheets, and filaments, and the corresponding lich potential structures. However, unsurprisingly, lich allows a more physical detection and characterization of clusters, where the potential flow assumption breaks down. At this point, the reader is referred to figure 3 in borg sdssT-web for our map of the same volume, inferred using the T-web definition. The T-web and diva maps are visually similar, with an overall smoother structure for the voids defined by diva, which are sharply separated by walls. diva and lich also visually excel in highlighting the filamentary structure of the cosmic web. In contrast, with origami, most of the volume is filled by voids and more complex, shell-crossed structures are rarely identified.

iv.3 Cosmic velocity fields in the SDSS volume

Figure 9: Slices through the posterior mean for the radial (upper panel), polar (middle panel) and azimuthal (lower panel) velocity components, defined on a cubic grid of voxels. Velocity fields are obtained from the final configuration of a tetrahedral tessellation of the Lagrangian particle grid. Galaxies of the SDSS main galaxy sample are overplotted. The slice is  Mpc/ thick around the celestial equator. The reader is invited to compare with figure 7 in Jasche, Leclercq & Wandelt (2015).
Figure 10: Power spectra of the velocity divergence (, top panel), the vorticity (, second from top), the cross-correlation between velocity divergence and density contrast (, third from top) and the density power spectrum (, bottom). The particle distributions are determined using constrained (“posterior”) and unconstrained (“prior”) realizations. The solid lines correspond to the mean among all realizations used in this work, and the shaded regions correspond to the 2- credible interval estimated from the standard error of the mean. In the bottom panel, the dashed black curve represents , the theoretical power spectrum expected at from high-resolution -body simulations. In spite of the approximation made in the inference with 2LPT, cola filtering regenerates most of the final vorticity, so that the power spectrum is only off by 35% with respect to prior expectations.

Understanding large-scale velocity fields is of central importance for modern cosmology: it is linked to galaxy and large-scale structure formation (Pichon & Bernardeau, 1999), is crucial in modeling redshift-space distortions (Percival & White, 2009) and in measuring the kinetic Sunyaev-Zel’dovich effect (Shao et al., 2011; Lavaux, Afshordi & Hudson, 2013). Unfortunately, determining the continuous velocity field is an intrinsically complicated problem, even in simulations, because (i) it is only sampled at discrete locations, with poor resolution in low-density regions; and (ii) it is multi-valued in regions where multiple streams contribute. Typical observational uncertainties (mask, selection effects, galaxy biases) add an additional layer of complexity. Recently, Hahn, Angulo & Abel (2015) showed that the first issues can be overcome in simulations by using the Lagrangian phase-space sheet. We apply this technique to our set of constrained simulations and demonstrate that we can also circumvent the observational complications. In doing so, we provide unprecedentedly accurate reconstruction of cosmic velocity fields in the volume of the nearby Universe covered by the northern galactic cap of the SDSS main galaxy sample and its surroundings.

We applied the estimators described in section II.3 to our set of constrained simulations, and obtained the components of the velocity field , , , on Eulerian grids of and voxels. As shown in the beginning of section IV, under the assumption that velocities are conditionally independent of the data once initial conditions are known, these samples are drawn from . As before, each of them gives one plausible realization of the velocity field, and the variation between samples quantifies the remaining uncertainty. In figure 9, we show the posterior mean for the three spherical components of the velocity field, in the celestial equator. The interested reader may want to compare with figure 7 in borg sdss–inference, which showed the velocity field as probed by 2LPT particles in one particular sample. The present work constitutes a substantial improvement over these results, since the velocity field is now predicted by a fully non-linear structure formation model instead of 2LPT, well-sampled even in low-density regions, and with more straightforward uncertainty quantification. The recovered velocity field is fully compatible with the known structure of the nearby Universe; showing, for example, the infall of matter onto the Sloan Great Wall.

As noted in borg sdss–inference, velocity fields are derived a posteriori with borg, i.e. they are a prediction of physical models given the inferred initial density field.333In contrast, algorithms like virbius (Lavaux, 2016) explicitly exploit velocity information contained in the data to constrain the result, and reconstructions based on surveys of peculiar velocities have produced highly appealing results (e.g. Courtois et al., 2013; Tully et al., 2014). However, since velocity fields are predominantly governed by large-scale modes, which are accurately recovered by borg (borg sdss–inference), any improvement in the inference scheme is not expected to crucially change our results.

To characterize the accuracy at which our velocity fields are predicted, we analyzed their spectral properties against that of reference simulations produced with the same setup (presented in borg sdssT-web, ). From the Cartesian velocity components,444We do not use the divergence and curl directly computed using Lagrangian transport, but rather from the velocity field itself, for the reasons discussed in Hahn, Angulo & Abel (2015, section 5.1). we computed the divergence and the curl,


using FFTs on the Eulerian grid of side length 750 Mpc/ and voxels. We computed their respective power spectra, given by


To test the proportionality between and the density contrast , valid at first order in Lagrangian perturbation theory, we also computed the cross- and auto-power spectra


In this case, is estimated by standard CiC deposit of dark matter particles to the grid, and we apply a correction factor for aliasing effects and suppression of small-scale power close to the Nyquist wavenumber (Jing, 2005).555We here rely on the CiC scheme and not on the phase-space density fields estimators, following the arguments of Hahn, Angulo & Abel (2015, section 5.2).

In figure 10, we show , , and as a function of the wavenumber . The measurements from our simulations agree with the results of Hahn, Angulo & Abel (2015). The plots confirm the agreement between unconstrained and constrained density fields () at all scales (already discussed in borg sdssT-web).666We note that a lack of small scale power with respect to theoretical predictions, for , is observed in both unconstrained and constrained realizations. This is a gridding artifact due to the finite mesh size used to compute power spectra. This scale corresponds to around one quarter of the Nyquist wavenumber. Remarkable agreement with simulations can also be observed for and correlations. Deviations can only be observed for . Below this scale, structure formation becomes notably non-linear, meaning that the 2LPT model used in the borg inference process breaks down, which explains the lack of small-scale power when the constrained initial conditions are run forward with a fully non-linear code. The lack of and correlations does not exceed  30% for . Since the velocity field is curl-free in 2LPT, the vorticity is a unique prediction of the physical model accounting for full gravity. Strikingly, the prediction for still agrees well with the result of unconstrained simulations, meaning that cola filtering reinstates most of the final vorticity, and does so consistently with the inferred phases (see the panel in figure 3). Discrepancies are limited to an almost scale-independent negative offset of  35%, which can be explained by known limitations of the borg data model: use of 2LPT instead of fully non-linear gravity, only partial fitting of galaxy bias, absence of redshift-space distortions and lightcone effects.

Summing up this section, at the demonstrated degree of accuracy, our results go beyond the state of the art for the characterization of cosmic velocity fields in the SDSS volume, including its low galaxy-density regions. They include the first quantitative prediction of the vorticity field, and come with the possibility of uncertainty quantification at the level of fields and summary statistics.

iv.4 Cosmography

Figure 11: Slices through the supergalactic plane, with a thickness of  Mpc/ for galaxies. The fields shown are the density (upper panel), the secondary stream density (middle panel) and the radial velocity (lower panel), as a function of distance and supergalactic longitude . In the upper panel, some structures of the Local Universe are identified in orange: the Coma cluster, the Ursa-Majoris supercluster, and two farther clusters in the direction of Ursa-Major and Ursa-Major/Coma-Berenices. The Virgo-Boötes-Hercules (VBH) filament is also labeled. In purple, two nearby voids in the directions of Ursa-Major and Coma-Berenices, as well as the Boötes void. The centre of superclusters and voids is projected onto the supergalactic plane and denoted as stars.

For illustration purposes, as well as to demonstrate the power of inferring the phase-space structure of dark matter, we discuss in this section some cosmographic results concerning well-known objects of the Local Universe. We have already discussed the Sloan Great Wall (Gott et al., 2005; Einasto et al., 2011), at celestial equatorial coordinates , ,  Mpc/, visible in figures 3, 6 and 9.

In figure 11, we show slices through the supergalactic plane (SGP). The supergalactic coordinate system is defined by the position of its North Pole () in the direction of Galactic coordinates (; , Lahav et al., 2000). The portion of the SGP () which is constrained in our analysis corresponds to , the rest is not covered by the SDSS main galaxy sample. The different panels show the posterior mean for: the density estimated from mass tracers (see section IV.1.2), the secondary stream density (see section IV.1.1), and the radial velocity field (see section IV.3). Galaxies in a  Mpc/-thick slice are overplotted. Some major structures of the Local Universe are clearly visible: the Coma cluster (,  Mpc/), the Ursa-Majoris supercluster (,  Mpc/), and two farther clusters in the direction of Ursa-Major (,  Mpc/) and Ursa-Major/Coma-Berenices (,  Mpc/) (e.g. Abell, Corwin & Olowin, 1989). One can also notice part of Leo supercluster (,  Mpc/), which is mostly below the SGP (at ). Three voids intersect the SGP: the well-known Boötes void (Kirshner et al., 1981), whose center is above the SGP (, ,  Mpc/) and which is around  Mpc/ in radius (Batuski & Burns, 1985), a void in the direction of Ursa-Major (,  Mpc/), and one in the direction of Coma-Berenices (,  Mpc/) (e.g. Fairall, 1998). Part of the Virgo-Boötes-Hercules (VBH) filament, which extends from to at  Mpc/ (Lavaux & Jasche, 2016) is also visible. For comparison, see also figure 4 in Carrick et al. (2015) and figure 7 in Lavaux & Jasche (2016).

As expected, structures observed in galaxies correlate well with the reconstructed dark matter density. At this mass resolution, up to 5 matter streams converge in the inner regions of Coma, the Ursa-Majoris supercluster, and the VBH filament. The voids in Coma-Berenices and in Ursa-Major are single-stream regions, as well as most voids of the Local Universe. However, our analysis reveals some substructure of the Boötes void. Reconstructed velocity fields gives further dynamical insight into these structures. Superclusters are still accreting matter, as seen in radial velocity “dipoles”. The ongoing emptying of voids is also clearly observable; for example, dark matter inside the Boötes void is streaming out towards its surrounding overdensities, the Coma cluster and the VBH filament.

V Summary and conclusions

This work discusses a fully probabilistic, chrono-cosmographic analysis of the Lagrangian dark matter sheet in a volume covered by the northern cap of the SDSS main galaxy sample. It is a data application of advanced tools, the application of which was so far limited to post-processing simulations and deemed unfeasible in observations. The result is a wealth of new information about the dark matter underlying the observed galaxy distribution. Importantly, all these results are probabilistic, i.e. come with a quantification of uncertainty at the level of maps.

Specifically, we characterized the number of streams and the flow properties in Lagrangian and Eulerian space; we explained how to detect caustics; we improved estimates of the density field and showed that this can be thought of as an extended non-linear filtering; and we obtained extremely accurate reconstructions of the velocity field and its derivatives (velocity divergence and vorticity). We introduced the Lagrangian classifier lich, a generalization of diva to the case of heterogeneous flows. lich is a more precise assignment of web-types — distinguishing potential and vortical structures — that we have used to check the accuracy of the potential flow approximation. Using diva, lich, and origami, we presented the first maps of nearby Lagrangian cosmic web elements, and showed how to translate the result into redshift-zero Eulerian maps. With these definitions, identified structures have a direct physical interpretation in terms of the entire history of matter flows toward and inside them.

All the results shown in this paper are predictions based on fusing data constraints imprinted in the inference products (the initial conditions) and physical information expressed in the numerical structure formation model. At various places, we have checked the accuracy of these predictions. All posterior results are consistent with the respective priors, and deviate only in expected ways where the data model used for inference of the initial conditions has known shortcomings. This analysis therefore supports the standard paradigm of large-scale structure formation. If our tests suggest that data models are not yet accurate enough to allow inference of cosmological model parameters, they reinforce the concept underlying borg, i.e. the general picture of including physical structure formation within survey data analysis. In particular, our first prediction of vorticity matches expectations at the % level, which is remarkable given that the inference model only relies on the assumption of a potential flow. A measure of the information content of Lagrangian cosmic web maps shows that it is vastly enhanced with respect to Eulerian ones, due to information transport along Lagrangian trajectories. The natural subsequent question of choosing the optimal cosmic web classifier for certain analyses is not addressed in this work; the required information-theoretic framework is introduced in Leclercq et al. (2016).

This work constitutes the first three-dimensional mapping of the nearby Lagrangian dark matter sheet, as constrained by the SDSS main galaxy sample. The nature of these maps is unprecedented, as they are data-constrained with a high degree of control on uncertainties. The understanding of the phase-space properties of dark matter is a pathway to treating it in a more physical fashion at cosmological scales, beyond the phenomenological description of the CDM model. In particular, density fields and locations of caustics can be used for dark matter indirect searches in -ray data from the Fermi satellite (see Gao et al., 2012); velocity fields can be used to account for peculiar velocity effects in supernova cosmology (see Davis et al., 2011) and to characterize the kinetic Sunyaev-Zel’dovich effect (see Shao et al., 2011); tidal and vorticity fields can be used to gain better control on intrinsic alignments in the context of weak lensing (Codis et al., 2015) and CMB lensing (Larsen & Challinor, 2016). In addition to these specific examples, we believe that vast scientific opportunities to use our maps may emerge from the community. These data products, complementing to our earlier release of Eulerian density fields (borg sdss–inference) and T-web maps (borg sdssT-web), are provided as on-line material along with this paper. They are available from the first author’s website, currently hosted at

Beyond the specific region probed in this work, our study demonstrates the possibility of mapping the dark matter sheet in the real Universe. This possibility, so far limited to simulations, is uniquely offered by Bayesian physical inference of the initial conditions from large-scale structure surveys, which allows a self-consistent treatment of the necessary aspects: handling of observational effects and description of structure formation. Thus, our result may have important implications for large-scale structure data analyses, as they allow new confrontations of observational data and theoretical models, and therefore novel tests of the standard paradigm of cosmic web formation and evolution.

Note added: After submission of this paper, several related pieces of work appeared in the literature. These include a new cosmic web classifier using the eigenvalues of the Hessian of the multistream field (Ramachandra & Shandarin, 2017), a new derivation of the theory of caustics (Feldbrugge et al., 2017) and a comparison project of different cosmic web algorithms (Libeskind et al., 2017).

Statement of contribution

FL implemented cola and the required cosmic web analysis tools, performed the study, produced the maps and wrote the paper. JJ developed borg and lead the SDSS analysis; GL developed diva; JJ and GL contributed to the development of computational tools. BW was involved in the conception and design of Bayesian large-scale structure inference. BW and WP contributed to the interpretation of results. All the authors read and approved the final manuscript.

FL would like to thank Oliver Hahn, Mark Neyrinck and Rien van de Weygaert for discussions that stimulated this analysis. The manuscript benefited from constructive comments and useful suggestions from the reviewer. We are grateful to Bridget Falck and Mark Neyrinck for making publicly available their origami code, which was used in this work. Numerical computations were done on the Sciama High Performance Compute (HPC) cluster which is supported by the ICG, SEPNet and the University of Portsmouth. This work has also made use of the Horizon Cluster hosted by the Institut d’Astrophysique de Paris; we thank Stéphane Rouberol for running smoothly this cluster for us. FL and WP acknowledge funding from the European Research Council through grant 614030, Darksurvey. JJ is partially supported by a Feodor Lynen Fellowship by the Alexander von Humboldt foundation. BW acknowledges funding from an ANR Chaire d’Excellence (ANR-10-CEXC-004-01) and the UPMC Chaire Internationale in Theoretical Cosmology. This work has been done within the Labex Institut Lagrange de Paris (reference ANR-10-LABX-63) part of the Idex SUPER, and received financial state aid managed by the Agence Nationale de la Recherche, as part of the programme Investissements d’avenir under the reference ANR-11-IDEX-0004-02. This research was supported by the DFG cluster of excellence “Origin and Structure of the Universe”.