Cosmic Bulk Flow

# Cosmic bulk flows on 50 {h}^{-1}Mpc scales: A Bayesian hyper-parameter method and multishells likelihood analysis

## Abstract

It has been argued recently that the galaxy peculiar velocity field provides evidence of excessive power on scales of , which seems to be inconsistent with the standard CDM cosmological model. We discuss several assumptions and conventions used in studies of the large-scale bulk flow to check whether this claim is robust under a variety of conditions. Rather than using a composite catalogue we select samples from the SN, ENEAR, SFI++ and A1SN catalogues, and correct for Malmquist bias in each according to the IRAS PSCz density field. We also use slightly different assumptions about the small-scale velocity dispersion and the parameterisation of the matter power spectrum when calculating the variance of the bulk flow. By combining the likelihood of individual catalogues using a Bayesian hyper-parameter method, we find that the joint likelihood of the amplitude parameter gives (68 per cent confidence region), which is entirely consistent with the CDM model. In addition, the bulk flow magnitude, , and direction, , found by each of the catalogues are all consistent with each other, and with the bulk flow results from most previous studies. Furthermore, the bulk flow velocities in different shells of the surveys constrain (, ) to be (), for SFI++ and () for ENEAR, which are consistent with WMAP 7-year best-fit values. We finally discuss the differences between our conclusions and those of the studies claiming the largest bulk flows.

###### keywords:
methods: statistical–galaxies: kinematics and dynamics –distance scale–large-scale structure of Universe

## 1 Introduction

The cosmic bulk flow is the streaming motion of the galaxies surrounding our Milky Way system, due to the gravitational pull of cosmic structure on large scales. In the gravitational instability paradigm, for a galaxy at position , the peculiar velocity of an individual galaxy at time is given by (Peebles, 1993)

 v(r,t)=Ω0.55mH04π∫d3r′δm(r′,t)r−r′|r−r′|3, (1)

where is the density contrast at position , is the fractional matter density, and is the Hubble constant. The bulk flow is normally considered as an average over a sufficiently large volume, with some window function , so that the above linear perturbation theory is applicable. This average is defined as (Juszkiewicz et al., 1990; Nusser & Davis, 2011)

 Extra open brace or missing close brace (2)

where is the 3-D peculiar velocity field at time , defined in Eq. (1). A complete investigation of bulk flows of nearby galaxies should measure the individual velocities of galaxies all over the observed volume. However, realistic observational techniques, such as the Tully-Fisher relation, only allow us to probe the radial component of the peculiar velocities of galaxies. In addition, most of the current observations can only cover a patch of sky with limited depth, leading to large uncertainties when interpreting the results.

Of course, none of these considerations are new. There is already a large literature on the study of the peculiar velocity field, with particularly intense activity in the early 1990s (see overviews in Burstein, 1990; Courteau et al., 1993; Latham & da Costa, 1991; Bouchet & Lachièze-Rey M., 1993; Strauss & Willick, 1995; Courteau & Willick, 2000). Investigating the relationship between velocities and densities has great potential for constraining cosmological parameters, and testing theories of gravity on large scales. However, it has long been realised that the construction of appropriate catalogues is difficult, and that systematic effects can easily overwhelm statistical noise.

In attempting to overcome these observational limitations, there have been significant recent efforts in the community to reconstruct bulk flow moments from the limited data available, and to test their consistency with the CDM cosmology. One of the important issues lies in determining the proper weighting for individual galaxy velocities in a catalogue in order to obtain streaming motions. Some of the published studies, such as Sarkar et al. (2007) and Abate & Erdogdu (2009), focus on a weighting scheme that produces the maximum likelihood estimate of the bulk flow (see also Watkins et al., 2009), which can minimise the measurement noise. However, this weighting depends on the particular survey geometry and statistical properties, which leads to a large uncertainty when interpreting the constraints from combined data sets.

Watkins et al. (2009) proposed another method of estimating the bulk flow of galaxy peculiar velocities. They focused on the problem of how realistic surveys can be used to reconstruct the bulk flow at a given depth. They developed a minimum variance weighting method (Watkins et al., 2009; Feldman et al., 2010), which minimises the variance between the real data catalogue and the ideal survey, and they applied it to combined catalogues of peculiar velocity surveys. Surprisingly, they found a very large bulk flow on scales () towards , , which prefers a large amplitude of fluctuations (), inconsistent with the WMAP 5-year results (Komatsu et al., 2009). Subsequent work has discussed a possible explanation for this large bulk flow related to pre-inflationary isocurvature perturbations (Ma et al., 2011).

Contradicting the claim in Watkins et al. (2009), Nusser & Davis (2011) developed a method termed the ASCE (All Space Constrained Estimate) which reconstructs the bulk flow from an all-space 3-D velocity field to match the inverse Tully-Fisher relation. By applying this method, as well as the Maximum likelihood method (Abate & Erdogdu, 2009), to the Spiral Field -band Survey (SFI++ survey, Springob et al. 2007) catalogue, Nusser & Davis (2011) found the bulk flow on a sphere of radius to be , towards ()=(), which is close to the results from the maximum likelihood method. The estimated cosmological parameters, i.e. , are consistent with the CDM model. However, since Nusser & Davis (2011) only used the SFI++ data set, it is still not clear whether it is the other data sets used in Watkins et al. (2009) which led to the significantly different results.

Any analysis which claims to strongly rule out the simple inflationary CDM model deserves careful scrutiny, since a confirmed discordance would have profound consequences for our understanding of the large-scale Universe. We can identify four potential problems in Watkins et al. (2009) which may potentially skew the likelihood and bias the results. Firstly, the inhomogeneous Malmquist bias is not corrected for in most catalogues, for example: ENEAR (da Costa et al., 2000; Bernardi et al., 2002; Wenger et al., 2003); SN (Tonry et al., 2003); SC (Giovanelli et al., 1998); EFAR (Colless et al., 2001); and Willick (Willick, 1999). This deficiency can significantly bias the distance estimates. Secondly, the distance errors from the Tully-Fisher and Fundamental Plane methods can be comparable to the measured velocities as the surveys go deeper, and moreover a simple model of Gaussian errors is almost certainly inappropriate as systematics come to dominate the distance estimation. Therefore the velocity data beyond become both very noisy and unreliable in assessing the bulk flow. Thirdly, directly combining various catalogues with different calibration methods can also induce systematic errors and a spurious flow. Finally, the assumption of a unique small scale velocity dispersion may be too small for some of the surveys (e.g. SFI++ prefers , Ma et al. 2011), perhaps skewing the constraints on the cosmological parameters and . The purpose of this paper is to investigate carefully the analysis presented in Watkins et al. (2009), and to combine each catalogue with a Bayesian hyper-parameter method to test for consistency with the usual CDM perturbation theory. As we have seen from Nusser & Davis (2011), different statistical methods should not dramatically alter the results, so we will focus on the ‘minimal variance’ scheme (Watkins et al., 2009; Feldman et al., 2010).

A further motivation for this paper is as an extension to the velocity-gravity comparison work we have already carried out in Ma et al. (2012b). In that paper we compare the observational peculiar velocity data with the reconstructed velocity field from the IRAS PSCz catalogue, and fit the linear growth rate parameter, . In this new paper, we do not discuss the small-scale modes, but will reconstruct the bulk motion of galaxies on distances Mpc. We will perform a direct comparison between observational data and CDM model predictions for the bulk flow velocity, and constrain the cosmological parameters and . In addition, we will extend the minimal variance scheme suggested in Watkins et al. (2009) and Feldman et al. (2010) to a multishell likelihood method. Furthermore, we will directly investigate the reason for the apparently large flows found in Watkins et al. (2009) and Feldman et al. (2010).

This paper is organised as follows. We first list the data sets used in Section 2.1, and then discuss the data selection criterion in Section 2.2. For the selected data, we correct the inhomogeneous Malmquist bias for the distance estimate (Section 2.3). In Section 3, we first illustrate how to quantify the variance of the bulk flow at any particular depth (Section 3.1), then we review the minimum variance weighting scheme proposed in Watkins et al. (2009) to measure the bulk flow at a given depth (Section 3.2), and furthermore we present the likelihood function for each individual catalogue (Section 3.3) and the hyper-parameter approach used to combine different data sets (Section 3.4). Then in Section 4 we compare our findings with those in Watkins et al. (2009). We first confirm that we can accurately reproduce the results in Watkins et al. (2009) by adopting the same conventions; then in Section 4.1 we show our constraints on bulk flow moments by performing the full likelihood analysis for each individual catalogue rather than the combined catalogue. In Section 4.2, we apply the Bayesian hyper-parameter method to combine the likelihoods of different catalogues, in order to avoid the systematics that may affect the constraints. This allows us to assess the consistency of each individual catalogue, and to work out the cosmological parameters in the combined likelihood. In Section 4.3, we extend our likelihood analysis to consider bulk flows in multiple shells in a survey, and their covariance matrix, and we compare our findings with WMAP 7-year best-fit values and the results from Nusser & Davis (2011). Our discussion and conclusion are summarised in Section 5.

Note that although () is now determined with reasonable accuracy, throughout this paper we continue to adopt the convention of giving distances in units of for ease of comparison with previous results.

## 2 Data

### 2.1 Catalogues

We will use four different samples coming from recent peculiar velocity surveys to reconstruct the bulk flow. These samples are listed from the nearest to the most distant (see also Watkins et al. 2009; Feldman et al. 2010; Turnbull et al. 2012). Our four samples consist of the ENEAR catalogue (da Costa et al., 2000; Bernardi et al., 2002; Wenger et al., 2003; Hudson, 1994), the SN catalogue (Tonry et al., 2003), the SFI++ catalogue (Springob et al., 2007) and the A1SN catalogue (Turnbull et al., 2012; Jha et al., 2007; Hicken et al., 2009; Folatelli et al., 2010). For detailed discussion and analyses of these four samples, including their characteristic depths, typical distance errors and data compilation, we refer readers to Section 3 of Ma et al. 2012b.

We should mention that in Watkins et al. (2009) and Feldman et al. (2010) five other catalogues, namely SBF (Tonry et al., 2001), SC (Giovanelli et al., 1998; Dale et al., 1999), SMAC (Hudson, 1999; Hudson et al., 2004), EFAR (Colless et al., 2001) and Willick (Willick, 1999), were used to reconstruct the bulk flow of galaxies. In contrast to the previously described four catalogues, these samples are either very distant and therefore have large errors, or very sparse in which case the survey geometry is complicated. Watkins et al. (2009) combined these five low-quality catalogues with the previous four higher quality catalogues to form a larger ‘COMPOSITE’ catalogue, and found an excess power of flow on scales of . However, there is potential danger in combining various catalogues with different calibration schemes. One concern is that the very distant samples with large systematics may be inducing a spurious large-scale flow.

In order to investigate this we tried to reproduce the ‘excess flow’ effect by using the suspicious COMPOSITE catalogue (see Section 3). However, in the subsequent more careful analysis, we will only use the ENEAR, SN, A1SN and SFI++ catalogues, with the following data selection criterion and Malmquist bias correction.

### 2.2 Data selection

We listed four different peculiar velocity catalogues in Section 2.1. In these catalogues, the samples beyond the scale are also quite sparse and suffer from large errors due to uncertainties in the distance indicators; therefore we trim the data sets at in order to reconstruct the bulk flow moments accurately on the scale.

In addition, since some of the samples in the SFI++ catalogue with are affected by localised non-linear structures, giving very large velocities (Ma et al., 2012b), we excluded these high velocity samples () from the SFI++ catalogue. The classification of the data in each catalogue is listed in Table 1.

### 2.3 Malmquist bias correction

In the above description of velocity catalogues, two different distance indicators, the Tully-Fisher relation and the Fundamental Plane method, have been used for determining the SFI++ and ENEAR distances. In addition, supernova luminosities are used in calibrating the distance of the SN and A1SN catalogues.

The large scatter of distance indicators suggests that objects with inferred distance , may come from a wide range of possible true distances. The effect usually referred to as Malmquist bias (Malmquist, 1920; Hendry et al., 1993) is related to the probability distribution of true distance , given the measured distance with its measurement error. The desired function is (Lynden-Bell et al., 1988a; Strauss & Willick, 1995)

 P(r|d)=r2n(r)exp(−[ln(r/d)]22Δ2)∫∞0drr2n(r)exp(−[ln(r/d)]22Δ2), (3)

where is the radial density distribution, and is the fractional distance uncertainty of distance indicators. Note that for the Tully-Fisher and Fundamental Plane methods, the typical errors are around per cent, and for Type Ia supernovae, the typical error is around per cent.

The simplest case is homogeneous Malmquist bias (Strauss & Willick, 1995; Hudson, 1994), in which, the number density is constant, so that Eq. (3) becomes independent of density, with

 P(r|d)=1√2π(dΔ)exp(−92Δ2)[x2exp(−[lnx]22Δ2)], (4)

where is the ratio between the true distance and the measured distance. One can verify that the expectation of , . This means that even for a constant density distribution of galaxies, the distance indicator is still generally biased. This is due to the fact that, near the measured distance , there are more galaxies in shells of larger distance than smaller distance, so it is more probable that the true distance is greater than the measured distance i.e. .

However, in the more general case, the gradient of the number density is not negligible, and this either reinforces or works against the volume effect – inhomogeneous Malmquist bias. If the gradient of the number density is positive, there will be even more galaxies at the larger distances than in the constant density case, i.e. the inhomogeneity reinforces the homogeneous Malmquist bias; on the other hand, if the gradient is negative, then the effective is opposite, and the inhomogeneous Malmquist bias partially cancels the homogeneous Malmquist bias effect.

To quantify the inhomogeneous Malmquist bias correctly, we use the real-space reconstructed positions of the PSCz galaxies as mass tracers to interpolate the mass density field on a cubic grid of Length and mesh size , smoothed with a Gaussian filter of . The field on the lattice is then interpolated along the line of sight to each object in the catalogue. The value of along the line of sight is specified at the position of equally-spaced points, with a binning of . Finally, Eq. (3) is used to predict from using a Monte Carlo rejection procedure.

We re-examine the catalogues described in Section 2.1, and correct for Malmquist bias according to Eq. (3) for the ENEAR, SN and A1SN samples1. We plot the measured distance (before Malmquist bias correction) and corresponding true distance (after Malmquist bias correction) in the left panel of Fig. 1. One can see that removing the bias tends to place galaxies at larger distances, although the shift is not very significant. The right panel of Fig. 1 shows the comparison between the uncorrected and corrected line-of-sight peculiar velocities.

## 3 Measuring the bulk flow

For linear perturbation theory within the CDM paradigm, the velocity field at any spatial point is directly related to the underlying density field through Eq. (1). What we are interested in is the bulk flow moment of the velocity field in a spherical region. Therefore in this section, we first calculate the mean-squared variance of the bulk flow (in Eq. (2)) which can be used to quantify the amplitude of the flow at different depths. Then we will review the ‘minimum variance’ method for weighting the data sets on different scales. Finally we will present the likelihood function that can be used to constrain cosmology with the measured bulk flow.

### 3.1 Mean-squared velocity in a top-hat region

Real surveys can only observe galaxies out to a particular depth , which means that the ‘window function’ has a sharp cut-off:

 w(∣∣r′−r∣∣,R) = 0, if ∣∣r′−r∣∣>R; (5) = 1, if ∣∣r′−r∣∣⩽R.

Therefore, by measuring only a galaxy sample within this sphere of radius one can calculate the ‘streaming motion’ through Eq. (2) by averaging the velocity within the sphere. The mean-squared velocity of the spherical region within radius is therefore (see also Ma et al. 2012a)

 ⟨|vbulk(t)|2R⟩ = ⟨vbulk(→x,t)R⋅vbulk(→x,t)R⟩→x all space (6) = (34πR3)2(HΩ0.55ma(t))2(2π)3∫d3→k˜w2(k,R)P(k)k2 = Missing or unrecognized delimiter for \left

At the present epoch and , so today

 ⟨|vbulk|2R⟩=(3H0Ω0.55m)22π2∫P(k)(j1(kR)kR)2dk. (7)

We expect Eq. (7) to be useful for quantifying the non-zero velocity fluctuations of our local surroundings. For the WMAP 7-year cosmological parameters (Komatsu et al., 2011), the typical bulk flow magnitude on a scale of from Eq. (7) is . We will compare this theoretical value with the measured velocity catalogues.

### 3.2 Minimum variance scheme

Bulk flow estimates are essentially weighted averages of the individual velocities in a galaxy survey (Watkins et al., 2009). Previous work, such as Abate & Erdogdu (2009) and Sarkar et al. (2007), focused on the estimate that minimises the uncertainties due to measurement noise, i.e. the maximum likelihood estimation scheme, but did not make any correction for the survey geometry. Thus the maximum likelihood bulk flow is obviously dependent on a given survey’s particular geometry and statistical properties. On the other hand, Watkins et al. (2009) and Feldman et al. (2010) instead addressed the question of how peculiar velocity data can be used to statistically estimate a more specialised quantity, the bulk flow of an ideal, densely-sampled survey with a given depth. They developed a ‘minimal variance’ weighting scheme which produces an estimate of the bulk flow at any particular depth. They found an excess in the power of the bulk flow on scales of , which seems to exceed the CDM predictions at the level. In the following, we will first review the minimum variance weighting scheme developed in Watkins et al. (2009) and Feldman et al. (2010) and then present the likelihood function for cosmological parameters.

A realistic survey consists of objects on the sky having position and measured line-of-sight velocity , with measurement error . The measured line-of-sight velocity is assumed to have the form , where is the galaxy line-of-sight velocity in the matter rest frame, and is a superimposed Gaussian random motion with variance , where accounts for the 1-D small-scale velocity dispersion.

Given an idealised survey with bulk flow velocity () at a particular depth , we need to determine the weight which makes the ‘linear compression’

 up=N∑i=1wp,iSi, (8)

give the closest approximation of (Feldman et al., 2010). At the same time, the line-of-sight velocity at position should take the form . In order for the estimator to give the correct amplitude of the velocity , i.e. , the weight function has to satisfy the following constraint:

 ∑iwp,i(xq⋅ri)=δpq. (9)

We can apply the Lagrange multiplier approach to minimise the average variance , i.e. minimise the following quantity (Feldman et al., 2010)

 ⟨(up−Up)2⟩+∑q(∑iwp,i(xq⋅ri)−δpq). (10)

By plugging in Eq. (8), one can expand the first term and obtain

 ⟨U2p⟩−2∑iwp,i⟨SiUp⟩+∑i,jwp,iwp,j⟨SiSj⟩ +∑qλpq(∑iwp,i(xq⋅ri)−δpq). (11)

In order to find the weight function that can minimise the variance, we take the derivative of the above equation and equate it to zero:

 −2⟨SiUp⟩+2∑jwp,j⟨SiSj⟩+∑qλpq(xq⋅ri)=0. (12)

From Eq. (12), one can solve for the weight function as

 wp,i=∑j(G−1)ij[⟨SjUp⟩−12λpq(xq⋅rj)], (13)

where is the covariance matrix for the measured velocity. Since as described above, one can write the covariance matrix as

 Gij = ⟨vivj⟩+δij(σ2∗+σ2i) (14) = ⟨(^ri⋅v(ri))(^rj⋅v(rj))⟩+δij(σ2∗+σ2i),

since and are not correlated. The first term is the real space velocity correlation function, which is related to the matter power spectrum in Fourier space,

 Missing or unrecognized delimiter for \bigr (15)

where the window function,

 Fij(k)=∫ d2^k4π(^ri⋅^k)(^rj⋅^k)×exp(ik^k⋅(ri−rj)), (16)

can be calculated analytically (Ma et al., 2011).

The correlation term is the average product of measured velocity with the ideal bulk flow moment . The ideal bulk flow moment is the average of the random velocities in an isotropic survey region. We assume that the survey is a spherical region with radius , therefore we generate random velocities in the top-hat region and calculate as

 ⟨SjUp⟩=1N′N′∑n′=1(xp⋅rn′)⟨vn′vj⟩, (17)

where the line-of-sight velocity correlation can be calculated in the same manner as Eq. (15).

Therefore, the only unknown in Eq. (13) is the Lagrange multiplier matrix . We can plug Eq. (13) into the constraint equation (9) to solve for :

 λpq=∑l[∑ij⟨SjUp⟩G−1ijgl(^rj)−δpl]M−1lq, (18)

where the matrix is given by

 Mpq=12∑i,jG−1ij(xp⋅ri)(xq⋅rj). (19)

To summarise, by simulating an ideal survey at depth and using Eqs. (14), (17), (18) and (19), one can calculate the weight function in Eq. (13) and hence obtain the bulk flow Moment at depth according to Eq. (8).

Once we obtain the bulk flow moment from the minimum variance weighting scheme, we can calculate the covariance matrix and therefore perform a full statistical analysis. The covariance matrix of becomes

 Cpq = ⟨upuq⟩ (20) = ∑i,jwp,iwq,j⟨SiSj⟩ = ∑i,jwp,iwq,jGij,

which can be broken down into an instrumental noise term

 Cnpq=∑iwp,iwq,i(σ2i+σ2∗), (21)

and a cosmic variance term

 Cvpq=Ω1.1mH202π2∫dk P(k) W2pq(k), (22)

where the angle-averaged window function is

 W2pq(k)=∑i,jwp,iwq,jFij(k). (23)

This window function describes the scale in -space that the catalogue actually probes. As an example, we plot this window function for the ENEAR catalogue at depth in Fig. 2a. As expected, it matches the window function of ENEAR as shown in figure 3 in Watkins et al. (2009) perfectly well. The shape of the curves reveal that the window function decays rapidly for beyond , therefore the non-linear regime of the matter power spectrum does not contribute to the bulk flow moment – bulk flow moments reflect perturbations on large scales, .

### 3.3 Likelihood of Individual catalogues

Given the reconstructed bulk flow moment at some depth and its covariance matrix (20), the likelihood of cosmological parameters is

 L(θ)=1(2π)32det(C(θ))12exp(−123∑p,q=1upC(θ)−1pquq). (24)

Since the major effect of the constraints is on the amplitude and shape of the matter power spectrum, in the following analysis we will only vary and , while fixing the other parameters at the WMAP values. We compute the bi-variate likelihood and marginalise one parameter to obtain the 1-D posteriori distribution of the other parameter.

In order to demonstrate the accurate reproduction of the results in Watkins et al. (2009), we use the same set of WMAP 5-year best-fit parameters (Komatsu et al., 2009): ; ; ; ; ; plus small-scale velocity dispersion . We also use the approximation to calculate the matter power spectrum (see Appendix A for comparison with the numerical result), as is used in Watkins et al. (2009). In Fig. 2 we show that we can accurately reproduce the window function (see Eq. (23)) and marginalised distribution of , as shown by the black line in Fig. 2b. These curves are very close to those shown as figures 3 and 7 of Watkins et al. (2009).

In addition, since we will not use the SBF, SC, SMAC, EFAR and Willick catalogues here, we need to test whether this ‘removal’ of data can substantially change the result. To do this, we use the rest of the data in the COMPOSITE catalogue, i.e. the combination of the SN, ENEAR and SFI++ catalogues (4256 samples in total), and keep all other conventions the same as in Watkins et al. (2009) to calculate the likelihood of , and we obtain the red line in Fig. 2b. Comparing with the black line, one can see that the removal of these sparse and distant samples can move the peak of the likelihood towards lower values, but a very high value of is still preferred compared with the WMAP constraint (dashed line). Therefore, the excessive power in the bulk flow on scales is not completely driven by the inclusion of five sparse and fairly noisy samples – we need to investigate further to understand the reasons.

We make the following adjustments to the model parameters in order to precisely compare the velocity field prediction with the observational data:

• we use WMAP 7-year best-fit cosmological parameters (Komatsu et al., 2011) to compute our prediction, i.e. , , , and ;

• we use the value for the intrinsic velocity dispersion,2 in order to compute the covariance matrix (Eq. (21)), since Ma et al. (2011) showed that this value is preferred for most catalogues;

• in the formula for , we use the numerical result from CAMB (Lewis, Challinor, & Lasenby, 2000) instead of to compute the power spectrum, with the difference between the numerical result and the parameterisation being shown in Appendix A.

### 3.4 Combining catalogues: Bayesian hyper-parameter method

In Watkins et al. (2009), the combined catalogue, referred to as ‘COMPOSITE’ is used to reconstruct the bulk flow and constrain the cosmological parameters and . They found an excess power in the bulk flow on a scale of , which suggests a high value of compared with the constraints from WMAP 5-year results (Komatsu et al., 2009) – see Fig. 2b.

Directly combining a variety of catalogues with different calibration methods and systematics may not be a precise way of exploring the combined constraints. Another way of carrying out the combination is to first compute the likelihood of individual data sets, and then directly combine them by multiplication, i.e.

 Ljoint(θ)=N∏kLk(θ), (25)

where is the number of data sets. Such a procedure assumes that the quoted observational random errors can be trusted, and that the two (or more) statistics have equal weights, so that

 χ2joint=∑kχ2k. (26)

However, when combing different data sets, one often wants to assign different weights to them. Lahav et al. (2000) describe an approach (see also Press, 1997, for an earlier application of the same idea in astrophysics) using

 χ2joint=∑kαkχ2k, (27)

where the are ‘hyper-parameters’, which are to be evaluated in a Bayesian way. Here for each data set is

 χ2=∑i[xobsi−xthei(θ)]2σ2i, (28)

where the summation is over measurements and is the error for each data point. By multiplying by , each error effectively becomes and therefore if an experiment underestimates (or overestimates) the systematic errors, the hyper-parameter can scale the error by using relative weights. Indeed, the hyper-parameters are useful in assessing the relative weight for each different experiment. This procedure gives an objective diagnostic for revealing experiments with problematic error estimates and which therefore deserve further investigation of their systematic or random errors.

It is worth clarifying that, in principle, the systematic effects could have two different components, either an overall multiplicative factor for the velocities, or else an extra contribution to the measurement noise. Although the former kind of systematic effect would be appropriate for some other kinds of data (particularly where the dominant uncertainty is a linear calibration factor), the effect on the peculiar velocity field is more complicated than this. We therefore focus our attention on modelling the systematics as an additional source of noise, effectively giving a different weighting of the signal-to-noise of each data set. For a discussion of related issues in other branches of astrophysics see for example Gull (1989) and Stompor et al. (2009).

In the same spirit of assigning weights to each data set, Hobson et al. (2002) calculated the joint distribution of cosmological parameters for multiple data sets, in which the weight assigned to each is determined directly by its own statistical properties. The weights are considered in a Bayesian context as a set of hyper-parameters, which are then marginalised over in order to recover the posterior distribution as a function only of the cosmological parameters of interest. In the case of a Gaussian likelihood function, this marginalisation can be calculated analytically, and it is shown that the joint probability distribution, , when applying the hyper-parameter approach is

 Ljoint(θ)=N∏k=12Γ(nk2+1)πnk/2|Vk|1/2(χ2k+2)−(nk2+1), (29)

where and are the number of data, covariance matrix and for the th data set. In our approach, flat priors on and are assumed. We will use Eq. (29) to explore the use of different catalogues to constrain cosmological parameters.

Once the distribution of Eq. (29) is calculated, the hyper-parameter is already marginalised over, which means that it automatically incorporates the relative weights between each data set and combines them in an objective way. Therefore, rather than using the COMPOSITE catalogue to constrain cosmology, we will first investigate the individual likelihoods for each data set, and use the joint distribution in Eq. (29) to combine these data sets.

## 4 Results

In this section, we will perform two different analyses separately. The first one, in Section 4.1 and 4.2, will focus on reconstructing on scales, and use it to constrain cosmological parameters. In the second analysis, we will extend the ‘minimal variance’ scheme to consider the cumulative bulk flow at different radii, and to explore the joint constraints on cosmology from all the bulk flows in different shells.

### 4.1 Bulk flow moments

We now present our results on reconstructing bulk flow moments using the minimum variance method on scales (Eq. (8)). In Fig. 3a, the magnitude of the bulk flow is plotted for the four different catalogues. Theoretically, the magnitude of bulk flow follows the Maxwellian distribution, i.e.

 p(V)dV=√54π(VσV)2exp[−32(VσV)2]dVσV, (30)

where is the velocity dispersion parameter (Coles & Lucchin, 2002). We calculate it as , where is the covariance matrix (Eq. (20)). One can see that SFI++ provides the tightest constraint on the bulk flow magnitude; this is because it is the largest, densest and closest to full-sky survey available at the moment. In addition, ENEAR (669 samples) and A1SN (153 samples) provide roughly similar constraints on the bulk flow. This is due to the fact that although the A1SN (First Amendment Supernovae catalogue) has less data than ENEAR, its errors (calibrated by luminosity distance) are much smaller than for the Fundamental Plane distance estimates.

In Fig. 3a, one can also see that, although there are offsets between the peaks of the likelihood for each individual catalogue, they are all quite consistent with the theoretical prediction, which is the mean-squared velocity of a spherical region of (see Section 3.1). Therefore by correcting the inhomogeneous Malmquist bias and properly selecting the samples, the four catalogues show a coherent flow on scales of about .

In addition, we plot the constraint on the direction of the bulk flow in Fig. 3b, and compare these directions with those found in other studies. From the figure, we can see that SFI++ provides the tightest constraint on the bulk flow direction, and the constraints on the direction of the bulk flows are consistent with each other across all catalogues. We also mark the preferred direction of the bulk flow from other published estimates. We can see that our constraints are consistent with the directions obtained from Ma et al. 2011, Nusser & Davis 2011, Watkins et al. 2009, and Macaulay et al. 2012, but Kitaura et al. (2012) prefer a slightly larger value for Galactic latitude.

The quantitative results for the four catalogues are listed in Table 2.

### 4.2 Cosmological parameters

We now turn to cosmological parameter estimation. We first apply the likelihood function (Eq. (24)) to to each individual catalogue to calculate , assuming a flat prior, and then combine different catalogues by using the hyper-parameter joint likelihood function of Eq. (29). We show our results in Fig. 4.

In Fig. 4a, one can see that the posterior distribution for is highly skewed and has a fairly long tail out to large amplitudes, which suggests that the peculiar velocity data available at the moment still cannot rule out flows with large amplitude. In addition, we can see that the SN catalogue peaks near the WMAP 7-year value (), while the SFI++ catalogue prefers a slightly higher value and A1SN and ENEAR prefer smaller ones. However, within the errors they are all quite consistent with each other, and none of them are inconsistent with the WMAP value of .

In Fig. 4b, we plot the constraints on the plane by using the hyper-parameter likelihood function of Eq. (29). One can see that the WMAP best-fit value is located close to the 68% contour in the plane, and therefore the hyper-parameter results are consistent with the expectation from CDM. Comparing Fig. 4b with figure 6 in Watkins et al. (2009), one can see that our contour prefers a much lower value of , and it is also closer to the WMAP value of .

### 4.3 Multi-shells likelihood method

In the above approach, we use the reconstructed 3-D bulk flow velocity as the ‘observational data’ to constrain cosmology. We have shown that this likelihood (Eq. (24)) can provide a fairly strong constraint on , but the constraint on is rather weak and thus the 2-D contours of do not close. This is because we use only one velocity vector (at ) as a constraint and this information is not enough to provide a tight limit (see also figure in Watkins et al. 2009).

Based on the ‘minimal variance’ scheme, here we propose another method to consider the bulk flow velocities for all of the shells within a certain radius. Since bulk flow velocities on different shells are highly correlated, one needs to calculate the full-covariance matrix of those bulk flow velocities. Note that we will apply this method to each individual data set to assess its validity for constraining cosmological parameters.

In the step of calculating (Eq. (17)), we need to simulate random velocities in the top-hat region and therefore obtain the weighting function for a certain shell . Let us assume we can sample multiple shells with this method, and therefore obtain the weighting function and bulk flow velocity for each shell . Now we can calculate the covariance matrix of s as ( are two shells)

 CRR′pq = ⟨uRpuR′q⟩ (31) = ∑i,jwRp,iwR′q,j⟨SiSj⟩ = ∑i,jwRp,iwR′q,jGij,

which can be broken down into an instrumental noise term

 Cn,RR′pq=∑iwRp,iwR′q,i(σ2i+σ2∗), (32)

and a cosmic variance term

 Cv,RR′pq=Ω1.1mH202π2∫dk P(k) (WRR′pq)2(k), (33)

where the angle-averaged window function is

 (WRR′pq)2(k)=∑i,jwRp,iwR′q,jFij(k). (34)

Note that all of the shells are correlated. Suppose we have shells, and now we arrange the bulk flow velocities of each shell into a velocity vector , where runs from to . For instance, the -direction of bulk flow in shell is now at the position in the vector. We do the same thing for the covariance matrix, and therefore we turn the covariance matrix into a covariance matrix , where the part contains the cosmological parameters and includes measurement errors. Now the likelihood function for multiple shells becomes

 L(θ)=1(2π)32det(Σ(θ))12exp(−123M∑i,j=1uiΣ(θ)−1ijuj). (35)

We apply this likelihood function to the SFI++ and ENEAR catalogues, since they are the deeper catalogues with the broader sky-coverage. The SFI++ catalogue has a mean distance of and extends out to , whereas the ENEAR catalogue has a mean distance and goes out to . We first trim both data sets out to , which leaves (SFI++) and (ENEAR) samples. Then we calculate the weighting functions and bulk flow velocities for different shells of distances , each with separation. The reason we use the bulk flows only on shells with distances greater than is that we would like to avoid non-linear structures on small scales. In addition, more distant objects are not very well sampled and therefore are very sparse, so we restrict our bulk flows to within the shell of . During the process of computation, we stick to the same conventions as listed in Section 3.3. Then we calculate the covariance matrix (Eq. (31)) and the likelihood function (Eq. (35)) for the shells, and we obtain the marginalised distribution of and , as shown in Figs. 5a and 5b. The joint distribution of is shown in Fig. 5c.

From Fig. 5a and Fig. 5c, one can see that the constraint on becomes tighter than the previous single bulk flow constraint and its contour is now closed. Therefore, by just using bulk flow data, one can obtain an independent constraint on the cosmological parameters. The best-fit value of WMAP 7-year results, as well as the constraints obtained from Nusser & Davis (2011) are all well within the confidence region of the parameter space. The reason that the likelihood function for multiple shells can give a reasonably good constraint on , while the bulk flow at does not, is that the the dependence of on is a function of scale, and therefore by incorporating multiple shells, one can gain more information on perturbations at different depths.

We would also like to point out that since the current peculiar velocity data are no deeper than , and the data beyond are very noisy and sparse, the shell bulk flows at distances of to are really the maximal information we can obtain with these catalogues. We have carefully checked that, for the data within , splitting into more shells of bulk flows does not improve the constraints, since these shells are highly correlated and we already have enough shells to effectively capture the scale dependence.

In addition, we should note that there is another statistical approach, the multiple moment method (Jaffe & Kaiser, 1995; Feldman et al., 2010; Macaulay et al., 2011, 2012), which has been proposed to reconstruct the bulk flow, shear, and octupole moments of perturbations. This can be considered as an alternative method to our multishell likelihood approach. The multiple moment method used in Feldman et al. (2010) and Macaulay et al. (2011) considers perturbations only on scales, but includes all moments, and they find that there is excessive power for the bulk flow, but not for the other moments. In contrast, our multishell likelihood function focuses just on the bulk flow, and it reconstructs this for shells of different distance, quantifying the full covariance matrix by calculating the correlations between shells. Our multishell likelihood shows that the bulk flow is not excessive compared with CDM predictions, and that one can obtain reliable constraints on cosmological parameters by applying the method to various peculiar velocity catalogues.

We list the numerical results of our cosmological parameter constraints in Table 3.

## 5 Discussion and Conclusions

In this paper, we have been investigating bulk flow measurements using various catalogues. We find results which are different to those given by Watkins et al. (2009), who claimed evidence for a surprisingly large bulk flow on scales, apparently discrepant with the CDM prediction. In contrast, by carefully considering four selected catalogues, we find a coherent flow of about on a scale of , entirely consistent with the value expected given the WMAP 7-year cosmological parameters.

By employing the same weighting scheme and the same conventions, we are able to accurately reproduce the results in Watkins et al. (2009), as shown in Fig. 2. Since we focus on the SN, SFI++ and ENEAR catalogues, we removed the other sub-catalogues from the COMPOSITE catalogue, and found a slightly lower value of (red line in Fig. 2b), but still higher than the WMAP constraint. This indicates that the high value of inferred from the COMPOSITE catalogue is not completely driven by the five deep and sparse catalogues included (SMAC, SBF, SC, EFAR and Willick).

To summarise the various other issues which could be responsible for the discrepancy, in Table 4 we list several technical points which lead to quantitatively different results.

The first issue is the assumption of small-scale velocity dispersion, which goes into the calculation of the covariance matrix (Eq. (21)). Watkins et al. (2009) assumed a value of , which is too small compared to the constraint obtained by Ma et al. (2011, 2012a), which was closer to the we chose here. Besides this, Watkins et al. (2009) used an inaccurate approximation for the matter power spectrum. From Fig. 6, one can see that although this is a small effect, it has the same sign, yielding smaller flows. Thus, by fitting to the observed flows, this tends to further increase the normalisation parameter .

The second major difference lies in the inhomogeneous Malmquist bias correction. In Watkins et al. (2009), only the SFI++ and SMAC catalogues were corrected for this effect. In our approach, we used the full-sky density field from the PSCz catalogue to extrapolate the density at any spatial position, and calculate the probability of the true distance given the measured distance (Eq. (3)). The comparison between the measured distance/velocity and true distance/velocity in Fig. 1, shows that the bias tends to move galaxies to smaller distances.

Another difference is that we only keep the high quality samples SN, SFI++ and ENEAR from the Watkins et al. (2009) compilation, and we further include the recent compilation of supernovae data, i.e. the A1SN catalogue. To remove any possible bias from the distant and sparsely sampled region, we restricted our attention to , and to avoid the results being driven by outliers, we also limited our samples to .

Furthermore, rather than using the COMPOSITE catalogue, we combined individual sample likelihoods using the Bayesian hyper-parameter technique. This should avoid the possibility that inconsistent data sets may bias the result if they are assigned equal weight. From the hyper-parameter likelihood, we find the best-fit value . This is somewhat low and hence inconsistent with a large bulk flow. However, the uncertainty is so large that this result is still consistent with standard CDM expectations.

Finally, we proposed a multishell likelihood method, which calculates the bulk flows in all shells within a certain radius together with their covariance matrix. This multishell likelihood takes into account the scale-dependence of the matter power spectrum on the parameter, and therefore maximises the constraining power one can obtain from a data set. By applying this likelihood to the SFI++ and ENEAR catalogues, we showed that they can provide much stronger constraints on and than the single shell () constraint. Our result also shows consistency with WMAP 7-year best-fits and results from Nusser & Davis (2011).

We conclude that the apparently large bulk flow on scales found by Watkins et al. (2009) may not be a genuine flow. By correcting for Malmquist bias, carefully selecting samples and examining assumptions, one finds that the current peculiar velocity field catalogues are consistent with the CDM model. On the other hand, any claimed discrepancy is not due to the ‘minimal variance’ scheme proposed by Watkins et al. (2009) and Feldman et al. (2010), since in our tests, we have shown that this scheme gives consistent results. In addition, our conclusions also agree with several other independent searches for bulk flows, such as the ASCE method with the SFI++ catalogue (Nusser & Davis, 2011), the minimal variance method with the Type-Ia SN data (Turnbull et al., 2012), and the luminosity function method with the 2MRS samples (Branchini, Davis, & Nusser, 2012). It should also be pointed out that the lack of evidence for a bulk flow on removes some of the support for an excessive flow on even deeper scales (Kashlinsky et al., 2008).

It seems clear that, despite extensive effort for decades, peculiar velocity catalogues remain systematics dominated. By applying different, but apparently reasonable, assumptions and statistical approaches, it is possible to find quite discrepant results using essentially the same data sets. This means that the realistic error bars are probably larger than given in many of the published studies. In addition, one should notice that there are many other methods developed to compute bulk flows that do not rely on distance indicators, such as luminosity fluctuations and fluctuations in the galaxy number density (Branchini, Davis, & Nusser, 2012), as well as the use of the kinetic Sunyaev-Zeldovich effect (e.g. Osborne et al. 2011). Although they also suffer from systematic effects, these will be of a different nature and therefore such approaches can be regarded as complementary to the method discussed here. Large-scale bulk flows still offer promise for constraining cosmological models, but fully realising that promise will require further improvements in the construction of catalogues, and in the control of the systematic effects which continue to plague this field.

Acknowledgments: We would like to thank Michael Hudson and Stephen Turnbull for sharing with us the First Amendment Supernovae compilation, and Enzo Branchini and George Efstathiou for helpful discussions. This research was supported by the Natural Sciences and Engineering Research Council of Canada. YZM is supported by a CITA National Fellowship.

## Appendix A Power spectrum Semi-analytic formula analysis

To sample the parameter space, we can apply a formula to generate the matter power spectrum . We use the following semi-analytic equation as presented in Eisenstein & Hu (1998):

 P(k)=σ28knsT(k)2, (36)

where

 T(k)=L0L0+C0(k,Γ)(k/Γ)2, L0=ln(2e+1.8(k/Γ)), C0(k,Γ)=14.2+7311+62.5(k/Γ). (37)

The quantity here is called the power spectrum ‘shape parameter’. Watkins et al. (2009) and Feldman et al. (2010) used as an approximation on large scales. However, in Fig. 6, we can see that this approximation () still has relatively large deviations from the numerical result from camb.

A more accurate parameterisation of the shape parameter is , as advocated in Eisenstein & Hu (1998). This is much closer to the numerical , due to the additional exponential factor.

In order to demonstrate the successful reproduction of results in Watkins et al. (2009), we use the approximation in Section 3. However, since in our subsequent analysis, we need to carefully compare the numerical value of the reconstructed bulk flow moment with the expectation based on cosmological parameters, we switch to the numerical result of the from CAMB (Lewis, Challinor, & Lasenby, 2000) in our determination of the bulk flow moments in each individual catalogue and subsequent hyper-parameter analysis.

### Footnotes

1. The SFI++ catalogue (Springob et al., 2007) was already corrected for Malmquist bias.
2. Turnbull et al. (2012) used a thermal noise .

### References

1. Abate A., Erdogdu P., 2009, MNRAS, 400, 1541
2. Bernardi M. et al., 2002, AJ, 123, 2990
3. Bouchet F.R., Lachièze-Rey M., eds., 1993, ‘Cosmic Velocity Fields’, Editions Frontieres, Gif-sur-Yvette
4. Branchini E., Davis M., Nusser A., 2012, MNRAS, 424, 472
5. Burstein D., 1990, Rep. Prog. Phys., 53, 421
6. Courteau S., Faber S. M., Dressler A., Willick J. A., 1993, ApJ, 412, 51
7. Coles P., Lucchin F., Cosmology– The Origin and Evolution of Cosmic Structure. John Wiley & Sons, LTD, 2002
8. Colless M. et al., 2001, MNRAS, 321, 277
9. Courteau S., Willick J., eds., ‘Cosmic Flows Workshop’, 2000, ASP Conf. Ser., Vol. 201, Astronomical Society of the Pacific, San Francisco
10. da Costa L.N. et al., 2000, AJ, 120, 95
11. Dale D.A. et al., 1999, AJ, 118, 1489
12. Eisenstein D, Hu W., 1998, ApJ, 496, 605
13. Feldman H., Watkins R., Hudson M.J., 2010, MNRAS, 407, 2328
14. Folatelli G. et al., 2010, AJ, 139, 120
15. Giovanelli R. et al., 1998, AJ, 116, 2632.
16. Gull S.F., 1989, in ‘Maximum Entropy and Bayesian Methods’, ed. J. Skilling, Kluwer Academic Publishers, Dordrecht, p. 53
17. Hendry M.A., Simmons J.F.L., Newsam A.M., 1993, in ‘Cosmic Velocity Fields’, ed. M. Lachièze-Rey, F. Bouchet, Editions Frontieres, Gif-sur-Yvette, p. 23
18. Hicken M. et al., 2009, ApJ, 700, 1097
19. Hobson M.P., Bridle S.L., Lahav O., 2002, MNRAS, 335, 377
20. Hudson M.J., 1994, MNRAS, 266, 468
21. Hudson M.J., 1999, PASP, 111, 57
22. Hudson M.J., Smith R.J., Lucey J.R., Branchini E., 2004, MNRAS, 352, 61
23. Jaffe A.H., & Kaiser N., 1995, ApJ, 455, 26
24. Jha S., Riess A.G., Kirshner R.P., 2007, ApJ, 659, 122
25. Juszkiewicz R., Vittorio N., Wyse R.F.G, 1990, ApJ, 349, 408
26. Kashlinsky A., Atrio-Barandela F., Kocevski D., Ebeling, H., 2008, ApJ, 686, 49
27. Kitaura F.-S. et al., 2012, 1205.5560 [arXiv:astro-ph]
28. Komatsu E. et al., 2009, ApJS, 180, 330
29. Komatsu E. et al., 2011, ApJS, 192, 18
30. Lahav O. et al., 2000, MNRAS, 315, 45
31. Latham D.W., da Costa L.A.N., eds., ‘Large Scale Structures and Peculiar Motions in the Universe’, ASP Conf. Ser., Vol. 15, astronomical Society of the Pacific, San Francisco
32. Lewis A., Challinor A., Lasenby A., 2000, ApJ, 538, 473
33. Lynden-Bell D., Faber S.M., Burstein D., Davis R.L., Dressler A., Terlevich R.J., Wegner G., 1988, ApJ, 326, 19
34. Ma Y.Z., Gordon C., Feldman H., 2011, Phys. Rev., D83, 103002
35. Ma Y.Z., Ostriker J., Zhao G.B., 2012a, JCAP, 06, 026
36. Ma Y.Z., Branchini E., Scott D., 2012b, MNRAS, 425, 2880
37. Macaulay E., Feldman, H., Ferreira, P.G., Hudson, M.J., Watkins, R., 2011, MNRAS, 414, 621
38. Macaulay E. et al., 2012, MNRAS, 425, 1709
39. Malmquist K., 1920, Medd. Lund. Astron. Obs., 22, 1
40. Nusser A., Davis M., 2011, ApJ, 736, 93
41. Osborne S.J., Mak D.S.Y., Church S.E., Pierpaoli, E., 2011, ApJ, 737, 98
42. Peebles P.J.E., Principles of Physical Cosmology. Princeton University Press, 1993
43. Press W.H., 1997, in ‘Unsolved Problems in Astrophysics’, ed. Bahcall J.N., Ostriker J.P., Princeton University Press, p. 49
44. Sarkar D., Feldman H.A., Watkins R., 2007, MNRAS, 375, 691
45. Springob C.M. et al., 2007, ApJS, 172, 599
46. Stompor R., Leach S., Stivoli F., Baccigalupi C., 2009, MNRAS, 392, 216
47. Strauss M.A., Willick J.A., 1995, Phys. Rep., 261, 271
48. Tonry J.L. et al., 2001, ApJ, 546 681
49. Tonry J.L. et al., 2003, ApJ, 594, 1
50. Turnbull S.J., Hudson M.J., Feldman H.A., Hicken M., Kirshner R.P., Watkins R., 2012, MNRAS, 420, 447
51. Watkins R., Feldman H.A., Hudson M.J., 2009, MNRAS, 392, 743
52. Wenger G. et al., 2003, AJ, 126, 2268
53. Willick J.A., 1999, ApJ, 522, 647
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters