Three-Point Correlation Functions of SDSS Galaxies: Constraining Galaxy-Mass Bias

Three-Point Correlation Functions of SDSS Galaxies: Constraining Galaxy-Mass Bias

Cameron K. McBride11affiliation: Department of Physics and Astronomy, University of Pittsburgh, Pittsburgh, PA 15260 22affiliation: Department of Physics and Astronomy, Vanderbilt University, Nashville, TN 37235 , Andrew J. Connolly33affiliation: Department of Astronomy, University of Washington, Seattle, WA 98195-1580 , Jeffrey P. Gardner44affiliation: Department of Physics, University of Washington, Seattle, WA 98195-1560 , Ryan Scranton55affiliation: Department of Physics, University of California, Davis, CA 95616 , Román Scoccimarro66affiliation: Center for Cosmology and Particle Physics, New York University, New York, NY 10003, USA , Andreas A. Berlind22affiliation: Department of Physics and Astronomy, Vanderbilt University, Nashville, TN 37235 , Felipe Marín77affiliation: Department of Astronomy & Astrophysics, Kavli Institute for Cosmological Physics, The University of Chicago, Chicago, IL 60637 USA 88affiliation: Centre for Astrophysics & Supercomputing, Swinburne University of Technology, Hawthorn, VIC 3122, Australia Donald P. Schneider99affiliation: Department of Astronomy & Astrophysics, Pennsylvania State University, University Park, PA 16802

We constrain the linear and quadratic bias parameters from the configuration dependence of the three-point correlation function (3PCF) in both redshift and projected space, utilizing measurements of spectroscopic galaxies in the Sloan Digital Sky Survey (SDSS) Main Galaxy Sample. We show that bright galaxies () are biased tracers of mass, measured at a significance of in redshift space and in projected space by using a thorough error analysis in the quasi-linear regime (). Measurements on a fainter galaxy sample are consistent with an unbiased model. We demonstrate that a linear bias model appears sufficient to explain the galaxy-mass bias of our samples, although a model using both linear and quadratic terms results in a better fit. In contrast, the bias values obtained from the linear model appear in better agreement with the data by inspection of the relative bias, and yield implied values of that are more consistent with current constraints. We investigate the covariance of the 3PCF, which itself is a measurement of galaxy clustering. We assess the accuracy of our error estimates by comparing results from mock galaxy catalogs to jackknife re-sampling methods. We identify significant differences in the structure of the covariance. However, the impact of these discrepancies appears to be mitigated by an eigenmode analysis that can account for the noisy, unresolved modes. Our results demonstrate that using this technique is sufficient to remove potential systematics even when using less-than-ideal methods to estimate errors.

Subject headings:
large-scale structure of universe – galaxies: statistics – cosmology: observations

1. Introduction

Studying the statistical properties of the galaxy distribution allows one to probe the structure of overdense regions today, learning about galaxy formation and cosmology. We observe significant clumping in this large-scale structure (LSS), which is commonly characterized by a series of -point correlation functions (reviewed in Peebles 1980). Observational evidence is in line with predictions of a dark-energy dominated cold dark matter () model (Komatsu et al. 2009; Sánchez et al. 2009; Reid et al. 2010). However, there is a large conceptual hurdle between following the evolution of mass densities in gravitational collapse (e.g. Bernardeau et al. 2002) and that realized by galaxy positions. A priori, there is little reason to believe a one-to-one correspondence exists between mass overdensities and galaxy positions; complex galaxy formation processes such as merging and feedback should have significant contributions. For example recent results from the Sloan Digital Sky Survey (SDSS; York et al. 2000) in Zehavi et al. (2005, 2010) show clustering varies with galaxy luminosity and color. This discrepancy between the observed “light” in galaxies relative to the predicted “mass” clustering is often described as galaxy-mass bias.

The parameterization of galaxy-mass bias enables a two-pronged approach to probe both cosmology and galaxy formation. On one side, we map the clustering of galaxies to that of the underlying mass distribution allowing us to understand and constrain cosmology. Alternatively, the parameterization of the bias itself encodes useful information concerning galaxy formation processes. This approach distills observational data from hundreds of thousands of galaxies available in modern surveys, such as the the two-degree field galaxy redshift survey (2dFGRS; Colless et al. 2001) and the SDSS into a significantly smaller and more manageable form.

Most observational evidence exploits the two-point correlation function (2PCF), the first in the series of -point functions (or equivalently, the power spectrum in Fourier space). However, the 2PCF represents only a portion of the available information. Measurements of higher order moments, such as the three-point correlation function (3PCF), allow a more complete picture of the galaxy distribution. The statistical strength of higher order information might rival that of two-point statistics (Sefusatti & Scoccimarro 2005), as well as break model degeneracies describing cosmology and galaxy bias (Zheng & Weinberg 2007; Kulkarni et al. 2007).

Previous analyses have estimated the 3PCF from modern galaxy redshift surveys, including work on the the 2dFGRS (Jing & Börner 2004; Wang et al. 2004; Gaztañaga et al. 2005) and results from SDSS data (Kayo et al. 2004; Nichol et al. 2006; Kulkarni et al. 2007; Gaztañaga et al. 2009; Marin 2010). Related higher order statistics have also been measured for these datasets (Verde et al. 2002; Pan & Szapudi 2005; Hikage et al. 2005; Nishimichi et al. 2007).

This work is the second of two papers analyzing the reduced 3PCF on SDSS galaxy samples. The first paper (McBride et al. 2010) focused on the details of the measurements we analyze here, as well as clustering differences due to galaxy luminosity and color. This paper utilizes the configuration dependence to constrain non-linear galaxy-mass bias parameters in the local bias model (Fry & Gaztanaga 1993), and the properties of the errors necessary for quantitative analyses.

The local bias model is a simple approach to characterize galaxy-mass bias. Alternative descriptions exist based on the halo model (reviewed in Cooray & Sheth 2002), which form phenomenological models with a wider range of parameters. Two well used formulations include the halo occupation distribution (HOD; Berlind & Weinberg 2002) and the conditional luminosity function (CLF; Yang et al. 2003; van den Bosch et al. 2003). There are formulations for the 3PCF; however, the accuracy of the model predictions is not as well determined as the 2PCF when compared with data (see Takada & Jain 2003; Wang et al. 2004; Fosalba et al. 2005). A significant advantage of a HOD modeling is the ability to use well determined measurements of the small scales for constraints (the non-linear regime in gravitational perturbation theory). Understanding the projected 3PCF, , a major component of this work, provides a critical link to obtain reliable measurements at these smaller scales from observational galaxy samples.

However, by using this simple prescription for galaxy-mass bias, we investigate the effects of binning and covariance resolution in a quantitative analysis with a clear and simple model where the implications for bias and cosmology are better studied for higher order moments. An important part of our analysis is comparing results from the projected 3PCF with the more commonly used redshift space measurements.

This paper is organized as follows. We discuss the SDSS data, simulations, and mock galaxy catalogs in §2. We review the theory and methods of our analysis in §3. We constrain the non-linear galaxy mass bias parameters in §4. In §5, we investigate clustering properties contained in the eigenvectors of the 3PCF covariance matrix. We perform a detailed examination of the quality of error estimation in §6. We discuss our results and compare to related analyses in §7. Finally, we review our main conclusions in §8. Unless otherwise specified, we assume a flat cosmology where , , and , used to convert redshift to physical distances.

2. Data

2.1. SDSS Galaxy Samples

Specifics of SDSS galaxy samples
Sample Absolute Redshift Volume Number of Density
Magnitude Galaxies
Table 1 The magnitude range, redshift limits, volume, total number of galaxies, and completeness corrected number density are shown for the galaxy samples constructed from the SDSS DR6 spectroscopic catalog. We selected these samples by cuts in redshift and corrected (K-correction and passive evolution) absolute -band magnitude to create volume-limited selections. See details in McBride et al. (2010).

The SDSS has revolutionized many fields in astronomy, obtaining images and spectra covering nearly a quarter of the sky by utilizing a dedicated 2.5 meter telescope at Apache Point Observatory in New Mexico (Gunn et al. 1998, 2006; York et al. 2000; Stoughton et al. 2002).

Our galaxy samples and details of the measurements are fully described in a companion paper (McBride et al. 2010). Briefly, we use galaxy data with spectroscopically determined redshifts, defined as the Main galaxy sample (Strauss et al. 2002). We conduct our analysis of clustering measurements using galaxies from DR6 (Adelman-McCarthy et al. 2008), and define samples from a refined parent catalog: the New York University Value-Added Galaxy Catalog (NYU-VAGC; Blanton et al. 2005). We analyze two samples: a BRIGHT sample where and LSTAR with . We do not analyze the FAINT sample presented in the companion paper (McBride et al. 2010), as the errors suffer from small volume effects. We tabulate properties, such as the redshift range, number of objects, volume and completeness corrected number density in Table 1.

Our absolute -band magnitudes use the NYU-VAGC convention defined to represent values at (see details in Blanton et al. 2005). We note these as for simplicity, which refer to . Radial distances and absolute magnitudes are calculated using a flat  cosmology with and .

2.2. Hubble Volume Simulation

To estimate the clustering of mass in the late time  cosmology, we analyze cosmological -body simulations. We use the Hubble Volume (HV) simulations (Colberg et al. 2000; Evrard et al. 2002) that were completed by the the Virgo Consortium. We utilize the lightcone output with  cosmology: (, , , ), where . The HV simulation consists of particles in a box of volume, resulting in a particle mass of . The particles start from an initial redshift of , and are evolved to the current time using a Plummer softened gravitational potential with a softening length of .

We use the same simulation output as presented in our companion paper (McBride et al. 2010). Here, we briefly review our postprocessing of the simulation data for completeness. We include redshift distortions in the mass field by distorting the position according to the peculiar velocity of the dark matter particle. We trim particles to match the identical volume of the corresponding SDSS samples, including the non-trivial angular geometry of SDSS data. Finally, we randomly downsample the number of dark matter particles to make the computational time of the analysis more manageable (discussed further in McBride et al. 2010).

2.3. Mock Galaxy Catalogs

We analyze mock galaxy catalogs created to match some of the SDSS galaxy data. These mock catalogs were constructed from 49 independent -body simulations, initiated with different random phases and evolved from a single cosmology: (). While these differ slightly from our assumed cosmology for the data, we expect the differences to be minor, with no significant implications on our analysis. Each of the 49 realizations had randomized phases where the initial conditions were generated from 2nd order Lagrangian perturbation theory (Scoccimarro 1998; Crocce et al. 2006). These simulations each consist of particles that we evolved using Gadget2 (Springel 2005) from an initial redshift of to the present epoch. The box side-length of contained enough volume to exactly match the brightest galaxy sample after applying the SDSS geometry. These simulations have been used in various other studies (e.g. Tinker et al. 2008; Manera et al. 2010).

The galaxy mocks were created by populating dark matter halos with galaxies by applying the HOD model in Tinker et al. (2005) with parameter values defined to represent and . The halos were identified using a friends-of-friends algorithm (Davis et al. 1985) applying a linking length of in units of the mean interparticle separation. The least massive halos contained particles, a minimum mass capable of hosting the faintest galaxies in the BRIGHT galaxy sample. Given the mass resolution of these simulations, less massive halos necessary to host galaxies in the LSTAR galaxy sample could not be identified. Therefore, we could only obtain reliable mock galaxy catalogs corresponding to the BRIGHT sample.

3. Theory & Methods

The -point correlation functions remain a standard description of the complexity seen in large-scale structure (LSS; Peebles 1980). In terms of the fractional overdensity () about the mean density (),


we characterize the two-point correlation function (2PCF) and three-point correlation function (3PCF) as:


We make the standard assumption of a homogeneous and isotropic distribution, and report clustering amplitudes dependent on the magnitude of the separation vector, e.g. .

Motivated by the hierarchical ansatz (Peebles 1980) and gravitational perturbation theory (Bernardeau et al. 2002) we use the reduced 3PCF:


This “ratio statistic” remains close to unity at all scales, and to leading order is insensitive to both time evolution and cosmology (reviewed in Bernardeau et al. 2002).

Redshift distortions impact measurements of clustering by altering the line-of-sight radial distant estimate, as we are unable to distinguish the galaxy’s peculiar velocity from the Hubble flow (reviewed in Hamilton 1998). We refer to the theoretical non-distorted distances as real space, commonly denoted with . Distances that include the redshift distortion (e.g. observational distances) are in redshift space, denoted with . We decompose the redshift space distance into line-of-sight () and projected separation () such that . With this separation, the anisotropic distortion is primarily contained in the coordinate.

We minimize the impact of redshift distortions by estimating the correlation function binned in both and and integrate along the line-of-sight resulting in the projected correlation function (Davis & Peebles 1983):


The projected 3PCF and its reduced form have analogous definitions:


The measurements we analyze set . We find this sufficiently deep to recover correlated structure to minimize redshift distortions, but not overly expensive to calculate (see detailed discussion in appendix of McBride et al. 2010).

The full 3PCF is a function of three variables that characterize both the size and shape of triplets. We parameterize the 3PCF by (, , ), where and represent two sides of a triangle (simplified notation from and ), and defines the opening angle between these sides. However, our measurements are estimated in bins defined by (). We convert to using the cosine rule (as detailed in McBride et al. 2010). The 3PCF remains sensitive to the exact choice of binning scheme, which can mask or distort the expected signal (Gaztañaga & Scoccimarro 2005; Marín et al. 2008; McBride et al. 2010). We choose a bin-width as a fraction, , of the measured scale, , such that and a bin at represents .

We always use the reduced 3PCF as a function of three variables, , but we simply our notation by sometimes referring to it as or even . If the amplitude of varies significantly with , we refer to this as strong configuration dependence, in contrast to little or no variation for a weak configuration dependence. We define the scale of triangles by , and choose configurations such that . This results in varying in size from when to when .

3.1. Galaxy-Mass Bias

We can consider galaxies to be a biased realization of the  mass field. In the local bias model (Fry & Gaztanaga 1993), the galaxy over-density, , can be connected to the mass over-density, , by a non-linear Taylor series expansion:


This relation describes the mapping between galaxy and mass by simple scalar values, to second order: the linear () and quadratic () bias.

With measurements on galaxy -point correlation functions, the clustering of galaxies is linked to mass clustering via the bias parameters. The 2PCF can be used to constrain the linear bias by equating the correlation function between galaxies, , to that of dark matter, , such that


The 3PCF is the lowest order correlation function that shows leading order sensitivity to the quadratic bias term. The analog to (9) for the connected 3PCF is written


where , etc. This simplifies for the reduced 3PCF where we denote the bias parameters as and :


We have changed notation slightly in (11), replacing with , the opening angle between the two sides and , as we discussed above.

A multiplicative factor such as can dampen () or enhance () the configuration dependence of as seen from the galaxy distribution, whereas the value of will produce an offset. We see that and are partially degenerate in this model. If shows no configuration dependence, two parameters are used to describe a shift in amplitude. However, this degeneracy can be removed when the 3PCF exhibits a shape dependence (Fry 1994). Even with the degeneracy broken, the values of and could show a strong correlation.

3.2. Estimating the Covariance Matrix

We measure the correlation between measurements by empirically calculating the covariance matrix. Given a number of realizations, , a fractional error on can be written as


for each realization () and bin () given a mean value () and variance () for each bin over all realizations. We use as a general placeholder for any measured statistic (2PCF, 3PCF, etc).

We construct the normalized covariance matrix using the standard unbiased estimator:


Equation 13 assumes that each realization is independent. In practice, a number of mock galaxy catalogs can be used to make this a tractable approach. If mock catalogs appropriate to the galaxy sample are not available, a covariance matrix can be estimated from the data itself, such as the commonly employed jackknife re-sampling (Lupton et al. 2001). Since jackknife samples are not independent realizations, we compute the covariance by:


where denotes the typical unbiased estimator of the covariance when computed on jackknife samples.

3.3. Eigenmode Analysis

We constrain galaxy-mass bias parameters using the information in the full covariance matrix. We utilize an eigenmode analysis (Scoccimarro 2000), an equivalent method to a principal component analysis (PCA) on the measurement covariance matrix. This method was tested in detail for the galaxy-mass bias of the 3PCF using simulated data in Gaztañaga & Scoccimarro (2005).

The basic idea is to isolate the primary contributing eigenmodes of the reduced 3PCF based on the structure of the normalized covariance matrix. This allows one to trim unresolved modes and perform a fit in a basis which minimizes the non-Gaussianity of the residuals. To summarize, the covariance matrix can be cast in terms of a singular value decomposition (SVD),


where is the Kronecker delta function making a diagonal matrix containing the singular values, . The matrices and are orthogonal rotations to diagonalize the covariance into where denotes the transpose of .

Applying the SVD to the covariance matrix yields a rotation into a basis where the eigenmodes are uncorrelated (i.e. the covariance matrix becomes diagonal). The resulting rotation matrix can be directly applied to our signal forming the -eigenmodes,


The singular values provide a weight on the importance of each eigenvector. Specifically, a multiplicative factor of is applied when is inverted. With this feature in mind, we define the signal-to-noise ratio as


We note that this estimate is a lower bound on the true due to the SVD. To remove noise and avoid numerical instabilities, we trim eigenmodes corresponding to low singular values. Gaztañaga & Scoccimarro (2005) suggest keeping eigenmodes resolved better than the sampling error in the covariance matrix. Since our covariance matrices are normalized (i.e. the diagonal elements are set to one), the singular values are directly related to sampling error, and we require the so-called “dominant modes” (Gaztañaga & Scoccimarro 2005) to satisfy:


where refers to the number of samples used to estimate the covariance matrix.

The advantage to using this eigenmode analysis for fitting is threefold. First, it correctly incorporates the correlation between measurement bins. Second, by performing the fit in the rotated basis of the eigenmodes, the residuals of the fit are more Gaussian and the degrees of freedom are properly addressed (e.g. 3 eigenmodes really only fits over 3 numbers). Finally, using only dominant modes removes artifacts due to noise in the estimated covariance matrices. For example, when using the full covariance but not trimming any modes, noise can cause a fit to converge on incorrect values with artificially small errors (and falsely high ). This effect becomes worse as the covariance becomes less resolved. Conversely, fitting over dominant eigenmodes helps to eliminate any problems from unresolved parts of the error estimation (Gaztañaga & Scoccimarro 2005, Figure 13), and has the benefit of dealing with singular covariance matrices.

4. Galaxy-Mass Bias

Figure 1.— Constraints on the galaxy-mass bias parameters using the galaxy sample and the HV simulation for mass estimates. The left column corresponds to fits using (redshift space) with the right column fit using (projected space). The top and bottom panels represent individual fits with triangles of and as indicated. The middle panels are a joint fit using both triangles. There are two points of comparison marked: an unbiased result with and only non-linear bias . The contours denote the and levels from the distribution of two parameters.
Figure 2.— Analogous to Figure 1, but for galaxies.
Figure 3.— The reduced 3PCF for the sample showing the mass scaled to the “best fit” galaxy-mass bias parameters. The top two panels correspond to redshift space, and the bottom two to projected space. From left to right, the scale of the triangle increases as noted. The red (dashed) line represents an individual fit only to that triangle scale, and the blue (dotted) line shows a joint fit between both scales.
Figure 4.— Like Figure 3 but for the sample. The reduced 3PCF showing the mass scaled to the “best fit” galaxy-mass bias parameters. The top two panels correspond to redshift space, and the bottom two to projected space. From left to right, the scale of the triangle increases as noted. The red (dashed) line represents an individual fit only to that triangle scale, and the blue (dotted) line shows a joint fit between both scales.

We want to constrain the galaxy-mass bias described by (8) using the full configuration dependence of the reduced 3PCF in the quasi-linear regime. For the galaxy data, we use measurements in both redshift space, , and projected space, as presented in McBride et al. (2010). We estimate the bias parameters by comparing to mass estimates obtained from dark matter particles in the HV simulation (see §ss:hv). We expect redshift distortions to affect the bias relation, which we partially neglect (Scoccimarro et al. 1999). In particular, we account for the effects of redshift distortions by applying a distortion distance to the dark matter particles based on their velocities for our mass measurement. However, this is not completely sufficient as redshift distortions alter the bias relation in (11), especially for (Scoccimarro et al. 1999, 2001). We expect that will be predominantly unaffected and roughly equivalent to real space measurements for this parameterization (see e.g. Zheng 2004).

We restrict our analysis to scales above , corresponding to with and with configurations having . We let vary between and using bins, as detailed in McBride et al. (2010). We investigate galaxy-mass bias in two samples where the covariance is well determined: BRIGHT () and LSTAR () as listed in Table 1. We remove the least significant eigenmodes during the fit by applying the criteria in (18). We discuss possible effects of using a different number of modes in §6. For each galaxy sample, we perform six independent fits: a series of three different scales for measurements in both redshift and projected space. We use the full configuration dependence for triangles with and , as well as a joint fit using both scales. For the joint fit, we estimate the full combined covariance matrix to correctly account for overlap and correlation and use the same eigenmode analysis. This changes the number of available modes from in the individual fits to modes for the combined joint fit. We estimate the covariance for these samples using jackknife samples, where our our jackknife regions have equal unmasked area on the sky and use the full redshift distribution of the observational galaxy sample (McBride et al. 2010).

4.1. Constraining Non-Linear Local Bias

We constrain the galaxy-mass bias using a maximum likelihood approach by calculating a simple statistic where the likelihood and


We determine the theoretical model, , by scaling the mass measurement from the HV simulation, , with bias parameters and as per (11). We evaluate on a grid using the ranges: and with a step-size of . We tested for discrepancies using a factor of finer spacing between grid elements with no significant differences to the fitted results.

We first examine the BRIGHT sample (), with the likelihood space of the six 2-parameter fits displayed in Figure 1. We include contours for Gaussian and levels which identify regions of probabilities for and %. We calculate these from the distribution for a 2-parameter fit (i.e. two degrees-of-freedom), with corresponding values of 2.3, 6.2, and 11.8 from the best fit value. We include two reference points for comparison, the unbiased result where along with a potential negative quadratic bias term accounting for the entire galaxy bias (similar comparison to Figure 5 in Gaztañaga et al. 2005).

We can clearly see the degeneracy between and in Figure 1, visible as the elongated diagonal contour. Larger values of remain likely with larger values of , consistent with our expectation of degeneracy by inspecting the bias relation in (11). The size of the errors are notably larger for projected space measurements, as well as lower values for the overall . This results from the larger uncertainties in the projected measurements (McBride et al. 2010). Since the scale represents a projection that incorporates larger scales (determined by the line-of-sight integration ), projected measurements are more sensitive to the dominant uncertainty from cosmic variance that increases with scale. In all cases, the unbiased model is excluded at greater than a level. To see the success of the fit “by eye”, we plot the 3PCF for dark matter, galaxies and best fit scaled model for this sample in Figure 3. Both the “individual” fits and “combined” joint fit produce models that well match the data.

Next, we fit the galaxy-mass bias parameters using the LSTAR sample (). This sample spans a unit bin in magnitude, and consists only of galaxies fainter than the previous bright sample. The results of the fit with likelihood contours are shown in Figure 2. The uncertainties appear reduced in size – a striking difference with respect to the BRIGHT sample in Figure 1. In addition, the slope of the “line of degeneracy” between and has shifted. We reason that this is in part due to the increased statistical significance of the larger sample, as both the measurements and covariance are better resolved. Due to the higher number density of galaxies, we re-measured the 3PCF using a finer binning scheme (fractional bin-width of as opposed to , see comparison in Appendix A). With the finer binning, we see a stronger configuration dependence, which will alter the degeneracy between and . We note that many of the best fit values appear smaller, which we expect for a fainter sample (Zehavi et al. 2005, 2010). The same line of reasoning suggests that the “unbiased” model should be more likely to fit.

As before, we plot the respective best fit model in comparison with the dark matter and galaxy 3PCF in Figure 4. There is a smaller difference between HV (mass) and galaxy measurements, as this sample is fainter. We notice some noise of the HV measurement for , making the model not quite as smooth. We note that by eye, on larger scales indicates a slight bias for the combined fit, with the model undershooting the data and uncertainties. Significant off-diagonal structure in the covariance matrix can produce a fit where “chi-by-eye” suggests a poor fit. Since the measurements in have much smaller errors, these scales drive the fit making measurements with appear a poor match to the “best fit” model.

We summarize the results of our two parameter constraints for the BRIGHT and LSTAR sample in Table 2. The BRIGHT sample () represents galaxies with -band magnitudes significantly brighter than where (Blanton et al. 2003). We typically consider galaxies to have a linear bias, i.e. where , and we might expect this brighter sample to have a larger value. The constraints from projected measurements appear to follow this logic; the best fit values on in the fainter LSTAR sample () are lower with . Redshift space measurements, , appear consistent with for all fits, but at the same time values of are lower, reflecting the degeneracy of the and parameters. The reduced values show an acceptable fit in almost all cases; the exceptions are the two fits using the triangles for the LSTAR sample. Consequently, the joint fit appears to be the poorest match in Figure 4. The  in Table 2 displays the likelihood an unbiased model is from the best fit parameters. We find an unbiased model is ruled out for the BRIGHT galaxy sample at greater than in redshift space and in projected space. We cannot conclude the same for the LSTAR sample, which is largely consistent with an unbiased model. We generally consider bright galaxies to be more biased (Zehavi et al. 2005, 2010). The LSTAR sample is a magnitude bin around and fainter than the BRIGHT sample, and we expect a better consistency with the unbiased model.

Galaxy-Mass Bias Parameters from SDSS
Measurement Scales () B C D.o.F. unbiased
BRIGHT-z 6-18 1.48 6-2 ()
BRIGHT-proj 6-18 0.78 6-2 ()
BRIGHT-z 6-27 0.83 9-2 ()
BRIGHT-proj 6-27 0.45 10-2 ()
BRIGHT-z 9-27 0.60 4-2 ()
BRIGHT-proj 9-27 0.34 5-2 ()
LSTAR-z 6-18 13.47 3-2 ()
LSTAR-proj 6-18 0.85 4-2 ()
LSTAR-z 6-27 3.22 5-2 ()
LSTAR-proj 6-27 1.07 7-2 ()
LSTAR-z 9-27 0.07 3-2 ()
LSTAR-proj 9-27 1.75 4-2 ()
Table 2 The two-parameter best fit galaxy-mass bias parameters, using (11) with the configuration dependence in the reduced 3PCF from SDSS DR6 galaxy samples in comparison with dark matter clustering from the Hubble volume simulation. The fits are performed separately on two galaxy samples BRIGHT () and LSTAR () using measurements in redshift space (denoted with “z”) as well as projected space (“proj”). The second column lists the range of scales used for the respective fit. The errors are marginalized bounds calculated by the range within from the best fit value. The quality of the best fit value is stated with the reduced chi-square . The degrees of freedom (D.o.F.) correspond to the number of eigenmodes used minus the number of parameters (). The last column lists the value to quantify the likelihood of an “unbiased” model matching the data, i.e. , with a likelihood expressed in the number of from by the standard Gaussian assumption for the distribution.
Galaxy-Mass Bias without Quadratic Term
Measurement Scales () B D.o.F. from best fit
BRIGHT-z 6-18 2.42 6-1 ()
BRIGHT-proj 6-18 0.63 6-1 ()
BRIGHT-z 6-27 2.23 9-1 ()
BRIGHT-proj 6-27 0.42 10-1 ()
BRIGHT-z 9-27 1.85 4-1 ()
BRIGHT-proj 9-27 0.26 5-1 ()
LSTAR-z 6-18 8.68 3-1 ()
LSTAR-proj 6-18 0.57 4-1 ()
LSTAR-z 6-27 4.59 5-1 ()
LSTAR-proj 6-27 1.02 7-1 ()
LSTAR-z 9-27 0.14 3-1 ()
LSTAR-proj 9-27 1.25 4-1 ()
Table 3 The single-parameter best fits for galaxy-mass bias using (11) where we constrain . We fit the configuration dependence of the reduced 3PCF from SDSS DR6 galaxy samples in comparison with dark matter clustering from the Hubble volume simulation. Fits are performed separately on two galaxy samples BRIGHT () and LSTAR () using measurements in redshift space (denoted with “z”) as well as projected space (“proj”). The second column lists the range of scales used for the respective fit. The errors are marginalized bounds calculated by the range within from the best fit value. The quality of the best fit value is stated with the reduced chi-square . The degrees of freedom (D.o.F.) correspond to the number of eigenmodes used minus the number of parameters (in this case, just one). The last column lists the value to quantify the difference in likelihood of this model with compared with the best fit of a two-parameter fit (i.e. Table 2).

4.2. Non-zero Quadratic Bias?

With our two parameter likelihood space, we can investigate the statistical significance of a non-zero quadratic bias term ( which is encapsulated in ). We use the same configuration dependence of the 3PCF, and the measured covariance, but restrict the two parameter fit such that . We evaluate the best fit , the quality of the fit (via the reduced ) as well as the for the best two-parameter fit, which we present in Table 3. For the BRIGHT sample, we notice the values are equivalent across both and for the same scales on the same sample. Since we removed the degeneracy (as is zero), this behavior makes sense and in agreement with the measurements. We note that the typical values are larger for the BRIGHT sample, and lower for the fainter LSTAR sample. For our constraints find little statistical significance for a non-zero quadratic bias term; the likelihood difference is small and a linear bias term is sufficient to quantify the bias for both the BRIGHT and LSTAR samples. Overall, measurements in redshift space () more strongly suggest that , especially when using the smaller scale triangles ().

4.3. Relative Bias

Figure 5.— The relative bias using measurements in redshift (red ’x’ symbols) and projected space (blue diamonds). We calculate the uncertainties by propagating 1 values from the 2PCF. The dotted and dashed lines display results from the best fit bias terms at the largest scales (). The bold lines indicate values from the two-parameter fit (Table 2), and the faint lines show the best linear fit (Table 3).
Figure 6.— Analogous to Figure 5 but for the 3PCF. The relative bias using measurements of equilateral 3PCF in redshift (red ’x’ symbols) and projected space (blue diamonds). We calculate the uncertainties by propagating 1 values from the 3PCF. The dotted and dashed lines display results using the best fit bias terms at the largest scales (). The bold lines indicate values from the two-parameter fit (Table 2), and the faint lines show the best linear fit (quadratic bias constrained to be zero; Table 3).

The relative bias characterizes the relative clustering strength between different galaxy samples – an alternative to the “absolute” galaxy-mass bias constrained previously. Relative bias is insensitive to cosmology and does not require assumptions to determine mass clustering. We can use the relative bias to check consistency with linear and quadratic bias parameters obtained above in §4. For the 2PCF, the relative bias is simply:


where can refer to redshift or projected space measurements.

We show from the 2PCF in Figure 5, using the linear bias parameters obtained from the best two-parameter fit (i.e. Table 2). Both redshift space and projected measurements agree and produce a flat relative bias, even at non-linear scales below a few . Two obvious discrepancies arise when comparing observational data to “best fit” values. First, neither redshift nor projected space fits appear to match data. Earlier we noted a substantial degeneracy between the linear and quadratic bias terms. The quadratic bias term is accounting for more of the clustering bias when we constrain with , which isn’t noticeable in the 2PCF. This suggests we underpredict values of linear bias, either just for the BRIGHT sample or in unequal portions for both. Second, there is a significant difference between these two estimates given the same galaxy samples, although the projected measurement appears closer to agreement.

Let us consider the relative bias of the reduced 3PCF. Since is proportional to , we define the relation


Figure 6 presents the relative bias of our DR6 galaxies for the equilateral 3PCF () (McBride et al. 2010). We note that is related but not identical to the configuration dependence measurements of used for constraining and . Looking at Figure 6, we see an obvious difference with respect to the 2PCF: the much larger uncertainties. However, the predicted from the galaxy-mass bias constraints appear much more consistent with the measurements, as opposed to the 2PCF. The 3PCF results agree with the observational data, and show a much smaller discrepancy between redshift and projected space. The quadratic bias term () can properly account for the clustering difference that was missing in the relative bias of the 2PCF.

4.4. Implications for Cosmology:

Implied values of
Measurement Scales () B
BRIGHT-z 9-27 0.96-1.13
BRIGHT-proj 9-27 1.02-1.12
LSTAR-z 9-27 0.88-0.97
LSTAR-proj 9-27 0.83-0.97
Table 4 We use galaxy-mass bias constraints from the configuration dependence of the 3PCF, , with measurements of the 2PCF, to estimate the implied values of via (23). We use the largest triangle configurations for our two samples, and the 1-parameters constraints on . The range of does not represent formal uncertainties; we calculate values from the range of uncertainties stated in , neglecting additional errors from the 2PCF. For reference, WMAP-5 (with SN and BAO) suggest (Komatsu et al. 2009).

Better understanding galaxy-mass bias, or at the least accurately parameterizing it, allows one to “calibrate out” the effects of galaxies and infer properties of the underlying mass distribution to constrain cosmology. We can use our estimates of bias to probe the mass variance in spheres of radius, a common normalization of the amplitude of the matter power spectrum, . The theoretical is linearly extrapolated from a very early epoch until today,


where is a top-hat window function in Fourier space for mode and smoothing radius and is the linear power spectrum.

In terms of our fitting formula on the 3PCF in (11), we expand the bias relation for the 2PCF to highlight its dependence on


Formally, the mass 2PCF already encodes a value of . As scales linearly with a change in the square of , we include an explicit scaling factor to account for a difference in between the underlying mass of the observed galaxy distribution and that assumed in our estimate of mass clustering from -body results. In our case, we use dark matter from the HV simulation where , explaining the denominator on the right hand side of (23). We can see that an incorrect assumption of in the estimate of mass will directly translate into a different value of the best describing galaxies. Even if we use the above relation, (23), and are completely degenerate when solely considering the 2PCF.

By using the additional information available in the configuration dependence of the reduced 3PCF, we obtain a value of that is independent of , and breaks the degeneracy between the two parameters. Formally, this is only true to leading order, as loop corrections in will add cosmological dependence which we neglect in this analysis.

With an independent value of from (11), we estimate by utilizing the 2PCF in (23). Ideally, we could construct a three-parameter fit to jointly constrain , and (e.g. Pan & Szapudi 2005, on 2dFGRS data). Or as an further extension, we could jointly fit over several samples, since they each have the same underlying . However, this additional complexity is beyond the scope of this analysis as our current uncertainties would yield poor constraints on . We simply estimate the value of implied by best fit bias parameters. We restrict this estimate to the largest scale triangles () to ensure we approach the linear regime (i.e. the scales we are most confident with using the local bias model). Given our analysis of the relative bias, we use the larger values where we constrain . We present these estimates in Table 4.

5. Eigenvectors of the 3PCF Covariance Matrix

Figure 7.— Top three eigenvectors (EVs) chosen from the normalized covariance matrix in the galaxy sample. The sign of the EV is arbitrary. The first EV (solid black) shows equal weights for all bins. The second (dashed red) and third (dotted blue) EV display the configuration difference between perpendicular and co-linear triangles as well as the scale variation as the scale of the third side increases.
Figure 8.— Like Figure 7 but for the galaxy sample. The top three eigenvectors (EVs) chosen from the normalized covariance matrix. The sign of the EV is arbitrary. The first EV (solid black) shows equal weights for all bins. The second (dashed red) and third (dotted blue) EV display the configuration difference between perpendicular and co-linear triangles as well as the scale variation as the scale of the third side increases.

A point that is often overlooked is that the covariance matrix itself is a measurement of clustering rather than simply a means of quantifying uncertainty. It exhibits increased sensitivity to higher order terms (for a concise review see Szapudi 2009) with the covariance of the 2PCF being leading order sensitive up to fourth order, and the 3PCF up to sixth order.

We investigate the structure of the normalized covariance matrices by examining the eigenvectors (EVs), or principal components, obtained by a singular value decomposition. The EVs are contained in the and matrices from (15) and (16). The first EV is associated with the largest singular value (SV), and accounts for the largest variance in the normalized covariance matrix (i.e. most of the observed structure); the second EV is the next largest SV and so on. If the covariance matrix resolves predominately “true” signal, the first EVs should characterize this structure whereas the lower ranked EVs encapsulate noise. While the amplitude of the EVs are not significant without the corresponding SV, they do represent the variation between bins in an orthogonal basis where the full covariance is a simple linear combination.

We show the top three EVs for the BRIGHT and LSTAR galaxy samples in Figures 7 and 8, respectively. In all cases there appear to be consistent features in the eigenmodes. The first EV represents weighting all bins equally. Typically, the second EV highlights the difference between “perpendicular” and “co-linear” configurations. Finally, the third EV tracks a roughly monotonic change from small to large , possibly accounting for the scale difference of the continually increasing third side of the triangle. We point out that this structure evident in observational galaxy samples agrees well with theoretical predictions from simulations in Gaztañaga & Scoccimarro (2005). Remember, these EVs are obtained by deconstructing just the normalized covariance matrix.

We do not always see a clear separation between the second and third EVs. As the full structure is a linear combination of all modes, the configuration dependence and scale variation effects could be combined. The SVs of the two effects are essentially equivalent for our measurements, making their numerical distinction in the SVD somewhat arbitrary. This is not a concern, as it appears that the linear combination of these two EVs is consistent with our interpretation (even if they are mixed). Less significant EVs show less coherent structure, consistent with noisy modes in the covariance (as we would expect).

By examining the EVs of the covariance matrices, we note structure consistent with measurements of the reduced 3PCF (see Figures 3 and 4). Observing this structure provides supporting evidence that we have signal dominated estimates of the covariance matrix. This justifies our approach of using a combination of the most significant eigenmodes in a quantitative comparison to galaxy-mass bias models, as we did in §4.

Figure 9.— We compare the absolute (diagonal) errors of the reduced 3PCF obtained by using different methods of estimation: independent mock catalogs or jackknife resampling as denoted. These measurements correspond to the BRIGHT galaxy sample.

6. Quality of Error Estimation

Figure 10.— We present the normalized covariance matrices and residuals of the error estimation for large triangles (). The left and right columns pertain to redshift and projected space respectively. We estimate errors using , , , and jackknife regions and compare with results from mocks from independent -body simulations. The solid, dashed and dotted contours in the normalized covariance correspond to values of , , and , respectively.
Figure 11.— Top three eigenvectors (EVs) chosen from the normalized covariance matrix for the different error estimates for and . The sign of the EV is arbitrary. The first EV (left panels) shows equal weights between all bins. The second (middle) and third (right) EV display the configuration difference between perpendicular and co-linear triangles, as well as the scale variation as the scale of the third side increases.
Figure 12.— The singular values (SV), or eigenvalues, obtained from the singular value decomposition (SVD) of the normalized covariance matrix for each of our error estimates. Larger values of the SV correspond to more statistically significant eigenmodes in the structure of the covariance.
Figure 13.— The signal-to-noise ratio for each eigenmode, ordered in terms of importance. The total signal-to-noise of a measurement is calculated by adding each individual eigenmode in quadrature.
Figure 14.— The cumulative signal-to-noise ratio for each eigenmode, ordered in terms of importance. The cumulative total is calculated by summing in quadrature the more significant modes.
Figure 15.— We compute the compatibility of the subspace contained in a series of eigenvectors such that the y-axis can be interpreted as the fractional ”match” between the spaces the eigenvectors probe. A value of means no mismatch in the space they probe, and means no overlap (i.e. orthogonal). The left panel uses the eigenvectors of the redshift space covariance matrix determined by the mocks as the reference value. The right plot does the same as the left, but uses measurements in projected space. The comparison is cumulative (eigenmode 3 means the sum of the first 3 modes).
Figure 16.— We use the subspace comparison of eigenvectors to estimate the difference in space probed between similar numbers of eigenmodes in redshift and projected covariance matrices for each of the error estimates.

We rely heavily on the structure of the error covariance matrix for constraints on galaxy-mass bias. We noticed the observed structure in the covariance is qualitatively similar to clustering measurements (in §5), but it remains unclear if this structure is affected by our jackknife re-sampling estimation of the covariance. Higher orders add complexity and increased sensitivity to systematics, even with a “ratio statistic” such as the reduced 3PCF where the error sensitivity is canceled to first order. We must investigate the error resolution of jackknife resampling on the 3PCF, as tests on the 2PCF in angular correlation functions (Scranton et al. 2002) or redshift space SDSS (Zehavi et al. 2002, 2005, 2010) can not be assumed to be sufficient.

An alternative method to estimate errors uses a series of independent realizations of artificial galaxies, ideally created to match observational limitations such as the volume and geometry of the SDSS galaxy samples. We created 49 independent galaxy mock catalogs based on independent -body simulations that have appropriate resolution to match the BRIGHT SDSS sample (). We use these independent mocks to estimate errors and compare with those obtained from jackknife re-sampling of the data. This exercise should provide an idea how effective jackknife re-sampling is for resolving the errors on the 3PCF. The BRIGHT galaxy sample has the lowest number density with the least number of galaxies over the largest volume. To help protect against undersampled measurements due to low bin counts (see discussion in McBride et al. 2010), we restrict the comparison to the configuration dependence of the larger triangles ( sides).

We estimate the covariance matrix for the BRIGHT sample using different numbers of jackknife samples on the data, specifically using , , and jackknife regions. Again, these jackknife sample are created from the observational data and not from the mock galaxy catalogs, where each jackknife region is selected to maintain equal unmasked area (same method as detailed in McBride et al. 2010). Since we measure bins for , we require at least jackknife samples to prevent a singular covariance matrix. We use twice this number () and then use the same number as the number of mocks (). The final value corresponds to the number of unique elements in the symmetric covariance matrix: . We caution that as we increase the number of jackknife samples, we decrease the respective volume of each jackknife region which might subtly bias jackknife estimates (e.g. underestimate cosmic variance).

First, we investigate the magnitude of the absolute (diagonal) errors of the reduced 3PCF. Since we use normalized covariance matrices, differences in the absolute errors might not be noticeable in the covariance structure. The absolute errors are shown in Figure 9. We see little difference between any of the methods, and the uncertainty typically ranges between and .

For each of our methods, we estimate the normalized covariance in both redshift and projected space, as depicted in Figure 10, and include the distribution of residuals. We note that jackknife re-sampling methods appear to underestimate the correlation in all cases, but the general structure looks comparable. More samples generally produce a smoother, and more correlated, covariance matrix. However, not even the jackknife sample estimate reproduces the correlation in mocks. We consider the distribution of residuals an important metric in evaluating the reliability of the resulting covariance, which we include in Figure 10. Ideally, the covariance matrix accounts for all “connections” between bins only if the residuals are reasonably Gaussian. We notice a skew in several of the jackknife re-samplings, with a tail extending to lower values. As discussed in McBride et al. (2010), this is a consequence of cosmic variance within jackknife samples. A few rare structures affect the 3PCF; when they are excluded by an a jackknife region the of the entire sample drops. The mock estimate shows a slight skew in the positive direction from the same effect. In mocks, when a rare structure exists in the probed volume then the 3PCF rises producing a rare high measurement.

The eigenmode analysis we utilize relies on signal being the dominant contribution to the structure of the covariance matrix (as opposed to noise). Noise is commonly expected to be an independent or diagonal contribution. Similar to §5, we examine the eigenvectors (EVs) of the covariance matrix to provide insight into the structure. By using the singular value decomposition (SVD), the eigenvectors are ordered by largest to least amount of variance explained in the covariance matrix.

The first three EVs are shown in Figure 11 for both redshift and projected space. Similar structure appears in each of them, which we interpret as follows. The first EV represents the general measurement, with all eigenmodes equally weighted. The second EV shows the difference between “collapsed” and “perpendicular” configurations. Finally, the third EV represents a scale dependence as the third side of the triplet ranges between at to at . In some of the estimates, the shapes of the second and third EVs appear either combined or transposed. Since the full measurement is a linear combination of all EVs, this lack of separation makes sense. In these cases, the statistical significance the two EVs remain similar. This interpretation of the structure follows the analysis by Gaztañaga & Scoccimarro (2005) for -body simulations. The less significant eigenvectors (which we do not show) appear random, with the lowest being contributions from noise or numerical instabilities. We identify the significance of the eigenmodes by inspecting the singular values (SVs) shown in Figure 12. The SV can be understood as an “importance weighting” of each eigenmode, and the figure shows a rapid decline of significance for each eigenmode. The first three eigenvectors cumulatively account for over of the variance in the normalized covariance matrix.

The signal-to-noise ratio () of each eigenmode is shown in Figure 13, as calculated by (17). The mocks in both redshift and projected space depict a slow decline in over the first few eigenmodes, supportive or our interpretation of relative significance. This trend is not as clear in the jackknife estimates for redshift space, although it appears consistent in projected space. We see the first half of the modes appear resolved, with well behaved . For the least significant eigenmodes, the noisier error estimates using fewest jackknife samples show unrealistically high ratios (especially in the case of jackknife regions). The total would increase dramatically and artificially if we included these noise dominated modes. In these cases, using the full covariance (i.e. including all modes) would be a mistake. To make the point clearer, we examine the cumulative ratio in Figure 14 where we identify rapid upturns in the total as an artificial consequence of noise. Several curves in Figure 14 do not appear problematic with this test, and show steady behavior across all modes. The amplitude of the ratio between mocks and the jackknife samples show consistency, but the ratio does not appear to be a monotonic change with the number of jackknife samples which suggests a complex relationship between the best and an optimal number of jackknife regions.

We can compare the subspace that a set of eigenvectors probe between two error estimates. The formalism is the same as discussed in Yip et al. (2004, see section 4), which results in a fractional “compatibility” between a collection of eigenvectors. Intuitively, this is the matrix equivalent of the vector dot product, where two orthogonal unit vectors would have a vector subspace of (no compatibility) and two identical unit vectors would result in . We use the covariance of the mocks as “truth”, and test the fractional compatibility of the jackknife estimates for covariance in and shown in Figure 15. When all the eigenmodes are considered, the subspace becomes the full space and the comparison yields unity by construction. We notice the projected measurements never appear more discrepant than . After the first few eigenmodes, redshift space shows a similar agreement. With the exception of the jackknife sample estimate, the 3 eigenmode mark appears compatible or better in all cases. This quantifies our argument of the top three EVs in Figure 11, where the second and third eigenvectors appear different (predominantly in redshift space), but their linear combination remains consistent with each other. Remember, this comparison only considers the compatibility of the direction of each eigenvector, and not their relative strengths (i.e. SVs).

We evaluate the subspace compatibility on the normalized covariance matrix between redshift and projected space estimates. For each method, we show the fractional comparison in Figure 16. The mocks estimates show the most compatibility across all eigenmodes, where showing agreement at or better. With all estimates, we find that the combination of the first three eigenmodes remains a compatible subspace, above , if we again exempt the -jackknife sample estimate (which shows less than compatibility).

We caution that the resolution of errors and the choice of binning scheme relate in a non-trivial manner, which is discussed in additional detail by McBride et al. (2010). We chose “large” bins (fiducial scheme with ) to ensure a smooth, signal dominant structure in the covariance matrix. Overall, this error comparison supports our claim that accurate results can still be obtained even with less-than-optimal error estimation such as jackknife re-sampling.

7. Discussion

We utilize the the configuration dependence of the 3PCF in redshift and projected space to constrain galaxy-mass bias parameters in the local bias model. We find that galaxies are biased tracers of mass, with brighter galaxies corresponding to increased bias. These results are consistent with detailed analysis of SDSS galaxies from the 2PCF (Zehavi et al. 2005, 2010) which quantifies how bias increases clustering for brighter galaxy samples. Our results indicate that a linear bias model yields reasonable approximations to the observations, in agreement with Hikage et al. (2005). However, a non-linear bias model produces slightly better agreement, and yields lower reduced chi-square values ( in Tables 2 and 3). We notice a strong correlation between linear and quadratic bias, as expected from inspection of (11), and consistent with measurements of SDSS galaxies using the bispectrum (Nishimichi et al. 2007). We find that our redshift space measurements predict significantly negative quadratic bias with a linear bias near one. This effect was seen in a similar analysis conducted on 2dFGRS galaxies (Gaztañaga et al. 2005). Interestingly, we find projected measurements suggest a larger linear bias with near zero quadratic bias for the same samples, suggesting a possible systematic effect from redshift distortions in this simple bias model.

We examined the relative bias in §4.3. We find supporting evidence that the brighter galaxy sample is a more biased realization using both the 2PCF and 3PCF, consistent with other analyses of SDSS data (Zehavi et al. 2002, 2005, 2010). Relative bias provides a consistency check on the “absolute” galaxy-mass bias parameters we constrain, suggesting a combination of linear and quadratic bias terms are consistent with observations. However, the relative bias of the 2PCF suggests that our two parameter bias model fits underpredict the value of linear bias necessary to explain the observations. Again, we see a hint that constraints from projected measurements appear to be less affected – although we caution that this trend has weak statistical significance given the larger uncertainties in projected space.

We obtain reasonable projections for by using our linear bias values from fits on in conjunction with the 2PCF. We estimate the values of to be between and based on the BRIGHT () and LSTAR () galaxy samples. The values we obtain are contingent on a specific model of mass clustering, where we have chosen to use -body simulations (specifically the Hubble Volume  results), and redshift distortions (which we include through velocity information to distort particle positions in the HV simulation). For comparison, constraints of from a joint analysis of the cosmic microwave background (CMB), supernova data (SN) and baryon acoustic oscillations (BAO) find (Komatsu et al. 2009). Our lower values are in good agreement with these constraints. Our high end values appear too large, but our results are in reasonable agreement with an analysis of a related statistic, the monopole moment of the 3PCF, where they find best fit values between and (see Table 3 in Pan & Szapudi 2005) using 2dFGRS galaxies (Colless et al. 2001). Although the value of we obtain is comparable with results from 2dFGRS, the specific bias values will not be, as the 2dF targets a different galaxy selection than our SDSS samples. If we underestimate the value of linear bias, effectively here, (23) shows that the implied value of will be overestimated. This might explain the larger values of our estimates in comparison to WMAP analyses. Our projections for use clustering measurements between and exploit only the configuration dependence of . This is a much smaller slice of data that is significantly different than either the monopole measurement (which utilizes a larger range of scales without configuration dependence) or WMAP results (that combines a immense amount of data from both CMB and LSS analyses). We do not intend this analysis to complete with these constraints, but rather to help illuminate the role of galaxy-mass bias in future constraints of using the 3PCF.

Understanding the properties of measurement errors and the impact of empirical methods of estimating the covariance is a critical component necessary for quantitative constraints. Recent results have done comparisons on lower order statistics, such as the work by Norberg et al. (2009). We compared several properties of 3PCF covariance matrices estimated from jackknife re-sampling to those constructed from many realizations of independent galaxy mock catalogs. While we noted some concerning discrepancies, we found these typically affected only the least significant eigenmodes. We found many similarities between the covariance estimates, including physical descriptions for the first three eigenmodes which account for an overwhelming majority of the variance. We established the need to trim noisy, unresolved modes from the covariance. When trimmed, and the eigenmode analysis is properly utilized, we noted only a few significant differences, mostly in the case of jackknife samples. We conclude that our use of jackknife samples does not significantly affect our analysis.

8. Summary

We analyze measurements of the configuration dependence of reduced 3PCF for two SDSS galaxy samples that were first presented in McBride et al. (2010). In both redshift and projected space, we characterize the galaxy clustering differences with those predicted by the non-linear mass evolution in the  Hubble Volume simulation. Here, we summarize our main results:

  • We demonstrate that brighter galaxies remain a more biased tracer of the mass field by constraining the linear and quadratic galaxy-mass bias parameters using a maximum likelihood analysis on scales between and . Conservatively using scales above , the BRIGHT sample is biased at greater than and the fainter LSTAR shows no significant bias, in generally agreement with expectations from previous analyses of SDSS galaxies (Zehavi et al. 2005, 2010). The bias parameters and their significance are summarized in Table 2.

  • We resolve the degeneracy between the linear and quadratic bias terms, which helps to explain the weak luminosity dependence observed in the reduced 3PCF.

  • We find a linear bias model appears sufficient to explain the measurements of the 3PCF by re-fitting the linear bias while constraining the quadratic bias at zero (results reported in Table 3). However, we find the two parameter fit is preferred in our likelihood analysis, as it yields a lower chi-square in the best fit value.

  • The relative bias between samples of different luminosities (which is independent of the mass predictions), as well as the cosmological implications for values of , show general consistency with previous analyses. Inspection of our results suggest that the linear bias values obtained without a quadratic bias term are preferred. This suggests that two-parameter bias constraints might underpredict the linear bias.

  • We decompose the structure of the normalized covariance matrix as an alternative view into clustering properties of our samples. The eigenvectors of the first three dominant modes show coherent structure consistent with variations seen in the measurements, supporting our claim that the covariance is signal dominated and sufficiently resolved.

  • We find that jackknife re-sampling methods cannot reproduce the correlation seen in the a 3PCF covariance matrix estimated from many realizations of mock galaxy catalogs. By performing a detailed comparison of the properties and structure of the errors, we identify that noisy, unresolved modes introduce significant discrepancies. We find that using an eigenmode analysis can mitigate the differences and conclude that our analysis should not be significantly affected by less-than-ideal methods of error estimation.

  • Comparing results between redshift space and projected measurements implies a potential systematic bias on values from the redshift space analysis when scales below are included, which have been utilized in other comparable analyses. Since the small scale measurements contain more constraining power than larger scales, they drive the likelihood analysis even when larger scales are considered.

  • On scales above , the statistical significance of constraints from redshift space analyses appear stronger than those found in analyses of projected measurements. We attribute this result to the increased uncertainties of the projected 3PCF, which mixes in larger scales (with larger errors) due to the line-of-sight projection. When considered with the results of McBride et al. (2010), which finds the projected 3PCF recovers configuration dependence at small scales lost in redshift space, a combination of redshift space analysis at large scales and projected measurements at small scales would form a nice complement in future analyses.

We thank many in the SDSS collaboration, where active discussion helped to refine this work. We would like to specifically acknowledge valuable input from István Szapudi, David H. Weinberg, Zheng Zheng, Robert Nichol, Robert E. Smith, Andrew Zentner, and the detailed discussions on error estimates with Idit Zehavi. We thank August Evrard and Jörg Colberg for kindly providing data and assistance with the Hubble Volume (HV) simulation. The HV simulation was carried out by the Virgo Supercomputing Consortium using computers based at the Computing Centre of the Max-Planck Society in Garching and at the Edinburgh parallel Computing Centre. J. G. and the development of Ntropy was funded by NASA Advanced Information Systems Research Program grant NNG05GA60G. A. J. C. acknowledges partial support from DOE grant DE-SC0002607, NSF grant AST 0709394, and parallel application development under NSF IIS-0844580. This research was supported in part by the National Science Foundation through TeraGrid resources provided by NCSA (Mercury) and the PSC (BigBen) under grant numbers TG-AST060027N and TG-AST060028N. Funding for the SDSS and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS Web Site is The SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions. The Participating Institutions are the American Museum of Natural History, Astrophysical Institute Potsdam, University of Basel, University of Cambridge, Case Western Reserve University, University of Chicago, Drexel University, Fermilab, the Institute for Advanced Study, the Japan Participation Group, Johns Hopkins University, the Joint Institute for Nuclear Astrophysics, the Kavli Institute for Particle Astrophysics and Cosmology, the Korean Scientist Group, the Chinese Academy of Sciences (LAMOST), Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, Ohio State University, University of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University of Washington.

Appendix A Effects of Binning

As an example of the effects of bin-size on galaxy-mass bias constraints, we re-analyze the LSTAR galaxy sample () using the two fractional bin-widths: and . First, we ignore the structure of the covariance matrix and show constraints using all the bins assuming perfect independence shown in left panel of Figure 17. While unphysical, this illustration allows one to probe the effect of shape differences in the 3PCF measurements without considering the resolution of the covariance matrix. Since larger bins smooth the configuration dependence, we expect a larger degeneracy between and , which is apparent. We see the best fit values (symbols) stay within the respective contours, but just barely. Remember, this approach uses all modes and the exact same input data, suggesting that binning can result in a systematic bias.

Figure 17.— Analogous to the Galaxy-Mass bias constraints in §4, we show the constraints on and using the same data for our fiducial binning scheme with fractional bin-size of (solid red contours) and larger (block dotted). On the left, we neglect any overlap as well as the the covariance and assume independent diagonal errors while and use the full 15 bins. On the right, we utilize the full covariance and only fit the dominant modes in an eigenmode analysis. The contours correspond to the and confidence levels from the surface. We use the LSTAR galaxy sample.

For the right panel of Figure 17, we consider the full covariance as well as improvements obtained by using the eigenmode analysis in the galaxy-mass bias constraints. First, we notice the error contours appear less stretched, in accord with our expectations of using a non-diagonal covariance matrix. In most cases (excepting for ), the area of the contours appear of equal size or even decreased for the larger measurements in contrast to the diagonal case. This makes sense, as the lower variance measurements of appear better resolved as long as there are enough remaining modes to constrain two parameters. The best fit values appear discrepant, especially at the lower scales () where they disagree at more than a significance. While this causes some concern, it is not as drastic as the diagonal case. As the eigenmode analysis trims modes, it excludes information and the same input data produces a different statistical representation. In light of this effect, a difference becomes a statistical difference of analysis rather than a significant systematic effect.

In summary, we find lower galaxy-mass bias parameters with larger bin-widths, a potential artificial bias on the galaxy-mass parameters due to over-smoothing. Since we gain very little additional constraining power with the bin-width, we argue the bin-width represents the more conservative choice. Although the scheme represents smaller bins, they are still quite large and adequately resolve structure in the covariance.


  • Adelman-McCarthy et al. (2008) Adelman-McCarthy, J. K., et al. 2008, ApJS, 175, 297
  • Berlind & Weinberg (2002) Berlind, A. A., & Weinberg, D. H. 2002, ApJ, 575, 587
  • Bernardeau et al. (2002) Bernardeau, F., Colombi, S., Gaztañaga, E., & Scoccimarro, R. 2002, Phys. Rep., 367, 1
  • Blanton et al. (2003) Blanton, M. R., Lin, H., Lupton, R. H., Maley, F. M., Young, N., Zehavi, I., & Loveday, J. 2003, AJ, 125, 2276
  • Blanton et al. (2005) Blanton, M. R., et al. 2005, AJ, 129, 2562
  • Colberg et al. (2000) Colberg, J. M., et al. 2000, MNRAS, 319, 209
  • Colless et al. (2001) Colless, M., et al. 2001, MNRAS, 328, 1039
  • Cooray & Sheth (2002) Cooray, A., & Sheth, R. 2002, Phys. Rep., 372, 1
  • Crocce et al. (2006) Crocce, M., Pueblas, S., & Scoccimarro, R. 2006, MNRAS, 373, 369
  • Davis et al. (1985) Davis, M., Efstathiou, G., Frenk, C. S., & White, S. D. M. 1985, ApJ, 292, 371
  • Davis & Peebles (1983) Davis, M., & Peebles, P. J. E. 1983, ApJ, 267, 465
  • Evrard et al. (2002) Evrard, A. E., et al. 2002, ApJ, 573, 7
  • Fosalba et al. (2005) Fosalba, P., Pan, J., & Szapudi, I. 2005, ApJ, 632, 29
  • Fry (1994) Fry, J. N. 1994, Physical Review Letters, 73, 215
  • Fry & Gaztanaga (1993) Fry, J. N., & Gaztanaga, E. 1993, ApJ, 413, 447
  • Gaztañaga et al. (2009) Gaztañaga, E., Cabré, A., Castander, F., Crocce, M., & Fosalba, P. 2009, MNRAS, 399, 801
  • Gaztañaga et al. (2005) Gaztañaga, E., Norberg, P., Baugh, C. M., & Croton, D. J. 2005, MNRAS, 364, 620
  • Gaztañaga & Scoccimarro (2005) Gaztañaga, E., & Scoccimarro, R. 2005, MNRAS, 361, 824
  • Gunn et al. (1998) Gunn, J. E., et al. 1998, AJ, 116, 3040
  • Gunn et al. (2006) —. 2006, AJ, 131, 2332
  • Hamilton (1998) Hamilton, A. J. S. 1998, in Astrophysics and Space Science Library, Vol. 231, The Evolving Universe, ed. D. Hamilton, 185–+
  • Hikage et al. (2005) Hikage, C., Matsubara, T., Suto, Y., Park, C., Szalay, A. S., & Brinkmann, J. 2005, PASJ, 57, 709
  • Jing & Börner (2004) Jing, Y. P., & Börner, G. 2004, ApJ, 607, 140
  • Kayo et al. (2004) Kayo, I., et al. 2004, PASJ, 56, 415
  • Komatsu et al. (2009) Komatsu, E., et al. 2009, ApJS, 180, 330
  • Kulkarni et al. (2007) Kulkarni, G. V., Nichol, R. C., Sheth, R. K., Seo, H., Eisenstein, D. J., & Gray, A. 2007, MNRAS, 378, 1196
  • Lupton et al. (2001) Lupton, R., Gunn, J. E., Ivezić, Z., Knapp, G. R., & Kent, S. 2001, in Astronomical Society of the Pacific Conference Series, Vol. 238, Astronomical Data Analysis Software and Systems X, ed. F. R. Harnden, Jr., F. A. Primini, & H. E. Payne, 269–+
  • Manera et al. (2010) Manera, M., Sheth, R. K., & Scoccimarro, R. 2010, MNRAS, 402, 589
  • Marin (2010) Marin, F. 2010, ArXiv e-prints
  • Marín et al. (2008) Marín, F. A., Wechsler, R. H., Frieman, J. A., & Nichol, R. C. 2008, ApJ, 672, 849
  • McBride et al. (2010) McBride, C. K., Connolly, A. J., Gardner, J. P., Scranton, R., Newman, J. A., Scoccimarro, R., Zehavi, I., & Schneider, D. P. 2010, ArXiv e-prints
  • Nichol et al. (2006) Nichol, R. C., et al. 2006, MNRAS, 368, 1507
  • Nishimichi et al. (2007) Nishimichi, T., Kayo, I., Hikage, C., Yahata, K., Taruya, A., Jing, Y. P., Sheth, R. K., & Suto, Y. 2007, PASJ, 59, 93
  • Norberg et al. (2009) Norberg, P., Baugh, C. M., Gaztañaga, E., & Croton, D. J. 2009, MNRAS, 396, 19
  • Pan & Szapudi (2005) Pan, J., & Szapudi, I. 2005, MNRAS, 362, 1363
  • Peebles (1980) Peebles, P. J. E. 1980, The large-scale structure of the universe, ed. P. J. E. Peebles
  • Reid et al. (2010) Reid, B. A., et al. 2010, MNRAS, 404, 60
  • Sánchez et al. (2009) Sánchez, A. G., Crocce, M., Cabré, A., Baugh, C. M., & Gaztañaga, E. 2009, MNRAS, 400, 1643
  • Scoccimarro (1998) Scoccimarro, R. 1998, MNRAS, 299, 1097
  • Scoccimarro (2000) —. 2000, ApJ, 544, 597
  • Scoccimarro et al. (1999) Scoccimarro, R., Couchman, H. M. P., & Frieman, J. A. 1999, ApJ, 517, 531
  • Scoccimarro et al. (2001) Scoccimarro, R., Feldman, H. A., Fry, J. N., & Frieman, J. A. 2001, ApJ, 546, 652
  • Scranton et al. (2002) Scranton, R., et al. 2002, ApJ, 579, 48
  • Sefusatti & Scoccimarro (2005) Sefusatti, E., & Scoccimarro, R. 2005, Phys. Rev. D, 71, 063001
  • Springel (2005) Springel, V. 2005, MNRAS, 364, 1105
  • Stoughton et al. (2002) Stoughton, C., et al. 2002, AJ, 123, 485
  • Strauss et al. (2002) Strauss, M. A., et al. 2002, AJ, 124, 1810
  • Szapudi (2009) Szapudi, I. 2009, in Lecture Notes in Physics, Berlin Springer Verlag, Vol. 665, Data Analysis in Cosmology, ed. V. J. Martinez, E. Saar, E. M. Gonzales, & M. J. Pons-Borderia , 457–+
  • Takada & Jain (2003) Takada, M., & Jain, B. 2003, MNRAS, 340, 580
  • Tinker et al. (2008) Tinker, J., Kravtsov, A. V., Klypin, A., Abazajian, K., Warren, M., Yepes, G., Gottlöber, S., & Holz, D. E. 2008, ApJ, 688, 709
  • Tinker et al. (2005) Tinker, J. L., Weinberg, D. H., Zheng, Z., & Zehavi, I. 2005, ApJ, 631, 41
  • van den Bosch et al. (2003) van den Bosch, F. C., Yang, X., & Mo, H. J. 2003, MNRAS, 340, 771
  • Verde et al. (2002) Verde, L., et al. 2002, MNRAS, 335, 432
  • Wang et al. (2004) Wang, Y., Yang, X., Mo, H. J., van den Bosch, F. C., & Chu, Y. 2004, MNRAS, 353, 287
  • Yang et al. (2003) Yang, X., Mo, H. J., & van den Bosch, F. C. 2003, MNRAS, 339, 1057
  • Yip et al. (2004) Yip, C. W., et al. 2004, AJ, 128, 585
  • York et al. (2000) York, D. G., et al. 2000, AJ, 120, 1579
  • Zehavi et al. (2010) Zehavi, I., Zheng, Z., Weinberg, D. H., Blanton, M. R., & the SDSS collaboration. 2010, ArXiv e-prints
  • Zehavi et al. (2002) Zehavi, I., et al. 2002, ApJ, 571, 172
  • Zehavi et al. (2005) —. 2005, ApJ, 630, 1
  • Zheng (2004) Zheng, Z. 2004, ApJ, 614, 527
  • Zheng & Weinberg (2007) Zheng, Z., & Weinberg, D. H. 2007, ApJ, 659, 1
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description