Probing Cosmology and Galaxy Cluster Structure with the Sunyaev–Zel’dovich Decrement vs. X–ray Temperature Scaling Relation
Scaling relations among galaxy cluster observables, which will become available in large future samples of galaxy clusters, could be used to constrain not only cluster structure, but also cosmology. We study the utility of this approach, employing a physically motivated parametric model to describe cluster structure, and applying it to the expected relation between the Sunyaev-Zel’dovich decrement () and the emission–weighted X–ray temperature (). The slope and normalization of the entropy profile, the concentration of the dark matter potential, the pressure at the virial radius, and the level of non-thermal pressure support, as well as the mass and redshift–dependence of these quantities are described by free parameters. With a suitable choice of fiducial parameter values, the cluster model satisfies several existing observational constraints. We employ a Fisher matrix approach to estimate the joint errors on cosmological and cluster structure parameters from a measurement of vs. in a future survey. We find that different cosmological parameters affect the scaling relation differently: predominantly through the baryon fraction ( and ), the virial over–density ( and for low– clusters) and the angular diameter distance (, for high– clusters; and ). We find that the cosmology constraints from the scaling relation are comparable to those expected from the number counts () of the same clusters. The scaling relation approach is relatively insensitive to selection effects and it offers a valuable consistency check; combining the information from the scaling relation and is also useful to break parameter degeneracies and help disentangle cluster physics from cosmology. Our work suggests that scaling relations should be a useful component in extracting cosmological information from large future cluster surveys.
keywords:cosmological parameters – cosmology: theory – galaxies: clusters: general
This work is motivated by large upcoming cluster surveys that utilize the Sunyaev-Zel’dovich (SZ) effect (Sunyaev & Zeldovich 1972) such as ACT111http://www.physics.princeton.edu/act/, APEX222http://bolo.berkeley.edu/apexsz/, Planck333http://www.rssd.esa.int/index.php?project=Planck, and SPT444http://pole.uchicago.edu/. As is well known, the SZ signal is nearly redshift independent, so these surveys are expected to be especially efficient in detecting high–redshift clusters. The expected catalogs will be sensitive probes of dark energy, and also useful in breaking degeneracies in local cluster surveys (for example, the degeneracy between and ). The planned and on–going surveys will cover thousands of square degrees of sky, and detect on the order of 10,000 of clusters with masses over a few . For example, the SPT survey will cover 4,000 of sky in 4 frequency channels (90, 150, 220, 270 GHz), and Planck aims to cover the whole sky in 9 frequency channels. These cluster samples will contain a significant amount of cosmological information.
Importantly, cosmological information can be extracted from large galaxy cluster catalogs in several complementary ways. For example, the cluster abundance is exponentially sensitive to the amplitude of matter density fluctuations, and the X-ray temperature function obtained from local cluster samples has been used to constrain and (e.g., Henry 2000, Ikebe et al. 2002). The redshift evolution of the abundance will be useful in constraining dark energy parameters, with statistical errors competitive with those in most other methods (e.g., Haiman et al. 2001; Albrecht et al. 2006). Small existing samples of tens of X–ray clusters out to already provide interesting constraints on the dark energy density and equation of state parameter (Henry 2004; Mantz et al. 2008; Vikhlinin et al. 2008). The cluster power spectrum also contains information on cosmology (Hu & Haiman 2003), both through the growth of fluctuations (Refregier et al. 2002) and through baryon acoustic features (Hu & Haiman 2003; Blake & Glazebrook 2003, Seo & Eisenstein 2003, Linder 2003). Combining the number counts and the power spectrum provides a cross-check and can allow a “self–calibration” to contain systematic errors in the mass–observable relation (Majumdar & Mohr 2004; Wang et al. 2004). In addition to the above, clusters could also be used as “standard rulers”. The measured gas fraction , which is derived from the observed X-ray temperature and density profiles, depends on the angular diameter distance as (e.g., Allen et al. 2008). To the extent that the gas fraction is predictable ab–initio from numerical simulations, this provides a measurement of . A complementary measurement of can be provided by combining SZ and X-ray signals, under the assumption that clusters are at least statistically spherical (e.g., Bonamente et al. 2006).
The gravitational potential of clusters is dominated by dark matter, whose behavior is determined by gravity alone, and is therefore robustly predictable (Navarro et al. 1997; hereafter NFW). If astrophysical processes in the gas were unimportant, the intracluster gas would evolve adiabatically, tracing the self–similar dark matter profile, and its global properties would obey simple scaling relations (e.g., Kaiser 1986). In fact, observed clusters indeed exhibit scaling relations that are tight, but which deviate significantly from the self–similar expectation. For example, the relation between X–ray flux () and temperature () is observed to be close to , significantly steeper than the power law expected in self-similar, adiabatic models. These observations could be explained by preferentially increasing the specific entropy of the cluster gas in low–mass clusters. Many variants of such models have been developed, based either on heat input from stars or nuclear black holes, or preferential elimination of the low–entropy gas by star–formation (see, e.g., a review by Voit 2005 and references therein). Our present study is motivated by the fact that in any such model, the predicted scaling relations will generically depend on the background cosmology. Using simple toy models, Verde et al (2002; hereafter VHS02) showed that the cosmological parameters indeed affect cluster scaling relations, i.e. relations among temperature, cluster size and SZ decrement. In small cluster samples (e.g., Morandi et al. 2007), the subtle cosmology–dependencies will be masked by the larger uncertainties in the physical modeling of cluster structure. However, given a sufficiently accurate measurement of the scaling relations, using thousands of clusters, it should become possible to place useful constraints simultaneously on cosmological parameters and the parameters of any given specific cluster physics model.
VHS02 argued that combining SZ and X–ray data will be particularly useful, because the SZ and X–ray signatures depend on cosmological parameters differently, and singled out the relation between the Sunyaev-Zel’dovich decrement () and the X–ray temperature () as a promising probe of both cluster structure and cosmology. Afshordi (2008) showed that the measured relation between SZ decrement and angular half–light radius, which does not require X–ray data, may already help reduce the errors in cluster mass estimates. Younger et al. (2006) showed that combining number counts from SZ and X-ray surveys delivers constraints that are tighter than adding two independent measurements in quadrature; this synergy again arises because the SZ decrement and X–ray flux depend differently on the background cosmology. Finally, Aghanim et al. (2008) recently used hydrodynamical simulations, and studied how different values of the dark energy equation of state affect SZ vs. X–ray scaling relations. They found relatively little direct sensitivity to (which is consistent with our own findings; see discussion in § 3.1 below).
Despite the above few works, the utility of the scaling relations in probing cosmology remains relatively unexplored. We believe it deserves more investigation, for the following two reasons. First, data on the scaling relations will be automatically available (at least for a subset of clusters) once the planned SZ surveys are performed. Large catalogs of cluster temperatures (hundreds of clusters) already exist, and new, much deeper X-ray surveys are being proposed and planned, such as eROSITA and IXO555See http://www.mpe.mpg.de/erosita/MDD-6.pdf and http://ixo.gsfc.nasa.gov, respectively.. Compared to the number counts, the scaling relation technique should be relatively less sensitive both to selection effects and to the relation between the observables and cluster mass. Second, as we will discuss below in detail, scaling relations derive cosmological information from a different combination of geometrical distances and non–linear growth than the other cluster observables. For this reason, they could not only be combined with other techniques to tighten constraints, but can also serve as useful consistency checks.
In this paper, we follow VHS02, and we focus on the relation between the total SZ flux decrement, encoded in integrated Compton parameter, and the X-ray emission weighted temperature. There are other physical quantities, such as the X-ray luminosity or the central SZ decrement . These quantities are especially sensitive to the properties of the cluster core, where cooling, star–formation, and feedback processes are most effective, and which is therefore the most difficult region of the cluster to model. The scatter in these quantities is known to be large, which will limit their utility for constraining cosmology. In contrast, the integrated Compton parameter and the mean emission–weighted temperature show strong robustness to the above uncertainties (Reid & Spergel 2006 and Kravtsov, Vikhlinin & Nagai 2006 showed that similarly robust observables can be constructed from X–ray data, as well). An additional virtue of these two quantities is that they are relatively easy to measure, i.e. they do not require a detailed measurement of radial profiles.666We note, however, that the cluster core needs to be excised in cooling core clusters, in order not to affect the emission–weighted ICM temperature measurement. While it is possible to extract both the core temperature and the ICM temperature from a single spectrum, this will inevitably introduce uncertainties, which will be discussed in § 4 below. The main improvements of the present study over the analysis of VHS02 are the following: (i) we include a full set of 8 cosmological parameters, representing the matter density , the dark energy density , the baryon density , the Hubble constant (), two dark energy equation of state parameters and (see detailed definition in next section), the normalization of the matter density power spectrum and the “tilt” of the primordial power spectrum . Note that we do not assume spatial flatness, so the 8 parameters are independent. VHS02 only included , and as free parameters. (ii) VHS02 adopted a simple spherical toy model for cluster structure, based on the virial theorem, to predict relations between different observable quantities. This approach has the virtue of simplicity, and makes it easier to interpret the results; however, such a simple cluster model is already in contradiction with existing data. Here we use a more elaborate, and more realistic phenomenological cluster model, with many free parameters. We explicitly require the model to satisfy existing observational constraints and we explore the impact of various cluster structure uncertainties on the final conclusions. (iii) We employ a Fisher matrix technique, instead of a Kolmogorov-Smirnov test as in VHS02. The Fisher matrix technique is a fast way of estimating joint parameter uncertainties in a multi–dimensional parameter space, and allows us to understand parameter degeneracies. (iv) Finally, we also study constraints from the number counts (including the effects of cluster structure uncertainty, mass-observable scatter and incompleteness), and we forecast the combined constraints from the scaling relations and the number counts.
The rest of this paper is organized as follows. In § 2, we describe the Fisher matrix technique and the physically motivated, phenomenological cluster model we adopt. The cluster model is compared against observations and simulations. We also explain our choice of fiducial values for cosmological parameters, cluster parameters and survey parameters. In § 3, we present our main results, i.e. the constraints on cosmological and cluster structure parameters. Proceeding pedagogically, we first include only the 8 cosmological parameters, then add increasing uncertainties from the cluster structure parameters to our analysis. We also explain in detail where the cosmological constraints from the scaling relations come from. In § 4, we compare the scaling relation technique with constraints from the number counts, and discuss various caveats and possible improvements to our results. We summarize our results and offer our conclusions in § 5.
2 Cluster model and Fisher matrix technique
2.1 Fisher matrix technique
We employ the Fisher matrix technique to forecast cosmological constraints from future surveys. The Fisher matrix is a quick way to estimate joint parameter uncertainties in a multi-parameter fit (Fisher 1935; Tegmark et al. 1997). It is defined as,
where is the likelihood for a certain observable, and is the parameter set (including both cosmological and cluster structure parameters in our case). The best attainable covariance matrix is simply the inverse of the Fisher matrix ,
and the constraint on any individual parameter , marginalized over all other parameters, is . Another advantage of the Fisher matrix technique is that it is easy to obtain joint constraints from several data sets or methods: the total Fisher matrix is just the sum of individual Fisher matrices as long as they are uncorrelated. In this paper, we assume that the different Fisher matrices are indeed independent; we justify this assumption in § 4.
The Fisher matrix approach makes the underlying assumption that the likelihood surface for the parameters is a multi-variate Gaussian. This is indeed the case if experimental errors are Gaussian–distributed and the model depends linearly on the parameters, but in general, this assumption does not hold, and is instead justified by invoking the central limit theorem in the presence of large number of independent data. The classical example is the CMB likelihood which is very close to Gaussian for the so-called “normal” or “physical” parameters (Kosowsky et al. 2002), but not necessarily for the standard cosmological parameters. However, for most cosmological models and future CMB data sets, especially if combined with external datasets or a weak prior on , the CMB likelihood is very close to Gaussian even for the standard cosmological parameters (Komatsu et al. 2009). For degeneracies in parameters space that are described by non-linear parameter combinations, the Fisher matrix approach tends to under-estimate the error-bars. Even with these limitations, the Fisher matrix approach is invaluable to estimate degeneracies among parameters and assess which data set combination can lift them.
2.2 Cluster model
Galaxy clusters are the largest gravitationally bound structures in the universe, and the properties of their dark matter halos should be relatively insensitive to astrophysical processes, which typically operate on scales much smaller than the size (i.e. virial radius) of a massive cluster. However, processes such as radiative cooling, star formation, heating and radiative feedback from active galactic nuclei, turbulence, and non–thermal pressure support from energetic particles accelerated in large–scale shocks, can all have significant impact on the thermal state and spatial distribution of gas in the intra-cluster medium (ICM), especially near the center of the cluster. Many aspects of the ICM remain poorly understood, despite extensive theoretical work, numerical simulations, and high–resolution observations.
There have been many approaches to building simplified models for cluster structure. Some are purely phenomenological formulae for the radial profiles, such as the simple 3–parameter “beta–model” (Cavaliere & Fusco-Femiano 1976), or the 17–parameter generalized NFW model proposed more recently by Vikhlinin et al. (2006), which provides excellent fits to the range of observed X–ray profiles. Many studies have based the models on physical ingredients, generally assuming hydrostatic equilibrium, and parameterizing the the processes listed above (see, e.g., Komatsu & Seljak 2001; Voit et al. 2002; Ostriker et al. 2005; Reid & Spergel 2006; Fang & Haiman 2008; Ascasibar & Diego 2008, and references therein).
In this paper, we do not attempt to build another ab initio physical cluster model. Instead, we use a “hybrid” phenomenological model, with physically motivated free parameters, similar to that proposed in Reid and Spergel (2006) and Fang and Haiman (2008). As we will show below, this model can satisfy most available observational constraints, and has the flexibility to include parameter variations. The ICM properties are assumed to be spherically symmetric, on average, and are determined by four factors: the radial entropy profile, the profile of the gravitational potential, the equations of hydrostatic equilibrium, and boundary conditions. Below, we describe how we incorporate these four factors into our cluster model.
Entropy profile. The radial entropy profile is parameterized by a power law,
where and are density and temperature of the ICM gas, is the adiabatic index, which we choose to be 5/3, appropriate for ideal monatomic gas, and is the dimensionless radius, normalized by the virial radius of the cluster. The virial radius is defined to be the radius within which the mean density is equal to the virial density determined from numerical simulations (see equation 7 below). is the dimensionless entropy at the virial radius, and quantifies the logarithmic slope of the entropy with radius. Note that convective stability requires the entropy to be a monotonically increasing function of radius (Voit et al. 2002), so we require . The natural choice for the entropy scale is its value estimated using virial theorem, , where . In above definitions, is the baryon fraction of the universe (), is the mass of a proton, and is the mean molecular weight of the ICM (we adopt 0.59 as its value, appropriate for a fully ionized H-He plasma with helium mass fraction equal to 0.25). Note that is the characteristic entropy of a cluster in absence of non-gravitational forces rather than entropy at virial radius; in particular, even without any feedback processes. Using the fitting formula given by Younger & Bryan (2007) and a slightly modified version of our current cluster model code, we estimate that is equal to 1.5 in the self-similar case.
Dark matter halo gravitational potential. We assume that the dark matter halos are spherically symmetric, and their density profiles are described by the NFW shape. The assumption of spherical symmetry can obviously be very inaccurate for individual clusters. We assume, however, that the main effect of the asymmetries is to introduce a scatter in the global scaling relations, rather than to change their mean (the accuracy of this assumption should be assessed in the future in three–dimensional simulations of a large sample of clusters). The NFW profile is expressed as
where is the scale radius and is the critical density of the universe. For a halo of DM mass , the two parameters and are determined from the concentration parameter and the virial density ,
Here , with . We adopt a fitting formula from the numerical spherical collapse model calculation by Kuhlen et al. (2005) for ,
where , , and , with the dark energy equation of state. Note that this formula differs slightly from other expressions for the virial overdensity that are commonly used in the literature (e.g., Bryan & Norman 1998), and it includes an explicit dependence on . This feature is important to us, since we will constrain , and, as we will find below, this dependence drives the constraints on at low redshift.
It is important to note that Equation (7) was obtained from spherical collapse calculations that assumed a constant . In particular, one may wonder whether this fitting formula is still accurate when is redshift dependent. To check the goodness of fit of Equation (7), we performed numerical calculations of the virial overdensity, using the spherical collapse model described by Kuhlen et al. We have found that Equation (7) is accurate to within 10% for time–varying models in the range of -1.5-0.5, and . More importantly, the virial density computed with the numerical method is systematically more sensitive to than Equation (7) predicts. For example, we find that at redshifts , 0.2, and 0.4, the fractional change in , when is changed from to (i.e., by its value; see below), and all other parameters are held fixed, is a factor of 3, 2, and 1.4 larger in the numerical calculation than predicted by the fitting formula. The higher sensitivity is easy to understand: at depends on , which at higher redshifts differs increasingly from the constant value . We therefore conclude that using the Kuhlen et al. formula makes our constraints on below conservative.
Equations of hydrostatic equilibrium. Below are the equations that we solve to obtain the cluster gas density and pressure profiles , . The first is from the hydrostatic equilibrium condition, the second is from mass conservation,
where is the total mass within radius , including both dark matter and baryons, and is the total gas mass enclosed within radius . The gas fraction , which we will use below, is defined to be . Finally, the parameter is introduced to allow for deviations from strict hydrostatic equilibrium. Physically, deviations from could represent any non–thermal pressure support (e.g., from cosmic rays and/or turbulence), and also lack of full virialization. In fact, allowing for turbulent support in the analytical model is known to be necessary in order to reproduce the density and temperature profiles for the ICM gas in simulations that include non–gravitational pre–heating (Younger & Bryan 2007).
Boundary conditions. The boundary condition at is specified by requiring that is zero at the cluster center. The boundary condition at the virial radius is imposed by requiring that the gas pressure matches the momentum flux of the infalling gas,
where is the free–fall velocity from the turnaround radius. Assuming the turnaround radius is twice the virial radius, as in the spherical collapse model, we have . We follow Reid & Spergel (2006) and introduce a free parameter to allow for an uncertainty in this condition.
The cluster model described above has 5 free parameters that capture uncertainties about cluster structure and evolution: , , , and . All of these quantities could additionally depend on both mass and redshift ( and could also explicitly depend on radius, which, however, we ignore here). We use a power–law parameterization to allow for these dependencies,
where could represent any of the 5 quantities. In equation 11, each function is described by 3 parameters, one for normalization, one for mass dependence and another for redshift dependence. We choose to be (this choice is not essential, since changes to can be compensated by changes in the normalization).
Several cautionary remarks about the above model are in order. First, the assumption of power–law mass and –dependence is likely valid only when the variations over the observed mass and redshift range are small; the real dependence could be more complicated, especially if a wide mass or redshift range is considered. This could mean that actual data will not be fit adequately by such power–laws; in this case, additional parameters will likely have to be introduced (this possibility is addressed more quantitatively below).
Second, unlike Reid & Spergel (2006), we did not include additional modeling of the cluster cores. The reason for this is that core properties are known to vary significantly from cluster to cluster, and it is difficult to capture this variation with a universal parametrization. Fortunately, neglecting the core makes relatively little difference in our results. The two observables we focus on are temperature and integrated Compton parameter . We checked explicitly that introducing a flat entropy core within 0.1 changes the value of by less than 2 percent (this result is consistent with Reid & Spergel 2006). When computing the emission–weighted temperature , we excise the innermost regions (see eq. 17 below), which makes our temperature–observable similarly robust to core properties. This cut mimics the common procedures in existing observations, in which the ambient gas temperature is inferred either by excising the core region, or using a model (such as a cooling flow model) to eliminate the contribution from core regions. In order to minimize potential biases from the model–dependence of such cuts, we use a simple definition below.
Third, in reality, the cluster structure parameters will clearly have cluster–to–cluster variations: each of the parameters appearing in Equation (11) should therefore represent only a mean value. A scatter in any cluster structure parameter will induce a scatter in the value of the observables we predict. Below, we will derive constraints only from the mean observables (i.e. our signal is the mean scaling relation; the finite distribution of at fixed is considered conservatively to be pure noise). An underlying scatter in a structure parameter can therefore have two effects on our results. First, the measurement error of the mean is increased, which will correspondingly weaken our statistical constraints. Second, the mean inferred value of can be biased, if the scatter in a parameter introduces a skewed -distribution – and/or if the scatter is large, and it introduces Malmquist bias (i.e. the low- tail of the clusters at fixed could be preferentially missing from the sample). The first of these effect will be addressed below by allowing for a scatter in itself; the possible biases from the second effect are discussed in § 3.3.
In summary, our adopted baseline cluster model has parameters; given these parameters, we can numerically solve for the density and pressure profiles, and deduce all other ICM quantities. Below in § 2.5, we will use existing observational data and simulation results to determine the fiducial values of these parameters.
2.3 Fisher matrix for the scaling relation
Our first way of constraining cosmological parameters is to use the relation between the SZ flux (measured through the integrated Compton parameter, ) and the X–ray emission weighted temperature . The electrons in the hot ICM gas scatter the cosmic microwave background (CMB) photons, which distorts the CMB spectrum by the amount (e.g., Birkinshaw 1999; Carlstrom et al. 2002)
where is a known function of frequency,
Here , is CMB temperature, is Planck constant, is the Boltzmann constant, and is the speed of light. is positive at high frequency, negative at low frequency, and has a null at GHz. Physically, this means low–energy photons are Compton scattered by the hot electrons to higher energies, reducing the flux at low frequency and increasing it at high frequency.
The ICM properties are all encoded in . The total distortion within a fixed solid angle is given by
where is the Compton parameter along a given line of sight,
and is the Thompson cross section. In this work, we use the value of integrated over the whole cluster, i.e. in equation (14) we set , with the angular diameter distance. Combining equations (14) and (15), we get,
where is the volume of cluster. Equation (16) clearly shows that the integrated SZ flux directly probes the thermal energy of ICM.
The emission–weighted temperature is calculated as
where is the cooling function, calculated by a Raymond-Smith code (Raymond & Smith 1977) with metallicity , and is the radius with a mean enclosed overdensity of 500 relative to the critical density. Note that we do not integrate over the whole cluster – instead, we excise the inner region. This mimics the temperature measurements in X-ray observations, which either excise the cooling flow regions, or model and subtract their contribution. Since the inner radius (=) has to be estimated from the data itself, this can introduce uncertainties or biases in the inferred . As we show in § 4 below, in order not to degrade the constraints we obtain below, the inner radius has to be accurate (statistically) to within , or the mass of the cluster to within .
Given a relation, we can construct the scaling–relation Fisher matrix for an individual cluster as
where and are parameters to be constrained, is the total statistical uncertainty on the value of , including both the intrinsic scatter in at fixed , and the measurement uncertainty in , , . Note that the partial derivative is taken at a fixed temperature, not at given cluster mass, since we are studying the relation of vs. not vs. .
For a sample of clusters for which and are both measured, the total Fisher matrix is the sum of the individual single–cluster Fisher matrices. We approximate this sum by an integration,
where , and are the solid angle, and the minimum and maximum redshifts covered by the survey, is the mass of smallest detectable cluster at each redshift, is the comoving volume element, and is the halo mass function. The form of this Fisher matrix is similar to the “follow-up” Fisher matrix used in Majumdar & Mohr (2003), except that their observable is the cluster mass itself (or a mass–like quantity), whereas our observable here is . We used the fitting formula by Jenkins et al. (2001) for the mass function (their smoothed mass function, equation 9). In the fitting formula in Jenkins et al. (2001), the cluster mass is defined to be the mass enclosed within a spherical region with overdensity of 180 with respect to the mean background matter density, whereas we defined clusters based on their virial overdensity with respect to the critical density (eq. 7 above). We used the NFW profile to convert the Jenkins et al. mass function to be consistent with our mass definition.
2.4 Fisher matrix for number counts
Another way of constraining cosmological parameters is through the cluster abundance. The observable in this case is the number of clusters in a given range of redshift and ,
where is a minimum mass we impose by hand (representing a sharp survey selection threshold; see discussion in § 3.3 below for allowing uncertainties in the selection), and is the probability that a cluster with mass has a value of within the range of the th –bin. In this paper, for simplicity, we assume Gaussian scatter between and , so that has an analytical form. Suppose bin is specified by its minimum and maximum , and for a given mass , has a mean and r.m.s. of . In this case,
Assuming Poisson errors dominate in the number counts (), and summing over all redshift– and –bins, the total Fisher matrix for the cluster abundance (Holder et al. 2001) is given by
where and are the number of redshift bins and bins, respectively. This expression ignores sample variance, whose effect on the cluster abundance constraints has been considered in detail in previous works (Hu & Kravtsov 2003; Lima & Hu 2004; Fang & Haiman 2006), and has been found to be modest, especially if the survey is sub–divided into many angular cells, and the variance is considered as signal, rather than noise; Lima & Hu 2004). Likewise, Holder et al. (2001) explored the validity of the Fisher matrix approach for forecasting cluster count constraints, and found it to be a good approximation, with the exception of the constraints for .
2.5 Fiducial parameter values
Here we summarize all the parameters in this work, and explain our choice of their fiducial values. Overall, the model parameters can be grouped into three categories: cosmological parameters, cluster model parameters and survey parameters.
Cosmological parameters. We include the following 8 standard cosmological parameters, with the 3–year results from the Wilkinson Microwave Anisotropy Probe (WMAP) experiment as their fiducial values (Spergel et al. 2007): , , , , , , and . Here and parametrize the dark energy equation of state,
We do not assume a spatially flat universe, so the 8 cosmological parameters are independent. The more recent 5–year results from WMAP are consistent with the 3–year results. is slightly higher ( from the combination of the WMAP result with baryon acoustic oscillations and supernovae; Dunkley et al. 2008). Adopting this new value would increase the number of detectable clusters, and tighten our constraints below. We emphasize that the number counts constrain all 8 parameters directly, while the scaling relation alone can only constrain 6 of them ( and do not affect scaling relation). Nevertheless, when combined with the number counts, the information from the scaling relation can indirectly help constrain and , by breaking degeneracies, as we will demonstrate in § 3 below.
Cluster model parameters. The cluster model described in § 2.2 above has parameters, describing the normalization and logarithmic slope of the gas entropy profile, the concentration parameter for the dark matter halo profile, any contributions from non–thermal pressure , and the gas pressure at the virial radius , each with a normalization, redshift dependence and mass dependence. We set their fiducial values using results from simulations and observations.
The self-similar collapse model that invokes only gravity and shock heating predicts a universal entropy profile (Tozzi & Norman 2001; Borgani et al. 2001), which is in general agreement with observations outside the core (Ponman et al. 2003; Pratt et al. 2006). We therefore adopt this fiducial value for , with no dependence on mass or redshift ().
The difference in the cluster mass inferred from weak lensing and X-ray measurements suggests that non–thermal pressure contributes about 10% to total gas pressure (Zhang et al. 2008), similar amounts have also been seen in simulations (Rasia et al. 2004; Kay et al. 2004; Faltenbacher et al. 2005), thus we adopt with no mass and redshift dependence.
For the concentration parameter , we adopt the fiducial value that is directly computed from cosmological simulations in Voit et al. (2003),
Note that unlike our other model parameters, in principle, can be accurately computed ab–initio, using three–dimensional N–body or hydro simulations. However, the uncertainties are still significant, and the current results are in tension with X–ray observations (e.g., Duffy et al. 2008); for completeness, we therefore include it as a free parameter in our baseline model. Below, we will investigate the benefits of placing tight priors on this parameter (and we find the benefits to be small). For simplicity, we set , which corresponds to the condition that all kinetic energy is transformed into thermal energy at the virial radius. Molnar et al. (2008) recently studied in detail the morphology and properties of virial shocks around galaxy clusters in smooth particle hydrodynamics (SPH) and adaptive mesh refinement (AMR) simulations of a sample of individual clusters. Although virial shocks are often preceded by external shocks farther out, closer to , a significant fraction of the clusters’ surface area is covered by strong virial shocks located at , for which should be a good approximation. While detailed simulations could help refine the best fiducial choice for the mean boundary pressure, we do not expect the choice of this fiducial value to have a significant impact on our forecasts.
The fiducial value of is fixed by fitting the relation. In Figure 1, we compare the observed relation to the predicted values for the best-fit . The data points are from the HIFLUGCS cluster sample (Reiprich & Böhringer 2002). The bolometric luminosity is computed from
where and are electron and proton number densities, respectively, and we assume a helium mass fraction of 25%. Unlike , is sensitive to the core properties, and the HIFLUGCS sample includes both cooling–core and non–cooling–core clusters. To account for this mixing, we modify the entropy profile in our model clusters, and add a flat entropy core of size . The best–fit value of is found, by minimizing using the data from Reiprich & Böhringer, to be .
This is in accordance with previous results on cluster formation and evolution. In particular, we find that the entropy is elevated (), as expected by feedback processes, and also that the increase in entropy is more significant for lower–mass clusters, breaking the self similar relation. For comparison, in the same figure we also plot the curves expected for (dashed line, and are kept at their fiducial values) and (dotted line, and are kept at their fiducial values). Lowering raises at a given temperature, while lowering increases the slope.
We emphasize that , as mentioned above, does not correspond to the gravitational-heating only case, and that, in agreement with previous results, the most massive clusters in Figure 1 fall on the observed relation even without any non-gravitational heating. To verify this, we followed Fang & Haiman (2008), and used the fitting formula given in Younger & Bryan (2007) to compute the maximum entropy in the gravitational-heating only case. This entropy is found to be at the virial radius. With our adopted set of cosmological parameters, a 10keV cluster has mass of , and ; the difference between our fiducial and that of the simulation is only 0.1 around keV (this difference is due to the fact that our fiducial entropy profile is steeper than found in adiabatic simulations).
Though a successful fit in general, at the lowest temperatures shown in Figure 2 (keV), the best–fit relation has a slope that flattens slightly, in contrast with observations that indicate a steepening at these temperatures (e.g., Helsdon & Ponman 2000). This shows that our particular power–law parameterization (eq. 11) is insufficient in capturing the increase in the mean entropy at low temperatures. Correcting this deficiency would be possible by using different parameterizations (e.g., parameterizations that preferentially affect the cores of low–temperature clusters, such as the “entropy–floor” models; e.g., Fang & Haiman 2008). However, in this paper, we focus on clusters detectable in SZ experiments. These have a mass (or equivalently, a mean temperature of ), and the power–law models provide a good fit the relation of these clusters (Fig. 2).
We also note that we do not find a need for , as expressed in the form of equation (11), to evolve with redshift. In Figure 2, we show the relation predicted in our model, assuming , at redshift , together with data from the high–redshift cluster sample from the Wide Angle ROSAT Pointed Survey (WARPS; with average redshift of ). As the figure shows, the model provides an excellent match to the data. Although not immediately obvious, this conclusion is qualitatively in agreement with the result in Fang & Haiman (2008), who found that that if a fixed entropy floor is assumed to exist in all cluster at a given redshift, then this floor value has to decrease toward higher redshifts. The entropy floor is the difference between total entropy and baseline value (without non–gravitational heating). Our result indicates that the ratio of total entropy to baseline entropy is the same for low and high redshifts. This means that the difference is smaller for high–redshift clusters, since they have a higher density and a smaller baseline entropy. (For more details on this apparent coincidence, see Figure 6 and the related discussion in Fang & Haiman 2008).
Our parameterization allows us to vary the slope of the entropy profile , which is necessary to fit temperature profiles measured in X–ray observations. This, in fact, is the main advantage of our parameterization compared to similar models that include a constant entropy floor, since the latter approach generically fails to match radial profiles (e.g., Younger & Bryan 2007 and references therein). In Figure 3, we show the temperature profile in our fiducial cluster model with , compared to the mean profile recently inferred from XMM-Newton observations (Leccardi & Molendi 2008). In this plot, we adopt , which is approximately the median redshift of the XMM-Newton cluster sample, and , which corresponds to a mean temperature about 6 keV, approximately the temperature of a typical cluster in the sample. We follow Leccardi & Molendi (2008) to compute as
As shown in Figure 3, the model temperature profile is in good agreement with observations out to the radius of where data is available.
Finally, we compare the relation in our fiducial model with predictions from simulations. More specifically, in Figure 4, we show the relation in our fiducial model, together with the predictions for the same quantity in Nagai (2006) and Sehgal et al. (2007), who respectively use high–resolution hydrodynamical simulations, and N-body simulations of dark matter halos and a prescription for the corresponding gas distribution. Here and are the SZ Compton parameter and the total mass within the radius . We note that Sehgal et al. integrate over a cylindrical region extending to an angular radius corresponding to , while Nagai integrates over a sphere of radius . For a fair comparison, we compute both ways. The upper solid [red] curve in Figure 4 is in a cylindrical region, while the lower solid [red] curve is that in a sphere. The redshift is set to be , to match Nagai’s simulation. We use the fitting parameters in row 2 of Table 2 in Sehgal et al. (2007), since we focus on clusters above the SZ detection threshold (about at intermediate redshifts). Sehgal et al. find a slightly steeper slope than Nagai, which they attribute to the effect of AGN heating. The slope of Nagai roughly agrees with the self-similar expectation. Our slope is closer to that of Sehgal et al., which includes star formation, and AGN feedback, but no cooling. Overall, the slopes, however, are close to one another, and our normalization falls between the predictions of the two simulations.
In addition to the 15 parameters fixed above, we also allow for independent scatter in the and the relations. We chose a fiducial value of 10% for both (i.e. in eq. 18 and in eq. 2.4 are both ), motivated by the simulations of Nagai (2006), who finds an r.m.s. scatter between and of 10-15%. The effect of a non–zero scatter on our results is two–fold. Scatter increases the number of detected cluster at a given flux threshold (because of the steep slope of the mass function, more clusters scatter from below the threshold to above it than vice-versa), which is helpful in constraining cosmology. On the other hand, scatter flattens the effective mass function, which degrades the information derivable from the shape of the mass function (as will be demonstrated in § 3 below, we find that the second effect dominates in the constraints from the cluster counts).
In summary, we conclude that our fiducial cluster structure model matches existing observations and simulations reasonably well, at least at low redshifts, and for the clusters above the expected detection threshold of future SZ surveys. This gives us confidence to use our adopted model parameterization to forecast cosmological constraints, and to study the effect of cluster structure uncertainties. Of course, in the future, as better data becomes available, it is possible (indeed likely) that modifying the parameterization of the cluster structure model will become necessary.
Survey parameters. Survey parameters include the following: sky coverage , frequency , measurement noise , redshift range and related parameters, such as redshift bin size and bin size. We adopt typical values relevant to upcoming SZ surveys for these parameters.
The sky coverage is set to be 4,000 which is the solid angle covered by SPT in 2 years. The frequency is chosen to be , where the SZ signal reaches maximum decrement. Most surveys will observe in bands at multiple frequencies, in order to separate the SZ signal from other CMB secondary anisotropies and from foregrounds. However, almost all planned surveys have a frequency band around . The detector noise is set to be , which represents a typical value for the total SZ flux of the smallest detectable cluster in upcoming surveys (such as in SPT). At frequency of , this corresponds to sr. We set the cluster detection threshold to be . At low redshift, surface brightness becomes an important additional detection criterion, and a simple flux limit becomes inadequate, so we impose a floor on the mass limit (eq. 20). We will discuss how this choice affect our result in § 4.
We further assume the SZ survey covers the redshift range , and we divide this range into 40 uniform bins (). This of course requires the cluster redshift measurement uncertainty is better than 0.05. This accuracy should be achievable by follow–up surveys designed for this purpose; for example, the Dark Energy Survey can determine cluster photometric redshifts to an accuracy of 0.02 or better out to (Abbott et al. 2005).
Finally, we divide the range into 8 bins, which we allocate so that each bin contains a similar number of clusters. This requirement led us to adopt the following boundaries of the –bins, in units of the detection threshold: , , , , , , , .
In Figure 5, we show the minimum mass detectable by the SZ survey as function of redshift, neglecting scatter between and . The flat portion of the curve at low redshift is the limit imposed by hand (). The total number of detected clusters is about 6,800 for our adopted set of fiducial cosmological parameters (for reference, we note that the total number would be a factor of higher, , if we instead adopted the 1st year WMAP cosmological parameters; the difference attributable mostly to the change in from 0.90 to 0.76).
In our fiducial calculation, we assume that temperature measurements are available for all clusters detected by the SZ survey. This is usually taken to require X–ray spectroscopic data for each cluster. Mroczkowski et al. (2008) find that a joint analysis of SZ and X-ray imaging data, for 3 clusters detected with the SZA instrument, yields temperature profiles that are in good agreement with spectroscopic X-ray measurements. This may ease the requirement on the depth of the X–ray survey. The availability of temperatures for a large fraction of the SZ clusters may, however, still be an optimistic assumption, since the X–ray flux drops much faster than SZ flux at high redshift. We discuss the effect of partial followup in § 4.
With the cluster model and the Fisher matrix technique described above, we are now ready to forecast constraints from upcoming SZ and X–ray surveys. As an academic exercise, in § 3.1 we first consider an “idealized case”, in which cluster parameters are precisely known, and only cosmological parameters are constrained. This exercise serves two purposes: (i) it allows us to understand where the cosmology sensitivity comes from, and (ii) it will clarify the amount of the degradation in the constraints, once the cluster model parameter uncertainties are included. In § 3.2, we relax the assumption that cluster structure parameters are known, and simultaneously constrain cosmological and cluster parameters. Finally, in § 3.3, we investigate the effects of the additional uncertainties in the scatter and incompleteness.
3.1 Constraints with cosmological parameters alone
Table 1 lists the marginalized 1 errors on cosmological parameters. In this and in the other tables below, “SR” stands for “scaling relations”, and “NC” stands for “number counts”. In addition to the marginalized errors (computed as ), in Table 1 we also list the single–parameter errors , and the degeneracy parameter , which we define as . In the limit of no degeneracies, 1, while large indicates significant degeneracy.
“Complementarity” parameter which quantifies the level of degeneracy breaking when different measurements are combined. See eq. 31 and Fang & Haiman (2008) for the formal definition. for details.
As Table 1 shows, the scaling relation in general has a constraining power comparable to the number counts. For some of the cosmological parameters, and especially for and , the SR is even more powerful than the NC approach. This might be surprising, since the cluster abundance is known to be exponentially sensitive to cosmological parameters, while the SR depends on these parameters more–or–less “linearly”. However, the scaling relation approach has its own advantages. We compare these two approaches in more detail in § 4. Let us first see where the constraints come from in the scaling relation approach.
It is easy to see from equation (16) above that
Since we are studying the relation, we eliminate the mass from this equation by converting to and using the virial theorem and mass conservation,
Combining the above three equations, we find
Equation (30) indicates that the dependence on cosmological parameters can arise through three terms: the angular diameter distance , the gas fraction (defined in § 2.2 above), and the virial overdensity . Here and are both related to cluster properties; , on the other hand, is a direct property of space–time geometry. Below, we first study the dependence of through the grouped combination of and through . This grouping is useful because is a pure geometrical quantity, and the cosmology–dependence that arises through this quantity is likely to be quite robust. On the other hand, predicting requires a structure formation model, and the cosmology dependence through this quantity will be necessarily model dependent. In particular, while depends only on the details of nonlinear gravitational collapse, also depends on gas physics – in particular, in our case, on our assumption of hydrostatic equilibrium. For each cosmological parameter, we want to know whether these two dependencies work in the same direction, or whether they tend to cancel each other – and, in either case, it is useful to know which dependence dominates the constraints.
To answer this question, we computed separately from each cosmology–dependent term, since the Fisher matrix element (eq. 18) is proportional to this derivative. First, we allow cosmological parameters to vary when we compute , but artificially keep them at their fiducial values in the computation of and . The resulting quantifies the dependence through alone. We next allow cosmological parameters vary in and , but we keep them fixed in ; this yields the dependence through alone. 777Although we will keep referring to the combination , it is useful to clarify that for , the dependence through is always much stronger than the mild cosmology–dependence through arising from eq. (9). On the other hand, we find that is much more sensitive to and than . The overall dependence is simply the sum of these two. We compute the above derivatives at and and the mass is set to (we did not find a strong mass dependence in the derivatives, so these numbers are typical for clusters in the whole mass range of interest).
The results of the above exercise are listed in Tables 2 and 3. As we can see from these two Tables, for each of the cosmological parameters, the derivative through and have different signs (except for , to which has no sensitivity). This, unfortunately, means that the dependence from always cancels with the dependence from . For and , the overall derivative is driven overwhelmingly by , and correspondingly the constraints come through . For and , there are significant cancellations, and the overall derivative has the same sign as that from , so the constraints come predominantly through . For and , there are again significant cancellations, and the overall constraints come predominantly through at low redshift, but through at high redshift.
Although each parameter has a dependence through , the situation is different for (, ) and for (, ). and are both directly related to , while they have a smaller or no effect on . The gas fraction is roughly proportional to the global baryon fraction . This direct dependence is the strongest among all dependencies of on cosmological parameters. The dark energy equation–of–state parameters and , on the other hand, have no direct effect on , and they mainly come into play through . A higher induces a higher , because clusters collapse earlier in such a universe (Kuhlen et al. 2005). Higher density means higher temperature for given cluster mass (eqs. 28 and 29), or conversely, lower mass for a given temperature. As a result, is reduced for a given (eq. 27). This reduction is further enhanced by the indirect effect of on through the pressure boundary condition. This can be understood by recalling that the ICM is assumed to be confined by the external pressure of the infalling gas. This pressure is proportional to (eqs. 10 and 28). When the virial density is raised, the external pressure increases roughly linearly with the temperature, which implies that the gas density is approximately kept constant (). The gas fraction, which is roughly proportional to the ratio of gas density to the virial overdensity, is therefore reduced.
Among the cosmological parameters, the relation is most sensitive to and , and constraints on these two parameters are therefore the tightest. However they also suffer the most from the severe degeneracy between one another (see column in Table 1). This is because both parameters affect the relation in an approximately uniform way, insensitive to redshift and cluster mass. This also means that a prior on one of these two parameters from another measurement could greatly help constrain the other. For example, if we apply a prior of from the result of WMAP+BAO+SN on (Dunkley et al. 2008), is reduced by more than a factor of 3, while the constraints on other parameters change only mildly. From the simple argument that both and are constrained through and is roughly proportional to the baryon fraction , one could conclude that it is the combination that is being constrained. This conclusion is borne out by our numerical results, which show that the direction of the Fisher eigenvector in the subspace is in the direction ; i.e., the scaling relation indeed best constrains this combination. Other parameters have a comparatively smaller effect on the relation (see the “Overall” column in Tables 2 and 3 and the corresponding column in Table 1). But due to the sensitivity to each parameter having a different redshift dependence, they suffer much weaker degeneracies. The parameter with the lowest degeneracy (smallest parameter) is , with , less than 1/50th of the degeneracy between and .
|Parameter constraints||Fiducial||1 bin||No Scatter|
Analogous to the discussion above, previous works have clarified the cosmology–dependence of the cluster number counts. We refer the reader to, e.g., Haiman et al. (2001) for a detailed discussion; here we just emphasize two points. First, the number counts also include a cosmology–dependence from the relation, through the selection function . The number–count constraint on is driven through this dependence, but the constraints on other parameters are dominated by either the cosmological volume element or the growth function (except the dependence, which is dominated by the explicit linear scaling of the cluster mass function with ). Second, we not only have multiple redshift bins, but also multiple –bins. This helps significantly in constraining cosmological parameters (analogously to the shape of the cluster mass function being helpful; e.g., Hu 2003). In Table 4 below, we present constraints both with and without binning in . Comparing these two cases, we see that the parameter changes significantly, while has only a mild change. This shows that the tightening of the constraints in the case when 8 bins are used is achieved mainly via breaking degeneracies between different parameters. In the same table, we also list constraints when the scatter between and is set to zero. The constraints become more stringent, which highlights the degrading effect of the scatter through flattening the mass function. A small scatter in the mass–observable relation (10% as assumed in this work) is seen to degrade constraints by a factor of up to 4 (the parameter ). Again, we see that this is mainly due to higher degeneracies in the presence of the scatter.
3.2 Constraints with cosmological and cluster parameters
Below, we consider the more “realistic” case, in which we take into account uncertainties in cluster structure and its evolution. In § 2.2, we have parameterized cluster structure and evolution with 15 parameters, characterizing various aspects of ICM physics, namely the shape of the gravitational potential, the gas entropy, non-thermal pressure, and boundary condition, as well as the mass– and redshift–dependence of these parametrized quantities. We repeat the above analysis, but also including these cluster structure parameters, which means we constrain 23 parameters simultaneously. The results of this exercise are shown in Table 5. Within the parentheses next to the errors on the cosmological parameters, we list the ratio of errors, =, where is the idealized constraint shown in Table 1, and is the new value in Table 5. This ratio therefore quantifies the degradation of the constraint introduced by the cluster parameter uncertainties. For the majority of parameters, the degradation is less than a factor of 2, and the constraints remain tight, despite the large increase in parameter space (this is true for both the scaling relation and number counts approaches). We emphasize again that we simplified things by assuming a particular form (eqs. 11) for the mass and redshift dependence of cluster parameters; nevertheless, these results highlight the ability of upcoming surveys to constrain a large number of parameters, which is due essentially to the large number of clusters and therefore small statistical errors.
For the number counts approach, the largest degradation is on . This is understandable, because unlike for other cosmological parameters, the constraint on , as we have mentioned, is largely from the relation in eq. (2.4), not from the mass function itself, and all of the 15 cluster parameters affect the relation. This could account for the large degeneracy between and the cluster parameters. Of course, is measured accurately by other methods (Kirkman et al. 2003; Dunkley et al. 2008), and this degradation is not a concern. In the scaling relations, however, the largest degradation is suffered by . This is because the simple power–law parameterization in eq. 11 happens to be close to the way affects the evolution of the scaling relation. This large degeneracy between the DE equation–of–state parameters and cluster parameter suggests that the cluster constraints on and will be especially useful when combined with independent measurements of these parameters using other probes.
Table 5 shows further that the constraints on cluster parameters are, in general, quite weak from both approaches, with most constraints at the order–unity level. This, conversely, indicates that the parameter is relatively insensitive to cluster parameter variations. We see from Table 5 that the single–parameter errors for the cluster parameters are very low, showing that the weakness of these constraints are due to strong degeneracies among the cluster model parameters. One likely reason for this strong degeneracy is that we adopted the same power–law form for the mass– and redshift–evolution for each of the cluster parameters. Indeed, we find that introducing even unrealistically tight priors on the cosmological parameters do not improve cluster parameter constrains significantly, indicating that the degeneracies are among the cluster parameters themselves. We thus conclude that relation by itself is not a good way of placing precise constraints on individual cluster parameters, unless the mass–dependence and redshift–evolution of the physical parameters can be understood a–priori, and they differ significantly from the power–law forms assumed here. Of course, the relation still delivers tight constraints on cluster–parameter combinations, so it should be useful when combined with other cluster observables.
3.3 Effects of scatter and completeness uncertainties
In addition to the uncertainties in cluster structure, there are also uncertainties in scatter and completeness. The scaling relation test is affected by scatter and incompleteness only indirectly through Malmquist bias. This bias is the increase in the mean value of at fixed , because the lowest– clusters that are scattered below the detection threshold are missing from the sample. We find that the scatter changes the number of detectable clusters in each redshift bin by less than 1% (except in the three bins beyond , where the changes are between 1-2%), and the mean is changed by a similar amount. This is comparable to the change in caused by variations in our parameters within their marginalized uncertainties (see Tables 2, 3 and 1). However, since flux limited surveys can do a correction that should eliminate the bulk of the Malmquist bias, we believe this will not be a major limitation of the constraints.
The effects of scatter and completeness uncertainties on the number counts constraints is somewhat more subtle. In this section, we allow both the scatter and a completeness to vary, together with the cosmological and cluster parameters. The scatter between and is parameterized using the same power–law form as the cluster parameters (eq. 11). The completeness , defined as the fraction of clusters at a given at redshift that are detected is taken to be given by a similar power–law, except is replaced by in equation (11), where is chosen to be mJy.
The fiducial scatter is assumed to be 10%, and the fiducial completeness is set to be 100% (both independent of redshift and mass, except completeness is set to zero below ). Table 6 shows the results when scatter is included and when both scatter and incompleteness are included. For comparison, we also list the result when neither uncertainty is taken into account (repeating the NC column from Table 5). The degradations are relatively small for most parameters. The two exception are and , for which the number-count constraints degrade by a factor of when the completeness uncertainty is included. The impact of the completeness uncertainty is relatively modest, because our treatment assumes that we know the form of the dependence of on and (i.e. power–laws in our case). At the opposite extreme, if one allows completeness to be an arbitrary function of mass and redshift, then of course no constraint can be derived on any model parameter. The fact that we still find interesting constraints shows that a reliable parameterization of the completeness as a function of and will be very important. We also find that all of the constraints would recover their values (to within 30%) in the fixed 1 case when a prior of 15% is applied to . Finally, in Table 6, we also list the combined constraints from the scaling relations and the number counts. Comparing these values with the combined constraints listed in Table 5, we find that these constraints are less affected by incompleteness than those from number counts approach alone. Except for , which degrades by a factor of , the constraints all degrade by factors of .