A Harmonic-space S/N

# Detecting the polarization induced by scattering of the microwave background quadrupole in galaxy clusters

## Abstract

We analyse the feasibility of detecting the polarization of the CMB caused by scattering of the remote temperature quadrupole by galaxy clusters with forthcoming CMB polarization surveys. For low-redshift clusters, the signal is strongly correlated with the local large-scale temperature and polarization anisotropies, and the best prospect for detecting the cluster signal is via cross-correlation. For high-redshift clusters, the correlation with the local temperature is weaker and the power in the uncorrelated component of the cluster polarization can be used to enhance detection. We derive linear and quadratic maximum-likelihood estimators for these cases, and forecast signal-to-noise values for the SZ surveys of a Planck-like mission and SPTPol. Our estimators represent an optimal ‘stacking’ analysis of the polarization from clusters. We find that the detectability of the effect is sensitive to the cluster gas density distribution, as well as the telescope resolution, cluster redshift distribution, and sky coverage. We find that the effect is too small to be detected in current and near-future SZ surveys without dedicated polarization follow-up, and that an r.m.s. noise on the Stokes parameters of roughly 1 K-arcmin for each cluster field is required for a 2  detection, assuming roughly 550 clusters are observed. We show that ACTPol is in a better position to observe the effect than SPTPol due to its advantageous survey location on the sky, and we discuss the novel spatial dependence of the signal. We discuss and quantify potential biases from the kinetic part of the signal caused by the relative motion of the cluster with respect to the CMB, and from the background CMB polarization behind the cluster, discussing ways in which these biases might be mitigated. Our formalism should be important for next-generation CMB polarization missions, which we argue will be able to measure this effect with high signal-to-noise. This will allow for an important consistency test of the model on scales that are inaccessible to other probes.

## I Introduction

Inverse-Compton scattering of cosmic microwave background (CMB) photons by free electrons in galaxy clusters gives rise to spectral distortions, the Sunyaev-Zeldovich (SZ) effect (Sunyaev and Zeldovich, 1972). This effect has been used extensively to identify clusters from maps of the CMB temperature anisotropies, with current surveys able to detect hundreds of clusters (Hasselfield et al., 2013; Reichardt et al., 2013; Planck Collaboration et al., 2013).

Scattering also imparts linear polarization into the CMB, with an amplitude proportional to the radiation quadrupole in the rest-frame of the scattering electron (Zeldovich and Sunyaev, 1980; Sunyaev and Zeldovich, 1980, 1981). The effect in clusters has not yet been observed, but its detection could potentially allow one to measure the CMB quadrupole at the spacetime location of the cluster, providing an indirect snapshot of the Universe within our past light-cone (Kamionkowski and Loeb, 1997). We know that our local quadrupole is low compared to the best-fit CDM model, so this technique could be used to compare our value with the value in other parts of the Universe. It can also be used to make measurements of the large-scale density field that are independent of those provided by other cosmological probes (Bunn, 2006).

With redshift information from optical observations, the cluster polarization could be used to measure the time-evolution of the quadrupole, which is sensitive to late-time structure formation through the integrated Sachs-Wolfe (ISW) effect, hence providing a window on dark energy (Cooray and Baumann, 2003; Cooray et al., 2005).

In this work, we discuss how this signal can be first detected in the optimal way. Current and near-future CMB polarization experiments such as Planck, ACTPol and SPTPol, as well as the proposed next-generation mission PRISM (PRISM Collaboration et al., 2013), will provide high-quality measurements of polarization in the microwave sky, so a detailed study appears timely. We construct optimal estimators for the effect, quantify the impact of the major systematic errors and propose techniques to mitigate them, and forecast signal-to-noise estimates for current and future surveys. Our analysis differs from previous studies in that we focus on the detection of the signal rather than its statistics, which are well understood on the cosmological scales relevant for CMB experiments (Portsmouth, 2004; Amblard and White, 2005; Bunn, 2006; Ramos et al., 2012). Detection of the cluster polarization signal with the expected amplitude would be a non-trivial test of the cosmological model on the largest observable scales.

However, there are several complications that must be overcome before the primordial quadrupole can be measured. Firstly, if the cluster is in relative motion with respect to the CMB rest-frame1, Doppler and aberration effects will produce polarization even in the absence of a primordial quadrupole (Sunyaev and Zeldovich, 1980; Sazonov and Sunyaev, 1999; Challinor et al., 2000). The dominant effect for typical cluster velocities and gas temperatures is the transformation of the CMB rest-frame monopole into a quadrupole in the electron rest-frame. For unpolarized incident radiation, this gives linear polarization in the plane defined by the quadrupole, and hence the observed polarization is proportional (at leading order) to the square of the transverse velocity of the cluster with respect to the line of sight. We shall refer to this effect as the kinetic part of the signal.

The kinetic part has a unique frequency-dependence, a consequence of the Lorentz transformation between the electron rest-frame and the CMB rest-frame. In contrast, the primordial part has the same frequency-dependence as the temperature quadrupole, i.e., the first derivative of a blackbody. This is a consequence of Thomson scattering not altering the energy of the incident photon (for higher-order and relativistic corrections, see (Challinor et al., 2000; Itoh et al., 2000)). The opposite is true for the usual SZ effect, where scattering of the primordial CMB by the distribution of electrons in the intra-cluster medium (ICM; the thermal SZ effect) does give rise to a spectral distortion, which is how the cluster may be identified. However, the part of the signal due to the cluster’s bulk velocity (the kinetic SZ effect) has the same frequency-dependence as the incident radiation. Moreover, the amplitude of this effect is proportional to the radial cluster bulk velocity. Detections of this effect have recently been made statistically by the Atacama Cosmology Telescope (Hand et al., 2012), and directly towards a massive cluster with Bolocam measurements (Sayers et al., 2013).

Fortunately, for typical cluster velocities ( ) the amplitude of the kinetic polarization signal is roughly a factor of 10 smaller than the primordial part at typical observing frequencies. In addition, the unique frequency dependence of this effect should allow it to be distinguished from the primordial signal if multi-frequency data are available.

An additional systematic comes from the background CMB polarization, which has the same frequency-dependence as the cluster quadrupole signal, but a different correlation structure. This could potentially bias the measurement if not properly accounted for. This background is also correlated with the temperature quadrupole that generates the signal itself, which further complicates the analysis. Compton scattering also has a dependence on the incident polarization leading to an additional correlation of the signal with the background CMB.

Since clusters are optically thin to Compton scattering, the generated polarization is linearly proportional to the optical depth through the cluster. To estimate the amplitude of the primordial signal it is necessary to know the cluster optical depth profile. This may be extracted from X-ray surface brightness observations if the temperature profile is known, which requires high-resolution spectroscopic X-ray measurements (see e.g. Vikhlinin et al. (2006)). However, X-ray measurements are only available beyond the virial radius for a few systems, leading to uncertainty in the gas profile and hence optical depths in the cluster outskirts. At larger radii we may appeal to simulations, but these show a high degree of scatter in the gas profile of the outer region, dependent on cluster mass (Sijacki et al., 2007). Thus our inferences of optical depths beyond a few virial radii are likely to be unreliable. In principle, relativistic corrections to the thermal SZ effect could help to constrain optical depths without the use of X-ray information, but these corrections are likely to be small for most of the clusters we consider (Challinor and Lasenby, 1998; Itoh et al., 1998).

Further potential systematics are provided by polarized foreground contamination, as in conventional CMB polarization analyses, and polarized emission intrinsic to the cluster itself. We will not attempt to address foregrounds in this work, since the techniques required to remove them are the same as in conventional analyses of small-scale CMB anisotropies (e.g., Dunkley et al. (2009)). We briefly discuss the impact of intrinsic polarization in Sec. VIII.

Despite these obstacles, the signal we are trying to detect has a well-understood scale-dependence. At low redshift, the signal is proportional to the local quadrupole, and hence is correlated over very large angular scales. The signal from high-redshift clusters contains more angular structure when observed today due to free-streaming, but for typical survey depths the large-scale nature of the signal is preserved. An SZ polarization survey then amounts to extracting noisy, sparse samples from a 3D field with a large correlation length. In addition, the signal is correlated with the local temperature and polarization anisotropies. These properties must be used in an optimal way to overcome the problems alluded to above.

The plan of this paper is as follows. In Sec. II we discuss the physics of the signal we seek to measure, and discuss its statistics. In Sec. III we write down the likelihood function for the signal and derive the maximum-likelihood estimator for the amplitude of the signal for low-redshift cluster surveys. In Sec. IV we discuss our survey and cluster catalogue assumptions, and forecast estimates. We discuss potential biases in Secs. V and VI. In Sec. VII we generalize our estimator to high-redshift cluster surveys, and we conclude in Sec. VIII. We derive the harmonic-space representations of the cluster quadrupole signal in Appendix A and the kinetic signal in Appendix B.

The fiducial model we assume in calculating the statistics of the signal is a flat CDM universe with parameters {} given by {}. We neglect the effects of massive neutrinos and dynamical dark energy, and consider only scalar adiabatic initial conditions evolved with linear perturbation theory.

## Ii Polarization observed towards galaxy clusters

### ii.1 Thomson scattering of the CMB

In this section, we review the calculation of the generation of polarization from Thomson scattering (see, e.g., (Hu and White, 1997)). We will consider the single scattering of a CMB photon off a free electron in the electron’s rest-frame, neglecting the effects of electron recoil.

Polarization is most easily handled by use of the Stokes parameters and . These are elements of a symmetric trace-free tensor , which is defined in the two-space orthogonal to both the propagation direction and the observer’s four-velocity. Specifically we have

 P11=Q/2,P12=P21=U/2,P22=−Q/2, (1)

where the indices refer to components on an orthonormal tetrad with 3-direction aligned with the propagation direction.

We will simplify the scattering geometry by setting up a spherical coordinate system centred on the scattering electron. We can define a local orthonormal right-handed basis on the sphere with the standard basis vectors , where is the radial unit vector giving the direction of photon propagation.

A quantity has spin if, under the transformation

 ^θ+i^ϕ→(^θ+i^ϕ)eiγ, (2)

we have . In our coordinate system, the complex polarization thus has spin , and has spin .

The photon is incident along a direction , gets scattered through an angle and leaves along a direction , with . We denote the increment of physical distance along the line of sight by . We set up polarization bases at and that have their local -axes in the scattering plane and local -axes perpendicular to it. We denote with an overbar the polarization defined with respect to these bases. For unpolarized incident radiation temperature the outgoing photon has by symmetry, and

 d¯Q(e2)=−316πneσTdlsin2βT(e1)de1, (3)

where is the free electron density of the medium, and is the Thomson cross-section (e.g., (Hu and White, 1997)).

Define the angle as the angle required to rotate (in a right-handed sense) the basis at onto the scattering-plane basis there. Let the angle be the corresponding quantity at . The angles form a set of Euler angles that rotate the polar basis at onto that at .

Performing the reverse rotation at from the scattering plane onto the spherical coordinate basis, we have

 d(Q±iU)=−316πneσTdlsin2βe±2iγ2T(e1)de1. (4)

We can express in terms of the elements of a Wigner -matrix (Varshalovich et al., 1988)

 sin2β=√83D20±2(γ1,β,−γ2)e∓2iγ2. (5)

We now make use of the addition theorem for spin-weighted spherical harmonics,

 Dlss′(γ1,β,−γ2)=∑m4π2l+1sY∗lm(e1)s′Ylm(e2). (6)

Using this, and the relation between Wigner matrix elements and spin harmonics

 Dl−ms(ϕ,θ,0)=(−1)m√4π2l+1sYlm(e), (7)

we have for the outgoing polarization along

 d(Q±iU)=−316πneσTdlT(e1)de1√83∑m4π5Y∗2m(e1)±2Y2m(e2). (8)

This has the form of a spin 2 expansion, confirming that is spin 2 with our current conventions. Integrating over all incoming directions we have

 d(Q±iU)=−110dτ∑m√6a2m±2Y2m(e2), (9)

where we have expanded the temperature of the incident distribution in spherical harmonics , and identified with an increment in the optical depth . Equation (9) should be compared with the general expansion of the polarization in spin- harmonics

 (Q±iU)(e)=∑lm(Elm±iBlm)±2Ylm(e). (10)

Under a parity transformation, , so has (electric) parity and has (magnetic) parity. The temperature transforms as , so has electric parity. Therefore the polarization is locally of the purely electric quadrupole type. Equation (9) also makes it clear that polarization from Thomson scattering is generated by quadrupolar anisotropies in the incident radiation field (Chandrasekhar, 1950).

We now need to relate the polarization carried away by the photon in direction to the polarization measured by the observer. We will set up a new spherical coordinate system centred on the observer, who observes the radiation along a line of sight . A local right-handed basis can be set up on the sphere with the 3-direction aligned with the propagation direction if we choose the local basis as . However, some care must be taken as on this new basis is spin (Chon et al., 2004). The observed polarization is obtained by performing a parity transformation on Eq. (9), since the new basis has had its local 3-direction reversed.

Since the cluster is optically thin, we can neglect any further scattering of the photon. Integrating through the cluster gives the observed complex polarization

 (Q±iU)(^n)=−√610τ(^n)2∑m=−2a2m(r)∓2Y2m(^n), (11)

where denotes the CMB quadrupole at the location of the cluster. This expression matches Eq. (7) of (Bunn, 2006), up to an arbitrary overall sign. Note that for polarized incident radiation, in the above expression should be replaced by where is the -mode quadrupole. However, the correction is very small for scattering well after recombination.

Equation (11) tells us that the observed polarization is purely quadrupolar for a very low-redshift cluster. However, to calculate the angular structure of the signal due to scattering at higher redshift, we must expand in spin spherical harmonics. The calculation appears in (Hu and White, 1997; Bunn, 2006), so we will not reproduce it here. The result is

 (Q±iU)(^n)τ−1(^n)=∑lmplm(r)∓2Ylm(^n), (12)

where and the coefficients are given by

 plm(r)=−il3π√(l+2)!(l−2)!∫d3k(2π)3/2jl(kr)(kr)2Δ2(k;r)Φ(k)Y∗lm(^k). (13)

Here, is the transfer function relating the gravitational potential in matter-domination, , to the temperature quadrupole at conformal time , given by

 Δ2(k;r)=13j2[k(η−η∗)]+2∫ηη∗dη′j2[k(η−η′)]∂∂η′[D(η′)a(η′)]. (14)

The conformal time today (where ) is and is the conformal time at last scattering. The growth factor of the comoving-gauge matter perturbation, , is normalized to unity at high redshift in matter domination. We have assumed that contains contributions from the Sachs-Wolfe effect and the ISW effect only, i.e., we neglect the Doppler term whose contribution to the quadrupole is sub-dominant. It is easy to check from Eq. (13) that , and that the projection of the quadrupole preserves the -mode nature of the signal.

### ii.2 Correlation functions

We assume that the temperature quadrupole can be calculated within linear theory, which should be a very accurate approximation. We also assume that the initial fluctuations are Gaussian and that temperature anisotropies are generated by scalar curvature perturbations.

In harmonic space, we can define the angular power spectra of the polarization field , as well as its cross-correlation with the local CMB temperature anisotropies and background CMB polarization anisotropies :

 ⟨plm(r)p∗l′m′(r′)⟩=δll′δmm′ξl(r,r′), (15)
 ⟨plm(r)a∗l′m′⟩=δll′δmm′ζTl(r), (16)
 ⟨plm(r)E∗l′m′⟩=δll′δmm′ζEl(r), (17)
 ⟨alma∗l′m′⟩=δll′δmm′CTTl, (18)
 ⟨ElmE∗l′m′⟩=δll′δmm′CEEl, (19)
 ⟨almE∗l′m′⟩=δll′δmm′CTEl. (20)

Note that parity invariance in the mean ensures vanishing correlation of the cluster polarization signal with -mode polarization.

In real space, polarization correlation functions can be defined in a way that makes them independent of the coordinate system used to define the Stokes parameters. Recall that an overbar denotes a quantity evaluated on the ‘physical’ basis defined by the geodesic connecting the two points of interest on the sphere. We may then define the correlation functions (Kamionkowski et al., 1997; Ng and Liu, 1999)

 Ξ−(β,r,r′)≡⟨¯p(^n1;r)¯p(^n2;r′)⟩=∑l2l+14πξl(r,r′)dl2−2(β), (21)
 Ξ+(β,r,r′)≡⟨¯p∗(^n1;r)¯p(^n2;r′)⟩=∑l2l+14πξl(r,r′)dl22(β), (22)
 ⟨¯p(^n1;r)T(^n2)⟩=∑l2l+14πζTl(r)dl20(β), (23)

where are elements of Wigner reduced matrices, and we have defined . Equations (21) and (22) correct Eqs. (21) and (22) of (Bunn, 2006), which did not properly account for spin-2 nature of polarization. The measured polarizations are given by the rotated quantities

 p(^n1)=e−2iγ1¯p(^n1), (24)
 p(^n2)=e−2iγ2¯p(^n2). (25)

The final ingredient required to calculate these two-point functions is the primordial power spectrum of . This is given by

 ⟨Φ(k)Φ∗(k′)⟩=PΦ(k)δ(3)(k−k′), (26)

with where the scalar spectral index .

In Fig. 1 we plot the auto-spectrum of the cluster polarization signal at three different redshifts, and in Figs. 2 and 3 we plot the cross-spectrum with the local () CMB temperature and -mode anisotropies. For plots of the cross-correlation between different redshifts, see (Bunn, 2006). We also plot the correlation coefficient, defined as , where . The dependence of the spectra in these plots on and is a combination of several effects. Firstly, at low redshift, the signal is almost entirely quadrupolar, with significant suppression of all the spectra at higher . At higher redshift, the coherence scale of the quadrupole at the cluster redshift subtends smaller angular scales on the sky today, which boosts power in the higher multipoles, a consequence of free-streaming. Secondly, the power in the quadrupole itself depends on redshift through the ISW term (and, weakly, through non scale-invariance of the primordial power spectrum). The onset of dark energy domination at causes potentials to decay, sourcing the ISW effect. The quadrupole power is therefore larger at lower redshifts, which explains why and decrease with increasing redshift. We can see this more clearly in Fig. 4, where we plot the correlation function , defined in Eq. (22), at zero lag and equal redshift. Note that this is given by , and hence is closely related to the integral of , c.f. Fig. 1. This quantity clearly rises at low redshift, due to the time-dependence of the quadrupole.

The cluster polarization signal also has a correlation with the local -mode through . At large angular scales, the CMB -mode is sourced by the scattering of the temperature quadrupole at reionization. The physics is exactly the same as that described above for galaxy clusters, except that the scattering probability (the visibility function of Thomson scattering) decays more slowly with redshift rather than being essentially a delta function in the cluster core. Since reionization occurs at high redshift (), the quadrupole coherence scale at reionization subtends smaller angular scales than at cluster redshifts. The greatest correlation will therefore be with polarization from high-redshift clusters. This behaviour can be clearly discerned in Fig. 3, where the trend (for ) is for the correlation coefficient to be greater for clusters at higher redshift.

## Iii Likelihood Analysis and Optimal Estimator

In this section we construct the maximum-likelihood estimator for the cluster polarization signal, and discuss its statistics. We should begin by justifying the claim implicit in this section, viz. that a statistical treatment of the signal is necessary for detecting the quadrupole signal in forthcoming CMB polarization surveys. The magnitude of the local temperature quadrupole is roughly (Planck collaboration et al., 2013), so a cluster at would source polarization of order in a pixel with optical depth . Optical depths through the centres of clusters can be as high as , so the polarization signal from these regions would be around . A typical low-redshift cluster has an angular radius (taken as where is the radius at which the enclosed density is times the critical density at the cluster redshift) of roughly 10 arcmin. Ignoring the rapid fall-off of the optical depth with angular radius, and the effects of finite instrument resolution, a polarization survey with sensitivity of - for and would give a signal-to-noise () around unity for a single cluster. Note that the polarization noise level of Planck at 143 GHz for the full mission is expected to be around -, and around - for the SPTPol survey at 150 GHz (Austermann et al., 2012). We have also assumed that the cluster is at a ‘typical’ point on the sphere, whereas in fact the magnitude of the polarization varies with angular position. As we shall see, even this estimate is optimistic due to the assumption of constant optical depth across the cluster.

This back-of-the-envelope calculation tells us that the effect will not be detectable with forthcoming surveys in individual clusters, and so we will have to approach the problem statistically by stacking clusters appropriately. However, it is not immediately obvious as to how to do this optimally, as the signal is correlated over the sky, and we have to account for the relative orientations of the polarization basis at each point on the sphere. We will attempt to address these issues in the rest of this section.

### iii.1 The conditional likelihood function

An obvious first step in the programme to use cluster polarization as a cosmological observable is to check that the signal is present with the expected amplitude. Our main objective in this paper is to assess the significance of this test with forthcoming surveys. To this end, we parameterise the signal with an amplitude parameter , such that the observed polarization in a direction is modelled as

 d(^n)=ατ(^n)p(^n)+n(^n), (27)

where is given by Eq. (12) and is a Gaussian instrumental noise contribution with zero mean and covariance . The parameter takes a fiducial value of unity. Estimating then requires us to write down a likelihood for the data. But since we do not know the values of , we will have to marginalise over it. The resultant distribution will be Gaussian with a variance which is the sum of the noise variance and the cosmic variance given by .

However this cosmic variance is potentially very large, since the signal is on large scales. The variance on the measured is correspondingly large and the test of concordance with the expected value is weak. The situation is analogous to that of the large-scale CMB. We can make very accurate measurements of our temperature quadrupole, and yet our measurement of the true (ensemble-averaged) quadrupole, given by , is very poor due to cosmic variance.

We can make a more powerful test by exploiting the correlation between the cluster signal and the local large-angle CMB anisotropies. Where the correlation is weak, we have to use the power observed in to constrain . However, at low redshift most of the signal is correlated with the local temperature quadrupole and the first few higher-order moments (see Fig. 2), and so the expected signal can be accurately predicted. By cross-correlating the measured cluster signal with the large-angle local anisotropies, our test of will be limited only by noise in the cluster measurement and the fluctuations in the uncorrelated component of the cluster signal.

We split the observed polarization as

 d(^n)=ατ(^n)[pc(^n)+pu(^n)]+n(^n), (28)

where the part of the signal correlated with the local temperature is

 pc(^n)=∑lmζTl(z)CTTlalm−2Ylm(^n), (29)

(we ignore correlation with local large-angle polarization, although this could be included straightforwardly) and the uncorrelated part is given by

 pu(^n)=p(^n)−pc(^n). (30)

The likelihood conditional on the large-angle temperature anisotropies (ignoring noise in their measurement) is, up to an -independent constant,

 Extra open brace or missing close brace (31)

where , and we have defined . The covariance matrix of is . Note that these covariance matrices are defined with respect to the local basis at each cluster, and so each element requires the appropriate rotation factors of Eqs. (24) and (25) to be related to the physically-defined correlation functions of Eqs. (15)–(17).

We shall derive an approximate maximum-likelihood estimator for based on the conditional likelihood of Eq. (31) in Sec. VII. The factors multiplying complicate the construction of a maximum-likelihood estimator. However, this dependence of the likelihood on is only important for high-redshift clusters — it represents the additional information that can be obtained on the amplitude from the power in . For the near-future surveys we consider, instrument noise and the moderately low-redshift cluster samples mean that we can ignore the dependence of the likelihood through the sub-dominant , leaving

 −2lnL(d|α,pc)≈(d−αPc)†(N+Cu)−1(d−αPc), (32)

up to an additive constant. The maximum-likelihood estimate of (see below) is now linear in the data and is an optimally-weighted measure of the cross-correlation between the cluster polarization and the large-angle temperature anisotropies across angular scales and redshifts.

We shall assume that the noise contribution to each pixel is constant and uncorrelated between pixels, such that has diagonal elements given by , with the variance on the measured (or ). The vector has length , containing the measured and values in each pixel, with the the average number of pixels in a cluster, and the total number of clusters.

### iii.2 Demodulating the optical depths

The correlation length of the CMB quadrupole is much larger than the extent of individual clusters, and so the quadrupole may be treated as constant across the cluster. [This assumption was already made in Eq. (12).] The observed polarization is therefore the signal of interest multiplied by the beam-convolved optical depth. For each cluster we can compress the measurements into a single estimate of the polarization for that cluster (where bracketed indices here refer to clusters):

 ^p(j)=τ⊤(j)N−1d(j)τ⊤(j)N−1τ(j)=∑iτ(j)idi∑i(τ(j)i)2, (33)

where, temporarily, bold-faced notation refers to pixels within a cluster, and is the beam-convolved optical depth for cluster . There is no sum over implied. The variance on this estimator is

 var(^p(j))=(τ⊤(j)N−1τ(j))−1=σ2N∑i(τ(j)i)2. (34)

The approximate likelihood (32) becomes for the demodulated signal

 −2lnL(^p|α,pc)=(^p−αpc)†(C^p+Cu)−1(^p−αpc), (35)

where bold-faced notation now refers to clusters, and the thermal noise variance now includes the optical depth weighting . This procedure has significantly reduced the dimensionality of the matrix inversions; the distribution in Eq. (35) now has degrees of freedom.

Finally we note that the above demodulation can also be done in harmonic space, which should be very accurate for a survey with full-sky coverage. We outline this procedure in Appendix A.

### iii.3 Maximum-likelihood estimator

Using Bayes’ theorem and assuming a flat prior on the parameter , the likelihood of Eq. (35) is proportional to the posterior distribution of given the data. We may then maximize this distribution with respect to to find the maximum-likelihood estimator

 ^α=Re[p†c(C^p+Cu)−1^p]p†c(C^p+Cu)−1pc, (36)

which has variance

 var(^α)=(p†c(C^p+Cu)−1pc)−1. (37)

Reassuringly, and its variance are independent of the polarization basis used to define and . To see this, note that a change of basis is effected by a unitary (here diagonal) transformation that cancels in forming the scalar .

The estimator, Eq. (36), has a simple physical interpretation. In the limit that , the numerator of Eq. (36) takes the form of a correlation between the inverse-variance weighted measured polarization and . The latter is the predicted signal given measurements of our local temperature field and knowledge of its correlation with the cluster polarization. This prediction should be precisely the underlying signal in the limit, so the best estimate of the signal amplitude is just the inverse-variance-weighted correlation, with a normalisation given by the denominator of Eq. (36). For clusters at higher redshift, we must include the cosmic variance of the uncorrelated part of the signal, and compare this inverse-variance-weighted polarization with the local prediction.

In formulating the estimator, Eq. (36), we have ignored the fact that we are not randomly sampling the polarization field , since clusters are biased tracers of large-scale structure. Essentially, our observable is the polarization field weighted by the cluster number density (and survey selection function), and complications arise from the correlations between the weight function and the local temperature anisotropies and the cluster polarization field . For example, the average of Eq. (36) conditioned on the local temperature anisotropies and the cluster number density in our universe is no longer unity. This is because the (doubly) conditional average of is not simply , but has an additional term from the part of that is uncorrelated with the local temperature anisotropies but is correlated with the cluster number density. However, such a bias is expected to be small for low-redshift cluster surveys since very little of is uncorrelated with the local temperature anisotropies. Furthermore, since we are dealing with very large-scale fields, it is only the cluster number density smoothed on scales approaching those of the cluster survey itself that are relevant, further reducing its impact. A detailed analysis of these issues may potentially be important for higher redshift surveys, and would best be addressed through simulations.

## Iv Signal-to-noise forecasts

In this section we forecast the for the estimator in Eq. (36) for present and future SZ polarization surveys. We begin by detailing our assumptions about these surveys and the cluster sample they expect to observe.

### iv.1 Cluster sample assumptions

To forecast future constraints, we need to assume optical depth profiles for each cluster in the survey. The optical depths are defined as integrals of the free electron density through each cluster. Assuming complete ionization in the ICM, this is equivalent to an integral over the gas density .

We consider two different phenomenological models for the gas density profile: the common -model parameterisation, and a more realistic cored Navarro-Frenk-White (cNFW) profile.

#### β-model

In the isothermal -model, the gas density profile is spherically symmetric with a radial profile

 ρg(r;z)=ρ0g(z)(1+r2r2s)−3β/2, (38)

where is a (redshift-dependent) normalization factor. In keeping with a conventional choice, we take .

To fix the normalization, we assume that the ratio of gas mass to total mass within the virial radius (taken to be ) is equal to the cosmological value. To calculate the dark matter mass within the virial radius, we assume an NFW profile

 ρm(r;z)=ρ0m(z)(r/rs)(1+r/rs)2. (39)

The normalization can be expressed in terms of the critical density at the cluster redshift and the concentration parameter , which is related to the scale radius by . We fix , which is the best-fit value for the universal pressure profile of (Arnaud et al., 2010), derived from X-ray observations. This model thus has one free parameter, given by .

#### Cored NFW model

The -model is known to be a rather poor description of the gas density in a typical cluster, particularly in the outer regions (Vikhlinin et al., 2006). Instead, a combination of X-ray measurements and hydrodynamic -body simulations suggests that the gas density falls more steeply than Eq. (38) at large radii, tracing the underlying dark matter halo quite faithfully for high-mass systems (Arnaud et al., 2010; Sijacki et al., 2007). Motivated by this, we consider another model taken to be an NFW profile with a core, given by

 ρg(r;z)=ρ0g(z)[(r+r0)/rs](1+r/rs)2, (40)

where is the core radius. We assume that is fixed for all clusters as in (Sijacki and Springel, 2006), although this choice does not affect our results significantly. We again fix through , leaving as a free parameter, and fix the normalization in the same way as for the -model.

#### Optical depth profiles

The optical depth through the cluster at angular radius from the centre is given by

 τ(θ;z)=σT∫L−Lne(r(l);z)dl, (41)

where , is a cut-off (in principle -dependent, see below), and is the angular diameter distance. The free electron density is related to the gas density through where is the hydrogen mass fraction (taken to be 0.76) and is the mass of the proton.

The simple analytic gas profiles we have considered are not expected to hold in the cluster outskirts, particularly for low-mass systems () where accretion could be important. We therefore choose to truncate the profiles at , which corresponds to roughly 3.1 virial radii. This may be an optimistic choice given the current deficit of observations beyond the virial radius, but it should not affect the results too much due to the low gas density in the outskirts.

When computing the integral in Eq. (41), there is some freedom as to the choice of cut-off . Given the uncertainty in the gas models at large physical cluster radii, large angular radii can be excluded from the analysis, but we will always have to deal with the unmodelled region along the line of sight. Simply excluding this region by truncating the integral at radii larger than, say, will underestimate the optical depth at all angular positions across the cluster. In this work, we set large enough so that all line-of-sight gas is included in the optical depth calculation. Other reasonable choices of cut-off lead to roughly variations in our final estimates.

With the above assumptions, typical central optical depths for our cluster sample are 0.01 and 0.004 at and 0.04 and 0.01 at for the cNFW and models respectively. The -model predicts a higher gas density in the outskirts, but this is more than compensated for by the higher central density in the cNFW model, which explains the differences in central optical depths. Note that the normalization of both models implies and hence clusters of a given physical size have higher optical depths at higher redshifts, having formed in denser environments.

In Fig. 5 we plot the redshift-independent quantity , where is the dimensionless Hubble parameter, assuming .

Finally, we note that it is the beam-convolved optical depth that is relevant for observations of cluster polarization. In our forecasts, we explicitly perform this convolution, assuming a Gaussian beam function with a full-width-at-half-maximum (FWHM) given by .

### iv.2 Survey assumptions

We forecasts constraints on for a full-sky mission similar to Planck and for the 150 GHz channel of the South Pole Telescope polarization experiment (SPTPol; (Austermann et al., 2012)). For the Planck-like survey, we use a single frequency of 143 GHz with specifications taken from the Planck blue book: thermal noise on (or ) of 11.5 in a square pixel of side length  arcmin. (We would expect that combining other frequency channels on Planck would increase our forecasts by less than .) For SPTPol, the 150 GHz channel was chosen for its superior sensitivity and resolution over the 90 GHz channel. We take specifications from the SPTPol overview paper (Austermann et al., 2012): polarization noise of for a arcmin pixel.

The Atacama Cosmology Telescope is also performing polarization surveys (Niemack et al., 2010; Naess et al., 2014). The final forecast sensitivities are similar to SPTPol, so our forecasts for SPTPol should be fairly similar, up to the different field positions that we discuss later.

For our cluster distribution, we assume that all the clusters used in the polarization survey have already been discovered or will soon be discovered by the SZ surveys of Planck and SPT. For Planck, we take the redshift distribution of the 813 clusters with confirmed redshifts from the Planck 2013 SZ catalogue (Planck Collaboration et al., 2013), which consists of 20 redshift bins equally spaced between central values of 0.025 and 0.975. This distribution is plotted in Fig. 6. We then use the X-ray angular sizes of clusters in the smaller Planck Early SZ catalogue (ESZ; (Planck Collaboration et al., 2011)) to deduce typical angular sizes in the redshift bins of the full distribution, by re-binning the ESZ samples assuming constant sizes in each bin. This translates to a roughly constant physical cluster size of in each bin. In the lowest redshift bin, we use the 1- lower limit of the mean size in the re-binned ESZ catalogue, to avoid including artificially large low-redshift clusters that were not present in the ESZ sample. Note that the approximate constancy of physical cluster radii is a statement about the selection function of the survey rather than the whole cluster population, for which the sizes are definitely not constant at different redshifts.

For the SPTPol survey, we take the redshift distribution of (Reichardt et al., 2013) supplemented with an extra 400 clusters expected from the full SPT survey of 2500 (Reichardt et al., 2013). This amounts to 558 clusters in total, distributed in 14 redshift bins plotted in Fig. 7. We assume a constant cluster size of to calculate angular sizes in each bin2.

Typical noise variances, defined by Eq. (34), are then for the Planck-like survey at in the cNFW model, rising to at . In the -model, the corresponding values are and respectively. For SPTPol, in the cNFW model the noise variance is at and at . In the -model the corresponding values are and .

We position clusters on the sky at random in each redshift bin, assuming full sky coverage, , for the Planck-like survey, and a 625 field centred at in celestial coordinates for SPTPol (Austermann et al., 2012). Although the SPTPol survey covers less area than the full SPT-SZ survey, the number of expected clusters is roughly the same, since the mass threshold of SPTPol is lower (Austermann et al., 2012).

Finally, we extract the first few temperature multipole coefficients from the 2013 Planck SMICA map (Planck Collaboration et al., 2013), using the HEALPix3 package, for use in calculating the correlated polarization from Eq. (29). We find that only the are required for convergence in our signal-to-noise calculations. In Fig. 8 we show the resultant correlated polarization magnitude for and . As discussed in Sec. II, higher redshift sources have greater angular structure when observed today due to free-streaming.

### iv.3 Signal-to-noise forecasts

After convolving the optical depth profile of each cluster with the telescope beam profile, we can evaluate the on from Eq. (37). Note that the -weighted noise variance has a simple form in Fourier space due to Parseval’s theorem, allowing for rapid computation of .

For the Planck-like survey, we forecast a on of 0.03 for the cNFW model, and 0.06 for the -model. The higher value for the -model is due to the higher gas density in the cluster outskirts carried by this model, which carry more weight in the calculation of due to these regions making up most of the angular extent. By comparison, the higher central density of the cNFW profile is diluted somewhat by the poor resolution of Planck.

For SPTPol, we forecast a of 0.28 for the cNFW model and 0.48 for the -model. The superior resolution of SPT allows much more independent information to be extracted from the clusters, and is particularly helpful for highly concentrated clusters since this is where the signal is greatest.

Note that in both experiments, the cluster redshifts are sufficiently low that the noise variance term dominates over the cosmic variance part . Given enough clusters, the would eventually be limited by , but one can always get around this by including clusters at higher redshift, which provide independent realizations of the uncorrelated polarization signal.

The cosmic-variance limited (obtained by switching off instrumental noise) is 35.6 for the Planck clusters and 27.4 for SPTPol, in the cNFW model. In this case, the variance from the uncorrelated polarization, , limits the . Since Planck has superior sky coverage to SPT, its clusters cover more uncorrelated patches of and hence beat down the cosmic variance to a greater extent. For SPT, this is partly mitigated by the fact that the survey is deep, and high-redshift clusters contribute to higher multipole moments in the signal power spectrum, which can be probed by a smaller field of view.

Because of its finite sky coverage, the signal that SPT can measure will either benefit or suffer from the variation of the polarization field over the sky. To understand this variation, consider a cluster located at . The polarization generated is then a projection of the local temperature quadrupole. We may represent the quadrupole using a symmetric trace-free tensor , such that . Using the spherical-harmonic expansion of this field, we can express the quadrupole coefficients in terms of the elements of . Using the spin-weighted spherical harmonic expansion of Eq. (11), we can express the polarization field in terms of 4. Following this procedure, it is straightforward to find that the magnitude of the polarization field has two maxima on opposite sides of the sky (the quadrupole has even parity), aligned with the normal to the plane containing the temperature quadrupole maxima and minima (we assume that all the eigenvalues of are different and non-zero, which is true for our Universe). For explicit forms of the quadrupole polarization, see (Sazonov and Sunyaev, 1999). The direction of maximal polarization is an eigenvector of , and intersects the unit sphere at and in Galactic coordinates, using the coefficients from the Planck SMICA map. For clusters at , these would be the ‘sweet-spots’ of the polarization field. These sweet-spots are discernible for the low-redshift source in Fig. 8, although free-streaming has moved the maxima somewhat from their pure-quadrupole locations.

We have seen that scattering at higher redshift transfers power to higher multipoles, as well as generating polarization uncorrelated with the local temperature. By varying the location of the centre of the SPT field across the sky, and keeping the redshift distribution fixed, we find that the maximum of the is at in Galactic coordinates. In celestial coordinates the maximum is at . Note that there is only one maximum, due to the contribution of odd multipoles that spoils the even parity of the quadrupole field. Centering the SPT field on this direction gives a of 0.32, assuming all other SPT survey parameters are fixed and using the cNFW model. The SPTPol field location therefore provides roughly 90% of the maximum achievable with the SPT cluster redshift distribution and survey size.

How would the change if we centred on the location of the ACTPol field? Using the ACT sky coverage map shown in Fig. 3 of (Niemack et al., 2010), the maximum spot should be observable, and lies in a part of the sky already observed with ACT in 2008. We would therefore expect ACT to be able to capture all the available if its redshift distribution is similar to that of SPT. Note that neither the ACT deep fields (D1–D6 used in the recent power spectrum analysis (Naess et al., 2014), which are all equatorial) or the proposed wide field (which overlaps with BOSS) actually include the polarization ‘sweet-spot’, although it is observable with ACT (and at higher elevation than for SPT).

We can also investigate the redshift-dependence of our constraints. In Fig. 9 we plot the for each experiment and each cluster model, assuming that all clusters in each survey are located in a single redshift bin. The curves in this plots represent a trade-off between the size of the signal and the resolution of the survey. All the clusters in the surveys are approximately the same physical size, so those at lower redshift appear larger and are hence better resolved. For this reason, the increases when all the clusters are at low redshift. At higher redshift, the noise variance increases due to the finite resolution of the beam. This is particularly acute for the Planck-like survey, with its resolution, where the decreases significantly with redshift. However SPT can still resolve these high-redshift clusters quite well, and so the actually increases due to the increasing gas profile normalization, which scales as . Figure 9 makes it clear that SPT could measure the quadrupole polarization with some significance if all the clusters were at low redshift.

Finally, we can consider the improvement made when local -mode CMB polarization is included as a prior in addition to the local temperature. Since the signal is large-scale, we are essentially conditioning our analysis on measurement of the large-scale polarization. We optimistically ignore Galactic foreground emission, assuming this can be cleaned with multi-frequency observations. Using ensemble averages for the local temperature and -mode covariances when calculating the S/N, in the cNFW model, the for the Planck-like survey improves by 7%, and for SPTPol the improvement is 20%. In the -model, the improves by 9% and 22% for Planck and SPTPol respectively. The gains are marginal due to much of the correlation of the signal with the local -mode coming from high-redshift clusters not present in the surveys (see Fig. 3). Note that in obtaining these results, we have neglected the contaminating contribution of the -mode to the observed polarization towards clusters. Hence, we do not include large-scale -mode measurements in the calculation at this stage, since this systematic must be properly accounted for (see Sec. VI).

## V Bias and variance from the kinetic polarization

In this section we investigate the consequences of ignoring the kinetic part of the polarization signal, sourced by the relative motion of the cluster with respect to the CMB rest-frame. All numerical results in this section assume a cNFW gas profile for the clusters.

Suppose the cluster has transverse velocity components and with respect to the line of sight5. The polarization due to the kinetic term in a direction observed at frequency is

 (Q±iU)(^n)=−τ(^n)10T0(x2cothx2)(Vθ∓iVϕ)2, (42)

where is the dimensionless frequency, is the CMB temperature, and we have set the speed of light . Note the distinctive frequency dependence, which can allow this term to be separated from the quadrupole signal.

If we neglect the contribution of the kinetic term, our estimate of from each cluster will be biased. However, since we are averaging over many clusters, these biases cancel to a degree.

Defining as the vector containing the -demodulated kinetic polarization at each cluster location, the contribution to from the kinetic polarization is given by

 ^αK=Re[p†c(C^p+Cu)−1pK]p†c(C^p+Cu)−1pc. (43)

Of course, we do not know the transverse velocities of our cluster sample a priori. If we were to average over realisations of the velocity field, what would be the mean bias in our estimator? Naively, this is given by the ensemble average of , which vanishes upon inspection of Eq. (42). However, we must also take into account that the average is conditioned on both the value of our local temperature anisotropies and the cluster number density (see discussion in Sec. III.3).

Firstly we will consider the expected bias conditional on the local temperature anisotropies neglecting the fact that clusters form in overdensities for now. We can quantify the magnitude of the bias in a straightforward manner assuming linear perturbation theory6 To proceed, it will prove useful to expand the spin quantity in spin spherical harmonics

 (Vθ±iVϕ)(^n)=∑lm±Vlm(r)±1Ylm(^n). (44)

We derive the multipole coefficients in Appendix B; the answer is

 ±Vlm(r)=±4πil√l(l+1)∫d3k(2π)3/2v(k,r)jl(kr)krY∗lm(^k), (45)

where we have written the three-dimensional velocity field as a pure gradient , since we only consider scalar perturbations. This ensures that the multipole coefficients have electric parity. Note that this is a pure dipole at . Linear perturbation theory may now be used to compute the velocity potential from the initial curvature fluctuation.

We can now calculate the mean values of each multipole coefficient given knowledge of the local CMB temperature anisotropies. This is simply given by the part of correlated with the CMB

 ⟨±Vlm|alm⟩=⟨±Vlma∗lm⟩CTTlalm. (46)

Note that averaging over realizations of our local CMB results in zero bias, as can be seen from the form of Eq. (43). Using Eq. (46), we can compute the conditional velocity average, and then use Eq. (43) to compute the bias. There is no bias from the uncorrelated component of the velocity by statistical isotropy, although it does add additional variance (see below). For both Planck and SPT, we find that the result is small, of order . This is at least partly due to the smallness of the correlation between the CMB and the , which we find to be roughly 5% at at , rising to 30% at .

There are two reasons for the smallness of the CMB-velocity correlation. Firstly, as mentioned above, the velocity field is a pure dipole at , whereas the CMB scales that contribute to the sum in Eq. (43) are purely quadrupolar at , giving zero cross-correlation. Thus the bias is suppressed for low-redshift clusters. Secondly, the dominant contribution is from the ISW effect. This contributes most power to at wavenumbers between and for . In contrast, the velocity power spectrum peaks at smaller scales, roughly at , such that the integral over wavenumber is suppressed. The Bessel function in Eq. (45) ensures a greater overlap in scales at high redshift for low , which explains why the correlation is higher at than at , but this is somewhat offset by the sharp drop in the velocity power spectrum at these scales.

Finally we note that a marginally better estimate of the velocity field could be made by using all the information contained in the CMB rather than the just the largest angular scales. Small angular scales in can survive the correlation with the large-scale polarization term in Eq. (43) since the kinetic polarization is proportional to the velocity squared, which results in mode coupling. However, the CMB-velocity correlation has most power on large angular scales for the redshift range under consideration, so we lose little by conditioning on only the large-scale CMB temperature anisotropies.

We now briefly return to the issue that we only sample at cluster locations, which are a biased tracer of large-scale structure. We shall simply neglect this effect since to generate a significant kinetic bias from further conditioning on the cluster number density would require a large asymmetry between the components of the velocity field as a result of our given realisation of the density field. To the extent that such an asymmetry is small, we can neglect the effect. In summary, we expect that neglecting the kinetic term should add negligible bias to our estimator.

We can also calculate the extra variance in the estimated due to the kinetic polarization. This amounts to calculating the variance of Eq. (43) assuming its mean vanishes. This procedure is straightforward if we assume that the velocity field is Gaussian distributed, since we can then use Wick’s theorem to express the variance in terms of the 2-point function of . The correlation function should be conditioned on our local temperature anisotropies, such that only the uncorrelated velocity component adds variance. However, as discussed above, the correlation is weak at the relevant redshifts, so we approximate by using the correlation function of the total velocity field. This is given by

 ⟨(¯Vθ+i¯Vϕ)(^n1;r)(¯Vθ∓i¯Vϕ)(^n2;r′)⟩=±∑l2l+14πξVl(r,r′)dl1±1(β), (47)

where , and an overbar denotes the quantity rotated onto the geodesic connecting the two points, as described in Sec. II. For reference, we find that the r.m.s. three-dimensional velocity at is roughly . We calculate the extra variance in the estimate of to be roughly for the Planck-like survey and for SPT, i.e., completely negligible. This may be understood as being due to the disparity in the relevant scales: it is difficult for the velocity field of the cluster sample to mimic a very large scale almost-quadrupolar polarization signal. This is likely the reason why the variance is higher for SPT, since the field of view is smaller such that the coherence scale of the velocity field is a greater fraction of the total observable area. Indeed, when we switch off correlations between cluster velocities, the variance due to the kinetic term falls by 70% for SPT as compared to 30% for Planck, suggesting large-scale coherent flows are most detrimental to measuring the quadrupole polarization.

Finally we note that any residual bias and variance from the kinetic effect may be mitigated by making use of the distinctive frequency dependence of this signal. It may even be possible to separate out the kinetic term and hence measure the transverse velocity field. Recently, (Roebber and Holder, 2013) have proposed to measure the kinetic polarization signal from local bulk flows, finding that the effect may be measurable for large enough velocities with near-term experiments. We leave these considerations to a future work.

## Vi Bias and variance from the background CMB

A further complication to our analysis comes from the fact that we observe the total polarization towards a cluster, which includes a contribution from the background CMB itself. This term cannot be separated with multi-frequency data since it has the same spectrum as the quadrupole signal.

The background CMB polarization may be expressed in terms of large-scale -modes that are coherent over the cluster, and sub-cluster scale modes. Both of these will bias our measurements if not properly accounted for.

Firstly consider the large-scale -modes. The bias can be quantified in a straightforward manner, and is given by

 B(^α)=⟨Re[p†c(C^p+Cu)−1^pE]p†c(C^p+Cu)−1pc⟩, (48)

where the average is conditional on the local temperature anisotropies. The vector contains the output of the -demodulation procedure on the background -mode for each cluster, with two entries per clusters (). For the th cluster, this is given by

 ^p(j)E=τ⊤(j)N−1E(j)τ⊤(j)N−1τ(j)=E(j)∑iτi∑iτ2i, (49)

where is the background CMB polarization towards the th cluster. We have assumed that these large-scale modes are constant over the cluster such that the vector , whose elements are the background polarization in each cluster, is given simply by , where is a vector with each component equal to unity. Note that the expectation value of does not vanish when conditioned on the local CMB temperature.

Evaluating Eq. (48) shows that the bias from these modes is significant, equal to 12.8 for the Planck-like survey and 20.7 for SPT, assuming the cNFW model. This could have been anticipated, since the background -mode is sourced by scattering of the quadrupole at reionization on these scales. The power in the quadrupole has not changed significantly between and cluster redshifts (see Fig. 4), so the magnitude of the bias is of order the ratio of the optical depths to reionization and cluster scattering, i.e., of order . As well as adding a bias, large-scale -modes increase the variance on , which is given approximately by the variance of the quantity inside the angle brackets in Eq. (48). This degrades the by a factor of roughly 20% for Planck, and almost 90% for SPT. Note that there is also a small correlation between the signal and the -mode background, which is proportional to . This should further reduce the extra scatter from the background, since upward fluctuations in background polarization are accompanied by upward fluctuations in the signal since is positive.

Fortunately, the large-scale -mode background may be removed from the analysis by the following simple construction. When demodulating the optical depth across each cluster to construct the cluster polarization signal, we additionally marginalise over a constant background. This is equivalent to replacing the inverse of the pixel noise covariance matrix according to where . The estimator of the projected quadrupole at the cluster is then

 ^p=τ⊤(I−^t^t⊤)dτ⊤(I−^t^t⊤)τ=1Npix[∑i(τi−¯τ)di¯¯¯¯¯τ2−¯τ2], (50)

where and