5\times2pt Methodology

# Dark Energy Survey Year 1 Results: Methodology and Projections for Joint Analysis of Galaxy Clustering, Galaxy Lensing, and CMB Lensing Two-point Functions

## Abstract

Optical imaging surveys measure both the galaxy density and the gravitational lensing-induced shear fields across the sky. Recently, the Dark Energy Survey (DES) collaboration used a joint fit to two-point correlations between these observables to place tight constraints on cosmology (DES Collaboration et al., 2017). In this work, we develop the methodology to extend the DES Collaboration et al. (2017) analysis to include cross-correlations of the optical survey observables with gravitational lensing of the cosmic microwave background (CMB) as measured by the South Pole Telescope (SPT) and Planck. Using simulated analyses, we show how the resulting set of five two-point functions increases the robustness of the cosmological constraints to systematic errors in galaxy lensing shear calibration. Additionally, we show that contamination of the SPT+Planck CMB lensing map by the thermal Sunyaev-Zel’dovich effect is a potentially large source of systematic error for two-point function analyses, but show that it can be reduced to acceptable levels in our analysis by masking clusters of galaxies and imposing angular scale cuts on the two-point functions. The methodology developed here will be applied to the analysis of data from the DES, the SPT, and Planck in a companion work.

12

Last updated March 6, 2018

## 1 Introduction

Modern optical imaging surveys measure the positions and gravitational lensing-induced shears of millions of galaxies. From these measurements, one can compute two fields on the sky: the spin-0 galaxy overdensity field, , and the spin-2 weak lensing shear field, . Two-point cross-correlations between these fields are powerful cosmological probes, as they are sensitive to both the geometry of the Universe and the growth of structure. Joint fits to multiple two-point correlations — such as and  — offer the possibility of breaking degeneracies between cosmological and nuisance parameters, as well as significantly improving cosmological constraints (e.g. Hu & Jain, 2004).3 Such joint fits have recently been demonstrated in several works (Nicola et al., 2016; Kwan et al., 2017; van Uitert et al., 2017; DES Collaboration et al., 2017). We refer to the set of three two-point functions that can be formed from and  — namely , , and  — as 32pt. The 32pt analysis of the DES Collaboration et al. (2017) presented the tightest cosmological constraints to date on and from a single galaxy survey data set, demonstrating the power of such joint two-point correlation analyses.

High resolution, low-noise observations of the cosmic microwave background (CMB) have recently enabled mapping of gravitational lensing of the CMB, typically quantified via the lensing convergence, . While it is possible to convert a map of the convergence to shear, doing so is not necessary for this analysis. Two-point functions that correlate with the and fields also contain cosmological information (Hand et al., 2015; Liu et al., 2016; Kirk et al., 2016; Harnois-Déraps et al., 2017; Giannantonio et al., 2016). Jointly fitting and with the 32pt cross-correlations serves several purposes. First, the joint fit helps improve cosmological constraints by breaking degeneracies with galaxy bias (e.g. Baxter et al., 2016). Second, the joint fit can constrain nuisance parameters associated with sources of systematic error in galaxy lensing measurements (e.g. Vallinotto, 2012; Das et al., 2013; Baxter et al., 2016; Schaan et al., 2017). This is possible because the sources of systematic error that affect the measurement of are generally different from those impacting the measurement of . Finally, because the CMB originates from high redshift, including the correlations extends the redshift lever arm of the two-point function measurements (e.g. Das & Spergel, 2009).

The South Pole Telescope (SPT) (Carlstrom et al., 2011) and Planck (Tauber et al., 2010; Planck Collaboration et al., 2011) provide high signal-to-noise maps of the CMB overlapping with the DES survey, allowing for the joint measurement of all six of the two-point functions that can be formed from , and . We will refer to the combination of all six two-point functions as 62pt, and the combination of all two-point functions except for as 52pt.

In this work, we develop the methodology for jointly analyzing the 52pt set of correlation functions. This methodology will be applied to measurements of the 52pt two-point functions using data from DES, SPT and Planck in a companion paper, extending the 32pt analysis of DES Collaboration et al. (2017). We do not include in the analysis presented here because the current highest signal-to-noise measurement of comes from Planck Collaboration et al. (2016a). Since the Planck  map covers the full sky, the covariance between measured by Planck and set of 52pt correlations involving current SPT and DES Y1 data (which overlap over roughly 1300 sq. deg. on the sky) is negligible. Therefore, cosmological constraints from the Planck measurement of can be trivially combined with those from the 52pt analysis by taking the product of the corresponding posteriors. For future DES and SPT data, the improved signal-to-noise of the measurements may necessitate revisiting the approximation of negligible covariance between the Planck measurement of and the DES and SPT measurements of 52pt.

The analysis presented here builds on the methodology presented in Krause et al. (2017) (hereafter K17) for analyzing the 32pt data vector. The most significant difference between this work and that of K17 is that we must account for sources of systematic error that are specific to the cross-correlations with . Of these systematics, the most problematic is contamination of by the thermal Sunyaev-Zel’dovich effect (tSZ). In the context of measurements of the CMB lensing autospectrum, the effect of tSZ (and other potential contaminants) on has been investigated previously by e.g. van Engelen et al. (2014). We develop an approach for estimating the effects of such contamination on and , and use these estimates to determine an appropriate choice of angular scale cuts to apply to the two-point function measurements to minimize tSZ-induced bias.

After developing the methodology for analyzing the 52pt data vector, we use simulated likelihood analyses to demonstrate how adding the cross-correlations with to the 32pt analysis can improve cosmological constraints and can potentially allow for the self-calibration of nuisance parameters that are degenerate with cosmology in the 32pt analysis. While the currently low signal-to-noise of the and correlation functions limits their cosmological constraining power, we show that including them in the joint analysis can make the cosmological constraints more robust to multiplicative shear biases.

This work builds on several recent DES collaboration papers that analyze two-point functions of DES observables. These include the analysis of cosmic shear (Troxel et al., 2017), the analysis of galaxy clustering (Elvin-Poole et al., 2017), the analysis of galaxy-galaxy lensing (Prat et al., 2017), and the joint analysis of all three two-point functions in DES Collaboration et al. (2017).

The layout of the paper is as follows. In §2 we describe the datasets used in this work; in §3 we describe the modeling steps required to compute a likelihood for the observed two-point functions given a cosmological model; in §4 we describe our procedure for characterizing systematic biases in and that are specific to the maps; in §5 we describe the motivation for our choice of angular scale cuts. We present results from simulated analyses in §6 and conclude in §7.

## 2 Data

This work presents the methodology for analyzing the two-point functions formed between , and . For the most part, developing this methodology does not rely on analyzing any actual data. However, in §4, we will take a data-driven approach to characterizing biases in and due to contamination of the maps. For that part of the analysis, we rely on exactly the same galaxy and shear catalogs used in the DES 32pt analysis (DES Collaboration et al., 2017). Below, we briefly describe these catalogs and refer readers to the listed references for more details.

We consider measurements of and in configuration space, i.e. as a function of the angle between the two points being correlated. Measuring and requires two sets of galaxies, which we refer to as ‘lenses’ and ‘sources.’ Lenses are treated as tracers of the matter density field and are used to measure ; images of the source galaxies are used to measure the gravitational lensing-induced shears, . The lens and source galaxies are in turn divided into multiple redshift bins.

### 2.1 Galaxy catalog

For the purposes of measuring , we use a subset of the DES Y1 ‘Gold’ catalog (Drlica-Wagner et al., 2017) referred to as redMaGiC (Rozo et al., 2016). The redMaGiC galaxies are a set of luminous red galaxies (LRGs) selected based on their match to a red sequence template, which is calibrated via the redMaPPer galaxy-cluster-finding algorithm (Rykoff et al., 2014; Rozo et al., 2016; Rykoff et al., 2016). The redMaGiC galaxies are designed to have very well understood photometric redshift estimates, with a scatter of (Elvin-Poole et al., 2017). As in K17, the redMaGiC galaxies are divided into 5 redshift bins at , where the three lower redshift bins have a luminosity threshold of and the two higher redshift bins have luminosity thresholds and , respectively. For a more detailed description of the galaxy sample, see also Prat et al. (2017); Elvin-Poole et al. (2017).

### 2.2 Shear catalog

For the purposes of measuring , we use the same shear catalogs used in the 32pt analysis. Two shear measurement algorithms – MetaCalibration (Huff & Mandelbaum, 2017; Sheldon & Huff, 2017) and Im3shape (Zuntz et al., 2013) – were used to generate the galaxy shear catalogs that were used in the 32pt analysis, while the MetaCalibration catalog was used as the fiducial catalog due to its higher signal-to-noise. MetaCalibration uses the data itself to calibrate the bias in shear estimation by artificially shearing the galaxy images and re-measuring the shear. Im3shape, on the other hand, invokes a large number of sophisticated image simulations to calibrate the bias in shear estimates. As in K17, the shear catalogs were divided into four redshift bins between and 1.3. For a detailed description of both shear catalogs, see Zuntz et al. (2017). For details of the photo-z catalog associated with the shear catalogs, see Hoyle et al. (2017). The analysis presented in this work adopts noise estimates and redshift distributions corresponding to the MetaCalibration catalog.

### 2.3 CMB lensing map

The methodology presented here is general and could be applied to any map of . However, in order to accurately characterize the magnitude of biases in , we tailor our analysis to the maps that will be used in the companion paper that presents cosmological constraints obtained from analysis of the 52pt data vector. That work will use the maps from Omori et al. (2017) (henceforth O17) and so we briefly describe those maps here.

The map generated in O17 is computed by applying the quadratic lensing estimator of Hu & Okamoto (2002) to an inverse variance weighted combination of 150 GHz SPT and 143 GHz Planck temperature maps. The quadratic estimator of Hu & Okamoto (2002) exploits the fact that gravitational lensing induces a correlation between the gradient of the CMB temperature field and small-scale fluctuations in this field. A suitably normalized quadratic combination of filtered CMB temperature maps then provides an estimate of . The SPT maps used for this purpose are from the SPT-SZ survey (Story et al., 2013). The combined map produced from the SPT and Planck datasets is sensitive to a greater range of angular modes on the sky than either experiment alone: Planck cannot measure small scale modes because of its 7’ beam (at 143 GHz), while SPT cannot measure large scale modes because of time domain filtering that is used to remove atmospheric contamination.

The map from Omori et al. (2017) is restricted to the area of sky that is observed by both SPT and Planck . The overlap of this region with the DES Y1 survey region is approximately 1300 sq. deg.

## 3 Modeling the two-point functions

### 3.1 Formalism

We begin by describing the formalism used to model the 52pt set of correlation functions. This methodology closely follows that described in K17 to model the 32pt data vector. We consider exactly the same galaxy selections, and make many of the same modeling assumptions. To minimize repetition, in this work, we focus only on describing the modeling of those correlations that involve (i.e. and ); for a complete description of the modeling of the other two-point functions (i.e. , , and ), we refer readers to K17.

When measuring the correlation between DES shears and , we consider only the component of the shear that is oriented orthogonally to the line connecting the two points being correlated, i.e. the tangential shear, . In the weak shear limit, this tangential component contains all the lensing signal (Bartelmann & Schneider, 2001). Using has the advantage of reducing contamination from additive systematics in the shear estimation and avoiding mask effects during the conversion from to (Harnois-Déraps et al., 2016). Henceforth, we will denote this correlation as .

We begin by computing the cross-spectra between the relevant fields in harmonic space using the Limber approximation (Limber, 1953). For computing , it is convenient to first express this cross-correlation in terms of lensing convergence, rather than shear, and then transform to shear when expressing the correlation function in configuration space. The lensing convergence, , in some direction specified by , is defined by

 κ(^θ,χs)=3ΩmH202c2∫χs0dχ′χ′(χ−χ′)χδ(^θ,χ′)a(χ′), (1)

where is comoving distance (with being the comoving distance to the source plane), is the Hubble constant today, is the matter density today, is the matter overdensity, and is the scale factor (Bartelmann & Schneider, 2001). We refer to the lensing convergence defined for the source galaxies as (in contrast to the CMB-derived lensing convergence, ). For galaxy lensing, the sources are distributed across a broad range of redshift and the convergence must be averaged across this distribution. In this case, the convergence for source galaxies in the th redshift bin becomes

 κis(^θ)=∫∞0dχ′qiκs(χ′)δ(^θ,χ′), (2)

where we have defined the lensing weight as

 qiκs(χ)=3ΩmH202c2χa(χ)∫∞χdχ′nis(z(χ′))dzdχ′¯nisχ′−χχ′, (3)

where the number density of the source galaxies as a function of redshift, and is the average of that quantity over redshift. Since the CMB originates from a very narrow range of comoving distance, we can approximate the source redshift distribution of the CMB as a Dirac -function centered on the comoving distance to the last scattering surface, . In this case, the lensing weight function for CMB lensing becomes

 qκCMB(χ)=3ΩmH202c2χa(χ)χ∗−χχ∗. (4)

The overdensity of galaxies on the sky in the th redshift bin can also be related to an integral along the line of sight of the matter overdensity, assuming the galaxy bias is known. Following DES Collaboration et al. (2017), we restrict our analysis to the linear bias regime, where the galaxy overdensity can be expressed as , where is the galaxy bias. In this case, the projected overdensity of galaxies on the sky is

 δig(^θ)=∫dχ′qiδg(χ′)δ(^θ,χ′), (5)

where we have defined the lens galaxy weight function as

 qiδg(χ)=big(χ)nig(z(χ))¯nigdzdχ, (6)

where is the number density of the lens galaxies in the th bin as a function of redshift. We will further simplify the bias modeling such that the bias for each galaxy redshift bin is assumed to be a constant, . In reality, the linear bias model is known to break down at small scales (Zehavi et al., 2005; Blanton et al., 2006; Cresswell & Percival, 2009). We will show in §5 that for our choice of angular scale cuts, the assumption of linear bias does not bias our parameter constraints.

Using the Limber approximation, we have

 CκsκCMB(ℓ)=∫dχqiκs(χ)qκCMB(χ)χ2PNL(ℓ+1/2χ,z(χ)), (7)

and

 CδgκCMB(ℓ)=∫dχqiδg(χ)qκCMB(χ)χ2PNL(ℓ+1/2χ,z(χ)), (8)

where labels the redshift bin (of either the lens or source galaxies) and is the nonlinear matter power spectrum. We compute the nonlinear power spectrum using the Boltzmann code CAMB4 (Lewis et al., 2000; Howlett et al., 2012) with the Halofit extension to nonlinear scales (Smith et al., 2003; Takahashi et al., 2012) and the Bird et al. (2012) neutrino extension.

SPT and Planck observe the CMB with finite-size beams. When generating the map, this beam is deconvolved, exponentially increasing noise at small scales. Unfortunately, the presence of small-scale noise in will make the real-space covariance diverge. To prevent this divergence, we apply a smoothing function to the maps. We convolve the maps with a Gaussian beam having full width at half maximum of . In harmonic space, this corresponds to multiplication of the maps by

 B(ℓ)=exp(−ℓ(ℓ+1)/ℓ2beam), (9)

where . Additionally, we filter out modes in the map with and , where the lower bound is to avoid the potential contamination coming from the mean-field calibration in the lensing map (Omori et al., 2017) and the upper limit is imposed to remove potential biases due to foregrounds in the map. The impact of this filtering can be seen in Fig. 1.

Converting the above expressions to configuration-space correlation functions via a Legendre transform yields

 wγitκCMB(θ) = ∫dℓℓ2πF(ℓ)J2(ℓθ)CκsκCMB(ℓ), (10) wδigκCMB(θ) = ∑2ℓ+14πF(ℓ)Pℓ(cos(θ))CδigκCMB(ℓ), (11)

where is the second order Bessel function of the first kind and is the th order Legendre polynomial. The appearance of in Eq. 10 is a consequence of our decision to measure the correlation of with tangential shear. The function , where is a step function, describes the filtering that is applied to the map. Henceforth, for notational convenience, we will suppress the redshift bin labels on the correlation functions. We show the model and corresponding to the best-fit Planck cosmological parameters in Fig. 1.

### 3.2 Modeling systematics affecting δg and γ

There are several sources of systematic uncertainty that affect the and observables. These systematics will propagate into the and measurements. We model these sources of systematic error exactly as described in K17, and so provide only a brief description here. We will consider sources of systematic error that can affect the map in more detail in §4.

#### Shear calibration bias

The inference of from an image of a galaxy is subject to sources of systematic error. Such errors are commonly parameterized in terms of a multiplicative bias, , such that the observed shear is related to the true shear by (e.g. Zuntz et al., 2017). While additive biases may also be present in shear calibration, these are typically tightly constrained by the data itself (and are minimized by our decision to use the tangential shear component).

Following K17 and other literature (Abbott et al., 2016; Joudaki et al., 2017; Hildebrandt et al., 2017), we adopt a separate multiplicative bias parameter, , for the th source galaxy redshift bin. The model for (Eq. 10) is then scaled by . Note, however, that does not depend on the estimated shears and is therefore unaffected by .

#### Intrinsic alignment

In addition to the coherent alignment of galaxy shapes caused by gravitational lensing, galaxy shapes can also be intrinsically aligned as a result of e.g. tidal fields (Heavens et al., 2000; Catelan & Porciani, 2001; Crittenden et al., 2001). Such intrinsic alignments constitute a potential systematic for the measurement of gravitational lensing from galaxy shapes. Intrinsic alignments of galaxies will also affect (Hall & Taylor, 2014; Troxel & Ishak, 2014). To see this, consider a galaxy that is stretched by the tidal field of nearby large scale structure; the same large scale structure that causes this intrinsic alignment will also lens the CMB, leading to a correlation between the intrinsic galaxy shapes and . This effect is analogous to the usual gravitational-intrinsic (GI) term affecting (Hirata & Seljak, 2004). Following K17, we parameterize the effects of intrinsic alignments using the nonlinear linear alignment (NLA) model (Bridle & King, 2007). This model impacts for the source galaxies as described in K17.

Briefly, we perform the replacement

 qiκs(χ)→qiκs(χ)−A(z(χ))nis(z(χ))¯nisdzdχ, (12)

where

 A(z)=AIA,0(1+z1+z0)αIA0.0139ΩmD(z), (13)

where is the linear growth factor and we set . The normalization and power law scaling with redshift, are treated as free parameters of the model.

#### Photometric redshift errors

DES uses multiband optical photometry to infer the redshift distributions of the galaxy samples (it is these distributions that are necessary for modeling the 52pt set of correlation functions). This inference is potentially subject to sources of systematic error, which can result in biases to and . Following K17 and other literature (Abbott et al., 2016; Joudaki et al., 2017; Hildebrandt et al., 2017), we parameterize such biases in terms of the shift parameters, , such that the estimated redshift distribution, is related to the true redshift distribution, , via . We consider separate shift parameters for each lens and source galaxy redshift bin, and , respectively, where the superscript labels the redshift bin.

### 3.3 Covariance

The DES 32pt analysis uses a halo model covariance, as described and validated in K17. We now describe the extension of this formalism to the CMB lensing cross-correlations and . For notational convenience, we will use and to generically represent one of these two-point functions in configuration and harmonic space, respectively; we will use and to represent one of the 32pt correlation functions (i.e. , , and ) in configuration and harmonic space, respectively. We calculate the covariance of the harmonic space correlation functions, as the sum of a Gaussian covariance and non-Gaussian covariance , which includes super-sample variance (Takada & Hu, 2013), as detailed in Krause & Eifler (2017) and Schaan et al. (2017), using the halo model to compute the higher-order matter correlation functions. The covariance of the and is then

 Cov(Σi(θ),Σk(θ′))=∫dℓℓ2πJn(Σi)(ℓθ)F(ℓ)∫dℓ′ℓ′2πJn(Σk)(ℓ′θ′)F(ℓ′)[CovG(Σi(ℓ),Σk(ℓ′))+CovNG(Σi(ℓ),Σk(ℓ′))], (14)

where is the th-order Bessel function of the first kind, and is the function that describes the filtering that is applied to the map. The cross-covariance between and with one of the DES 32pt correlation functions is given by

 Cov(Σi(θ),Ξk(θ′))=∫dℓℓ2πJn(Σ)(ℓθ)∫dℓ′ℓ′2πJn(Ξ)(ℓ′θ′)[CovG(Σi(ℓ),Ξk(ℓ′))+CovNG(Σi(ℓ),Ξk(ℓ′))], (15)

where the order of the Bessel function is given by for , , and , by for and , and by for .

### 3.4 Likelihood analysis

We now build the likelihood of the data given the model described in §3.1 and the covariance described in §3.3. The model includes parameters describing cosmology, galaxy bias, intrinsic alignment, and shear and photo-z systematics. The cosmological model considered in this analysis is flat CDM. The cosmological parameters varied are the present day matter density parameter, , the normalization of the primordial power spectrum, , the spectral index of the primordial power spectrum, , the present day baryon density parameter, , and the Hubble parameter today, . The complete set of model parameters is summarized in Table 1. For the simulated likelihood analyses described below, we generate a data vector at a fiducial set of model parameters given by the middle column of Table 1. The priors imposed in our fiducial likelihood analysis are given in the third column of Table 1; these priors are identical to those of the 32pt analysis of DES Collaboration et al. (2017).

For the purposes of this analysis, we keep the cosmological density of neutrinos fixed to , corresponding to a total neutrino mass of 0.06 eV. This choice is reasonable since the DES Collaboration et al. (2017) analysis only weakly constrains the neutrino mass, and the 52pt analysis does not significantly improve on these constraints.

Given a point in parameter space, , we consider a Gaussian likelihood for the 52pt observable, :

 L(d|p)∝exp[−12∑ij(di−mi(p))[C−1]ij(dj−mj(p))], (16)

where is the model vector, the sum runs over all elements of the data vector, and is the covariance matrix described in §3.3. As in K17, we keep the covariance matrix fixed as a function of cosmological parameters. This ignores the cosmology-dependence of the covariance matrix (Morrison & Schneider, 2013; Eifler et al., 2009), which is negligible compared to the noise level in the DES Y1 and SPT data.

The computation of the model vector and the likelihood analysis is performed in CosmoSIS (Zuntz et al., 2015). We sample parameter space using the multinest algorithm (Feroz et al., 2009). The multinest sampler has been tested in K17 to yield results consistent those of another sampler, emcee (Foreman-Mackey et al., 2013), which relies on the algorithm of Goodman & Weare (2010).

## 4 Biases in the κCMB maps

### 4.1 Overview

While the systematics considered in §3.2 affect both the 32pt data vector and the 52pt data vector, there are also sources of systematic error that impact only and . In this section, we attempt to quantify biases in the maps that will affect the measurement of these two correlation functions.

We write the observed signal on the sky, , as the sum of the true CMB lensing signal, , and some contaminating field, , i.e. . The observed correlation functions and then differ from the correlation functions with the true by and . To determine these biases, we will form an estimate of and then use the true galaxy and shear catalogs described in §2 to calculate and . However, given the large uncertainties associated with our estimates of , we will not attempt to model or correct for such biases in our analysis. Instead, we will choose angular scale cuts such that biases to the inferred posteriors on the model parameters are below 50% of the statistical errors (see discussion in §5).

The dominant sources of bias that contribute to will depend on the methods and data used to estimate . For instance, a map created from maps of CMB temperature will be affected by bias due to the tSZ effect, while this is not the case for maps constructed from maps of CMB polarization. Here we tailor our analysis to those systematics that are expected to be dominant for the cross-correlation of DES galaxies and shears with the maps generated in O17, since it is these maps that will be used in the forthcoming 52pt results paper.

Both the SPT 150 GHz maps and Planck 143 GHz maps used to construct the maps in O17 receive contributions from sources other than primary CMB. In particular, these maps receive significant contributions from the tSZ effect and from radio and thermal dust emission from distant galaxies. The tSZ effect is caused by inverse Compton scattering of CMB photons with hot electrons. At frequencies near 150 GHz, this results in a decrement in the observed CMB temperature. Unresolved galaxies, which together constitute the cosmic infrared background (CIB), on the other hand, appear as a diffuse background in the observed maps. The tSZ and CIB signals on the sky will propagate through the quadratic estimator into the maps of O17. Since both non-Gaussian sources of contamination are correlated with the matter density, we also expect to be correlated with the matter density. Consequently, these biases will not average to zero in the and correlations, and we must carefully quantify their impact on our analysis. Note that contamination from the kinematic Sunyaev-Zel’dovich (kSZ) effect is also expected to be present in the maps. However, since the kSZ signal has a similar morphology to the tSZ signal, but an amplitude that is a factor of smaller, by ensuring that the tSZ effect does not bias our results, we ensure that the kSZ effect also does not lead to a significant bias.

Our approach to estimating due to both tSZ and CIB is to estimate the contributions to the SPT+Planck temperature maps from these signals, and to then pass these estimated temperature maps through the quadratic estimator pipeline of O17. To see that this procedure works, consider the total temperature at some multipole, , as the sum of the lensed CMB and the contaminating signal: . The quadratic estimator for the lensing potential is then , where . Under the gradient approximation, , where the tilde denotes the unlensed field. In the case of both tSZ and CIB bias, terms of the form average to zero because the unlensed gradient field is uncorrelated with these biases. Therefore, we have , where is the “lensing” potential associated with the contaminating temperature field.

As we will see below, biases in and due to the tSZ effect can be quite large, and dominate over all other biases considered. Since massive galaxy clusters are the largest contributors to the tSZ effect on the sky, the level of tSZ bias in the maps can be reduced by masking these objects. Indeed, O17 masked clusters detected in the SPT maps at high significance via their tSZ decrement before applying the quadratic estimator to the SPT+Planck temperature maps. Although masking regions of high tSZ signal reduces the tSZ-induced bias, it has the undesirable consequence of inducing another bias in the correlation functions, since the regions of high tSZ signal are also regions of high . We will argue below that this bias is negligible given our masking choices.

We emphasize that the approach taken in this section to characterizing biases in the map is quite general, and could be applied to characterize biases present in maps other than that of O17. However, the values of the biases obtained here (in particular the measurement of bias due to tSZ contamination) apply only to the maps of Omori et al. (2017). Maps of generated from other data sets or using different techniques could have significantly different levels of bias.

### 4.2 Estimate of bias due to the tSZ effect

#### Construction of simulated y map

As described above, we estimate tSZ-induced bias in and by correlating the true galaxy and shear catalogs with an estimate of the bias in the map due to tSZ signal, which we refer to as . We estimate by applying the quadratic lensing estimator to an estimated map of the tSZ temperature signal in the SPT+Planck sky maps. In principle, the tSZ temperature signal could be computed directly from the multi-frequency SPT and Planck sky maps. Instead, we take the approach of constructing a simulated map of the tSZ signal by placing mock tSZ profiles at the locations of massive galaxy clusters on the sky. One advantage of using a simulated tSZ map instead of generating one from SPT or Planck temperature maps is that the simulated map will not be affected by noise in the temperature maps, making it possible to characterize the bias with high statistical accuracy. On the other hand, this approach carries some associated modeling uncertainty, which we will attempt to constrain below.

The cluster sample used to generate the simulated tSZ map combines the redMaPPer (Rykoff et al., 2014) cluster catalog from DES Y1 data with samples of tSZ-detected clusters from SPT and Planck. We use redMaPPer clusters with richness , SPT clusters with detection significance (Bleem et al., 2015) and the entire Planck tSZ-detected cluster sample (Planck Collaboration et al., 2016d). Each of these samples probes a different range of mass and redshift. The redMaPPer sample captures low mass clusters, but only over the redshift range of DES. The SPT cluster sample captures only very massive clusters, but out to high redshift. The Planck cluster sample, on the other hand, captures very massive clusters at low redshift which are missed by both SPT and DES.

Of course, there are halos in the Universe that are not detected by redMaPPer, SPT or Planck , but nonetheless contribute to the tSZ signal on the sky. However, halos outside of the DES survey region or at redshifts beyond those probed by DES, will not correlate with DES galaxies and shears, and will therefore not bias the inferred correlation functions (although this tSZ contribution will contribute as noise to the measurements). There are also halos within the DES survey region and redshift range that are not detected by any of these three surveys because their corresponding observables are below the detection limit. The lowest mass halos in our sample come from the redMaPPer catalog. The limiting richness threshold of the redMaPPer catalog that we employ is , corresponding roughly to a mass of assuming the mass-richness relation of Melchior et al. (2017). Using simulations, Battaglia et al. (2012) found that halos with masses contribute half the tSZ power at , with that fraction decreasing towards lower . Consequently, for (the range used to construct the maps from O17), we expect our simulated map to capture the majority of the tSZ power on the sky. There may also be tSZ signal on the sky that is not due to gas in massive halos, i.e. tSZ signal due to diffuse gas. However, again, this contribution is expected to be subdominant to the contribution of the massive halos and would therefore not significantly change the estimated bias in .

To assign tSZ profiles to the redMaPPer and Planck clusters, we first estimate their masses, and then use a model to compute expected tSZ profiles given the estimated masses. For the redMaPPer clusters, the masses are assigned using the mean mass-richness relation of Melchior et al. (2017). For the Planck clusters, the masses are assigned using the estimates constructed by Planck Collaboration et al. (2016d) from the observed cluster tSZ signals. In our fiducial analysis we set the hydrostatic bias parameter to when computing the masses of the Planck clusters. Given the mass estimates for the redMaPPer and Planck clusters, we compute corresponding pressure profiles using the fits from Battaglia et al. (2012). In particular, the thermal pressure profile is written as

 Pth(x)=P200P0(x/xc)γ[1+(x/xc)α]−β, (17)

where and is the radius from the cluster at which the enclosed mass is and the corresponding mean density is . The normalization, is given by

 P200=200GM200cρcrit(z)fb2R200c, (18)

where . The parameters , , , , and in Eq. 17 are related to the cluster mass, , and redshift as described in Battaglia et al. (2012). The pressure profile is then converted to a Compton- profile by integrating along the line of sight,

 y(θ,M200c,z)=σTmec2∫dlPe(√l2+d2Aθ2,M200c,z), (19)

where is the Thomson cross-section, is the electron mass, and the term in the integral is the electron pressure ( is the line of sight distance, is the angular diameter distance and is the angular separation relative to the cluster center). We assume that the electron pressure, , is given by . This relation holds when the hydrogen and helium are fully ionized, and the helium mass fraction is .

Finally, the tSZ temperature signal at frequency is related to via

 ΔT(ν)TCMB=g(hνkBTCMB)y, (20)

where in the limit that the gas is non-relativistic (e.g. Carlstrom et al., 2002).

In contrast to the Planck and DES-detected clusters, for the SPT clusters we have a direct measurement of their tSZ profiles, and so use these measurements rather than modeling the profile through an estimate of the cluster masses. Bleem et al. (2015) performed fits to the observed profiles using the isothermal model (Cavaliere & Fusco-Femiano, 1976), with :

 ΔT(θ)=ΔT0(1+θ/θc)−1, (21)

where is the angular distance to the cluster and and are parameters of the fit. For the SPT-detected clusters, we use these -profile fits to estimate their contribution to the signal on the sky. For any SPT-detected cluster that is also detected by Planck or redMaPPer, we use the SPT measurement of its tSZ profile.

As a test of our simulated tSZ map, the left panel of Fig. 2 shows a comparison of the estimated tSZ temperature profiles around the SPT, redMaPPer and Planck clusters used to generate the tSZ map. For those SPT-detected clusters that are also detected in the redMaPPer and Planck catalogs, we plot the amplitude of the -profile fits at one arcminute from the cluster center against the corresponding amplitudes of the estimated profiles from Eq. 17. We choose to evaluate the profiles at one arcminute because this is roughly the beam scale of the SPT, so we do not expect the -profiles to be well constrained below this scale. The left panel of Fig. 2 makes it clear that the estimated tSZ temperature profiles from Eq. 17 agree well with the direct -profile fits to the observed tSZ signals of the clusters. This agreement is non-trivial: it provides a test of both of the profile model for the simulated tSZ map as well as the mass estimates for both the redMaPPer and Planck clusters.

As another check on the model -profiles, we integrate the simulated profiles for the redMaPPer clusters out to to obtain , and compare these values to the direct measurement of around redMaPPer clusters from Saro et al. (2017). Saro et al. (2017) used a matched filter approach to estimate for redMaPPer clusters detected in DES Science Verification data. We find no evidence for a bias between the simulated and directly estimated for richness . At richness , we find that our model tends to yield higher values, meaning that our model may be somewhat overestimating the effects of tSZ contamination. Note that a similar discrepancy between the measured and predicted profiles was also found by Saro et al. (2017). In that work, it was found that the measured values for clusters with were smaller than predicted based on assumed scaling relations from Arnaud et al. (2010).

As a further test of our simulated tSZ map, we compute the power spectrum of the map and compare the result to measurements of the power spectrum from George et al. (2015) and Planck Collaboration et al. (2016b). This comparison is shown in the right panel of Fig. 2. At , our model yields a tSZ power spectrum that is in excellent agreement with that measured by George et al. (2015). At , we expect the signal on the sky to receive significant contributions from low mass () and high redshift halos () halos. The fact that our simulated tSZ map does not include low-mass, high-redshift halos yet has power at that is as large as the George et al. (2015) measurement suggests we may have somewhat overestimated the contribution to the -signal from the redMaPPer clusters. This explanation is consistent with the finding that our model predicts larger values than measured by Saro et al. (2017) for low richness clusters.

For , the tSZ power spectrum receives a significant contribution from clusters that are detected by Planck, and not by SPT or DES, i.e. high-mass, very low redshift clusters. This can be seen from the fact that when we vary the hydrostatic mass bias parameter used to calculate masses for the Planck clusters, the amplitude of the tSZ power spectrum at low changes significantly. For our fiducial choice of , we somewhat underpredict the tSZ power at low ; for , we somewhat overpredict the tSZ power at low , since this effectively assigns the Planck clusters larger masses, and thus larger tSZ signals. Although Planck Collaboration et al. (2016c) find evidence for , this choice is not well motivated here since we are attempting to invert the SZ-derived masses to obtain an estimate of the corresponding SZ profiles. Consequently, we keep as the fiducial choice for the estimated tSZ map. Note, though, that the amplitude of the inferred bias in and is almost completely insensitive to the value of that is assumed because the clusters that are only detected by Planck are at very low redshift, and hence do not have strong correlations with DES galaxies or shears.

#### Masking clusters to reduce tSZ-induced bias

Since galaxy clusters are sources of large tSZ signals, tSZ contamination of the maps can be reduced by masking these objects. O17 masked clusters detected by SPT with signal-to-noise when applying the quadratic lensing estimator to the SPT+Planck CMB temperature maps. Applying a more aggressive mask prior to the application of the quadratic estimator is problematic because a complicated mask will lead to difficulties with mode coupling.

In tests on the simulated -map, we find that tSZ bias of the map can be further suppressed by masking additional clusters after the reconstruction. This approach works because the application of the quadratic estimator with the filters defined in O17 to a localized tSZ source results in a somewhat-localized signal. Masking clusters post- reconstruction, then, can be used to reduce high- bias in the maps.

Ultimately, the choice of clusters used for masking is set by the two competing desires to (a) reduce bias in and due to tSZ, while (b) ensuring that the bias induced by masking regions of high remains very small (see §4.4 for more discussion of this bias). In tests on the simulated -maps, we find that masking SPT-detected clusters with and redMaPPer-detected clusters with post- reconstruction can reduce the impact of tSZ bias while inducing an acceptable level of bias due to masking. For all masked clusters, the mask radius employed is 5 arcminutes. This choice of masking radius was found to significantly suppress the high bias from the tSZ in tests on simulations, while simultaneously preserving most of the sky area. The masking threshold corresponds roughly to removing clusters with mass (Bleem et al., 2015). The threshold corresponds roughly to removing clusters with assuming the relation from Melchior et al. (2017). The fraction of sky area covered by the cluster mask is less than 1%.

#### Calculation of bias due to tSZ

To estimate , we pass the simulated tSZ temperature map through the estimation pipeline of Omori et al. (2017). We then correlate with the redMaGiC and shear catalogs described in §2.1 and §2.2 to estimate the biases in and .

We measure and in harmonic space using PolSpice5. Fig. 3 shows these bias functions relative to the theoretical expectation for and assuming the fiducial cosmological model shown in Table 1. Although the exact values of the estimated biases are cosmology dependent, we are only attempting to determine the scales over which the tSZ bias is significant. The change in these scales is negligible over the range of cosmological models allowed by the data. The tSZ bias is well described by a multiplicative factor that is a smooth function of multipole, and which exhibits mild redshift dependence. The bias in is negative at scales of , and positive for . The amplitudes of these biases can be quite large, reaching a maximum of roughly 25% for , and even higher for . The tSZ bias in does not exhibit as strong a peak at small scales as , but reaches similar levels of magnitude below .

Since the redMaPPer catalog is complete to only , we expect our estimate of the tSZ- induced bias in the last two redshift bins of and to be incomplete, since these bins receive contributions from structure at . We therefore apply our bias measurements for the third-to-last redshift bin to the higher redshift bins. We expect this approximation to be conservative, since the tSZ bias apparently decreases as a function of increasing redshift, as seen in Fig. 3. This decrease is apparently physical, since the completeness of the redMaPPer and SPT catalogs does not evolve significantly over the redshift range .

We fit the measured biases with smooth functions to make incorporation into our simulated analyses easier. For the ratio of /, we find that the functions defined below provide a good fit:

 y(ℓ)=a(|(ℓ−b)/c|)p×10−8+d, (22)

where , , , , and are free parameters for each redshift bin. Similarly, for /, we use a function of the form:

 y(ℓ)=−aexp(−(ℓ/b))1.2×10−4+c. (23)

The results of these fits are shown as the solid curves in Fig. 3. Given these parameterized fits, we can transform the biases measured in multipole space into biases in angular space (where and are measured).

Clearly, the biases due to tSZ leakage into are significant. In §5 we will assess the impact of these biases on the inferred cosmological constraints, and will choose scale cuts to mitigate their impact.

### 4.3 Estimate of CIB bias

To estimate the effects of CIB contamination of the maps on and , we follow a procedure similar to that used to estimate the tSZ bias. However, rather than generating a simulated CIB map, we instead rely on Planck observations. To this end, we use the Planck GNILC 545 GHz CIB map (Planck Collaboration et al., 2016e) as a proxy for the true CIB emission on the sky. We first calculate the -dependent cross-correlation between the combined SPT+Planck map and the Planck GNILC GHz maps; this correlation provides an estimate of the amount of CIB contamination in the SPT+Planck map. The GNILC GHz map is then convolved with the dependent scaling function:

 η(ℓ)=CGNILC×SPℓCGNILC×GNILCℓ, (24)

where refers to the SPT+Planck map. The result is a map of the estimated CIB leakage into the SPT+Planck temperature map.

Next, the quadratic estimator is applied to the estimated CIB leakage map to produce , an estimate of the leakage of CIB into the map. As with , we cross-correlate with the true DES galaxy and shear catalogs to form estimates of the bias in and due to CIB leakage. These cross-correlations are shown in Fig. 3. From the figure, it is apparent our estimate of the CIB bias is consistent with there being no bias, and we will henceforth ignore CIB as a potential source of contamination in our analysis.

### 4.4 Biases due to masking clusters

As mentioned in §4.2.2, massive galaxy clusters are masked to reduce contamination of by tSZ leakage. However, clusters are also associated with regions of high . Consequently, by masking these objects, we expect to reduce the amplitude of and