Dark Energy Survey Year 1 Results: CrossCorrelation Redshifts  Methods and Systematics Characterization
Abstract
We use numerical simulations to characterize the performance of a clusteringbased method to calibrate photometric redshift biases. In particular, we crosscorrelate the weak lensing (WL) source galaxies from the Dark Energy Survey Year 1 (DES Y1) sample with redMaGiC galaxies (luminous red galaxies with secure photometric redshifts) to estimate the redshift distribution of the former sample. The recovered redshift distributions are used to calibrate the photometric redshift bias of standard photo methods applied to the same source galaxy sample. We apply the method to three photo codes run in our simulated data: Bayesian Photometric Redshift (BPZ), Directional Neighborhood Fitting (DNF), and Random Forestbased photo (RF). We characterize the systematic uncertainties of our calibration procedure, and find that these systematic uncertainties dominate our error budget. The dominant systematics are due to our assumption of unevolving bias and clustering across each redshift bin, and to differences between the shapes of the redshift distributions derived by clustering vs photo’s. The systematic uncertainty in the mean redshift bias of the source galaxy sample is , though the precise value depends on the redshift bin under consideration. We discuss possible ways to mitigate the impact of our dominant systematics in future analyses.
Email: mgatti@ifae.es \@footnotetext Email: pvielzeuf@ifae.es \@footnotetext Einstein fellow \@footnotetext Affiliations are listed at the end of the paper.
1 Introduction
Current and future large photometric galaxy surveys like DES^{2}^{2}2https://www.darkenergysurvey.org/ (The Dark Energy Survey Collaboration, 2005), KiDS^{3}^{3}3http://kids.strw.leidenuniv.nl/ (de Jong et al., 2013), HSC^{4}^{4}4https://subarutelescope.org/Projects/HSC/ (Aihara et al., 2017), LSST^{5}^{5}5https://www.lsst.org/ (Tyson et al., 2003), Euclid^{6}^{6}6http://sci.esa.int/euclid/ (Laureijs et al., 2011), and WFIRST^{7}^{7}7https://wfirst.gsfc.nasa.gov/ (Spergel et al., 2013) will map large volumes of the Universe, measuring the angular positions and shapes of hundreds of millions (or billions) of galaxies. This will allow cosmological measurements with an unprecedented level of precision, leading to a considerable step forward in our understanding of cosmology and particularly of the nature of dark energy. To capitalize on their statistical constraining power, these surveys require accurate characterization of the redshift distributions of selected galaxies, which presents a considerable challenge in the absence of complete spectroscopic coverage.
Given the large amount of forthcoming photometric data, obtaining a spectroscopic redshift for every individual source is unfeasible: spectroscopy of large samples is timeconsuming and expensive, and it is usually restricted to the brightest objects of any given sample. Because of this limitation, photometric surveys provide redshift estimates for each galaxy based on that galaxy’s multiband photometry, a technique called photometric redshift, or photo. There exists a large variety of photo methods (e.g. Hildebrandt et al., 2010; Sánchez et al., 2014). However, unrealistic SED templates, degeneracies between colors and redshift, and unrepresentative spectroscopic samples for both training and calibration ultimately limit the performance of photo methods (Lima et al., 2008; Cunha et al., 2009; Newman et al., 2015; Bezanson et al., 2016; Masters et al., 2017).
Clusteringbased redshift estimation methods (Newman, 2008; Matthews & Newman, 2010; Ménard et al., 2013; Schmidt et al., 2013) constitute an interesting alternative to infer redshift distributions, since they are more general and do not suffer the above limitations. Briefly, one uses the fact that the correlation amplitude between a sample with unknown redshifts and a reference sample with known redshifts in some narrow redshift bin can be related to the fraction of galaxies in the unknown sample that lie within the redshift range of the reference sample.
Clusteringbased redshift estimators have been studied and applied both to simulations and to data (e.g. Ménard et al., 2013; McQuinn & White, 2013; Schmidt et al., 2013; Scottez et al., 2016; Scottez et al., 2017; Hildebrandt et al., 2017; van Daalen & White, 2017; Davis et al., 2017a). Hildebrandt et al. (2017) crosscorrelated the source galaxies used in the KiDS cosmological analysis with galaxies from zCOSMOS (Lilly et al., 2009) and DEEP2 (Newman et al., 2013). Unfortunately, the small () area covered by these reference galaxy samples severely limited the usefulness of the resulting crosscorrelation analyses. Consequently, Hildebrandt et al. (2017) ultimately chose to rely on traditional photo methods in deriving the KiDS cosmological constraints.
The DES Y1 cosmological analyses rely on a different strategy. Instead of using a small spectroscopic sample as reference, we use redsequence galaxies from the DES Y1 redSequence Matchedfilter Galaxy Catalog (redMaGiC, Rozo et al., 2016). The redMaGiC algorithm is designed to select galaxies with high quality photometric redshift estimates. While the reliance on redMaGiC photometric redshifts may be a source of concern for the crosscorrelation program, the vastly superior statistical power of the sample renders the resulting crosscorrelation constraints competitive with traditional photo methods.
The DES Y1 analysis attempts to combine traditional photo methods with crosscorrelation techniques. In particular, motivated by the fact that the DES Y1 cosmological analyses are primarily sensitive to an overall redshift bias in the photometric redshift estimates (Hoyle et al., 2017; Troxel et al., 2017; DES Collaboration, 2017), we have sought to use crosscorrelation methods to verify and calibrate the redshift bias of traditional photo methods. By combining these two techniques we benefit from the strength of both methods, while ameliorating their respective weaknesses. This calibration strategy is fully implemented in the DES Y1 cosmic shear and combined twopoint function analysis (Krause et al., 2017; Troxel et al., 2017; DES Collaboration, 2017).
This paper characterizes the performance and systematic uncertainties of our method for calibrating photometric redshift biases in the DES Y1 source galaxy sample via crosscorrelation with redMaGiC galaxies. Specifically, we implement our method on simulated data, introducing sources of systematic uncertainty one at a time to arrive at a quantitative characterization of the reliability and accuracy of our method. A companion paper (Davis et al., 2017b) implements the photometric calibration method developed here to enable DES Y1 cosmology analyses, while a second companion paper (Cawthon et al., 2017) uses crosscorrelations to validate the photometric redshift performance of the redMaGiC galaxies themselves.
The paper is organized as follows. In 2 we present the methodology we use to calibrate photo posteriors using clusteringbased redshift estimation. The simulations and the samples used are described in 3. 4 is devoted to the study and quantification of the systematic error of our method. In 5 we further discuss some aspects of clusteringbased redshift estimation techniques and how our method could be improved upon in the future. Finally we present our conclusions in 6.
2 Methodology
In DES Y1 we will use clusteringbased redshift estimates to correct the photo posterior distributions of a given science sample. We defer the description (and the binning) of the particular samples (unknown and reference) adopted in this work to 3, while keeping the description of the methodology as general as possible. Here, “unknown” always refers to the photometric galaxy sample for which we wish to calibrate photometric redshift biases, while “reference” refers to the galaxy sample with known, highly accurate redshifts (be they spectroscopic or photometric).
Our methodology divides into two steps:

We first estimate the redshift distribution of the unknown galaxy sample by crosscorrelating with the reference sample. Note that the reference sample does not necessarily have to span the full redshift interval of the unknown sample.

We then use the recovered redshift distribution to calibrate the redshift bias of the unknown galaxy population by finding the shift which brings the photo posterior in better agreement with the redshift distribution obtained through crosscorrelations.
2.1 First step: clusteringbased redshift estimates
In the literature a variety of methods to recover redshift distributions based on crosscorrelation have been discussed (Newman, 2008; Ménard et al., 2010; Schmidt et al., 2013; McQuinn & White, 2013). The underlying idea shared by all methods is that the spatial crosscorrelation between two samples of objects is nonzero only in case of 3D overlap. Let us now consider two galaxy samples:

An unknown sample, whose redshift distribution has to be recovered.

A reference sample, whose redshift distribution is known (either from spectroscopic redshifts or from highprecision photometric redshifts). The reference sample is divided into narrow redshift bins.
To calibrate the redshift distribution of the unknown sample we bin the reference sample in narrow redshift bins, and then compute the angular crosscorrelation signal between the unknown sample and each of these reference redshift bins. Under the assumption of linear biasing, we find
(1) 
where and are the unknown and reference sample redshift distributions (normalized to unity over the full redshift interval), and are the biases of the two samples, and is the dark matter 2point correlation function.
In this paper we implement three different clusteringbased methods: Schmidt’s method (Schmidt et al., 2013), Ménard’s method (Ménard et al., 2013; Scottez et al., 2016), and Newman’s method (Newman, 2008; Matthews & Newman, 2010). We briefly describe each of the three methods. A comparison of the three methods is presented in 4.6. At the end, we have opted for using Schmidt’s method for our fiducial analysis.
Schmidt’s method: In Schmidt et al. (2013), the authors use a “1angular bin” estimate of the crosscorrelation signal. This is achieved by computing the number of sources of the unknown sample in a physical annulus around each individual object of the reference sample, from a minimum comoving distance to a maximum distance . Our fiducial choice for the scales is from 500 kpc to 1500 kpc ^{8}^{8}8Even though these scales are clearly nonlinear, these nonlinearities do not have a significant impact on the methodology, as demonstrated in Schmidt et al. (2013) and in this paper. See 4.6 for a discussion concerning the choice of scales.. In addition, each object of the unknown sample is weighted by the inverse of the distance from the reference object, which has been shown to increase the S/N ratio of the measurement (Schmidt et al., 2013). We use the Davis & Peebles (Davis & Peebles, 1983) estimator of the crosscorrelation signal,
(2) 
where and are respectively data–data and data–random pairs, and the weight function. The pairs are properly normalized through and , corresponding to the total number of galaxies in the reference sample and in the reference random catalog.
The Davis & Peebles estimator is less immune to window function contamination than the Landy & Szalay estimator (Landy & Szalay, 1993), since it involves using a catalog of random points for just one of the two samples. We choose to use the Davis & Peebles estimator so as to avoid creating highfidelity random catalogs for the DES Y1 source galaxy sample (our unknown sample): the selection function depends on local seeing and imaging depth, resulting in a complex spatial selection function. We therefore decide to use a catalog of random points only for the reference sample, whose selection function and mask are well understood (ElvinPoole et al., 2017).
Assuming the reference sample is divided into sufficiently narrow bins centered at , we can approximate (with being Dirac’s delta distribution, and being the number of galaxies in the reference bin) and invert eq. (1) to obtain the redshift distribution of the unknown sample:
(3) 
where barred quantities indicate they have been “averaged” over angular scales, reflecting the fact that we are using 1angular bin estimates of the correlation while weighting pairs by their inverse separation. The proportionality constant is obtained from the requirement that has to be properly normalized.
In principle, the redshift evolution of the galaxy–matter biases and of could be estimated by measuring the 1bin autocorrelation function of both samples as a function of redshift:
(4)  
(5) 
where and are the redshift distributions of the reference and unknown samples binned into narrow bins centered in . If the bins are sufficiently narrow so as to consider the biases and to be constant over the distributions, they can be pull out of the above integrals. Knowledge of the redshift distributions of the narrow bins is then required to use eqs. (4) and (5) to estimate , and .
In our fiducial analyses we do not attempt to correct for the redshift evolution of the galaxy–matter bias and of the dark matter density field. Rather, we assume , and to be constant within each photoz bin, and use the simulations to estimate the systematic error induced by this assumption. This choice is motivated and discussed in more details in 5.1 and Appendix B. Under this assumption, eq. (3) reduces to:
(6) 
where the proportionality constant is again obtained requiring a proper normalization for .
Ménard’s method (Ménard et al., 2013): the method of Ménard differs from that of Schmidt in how the 1angular bin estimate of the crosscorrelation signal is measured. In particular, for Ménard’s method the correlation function is measured as a function of angle, and the recovered correlation function is averaged over angular scales via
(7) 
where is a weighting function. We assume to increase the S/N. The integration limits in the integral in eq. (7) correspond to fixed physical scales (500 kpc to 1500 kpc). As can be seen, the primary difference between Ménard’s method and Schmidt’s method is whether one computes the angular correlation function first followed by a weighted integral over angular scales, or whether one performs a weighted integral of pairs first, and then computes the angular correlation function.
Newman’s method (Newman, 2008; Matthews & Newman, 2010): this method assumes that all the correlation functions can be described by power laws . Adopting a linear bias model, this allows one to relate the measured crosscorrelation signal to and to quantities computable from a given cosmological model. Specifically, one has
(8) 
Here corresponds to the power law slope of the correlation function, while . and are respectively the angular size distance and the comoving distance at a given redshift.
We fit the observed crosscorrelation signal using a function of the form . With respect to Schmidt’s and Ménard’s methods, we note that the Newman’s implementation introduces two extra degrees of freedom ( and ). The index is estimated from the arithmetic mean of the indexes of the unknown and reference autocorrelation functions (see below). The parameters , and are obtained through chi square minimization; we estimate the covariance needed for the fit through jackknife resampling. Setting our two expressions for equal to each other, and solving for the redshift distribution, we arrive at
(9) 
Under the assumption of linear bias, both the index of the crosscorrelation function and its correlation length can be calculated from the unknown and reference autocorrelations. One has and . ^{9}^{9}9We note that if we assume constant (scaleindependent) bias, then . Nonetheless, we compute as the arithmetic mean of and to follow Matthew&Newman’s original recipe. A first guess value for can be inserted in eq. (9) to estimate the redshift distribution, which can be inserted back in eq. (8) to refine the value of . The whole procedure is repeated until convergence.
2.2 Second step: correcting photo posterior
Given an unknown galaxy sample, one can readily use photo techniques to estimate the corresponding redshift distribution. Here, we seek to use the redshift distribution recovered via crosscorrelation to calibrate the photometric redshift bias of the photo posterior. We investigated two approaches:

criteria  shape matching. Let be the photo posterior for the unknown galaxy sample and the redshift distribution recovered via crosscorrelations. The corrected photo posterior is defined as , where is the photometric redshift bias. The photo bias is calibrated matching the shapes of and within the redshift interval covered by the reference sample.

criteria  mean matching. Let be the mean of and the mean of . The photo bias is calibrated requiring and to match. Note that the means have to be estimated over the same redshift range.
Quantitatively, the matching is done using a likelihood function to solve for the photometric redshift bias of the photo posterior. We recall that we do not attempt to debias higher order moments of the photo posterior as the cosmological probes in the accompanying DES Y1 analysis are primarily sensitive to the mean of this distribution (DES Collaboration, 2017; Hoyle et al., 2017; Krause et al., 2017; Troxel et al., 2017). The loglikelihoods for the parameter for each of the two matching criteria are defined via
(10) 
and
(11) 
Note these likelihoods can account for the existence of a priori estimate of the photometric redshift bias . In the above equations, is a relative normalization factor that rescales , which is properly normalized to unity over the full redshift interval, to a distribution that is normalized to unity over the range of .
The quantity for each of the likelihoods is the appropriate covariance matrix from the crosscorrelation analysis. They are estimated from simulated data through a jackknife (JK) approach, using the following expression (Norberg et al., 2009):
(12) 
where the sample is divided into subregions of roughly equal area ( deg), is a measure of the statistic of interest in the ith bin of the kth sample, and is the mean of our resamplings. The jackknife regions are safely larger than the maximum scale considered in our clustering analysis. The Hartlap correction (Hartlap et al., 2007) is used to compute the inverse covariance.
Finally, despite the fact that our reference sample (redMaGiC galaxies) spans the redshift interval in our simulations, in practice, in criteria (mean matching), we restrict ourselves to a narrower redshift range, defined by the intersection of and , where is the root mean square of the redshift distribution . We have found that this choice increases the accuracy and robustness of our method by minimizing systematics (e.g. lensing magnification) associated with regions in which there is little intrinsic clustering signal. 4.4 quantifies the impact of this choice on our results. We do not shrink the interval used for matching under criteria (shape matching), as this procedure is inherently less sensitive to noise and biases in the tails.
One important feature of our analysis is that, when treating multiple WL source redshift bins, we treat each bin independently. In practice, there are clear statistical correlations between bins, as revealed by significant offdiagonal elements in the jackknife covariance matrix. However, as we demonstrate below, our analysis is easily systematics dominated. This has an important consequence, as we have found that attempting a simultaneous fit to all WL source redshift bins clearly produces incorrect results: systematic biases in one bin get propagated into a different bin via these offdiagonal terms, throwing off the best fit models for the ensemble. By treating each bin independently, we find that we can consistently recover numerically stable, accurate (though systematics dominated) estimates of the photometric redshift bias.^{10}^{10}10In principle, neglecting correlations between different bins should result in an underestimation of the statistical uncertainty. In practice, this effect is negligible.
We sampled the likelihood in eqs. (10) and (11) using the affineinvariant Markov Chain Monte Carlo ensemble sampler emcee (ForemanMackey et al., 2013)^{11}^{11}11http://dan.iel.fm/emcee. We assume noninformative flat priors for and .
3 Simulated Data
3.1 Buzzard simulations
We test our calibration procedure on the Buzzardv1.1 simulation, a mock DES Y1 survey created from a set of darkmatteronly simulations. The simulation and creation of the mock survey data is detailed in DeRose et al. (2017); Wechsler et al. (2017); MacCrann et al. (2017), so we provide only a brief summary of both. Buzzardv1.1 is constructed from a set of 3 Nbody simulations run using LGADGET2, a version of GADGET2 (Springel, 2005) modified for memory efficiency. The simulation boxes ranged from Gpc/h. Lightcones from each box were constructed on the fly. Halos were identified using ROCKSTAR (Behroozi et al., 2013), and galaxies were added to the simulations using the Adding Density Dependent GAlaxies to Lightcone Simulations algorithm (ADDGALS, Wechsler et al. 2017). ADDGALS uses the large scale dark matter density field to place galaxies in the simulation based on the probabilistic relation between density and galaxy magnitude. The latter is calibrated from subhalo abundance matching in highresolution Nbody simulations. Spectral energy distributions (SEDs) are assigned to the galaxies from a training set of spectroscopic data from SDSS DR7 (Cooper et al., 2011) based on local environmental density. The SEDs are integrated in the DES pass bands to generate griz magnitudes. Galaxy sizes and ellipticities are drawn from distributions fit to deep SuprimeCam band data. The galaxy positions, shapes and magnitudes are then lensed using the multipleplane raytracing code Curvedsky grAvitational Lensing for Cosmological Light conE simulatioNS (CALCLENS, Becker 2013). Finally, the catalog is cut to the DES Y1 footprint, and photometric errors are added using the DES Y1 depth map (Rykoff et al., 2015).
3.2 Unknown sample in simulations  weak lensing source sample
We seek to mimic the selection and redshift distribution of the weak lensing source galaxies included in the DES Y1 METACALIBRATION shear catalog described in Zuntz et al. (2017). To do so, we apply flux and size cuts to the simulated galaxies that mimic the DES Y1 source selection thresholds. Each source has its redshift estimated, and is assigned a photometric redshift equal to the mean of the posterior redshift distribution. These redshifts are used to divide the source galaxies into four redshift bins corresponding to [(0.20.43),(0.430.63),(0.630.9),(0.91.3)]. Due to the limited redshift coverage of the redMaGiC reference sample, we only apply our method to the first three redshift bins. The number densities of the weak lensing sample in the simulations are for these source bins. The corresponding values of the DES Y1 shear catalog are . The lower number densities in simulation have a negligible impact on the recovered statistical uncertainty, as the latter is dominated by the shot noise of the reference sample. Importantly, the shape of as estimated by the photo codes in the simulations match the data with good fidelity.
Three different photo codes have been run on the simulated WL source samples:

Directional Neighborhood Fitting (DNF) (De Vicente et al., 2016). DNF is a machine learning algorithm for galaxy photometric redshift estimation. Based on a training sample, DNF constructs the prediction hyperplane that best fits the neighborhood of each target galaxy in multiband flux space. It then uses this hyperplane to predict the redshift of the target galaxy, which is used to divide the sample into bins. The key feature of DNF is the definition of a new neighborhood, the Directional Neighborhood. Under this definition two galaxies are neighbors not only when they are close in the Euclidean multiband flux space, but also when they have similar relative flux in different bands, i.e. colors. A random sample from the photo posterior of an object is approximated in the DNF method by the redshift of the nearest neighbor within the training sample, and it is used for the reconstruction.

Random Forest (RF) (e.g. Breiman, 2001; Carrasco Kind & Brunner, 2013; Rau et al., 2015) RF is a Machine Learning method that generates an ensemble of decision trees from bootstrap realizations of the training data. When a new galaxy is queried down each tree in the ensemble, the decision trees vote on the galaxy redshift. The final prediction of the Random Forest is generated from the average of all decision trees’ results. Both a mean redshift and full probability distribution is generated for each galaxy.
Both DNF and RF photo require us to define a training/validation sample. The sample is first defined in data. A catalog is built collecting high quality spectra from more than 30 spectroscopic surveys overlapping the DES Y1 footprint and matching them to DES Y1 galaxies (Hoyle et al., 2017; Gschwend et al., 2017). This catalog is then used to define the training/validation sample in simulations, by selecting the nearest neighbors in magnitude and redshift space. The selection algorithm is applied in HEALPix pixels with resolution Nside=128 (0.2 deg), so as to mimic the geometry and selection effects of the spectroscopic surveys.
The true redshift distributions of the sources binned according to each of the three photo codes are presented in Figure 1. In what follow, we will show quantitative results for BPZ and only for one of the two machine learning codes, namely DNF, as the RF forest code does not provide results significantly different from DNF.
3.3 Reference sample in simulations  redMaGiC galaxies
We use redMaGiC galaxies for our reference samples. These are luminous red galaxies selected as described in (Rozo et al., 2016). The redMaGiC algorithm is designed to select galaxies with high quality photometric redshift estimates. This is achieved by using the redsequence model that is iteratively selftrained by the redMaPPer cluster finding algorithm (Rykoff et al., 2014). redMaGiC imposes strict color cuts around this model to produce a luminositythresholded galaxy sample of constant comoving density. The algorithm has only two free parameters: the desired comoving density of the sample, and the minimum luminosity of the selected galaxies. The result is a pure sample of redsequence galaxies with nearly Gaussian photometric redshift estimates that are both accurate and precise.
For this work we selected redMaGiC galaxies in the redshift interval 0.15<<0.85, applying the luminosity cut of ; the resulting redshift distribution is shown in fig. 2. The reference sample is further split into 25 uniform redshift bins. In our simulation, the mask of the redMaGiC sample includes all the survey regions that reach sufficient depth to render the sample volume limited up to = 0.85. Due to small differences in the evolution of the redsequence between the simulation and the data, the simulated redMaGiC sample has less galaxies than the data, reaching a maximum redshift of (instead of ). We expect statistical errors in this work to be overestimated by with respect to data^{12}^{12}12As our methodology is systematic dominated, this has a negligible impact on the results drawn in this paper.. We note that to be consistent with the redshift interval considered here, the analysis in data has been performed cutting the redMaGiC sample at (Davis et al., 2017b).
The characteristic scatter and bias of the redMaGiC photometric redshifts found in the data are very closely reproduced by the simulations as can be seen in Figure 3. It should be noted that in the simulation we have the true redshifts of all redMaGiC galaxies, and thus can calculate the aforementioned statistics using the full sample, whereas in the data we only have an incomplete spectroscopic training set with which to make these measurements. Cawthon et al. (2017) discusses further validation of the robustness of these estimates in the data.
We also generate a catalog of random points for redMaGiC galaxies. redMaGiC randoms are generated uniformly over the footprint, as observational systematics (e.g. airmass, seeing) are not included in the simulation and for the simulated redMaGiC sample used in this analysis, number density does not correlate with variation in the limiting magnitude of the galaxy catalogs.
4 Systematic characterization
In this section we test our clustering calibration of DES Y1 photo redshift distributions. To assess the accuracy of the methodology, we consider the mean of the redshift distributions, computed over the full redshift interval (i.e., without restricting to the matching interval where we have reference coverage). Any residual difference in the mean between the calibrated photo posterior and the true distribution is interpreted as a systematic uncertainty, which is quantified through the metric
(13) 
We will refer to this metric as the “residual difference in the mean”. We recall that in the above equation is the mean of the photo posterior once the photo bias has been calibrated.
Systematic errors can arise if the clusteringbased redshift distribution differs from the truth, owing to the fact that:

we are neglecting the redshift evolution of the galaxymatter biases of both the weak lensing and redMaGiC samples (and of the dark matter density field); hereafter we will refer to this systematic as bias evolution systematic;

we are using photo as opposed to true redshift to bin the reference sample, hereafter referred to as redMaGiC photo systematic.
Moreover, when we correct the photo posterior using the clusteringbased :

if the “shape matching” criterion is used, differences between the shapes of and could impact the recovered photometric redshift bias, as the criterion does not impose any requirement on the mean of . An incorrect shape of the photo posterior could also affect the “mean matching” criterion, as the matching is performed within of , and the photo posterior outside this interval cannot be calibrated. Hereafter we will refer to this systematic as shape systematic.
Below we introduce each of these systematics one at a time, computing each of their contributions to the total systematic error budget in our method. We will make the ansatz that our systematics can be treated as independent. We will come back to this assumption later in 4.5.
4.1 Bias evolution systematic
We first estimate errors due to bias evolution and evolution in the clustering amplitude of the density field. We apply our method to a nearly ideal scenario in which the source galaxy distribution is binned in redshift bins according to the mean of the photo posterior as estimated by each of the photo codes we consider. We use the true redshifts of the redMaGiC reference sample when applying our crosscorrelation method. We also assume that the of each redshift bin is identical to the true redshift distribution.
Our results are shown in the upper panels of fig. 4, labeled “scenario A”, and the residual shifts in the mean after the calibration are summarized in the first row of tables 2 and 2. If the calibration procedure was not affected by systematic errors, we should recover residual shifts in the mean compatible with zero. However, the values in tables 2 and 2 are as large as , owing to an incorrect estimate. All the residual shifts are substantially larger than the typical statistical uncertainty of the measurement. The specific values of the shifts vary depending on the photo code (as they can select slightly different populations of galaxies) and redshift bins. The two matching criteria do not always lead to the same residual shifts: in our calibration procedure, matching the shapes of two different distributions is not expected to give the same photo bias obtained by matching their means.
We demonstrate in 5.1 that correcting for the redshift evolution of the biases and of the clustering amplitude of the density field accounts for the observed residual shifts . This evolution can be readily estimated in our simulated data, but is difficult to account for in the real world. Therefore, the residual shifts reported in tables 2 and 2 represent the systematic error on the photo bias calibration due to the bias evolution systematic.
Lastly, we note that in fig. 4 the clusteringbased estimate recovers a spurious signal (in the form of a positive tail at high redshift) for the first redshift bin, which may potentially be explained by lensing magnification effects (see 5.2). We note, however, that the shape matching procedure is quite insensitive to biases in the tails, as the photo posterior goes to zero. Likewise, the mean matching within of is insensitive to the tails.
4.2 redMaGiC photo systematic
Next, we relax the assumption that we have true redshifts for the reference redMaGiC sample. Naively, we expect that any photometric redshift biases in redMaGiC will imprint themselves into the clustering result. We repeat the same analysis as before, only now we use photometric rather than true redshifts for the redMaGiC galaxies. Since this run is affected by bias evolution, we are interested in the change of the residual shifts relative to that in the previous Section.
Results are shown in the central panels of fig. 4, labeled “scenario B”, while the changes in the residual shifts () are summarized in the second row of tables 2 and 2. These changes correspond to the values of the redMaGiC photo systematic. Note that we do not show the statistical uncertainty for this systematic: as the residual shifts for scenario A and B are highly correlated (since they are estimated using similar data covariances), the statistical uncertainty of their difference is close to zero.
A comparison with the values obtained for the bias evolution systematic shows that redMaGiC photo systematic is subdominant.
4.3 Shape systematic
Relative to the previous run, in which the photo posterior was assumed to be the true redshift distribution of the source galaxies, we now replace the shape of the photo posterior by the photometrically estimated from each of the photo codes we consider. This constitutes our most realistic scenario, as it suffers from all three systematics identified in this paper: bias evolution, redMaGiC photo, and shape systematic. Our results are shown in the lower panels of fig. 4, labeled “scenario C”. To disentangle the shape systematic from the other two, we compute the change of the residual shift in the mean relative to that obtained in the previous section. The changes in the residual shifts () are summarized in the third row of tables 2 and 2. As for the case of redMaGiC photo systematic, we do not show the statistical uncertainty, which should be close to zero.
We see from tables 2 and 2 that the shape systematic has a much stronger impact on the shape matching criteria than on the mean matching criteria. This is particularly evident in the second redshift bin, where the differences in the shapes of the photo posterior and the true/crosscorrelation redshift distributions are especially pronounced. Note in particular the absence of a secondary low redshift peak in the photo posteriors. Given the smaller systematic uncertainty associated with shape systematics in the mean matching criteria, we adopt it as our fiducial matching criteria for the DES Y1 analysis.
Given that this last run (scenario C) includes all systematic uncertainties, we also report in tables 2 and 2 (fifth and sixth rows) the residual shift in the mean of the photo posterior before and after the calibration. Error bars only account for statistical uncertainty. In almost all the cases the calibration procedure greatly reduces the residual shifts in the mean. In particular, for many of the bins the corrected redshift distributions are consistent with zero photometric redshift bias. We note that in the second redshift bin, while it might seem by eye (fig. 4) that the calibrated differs from the true distribution, their means are correctly matched.
4.4 Dependence of the meanmatching criteria on the choice of redshift interval
We briefly discuss here our choice to apply the mean matching criteria only in the interval . The interval has been chosen as it roughly covers most of the range sampled by the true distribution, minimizing the impact of possible systematics affecting the tails of the recovered distribution.
We estimate the values of each systematic for different interval choices, namely , , as well as a run using all reference redshift bins. The results are shown in fig. 5. Variations in the values of the systematics are typically smaller than . However, there are two exceptions. In the first WL redshift bin, large intervals () include in the analysis the positive tail that appears in the clusteringbased estimate at high redshift. This substantially affects the bias evolution systematic. In the second WL redshift bin, the redshift interval is narrow enough that the secondary peak in the redshift distribution is not included in the analysis. This omission introduces a larger shape systematic.
To accommodate the impact of the choice of interval in the crosscorrelation measurement into our systematic error budget, for each weak lensing source bin we have opted to estimate the systematic using both our and cuts, always adopting the largest of the two systematic error estimates.
4.5 Total systematic error budget
We choose not to correct for the biases found in 4.1, 4.2 and 4.3, thereby not taking advantage of the fortuitous cancellations measured in the simulations. Instead, we consider each of the shifts reported in tables 2 and 2 as systematic errors, and proceed to add them all up in quadrature to produce our final systematic error estimate. This assumes the three sources of systematic error to be independent. We do not expect any correlation between redMaGiC photo errors, and the WL galaxymatter bias or WL photo posterior. There might be slight (anti) correlations between the bias evolution systematic and the shape systematic, if the photo code misplaces a population of galaxies with a given bias outside the matching interval. However, assuming a correlation coefficient of (or ) between these two systematics has a negligible impact on the total error budget, so we ignored this effect.
4.6 Choice of method and angular scales
Throughout this paper we have adopted as our fiducial clusteringbased method the one introduced by Schmidt et al. (2013) and considered physical scales between 500 kpc and 1500 kpc. We test the impact of the choice of angular scales by recomputing the residual shifts in the mean for one of the dominant systematic (the bias evolution systematic) with a different choice of physical scales and methods.
Figure 6 shows the residual shifts using the Schmidt, Ménard and Newman methods (outlined in 2.1), and for the following scales: 2001250 kpc (i.e, small scales only),12508000 kpc (i.e, large scales only), 2008000 kpc and 5001500 kpc (our fiducial choice for this work).
We find that the Ménard and Schmidt methods perform similarly. Small differences arise because of how the two methods average over angular scales. We also find that the Newman method results in the noisiest estimates, as the implementation of Newman’s method introduces two new degrees of freedom that the Schmidt/Ménard approaches do not ( and , see 2.1). For the largest angular scales (12508000 kpc), the reconstruction is so noisy that it fails to provide corrections to the photo posteriors, as the MCMC chains fail to converge. The noisier estimates are due to the powerlaw fits: some bins can have degeneracies among powerlaw parameters (especially in the tails of the ), and in some cases the crosscorrelation signal deviates from a pure power law shape.
Results are compatible within statistical errors for different choices of angular scales. Only the shifts obtained using Schmidt’s and Menard’s methods over our nominal scales in the third photo bin are significantly different from the rest. However, the differences are safely within the bounds of our total systematic error. Fig. 6 suggests that non linear galaxy–matter bias (e.g. Smith et al., 2007) seems to have a negligible impact on our methodology. At large scales, the S/N deteriorates, so we have chosen not to use the largest scales (>1500 kpc) in our fiducial analysis.
From fig.6, it is clear that using scales as small as 200 kpc appears to improve the statistical and systematic uncertainties of our method relative to the 500 kpc inner scale radius. However, the differences are small. We have opted to use the larger 500 kpc radius to avoid possible systematic uncertainties that may arise in the data but not in our simulation. In particular, photometric contamination from nearby galaxies will become more important as the inner scale radius is reduced (see Applegate et al., 2014; Choi et al., 2016; Morrison & Hildebrandt, 2015; Simet & Mandelbaum, 2015). Our choice to set is meant to safeguard our results against any such kind of neighboring biases.
5 Discussion
5.1 Impact of galaxy–matter biases on clusteringbased methods
We now demonstrate our previous claim that the bias evolution is responsible for the systematic shifts we observed in 4.1, when we used true redshifts for the reference sample and the true redshift distribution as .
In our standard methodology we have chosen not to correct for the redshift evolution of the galaxy–matter bias of both samples (and for the evolution of the dark matter density field, even if the effect is generally subdominant, Ménard et al. 2013). This approximation holds as long as the biases evolve on scales larger than the typical width of the unknown distribution. In this sense, it is clear that binning the unknown sample into narrow bins through photo or color selection helps to reduce the impact of bias evolution (Ménard et al., 2013; Schmidt et al., 2013; Rahman et al., 2015). If the bins are not sufficiently narrow, neglecting the bias evolution leads to systematic shifts.
We estimate the redshift evolution of the galaxy–matter bias and dark matter density field looking at the autocorrelation functions of the reference and the unknown samples, as suggested in Ménard et al. (2013) and explained in 2.1. We therefore bin both samples in 25 equallyspaced thin bins from z=0.15 to z=0.85 using their true redshift and we then measure the 1bin autocorrelation functions of the samples. If the bins are sufficiently narrow, and each bin has a tophat shape, the autocorrelation functions are simply proportional to , as shown in 2.1. We caution that we use uniform randoms to compute the WL autocorrelation functions: even though the WL sample selection function used in simulation only roughly mimics the one applied to data, using uniform random is only approximately correct. With this caveat in mind, we estimate the redshift distribution using the following estimator:
(14) 
By using the new estimator, we can obtain residual shifts which are compatible with zero (see the values reported in table 3 and fig. 7) within statistical uncertainty. The correction induced by including in the estimator the term accounts for most of the bias evolution systematic, indicating that the major contribution to the systematic is due to the WL sample. The correction induced by the term is negligible.
We emphasize that this estimator can be implemented only in simulations, since in data we do not have access to the true redshifts needed to bin the samples. In Appendix B we show an alternative correction obtained by binning the samples using their photo. The correction only works for the redMaGiC bias, but we decided not to implement it as its impact is negligible.
The bias evolution of the unknown sample constitutes one of the major issues for clustering based methods, and it is one of the dominant sources of systematic in our work. It is worth noting that in our simulation the bias evolution can be complex (as it can be inferred from the middle panels of Fig. 7) and therefore not especially suited for correction using simple parametric approaches (e.g. Matthews & Newman, 2010; Schmidt et al., 2013; Davis et al., 2017a).
As the clustering amplitudes of galaxies have been found to depend on galaxy types, colors, and luminosities (Zehavi et al., 2002; Hogg et al., 2003; Blanton et al., 2006; Coil et al., 2006; Cresswell & Percival, 2009; Marulli et al., 2013; Skibba et al., 2014), further splitting the unknown sample into smaller subsamples with similar colors/luminosity properties (together with thinner binnings in redshift space) should alleviate the impact of bias evolution (Scottez et al., 2017; van Daalen & White, 2017). We also note that one could break the degeneracy between galaxy bias and redshift distribution using other probes, like galaxygalaxy lensing (Prat et al., 2017).
5.2 Impact of lensing magnification on clusteringbased methods
It is well know that lensing magnification (Narayan, 1989; Villumsen et al., 1997; Moessner & Jain, 1998) can lead to a change in the observed spatial density of galaxies. The enhancement in the flux of magnified galaxies can locally increase the number density, as more galaxies pass the selection cuts/detection threshold of the sample; at the same time, the same volume of space appears to cover a different solid angle on the sky, generally causing the observed number density to decrease. The net effect is driven by the slope of the luminosity function, and it has an impact on the measured clustering signal (Bartelmann & Schneider, 2001; Scranton et al., 2005; Ménard et al., 2010; Morrison et al., 2012). For the WL samples, size bias can also be important (Schmidt et al., 2009), but it will not be considered here.
In the context of clusteringbased redshift estimates, lensing magnification has been generally ignored (Matthews & Newman 2010; Johnson et al. 2017; van Daalen & White 2017, but see McQuinn & White 2013; Choi et al. 2016). Ménard et al. (2013) state that the amplitude of the magnification effect on arcminute scales is generally negligible compared to the clustering signal of overlapping samples, and it has a mild dependence with redshift. However, magnification may become dominant in the regimes where the unknown and reference samples do not overlap (i.e., in the tails).
We try here to estimate the impact of lensing magnification on the recovered clusteringbased . The impact of lensing magnification on the galaxy overdensity is
(15) 
where is the galaxy overdensity, while is the overdensity induced by lensing magnification effects. The crosscorrelation signal between two different samples is therefore
(16) 
The first term on the right side of eq. (16) is associated to the clustering due to gravitational interactions, and disappears in the case of no redshift overlap between the unknown and reference samples. All of our methodology described in 2.1 assumes this term to be the dominant one. The second and third terms correspond to the lensing magnification contribution. The fourth term is generally small and can be neglected (Duncan et al., 2014).
Using the Limber and flatsky approximations (e.g. Hui et al., 2007; Loverde et al., 2008; Choi et al., 2016), the first clustering term in the above expression can be modeled via:
(17) 
The terms and indicate the galaxy–matter bias of the two samples; is the comoving distance, is the Hubble expansion rate at redshift . is the zeroth order Bessel function. is the 3D matter power spectrum at wavenumber k (which, in the Limber approximation, is set equal to ) and at the cosmic time associated with redshift z.
Under the approximation of weak gravitational lensing, the terms due to lensing magnification in eq. (16) can be written as
(18) 
The subscripts 1 and 2 are such that eq. (18) can refer either to the term or to . The term is the slope of the cumulative number counts evaluated at flux limit of the sample “2”. The slope of the cumulative number counts is formally defined for a flux limited sample as
(19) 
where is the cumulative number counts as a function of magnitude , and is to be evaluated at the flux limit of the sample. The term is the lensing redshift weight function defined as:
(20) 
and are respectively the Hubble constant today and the scale factor.
Knowing the redshift distribution, the bias evolution and the slope of the cumulative number counts for the two samples, theoretical predictions for the expected clusteringbased signal can be made through eq. (17) and eq. (18) and compared to the signal measured in simulations.
The true redshift distribution of the two samples is obtainable from the simulations. For the bias evolution, we make use of the 1point estimate measured in 5.1, appropriately corrected for the contribution due to the dark matter density field. For the sake of simplicity, we do not propagate to the theoretical predictions the statistical uncertainty of the 1point estimates of the two samples biases.
Concerning the slope of cumulative number counts, redMaGiC galaxies are in principle not a fluxlimited sample (the sample is indeed volumelimited up to z=0.85, and on top of that, galaxies are required to belong to the red sequence and to have luminosity greater than a fixed threshold value, see Rozo et al., 2016). However, redMaGiC galaxies are binned in thin redshift bins; within each bin, the sample can be well approximated as flux limited (. The thinner the bins, the better the approximation: this should be reflected as a sharp drop in the number counts as a function of magnitude. Therefore, for each bin, we evaluate the slope of the cumulative number count using eq. (19) at the magnitude where the number counts start to drop.
For the weak lensing sample the selection is way more complex and eq. (19) can not be directly applied. Fully characterizing the selection function for the weak lensing sample goes beyond the scope of this paper. We consider the predicted lensing signal for two characteristic values of the amplitude parameter , namely .
The results of this procedure are shown in fig. 8. We see that the predicted magnification signal is qualitatively similar to the excess clustering observed in the simulations, suggesting that the excess shown at high redshift in the topleft panel of fig. 4 is indeed due to magnification induced by redMaGiC galaxies at high redshift. Magnification due to the WL sample acting as a source is producing a noticeable effect only in the third bin, and the effect depends on the exact value of the amplitude parameter .
The result of this test shows that lensing magnification can have a non negligible impact on the clusteringbased , mostly on the tails of the recovered distribution. It is worth stressing that the procedure presented in this paper is little affected by lensing magnification, as we cut out the tails from our analysis. We leave properly incorporating weaklensing magnification effects into the analysis to future work.
6 Conclusions
Using numerical simulations, we characterize the performance of clusteringbased calibration of the Dark Energy Survey Year 1 (DES Y1) redshift distributions. Our standard calibration procedure is divided into two steps: a first step where the redshift distribution of a given science sample is estimated using a clusteringbased method; a second step where this estimated redshift distribution is used to correct for an overall photometric redshift bias in the posterior of traditional photo algorithms.
We use redMaGiC galaxies as the reference sample for the clusteringbased estimate. We show that our procedure could be applied in case of partial overlap in redshift space between the reference sample and the science sample. As for the science sample, we consider a simulated version of DES Y1 weak lensing source galaxies, divided in three redshift bins. We present the results for the photo posterior of two different photo codes (a templatebased code, BPZ, and a machine learning code, DNF). The photo codes are also used to bin the weak lensing source redshift bins, using their mean photo redshift.
We identify and characterized in our procedure three main sources of systematic errors in our methodology:

bias evolution systematic: systematic error induced by neglecting the redshift evolution of the galaxy–matter biases of the WL and redMaGiC samples and the evolution of the dark matter density field;

redMaGiC photo systematic: systematic caused by not using a spectroscopic sample as a reference;

shape systematic: systematic due to an incorrect shape of the photo posterior. This systematic is exacerbated if there is only a partial overlap between the redMaGiC and WL samples.
We find the bias evolution systematic (particularly, the effect due to the bias evolution of the WL sample) and shape systematic to dominate the total error budget. We also find statistical uncertainties in our procedure to be subdominant with respect to systematic errors. Total systematic errors for our calibration procedure, as a function of WL source redshift bin and photo code, are provided in 4.5, and stand at the level of .
We further address the impact of changing our fiducial choices concerning the angular scales and method used for the clusteringbased estimate, and discuss how our methodology could be improved. In particular, future works have to efficiently deal with the problem of the redshift evolution of the galaxy–matter bias of the science sample. This could be achieved by further splitting the science sample in luminosity/color cells. Other probes, like galaxygalaxy lensing, could be also used to break the degeneracy between galaxy bias and redshift distribution. Lensing magnification, whose impact is marginal in this study, might no longer be negligible as survey requirements become more stringent. Lastly, we note that as clusteringbased methods improve and systematic errors become subdominant with respect to statistical errors, full modeling of the crosscovariance between clusteringbased and other 2point correlation functions will be required so as not to bias the cosmological analysis.
The calibration strategy presented in this paper is fully implemented in the DES Y1 cosmic shear and combined twopoint function analysis (Troxel et al., 2017; DES Collaboration, 2017). Its direct application to DES Y1 data is discussed in two other companion papers (Davis et al., 2017b; Cawthon et al., 2017). Even though we show systematic errors to dominate over statistical uncertainties for this calibration procedure, this does not have negative implications for the DES Y1 cosmological analysis, which remains statistically dominated.
Acknowledgements
This paper has gone through internal review by the DES collaboration. It has been assigned DES paper id DES20170261 and FermiLab Preprint number PUB17317AAE.
Support for DG was provided by NASA through Einstein Postdoctoral Fellowship grant number PF5 160138 awarded by the Chandra Xray Center, which is operated by the Smithsonian Astrophysical Observatory for NASA under contract NAS803060. ER acknowledges support by the DOE Early Career Program, DOE grant DESC0015975, and the Sloan Foundation, grant FG20166443.
Funding for the DES Projects has been provided by the U.S. Department of Energy, the U.S. National Science Foundation, the Ministry of Science and Education of Spain, the Science and Technology Facilities Council of the United Kingdom, the Higher Education Funding Council for England, the National Center for Supercomputing Applications at the University of Illinois at UrbanaChampaign, the Kavli Institute of Cosmological Physics at the University of Chicago, the Center for Cosmology and AstroParticle Physics at the Ohio State University, the Mitchell Institute for Fundamental Physics and Astronomy at Texas A&M University, Financiadora de Estudos e Projetos, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro, Conselho Nacional de Desenvolvimento Científico e Tecnológico and the Ministério da Ciência, Tecnologia e Inovação, the Deutsche Forschungsgemeinschaft and the Collaborating Institutions in the Dark Energy Survey.
The Collaborating Institutions are Argonne National Laboratory, the University of California at Santa Cruz, the University of Cambridge, Centro de Investigaciones Energéticas, Medioambientales y TecnológicasMadrid, the University of Chicago, University College London, the DESBrazil Consortium, the University of Edinburgh, the Eidgenössische Technische Hochschule (ETH) Zürich, Fermi National Accelerator Laboratory, the University of Illinois at UrbanaChampaign, the Institut de Ciències de l’Espai (IEEC/CSIC), the Institut de Física d’Altes Energies, Lawrence Berkeley National Laboratory, the LudwigMaximilians Universität München and the associated Excellence Cluster Universe, the University of Michigan, the National Optical Astronomy Observatory, the University of Nottingham, The Ohio State University, the University of Pennsylvania, the University of Portsmouth, SLAC National Accelerator Laboratory, Stanford University, the University of Sussex, Texas A&M University, and the OzDES Membership Consortium.
Based in part on observations at Cerro Tololo InterAmerican Observatory, National Optical Astronomy Observatory, which is operated by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation.
The DES data management system is supported by the National Science Foundation under Grant Numbers AST1138766 and AST1536171. The DES participants from Spanish institutions are partially supported by MINECO under grants AYA201571825, ESP201588861, FPA201568048, SEV20120234, SEV20160597, and MDM20150509, some of which include ERDF funds from the European Union. IFAE is partially funded by the CERCA program of the Generalitat de Catalunya. Research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Program (FP7/20072013) including ERC grant agreements 240672, 291329, and 306478. We acknowledge support from the Australian Research Council Centre of Excellence for Allsky Astrophysics (CAASTRO), through project number CE110001020.
This manuscript has been authored by Fermi Research Alliance, LLC under Contract No. DEAC0207CH11359 with the U.S. Department of Energy, Office of Science, Office of High Energy Physics. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a nonexclusive, paidup, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.
This research used computing resources at SLAC National Accelerator Laboratory, and at the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DEAC0205CH11231. This research was funded partially by the Australian Government through the Australian Research Council through project DP160100930.
References
 Aihara et al. (2017) Aihara H., et al., 2017, preprint, (arXiv:1702.08449)
 Applegate et al. (2014) Applegate D. E., et al., 2014, MNRAS, 439, 48
 Bartelmann & Schneider (2001) Bartelmann M., Schneider P., 2001, Phys. Rep., 340, 291
 Becker (2013) Becker M. R., 2013, MNRAS, 435, 115
 Behroozi et al. (2013) Behroozi P. S., Wechsler R. H., Wu H.Y., 2013, ApJ, 762, 109
 Benítez (2000) Benítez N., 2000, ApJ, 536, 571
 Bezanson et al. (2016) Bezanson R., et al., 2016, ApJ, 822, 30
 Blanton et al. (2006) Blanton M. R., Eisenstein D., Hogg D. W., Zehavi I., 2006, ApJ, 645, 977
 Breiman (2001) Breiman L., 2001, Mach. Learn., 45, 5
 Carrasco Kind & Brunner (2013) Carrasco Kind M., Brunner R. J., 2013, MNRAS, 432, 1483
 Cawthon et al. (2017) Cawthon R., Davis C., Gatti M., Vielzeuf P., et al., 2017, to be submitted to PRD
 Choi et al. (2016) Choi A., et al., 2016, MNRAS, 463, 3737
 Coe et al. (2006) Coe D., Benítez N., Sánchez S. F., Jee M., Bouwens R., Ford H., 2006, AJ, 132, 926
 Coil et al. (2006) Coil A. L., Newman J. A., Cooper M. C., Davis M., Faber S. M., Koo D. C., Willmer C. N. A., 2006, ApJ, 644, 671
 Cooper et al. (2011) Cooper M. C., et al., 2011, ApJS, 193, 14
 Cresswell & Percival (2009) Cresswell J. G., Percival W. J., 2009, MNRAS, 392, 682
 Cunha et al. (2009) Cunha C. E., Lima M., Oyaizu H., Frieman J., Lin H., 2009, MNRAS, 396, 2379
 DES Collaboration (2017) DES Collaboration 2017, preprint (arXiv:1708.01530)
 Davis & Peebles (1983) Davis M., Peebles P. J. E., 1983, ApJ, 267, 465
 Davis et al. (2017a) Davis C., et al., 2017a, preprint, (arXiv:1707.08256)
 Davis et al. (2017b) Davis C., Gatti M., Vielzeuf P., Cawthon R., et al., 2017b, to be submitted to PRD
 De Vicente et al. (2016) De Vicente J., Sánchez E., SevillaNoarbe I., 2016, MNRAS, 459, 3078
 DeRose et al. (2017) DeRose J., Wechsler R., Rykoff E., et al., 2017, in prep.
 Duncan et al. (2014) Duncan C. A. J., Joachimi B., Heavens A. F., Heymans C., Hildebrandt H., 2014, MNRAS, 437, 2471
 ElvinPoole et al. (2017) ElvinPoole J., et al., 2017, preprint (arXiv:1708.01536)
 ForemanMackey et al. (2013) ForemanMackey D., Hogg D. W., Lang D., Goodman J., 2013, PASP, 125, 306
 Gschwend et al. (2017) Gschwend J., et al., 2017, in prep
 Hartlap et al. (2007) Hartlap J., Simon P., Schneider P., 2007, A&A, 464, 399
 Hildebrandt et al. (2010) Hildebrandt H., et al., 2010, A&A, 523, A31
 Hildebrandt et al. (2017) Hildebrandt H., et al., 2017, MNRAS, 465, 1454
 Hogg et al. (2003) Hogg D. W., et al., 2003, ApJ, 585, L5
 Hoyle et al. (2017) Hoyle B., et al., 2017, preprint (arXiv:1708.01532)
 Hui et al. (2007) Hui L., Gaztañaga E., Loverde M., 2007, Phys. Rev. D, 76, 103502
 Johnson et al. (2017) Johnson A., et al., 2017, MNRAS, 465, 4118
 Krause et al. (2017) Krause E., Eifler E., Zuntz J., Friedrich O., Troxel M., et al., 2017, preprint (arXiv:1706.09359)
 Landy & Szalay (1993) Landy S. D., Szalay A. S., 1993, ApJ, 412, 64
 Laureijs et al. (2011) Laureijs R., et al., 2011, preprint, (arXiv:1110.3193)
 Lilly et al. (2009) Lilly S. J., et al., 2009, ApJS, 184, 218
 Lima et al. (2008) Lima M., Cunha C. E., Oyaizu H., Frieman J., Lin H., Sheldon E. S., 2008, MNRAS, 390, 118
 Loverde et al. (2008) Loverde M., Hui L., Gaztañaga E., 2008, Phys. Rev. D, 77, 023512
 MacCrann et al. (2017) MacCrann N., DeRose J., Wechsler R., et al., 2017, to be submitted to PRD
 Marulli et al. (2013) Marulli F., et al., 2013, A&A, 557, A17
 Masters et al. (2017) Masters D. C., Stern D. K., Cohen J. G., Capak P. L., Rhodes J. D., Castander F. J., Paltani S., 2017, ApJ, 841, 111
 Matthews & Newman (2010) Matthews D. J., Newman J. A., 2010, ApJ, 721, 456
 McQuinn & White (2013) McQuinn M., White M., 2013, MNRAS, 433, 2857
 Ménard et al. (2010) Ménard B., Scranton R., Fukugita M., Richards G., 2010, MNRAS, 405, 1025
 Ménard et al. (2013) Ménard B., Scranton R., Schmidt S., Morrison C., Jeong D., Budavari T., Rahman M., 2013, preprint, (arXiv:1303.4722)
 Moessner & Jain (1998) Moessner R., Jain B., 1998, MNRAS, 294, L18
 Morrison & Hildebrandt (2015) Morrison C. B., Hildebrandt H., 2015, MNRAS, 454, 3121
 Morrison et al. (2012) Morrison C. B., Scranton R., Ménard B., Schmidt S. J., Tyson J. A., Ryan R., Choi A., Wittman D. M., 2012, MNRAS, 426, 2489
 Narayan (1989) Narayan R., 1989, ApJ, 339, L53
 Newman (2008) Newman J. A., 2008, ApJ, 684, 88
 Newman et al. (2013) Newman J. A., et al., 2013, ApJS, 208, 5
 Newman et al. (2015) Newman J. A., et al., 2015, Astroparticle Physics, 63, 81
 Norberg et al. (2009) Norberg P., Baugh C. M., Gaztañaga E., Croton D. J., 2009, MNRAS, 396, 19
 Prat et al. (2017) Prat J., et al., 2017, preprint (arXiv:1708.01537)
 Rahman et al. (2015) Rahman M., Ménard B., Scranton R., Schmidt S. J., Morrison C. B., 2015, MNRAS, 447, 3500
 Rau et al. (2015) Rau M. M., Seitz S., Brimioulle F., Frank E., Friedrich O., Gruen D., Hoyle B., 2015, MNRAS, 452, 3710
 Rozo et al. (2016) Rozo E., et al., 2016, MNRAS, 461, 1431
 Rykoff et al. (2014) Rykoff E. S., et al., 2014, ApJ, 785, 104
 Rykoff et al. (2015) Rykoff E. S., Rozo E., Keisler R., 2015, ArXiv: 1509.00870,
 Sánchez et al. (2014) Sánchez C., et al., 2014, MNRAS, 445, 1482
 Schmidt et al. (2009) Schmidt F., Rozo E., Dodelson S., Hui L., Sheldon E., 2009, Physical Review Letters, 103, 051301
 Schmidt et al. (2013) Schmidt S. J., Ménard B., Scranton R., Morrison C., McBride C. K., 2013, MNRAS, 431, 3307
 Scottez et al. (2016) Scottez V., et al., 2016, MNRAS, 462, 1683
 Scottez et al. (2017) Scottez V., BenoitLévy A., Coupon J., Ilbert O., Mellier Y., 2017, preprint, (arXiv:1705.02629)
 Scranton et al. (2005) Scranton R., et al., 2005, ApJ, 633, 589
 Simet & Mandelbaum (2015) Simet M., Mandelbaum R., 2015, MNRAS, 449, 1259
 Skibba et al. (2014) Skibba R. A., et al., 2014, ApJ, 784, 128
 Smith et al. (2007) Smith R. E., Scoccimarro R., Sheth R. K., 2007, Phys. Rev. D, 75, 063512
 Spergel et al. (2013) Spergel D., et al., 2013, preprint, (arXiv:1305.5422)
 Springel (2005) Springel V., 2005, MNRAS, 364, 1105
 The Dark Energy Survey Collaboration (2005) The Dark Energy Survey Collaboration 2005, ArXiv Astrophysics eprints,
 Troxel et al. (2017) Troxel M., et al., 2017, preprint (arXiv:1708.01538)
 Tyson et al. (2003) Tyson J. A., Wittman D. M., Hennawi J. F., Spergel D. N., 2003, Nuclear Physics B Proceedings Supplements, 124, 21
 Villumsen et al. (1997) Villumsen J. V., Freudling W., da Costa L. N., 1997, ApJ, 481, 578
 Wechsler et al. (2017) Wechsler R., DeRose J., Busha 2017, in prep.
 Zehavi et al. (2002) Zehavi I., et al., 2002, ApJ, 571, 172
 Zuntz et al. (2017) Zuntz J., Sheldon E., Samuroff S., et al., 2017, preprint (arXiv:1708.01533)
 de Jong et al. (2013) de Jong J. T. A., et al., 2013, The Messenger, 154, 44
 van Daalen & White (2017) van Daalen M. P., White M., 2017, preprint, (arXiv:1703.05326)
Appendix A Results for a different redMaGiC galaxy sample
In this paper we have adopted redMaGiC galaxies as a reference sample, as opposed to the more standard choice of using spectroscopic samples (e.g. Ménard et al., 2013; Schmidt et al., 2013; Choi et al., 2016; Hildebrandt et al., 2017). This choice has been mainly driven by the necessity of reducing the impact of shot noise and cosmic variance that the use of a small spectroscopic sample would have implied. We proved in 4.2 that the systematic error induced by redMaGiC photo is small compared to other source of systematics.
Despite statistical uncertainty being subdominant with respect to systematic errors, we note that the constant comoving density cut (together with luminosity threshold) used to select redMaGiC galaxies leads to large shot noise in the lowest redshift bins. We could select the redMaGiC sample imposing a lower luminosity threshold but a higher comoving density, so as to reduce shot noise. We therefore create a combined redMaGiC galaxy sample, made of three subsamples selected as follows: 1) high density sample, 0.15<z<0.6, ; 2) high luminosity sample, 0.6<z<0.75, ; 3) higher luminosity sample, 0.75<z<085, . The latter corresponds to the sample used in the main analysis, but restricted to a smaller redshift interval.
We repeated the full analysis for this new redMaGiC combined sample: results for the total systematic are summarized in table 4. As compared to our fiducial analysis (Tables 2 and 2), we find larger systematics for the first WL source redshift bin and slightly smaller ones for the third bin. In general, lowering the luminosity threshold of the redMaGiC algorithm allows to select more galaxies, but at the same time, increases the photometric error (and the redMaGiC photo systematic). Moreover, being now the sample made of three subsamples each characterized by a different luminosity, we might expect a non negligible bias evolution for the reference sample. The increase in the photometric error particularly affects the first bin (as it overlaps mainly with the high density sample), and together with the stronger bias evolution, leads to a larger total systematic error. As for the third bin, the stronger bias evolution of redMaGiC cancels out with the bias evolution of the weak lensing sample, reducing the bias evolution systematic and the total systematic error.
Given the larger impact of redMaGiC sample bias evolution and photometric errors on the total systematic budget, we preferred to use the higher luminosity sample for our analysis.
Appendix B Correcting the redshift evolution of the galaxy–matter bias with autocorrelations when spectroscopic redshifts are not available
In 5.1 we showed that we could get rid of the bias evolution systematic within statistical errors if we could measure the autocorrelation functions of the two samples divided in thin tophat redshift bins (i.e., using true). Unfortunately, it cannot be applied to data since we only have access to galaxies photo. Nonetheless, we could try to understand whether we could anyway correct for the redshift evolution of the galaxy–matter bias measuring the samples autocorrelation functions binned using photo.
In fig. 9 we show what we would obtain if we binned the WL source samples using the 1point estimates of the photo codes and measure the autocorrelation functions. This is compared to the results shown in 5.1, where the autocorrelation functions are binned using galaxies true (quantities in the plot correspond to the 1bin version of the autocorrelation functions, averaged over angular scales as explained in 2.1).
Due to the poor quality of source galaxies photo, the measurements are completely different: not only they can span a different redshift range, but also the redshift dependence is completely dissimilar.
For the reference galaxies, the scenario is a bit different (see fig. 9). In theory, redMaGiC galaxies have highquality, almost Gaussian photos, and we could in principle try to relate the two measurements. This can be done as follows: starting from eq. (1), the autocorrelation function included in the estimator proposed in 5.1 can be written as
(21) 
where is the redshift distribution of the redMaGiC galaxies in a given reference bin, and the reference sample galaxy–matter bias. If we assume the galaxy–matter bias (and the growth factor) to evolve as a function of redshift on scales larger than the reference bin width we can rewrite eq. (21) as
(22) 
where the quantities outside the integral are now computed at the median redshift of the reference bin. This would allow us to relate the 1bin estimates of the redMaGiC autocorrelation functions computed binning by true and photo as follows:
(23) 
This correction requires knowledge of , which is the true distribution of the reference sample binned using redMaGiC photo. This is usually not available in data, but an estimate can be obtained looking at the subsample of redMaGiC galaxies with spectra.
In Fig. 9 one can see how is precisely recovered using this procedure.
Affiliations
Institut de Física d’Altes Energies (IFAE), The Barcelona Institute of Science and Technology, Campus UAB, 08193 Bellaterra (Barcelona) Spain
Kavli Institute for Particle Astrophysics & Cosmology, P. O. Box 2450, Stanford University, Stanford, CA 94305, USA
Kavli Institute for Cosmological Physics, University of Chicago, Chicago, IL 60637, USA
UniversitätsSternwarte, Fakultät für Physik, LudwigMaximilians Universität München, Scheinerstr. 1, 81679 München, Germany
Department of Physics, Stanford University, 382 Via Pueblo Mall, Stanford, CA 94305, USA
Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT), Madrid, Spain
Institute of Space Sciences, IEECCSIC, Campus UAB, Carrer de Can Magrans, s/n, 08193 Barcelona, Spain
Department of Physics, University of Arizona, Tucson, AZ 85721, USA
Institució Catalana de Recerca i Estudis Avançats, E08010 Barcelona, Spain
Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA 19104, USA
Laboratório Interinstitucional de eAstronomia  LIneA, Rua Gal. José Cristino 77, Rio de Janeiro, RJ  20921400, Brazil
Observatório Nacional, Rua Gal. José Cristino 77, Rio de Janeiro, RJ  20921400, Brazil
SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
Department of Physics & Astronomy, University College London, Gower Street, London, WC1E 6BT, UK
Department of Physics, ETH Zurich, WolfgangPauliStrasse 16, CH8093 Zurich, Switzerland
Fermi National Accelerator Laboratory, P. O. Box 500, Batavia, IL 60510, USA
Center for Cosmology and AstroParticle Physics, The Ohio State University, Columbus, OH 43210, USA
Department of Physics, The Ohio State University, Columbus, OH 43210, USA
ARC Centre of Excellence for Allsky Astrophysics (CAASTRO)
School of Mathematics and Physics, University of Queensland, Brisbane, QLD 4072, Australia
Centre for Astrophysics & Supercomputing, Swinburne University of Technology, Victoria 3122, Australia
Sydney Institute for Astronomy, School of Physics, A28, The University of Sydney, NSW 2006, Australia
Australian Astronomical Observatory, North Ryde, NSW 2113, Australia
The Research School of Astronomy and Astrophysics, Australian National University, ACT 2601, Australia
Purple Mountain Observatory, Chinese Academy of Sciences, Nanjing, Jiangshu 210008, China
Cerro Tololo InterAmerican Observatory, National Optical Astronomy Observatory, Casilla 603, La Serena, Chile
LSST, 933 North Cherry Avenue, Tucson, AZ 85721, USA
INAF  Osservatorio Astrofisico di Torino, Pino Torinese, Italy
Department of Astronomy, University of Illinois, 1002 W. Green Street, Urbana, IL 61801, USA
National Center for Supercomputing Applications, 1205 West Clark St., Urbana, IL 61801, USA
George P. and Cynthia Woods Mitchell Institute for Fundamental Physics and Astronomy, and Department of Physics and Astronomy, Texas A&M University, College Station, TX 77843, USA
Department of Physics, IIT Hyderabad, Kandi, Telangana 502285, India
Department of Physics, California Institute of Technology, Pasadena, CA 91125, USA
Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Dr., Pasadena, CA 91109, USA
Department of Astronomy, University of Michigan, Ann Arbor, MI 48109, USA
Department of Physics, University of Michigan, Ann Arbor, MI 48109, USA
Instituto de Fisica Teorica UAM/CSIC, Universidad Autonoma de Madrid, 28049 Madrid, Spain
Department of Astronomy, University of California, Berkeley, 501 Campbell Hall, Berkeley, CA 94720, USA
Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
Astronomy Department, University of Washington, Box 351580, Seattle, WA 98195, USA
Santa Cruz Institute for Particle Physics, Santa Cruz, CA 95064, USA
Argonne National Laboratory, 9700 South Cass Avenue, Lemont, IL 60439, USA
Departamento de Física Matemática, Instituto de Física, Universidade de São Paulo, CP 66318, São Paulo, SP, 05314970, Brazil
Department of Astrophysical Sciences, Princeton University, Peyton Hall, Princeton, NJ 08544, USA
Institute of Cosmology & Gravitation, University of Portsmouth, Portsmouth, PO1 3FX, UK
Brookhaven National Laboratory, Bldg 510, Upton, NY 11973, USA
School of Physics and Astronomy, University of Southampton, Southampton, SO17 1BJ, UK
Instituto de Física Gleb Wataghin, Universidade Estadual de Campinas, 13083859, Campinas, SP, Brazil
Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Excellence Cluster Universe, Boltzmannstr. 2, 85748 Garching, Germany
Max Planck Institute for Extraterrestrial Physics, Giessenbachstrasse, 85748 Garching, Germany