# Accidental deep field bias in CMB T and SNe correlation

###### Abstract

Evidence presented by Yershov, Orlov and Raikov apparently showed that the WMAP/Planck cosmic microwave background (CMB) pixel-temperatures (T) at supernovae (SNe) locations tend to increase with increasing redshift (). They suggest this correlation could be caused by the Integrated Sachs-Wolfe effect and/or by some unrelated foreground emission. Here, we assess this correlation independently using Planck 2015 SMICA R2.01 data and, following Yershov et al., a sample of 2783 SNe from the Sternberg Astronomical Institute. Our analysis supports the prima facie existence of the correlation but attributes it to a composite selection bias (high CMB T high SNe ) caused by the accidental alignment of seven deep survey fields with CMB hotspots. These seven fields contain 9.2 per cent of the SNe sample (256 SNe). Spearman’s rank-order correlation coefficient indicates the correlation present in the whole sample (, p-value ) is insignificant for a sub-sample of the seven fields together (, p-value ) and entirely absent for the remainder of the SNe (, p-value ). We demonstrate the temperature and redshift biases of these seven deep fields, and estimate the likelihood of their falling on CMB hotspots by chance is at least 6.8 per cent (approximately 1 in 15). We show that a sample of 7880 SNe from the Open Supernova Catalogue exhibits the same effect and we conclude that the correlation is an accidental but not unlikely selection bias.

###### keywords:

cosmology: cosmic background radiation – cosmology: observations – supernovae: general – methods: statistical – surveys^{†}

^{†}pubyear: 2018

^{†}

^{†}pagerange: Accidental deep field bias in CMB T and SNe correlation–C

## 1 Introduction

Observations of the cosmic microwave background (CMB) and Type Ia supernovae (SNIa) are exceptional probes of cosmological parameters. The measurements of the CMB by the WMAP satellite (Bennett et al., 2013) and SNIa by the high- supernova search team (Riess et al., 1998) and the supernova cosmology project (Perlmutter et al., 1999) have established the six parameter CDM cosmological model. Precision measurements of the CMB temperature, polarisation, and lensing anisotropies from the South Pole Telescope, Planck satellite, and Atacama Cosmology Telescope (e.g., Story et al., 2013; Planck Collaboration et al., 2016c; Louis et al., 2017, and references therein) have strongly reinforced the preference for CDM as the concordance model of cosmology.

Cross-correlation of CMB observations with the large-scale structure (LSS) of the Universe, revealed by surveys such as the Sloan Digital Sky Survey (SDSS, Alam et al., 2015) and the Dark Energy Survey (DES, DES Collaboration et al., 2016), provide powerful tests of CDM (e.g., Giannantonio et al., 2016; DES Collaboration et al., 2017). Increased efforts are currently being made for accurate and precise calibration of the distance-redshift relation using SNIa up to high redshifts (e.g., LSST Science Collaboration et al., 2009; Kessler et al., 2015) in order to understand the expansion history and late-time () accelerated expansion of the Universe attributed to dark energy.

Yershov et al. (2012, 2014) combined CMB and supernovae (SNe) data and detected a correlation between the CMB temperature anisotropies (T) and the redshift (z) of the SNe (CMB T SNe ). High SNe appear to be preferentially associated with hotter CMB temperatures (Fig. 1). This effect was particularly strong for SNIa. They concluded that the correlation is not caused by the Sunyaev-Zel'dovich (SZ) effect (Sunyaev & Zel'dovich, 1970) and suggested it may instead be caused by the Integrated Sachs-Wolfe (ISW) effect (Sachs & Wolfe, 1967) or some remnant contamination in the CMB data, possibly from low redshift foreground (Yershov et al., 2014).

In this paper we re-analyse the SNe samples of Yershov et al. (2012, 2014) and offer an alternative explanation for the correlation, namely that it is a composite selection bias caused by the chance alignment of certain deep survey fields with CMB hotspots. This bias (high CMB T high SNe ) is the combined result of a selection bias (high SNe in deep fields) and the chance alignment of those deep fields with CMB hotspots.

The remainder of this paper describes our analyses of the reported correlation. In section 2 we describe the data and summarise the variety of methods used to demonstrate the prima facie existence of the correlation in these data. We present the results from re-analysing the SNe sample of Yershov et al. (2012, 2014) in section 3. Specifically, we identify SNe fields with a high surface density of SNe (3.1) and show that these cause the apparent correlation (3.2) due to their bias to hotter CMB temperature and higher redshift than the remainder of the SNe sample (3.3). We quantify the likelihood of this bias occurring by chance: at least 6.8 per cent, or approximately 1 in 15 (3.4). We present corroborating results from analysing alternative data in section 4. In section 5 we conclude that the correlation reported by Yershov et al. (2012, 2014) is actually an accidental but not exceptionally unlikely composite selection bias and we briefly speculate on further potential implications for cosmology.

## 2 Data and methods

### 2.1 SNe and CMB data

We used CMB data from the Planck 2015 (Planck
Collaboration et al., 2016a) maps^{1}^{1}1http://irsa.ipac.caltech.edu/data/Planck/release_2/all-sky-maps/matrix_cmb.html, specifically SMICA R2.01 with . These maps are provided and analysed using the Hierarchical Equal Area iso-Latitude Pixelisation scheme (HEALPix^{2}^{2}2http://healpix.jpl.nasa.gov/, Górski
et al., 2005). Our temperature distribution was consistent with that previously determined by Yershov
et al. (2014) using the Planck 2013 (Planck
Collaboration et al., 2014a) SMICA R1.20 map.

The Planck component-separated CMB maps were produced using four techniques: Commander (Eriksen et al., 2008), NILC (Delabrouille et al., 2009), SEVEM (Fernández-Cobos et al., 2012) and SMICA (Cardoso et al., 2008). Planck Collaboration et al. (2016b) provide a critical analysis of the applicability of the resultant four 2015 maps, and confirm that, as in 2013 (Planck Collaboration et al., 2014b), SMICA is preferred for high-resolution temperature analysis.

As recommended by Planck Collaboration et al. (2016b), for analysing component-separated CMB temperature maps, we used the Planck UT78 common mask. This is the union of the Commander, SEVEM, and SMICA confidence masks. UT78 excludes point sources, some of the Galactic plane (note also our subsequent Galactic latitude restriction), and some other bright regions. It has a fraction of unmasked pixels of . Note that our results were consistent using an alternative mask (UTA76), and without masking.

Our initial analysis used SNe data provided by Yershov
et al. (2014) as supplementary data^{3}^{3}3https://academic.oup.com/mnras/article-lookup/doi/10.1093/mnras/stu1932, derived from the Sternberg Astronomical Institute (SAI) Supernova Catalogue^{4}^{4}4http://www.sai.msu.su/sn/sncat/ (Bartunov et al., 2007) as of October 2013. This provides a sample of 6359 SNe of all types. To avoid contamination from the Galactic plane we restricted this sample to high Galactic latitude , the same conservative restriction used by Yershov
et al. (2014). We could not reproduce the identical sample for the redshift restriction apparently used by Yershov
et al. (2014), so we adopted a restriction of , which yielded a similar sample size to theirs. We excluded SNe on masked (Planck UT78) HEALPix pixels. The resultant SAI sample contained 2783 SNe.

Above redshift , SNe in the SAI sample become rather sparse and are predominantly associated with hotter than average CMB temperature. To analyse this high-redshift region further, we obtained data from the Open Supernova Catalogue^{5}^{5}5https://sne.space/ (OSC, Guillochon et al., 2017) as of June 2017. We removed SNe without co-ordinate information, those not yet confirmed as SNe () and gamma ray bursts (). These selections provide a sample of 12879 SNe of all types. We also restricted this sample to high Galactic latitude and excluded SNe on masked (Planck UT78) HEALPix pixels. The resultant OSC sample contained 7880 SNe.

Unless specified otherwise, all analysis in this paper was performed on the SAI sample after the restrictions on Galactic latitude (), redshift (), and Planck UT78 confidence mask. We repeated our analyses using the OSC sample (section 4.2) with the same restrictions to verify that our results are not specific to the SAI sample.

### 2.2 Methods

We constructed Fig. 1 broadly following Yershov
et al. (2014). We determined the temperature^{6}^{6}6We follow the common practice of referring to temperature anisotropies () as temperature (). of the CMB map pixel at each SN location. The data were grouped into redshift bins of width and the weighted mean CMB temperature of each bin () was calculated as

(1) |

where is the individual pixel temperature, is the weight of each pixel, and is the number of SNe per bin. Error bars are the standard error on the weighted bin mean (), calculated from the weighted variances as

(2) |

where is our variance estimate for each pixel. Note that for bins with only one SN, error bars are .

Individual pixel variances () were estimated^{7}^{7}7Following private communications with Planck Legacy Archive and NASA/IPAC Infrared Science Archive. by producing a squared, smoothed ( FWHM) half-mission half-difference (HMHD) map from the two Planck 2015 SMICA R2.01 half-mission maps . Weights () are thus

(3) |

The choice of smoothing scale and the method of estimating pixel variance affects the resultant weights. We tested a number of these and our results are consistent. See Appendix B for versions of Fig. 1 produced using different smoothing scales (none, , 0.5, and 5 FWHM) and different estimates of individual pixel variance (Planck 2013 R1.20 SMICA map noise, Planck 2015 R2.01 SMICA HMHD and HRHD maps, and Planck 2015 R2.02 143GHz and 217GHz frequency maps).

We fitted an ordinary least squares (OLS) linear regression to the binned data (dashed line) using Python’s statsmodels.api.OLS, and calculated its slope plus the standard error on the gradient. We followed the same method to calculate the OLS linear regression gradient throughout this paper. Note that results using weighted least squares (WLS) linear regression (weighted by or ), and results using OLS and WLS linear regression of unbinned data, were consistent.

Several of our analyses compared various SNe sub-samples using the parametric independent 2-sample Welch’s t-test (or unequal variance t-test, Welch, 1938) and the non-parametric 1-sided Mann-Whitney U (MWU) test. We used the unequal variance t-test to account for the different angular extent of the deep survey fields, and hence difference in variance of their CMB temperature. It also accommodates the wide variation in the number of SNe per redshift bin, and the resultant differing variance of both their CMB temperature and their redshift. The MWU test does not assume that the population follows any specific parameterised distribution, unlike the t-test which assumes a normal distribution, and it is less sensitive to outliers than the t-test.

The t-test gives the probability (p-value) of obtaining SNe sub-samples with differences in mean CMB temperature (or SNe redshift) at least as extreme as those observed, assuming the null hypothesis is true

(4) |

In other words, it tests whether the means of their populations differ. The 2-sample Welch’s t-test was implemented using Python’s scipy.stats.ttest_ind with equal_var False.

MWU combines the sub-samples of CMB temperature (or SNe redshift), ranks the combined sample, and determines the mean of the ranks () for each sub-sample. It gives the probability (p-value) of obtaining SNe samples with differences in mean ranks at least as extreme as those observed, assuming the null hypothesis is true

(5) |

In practice, this is generally interpreted as whether the distributions of the sub-samples differ, since the ranks of the sub-samples will differ if so. We used the 1-sided MWU to test the alternative hypothesis that the CMB temperature (or SNe redshift) distribution of one sub-sample was greater than that of the other

(6) |

The 1-sided MWU test was implemented using Python’s scipy.stats.mannwhitneyu with alternative ‘greater’.

To analyse which modes dominate the apparent correlation (section 3.2.1) we filtered the map to remove large angular scales. We used Python’s healpy.sphtfunc.almxfl to apply a high pass filter to the spherical harmonic coefficients () of the Planck 2015 SMICA map. We then computed the filtered map from these filtered values. We repeated the OLS linear regression and Spearman’s rank-order correlation coefficient analyses for each filtered map.

In our analysis of likelihood (section 3.4) we performed randomisations of SNe locations within fields and of SNe field centres on the sky. We used Python’s random.uniform to select random HEALPix pixels, subject to the same Galactic latitude restriction () as the original sample. After each randomisation the masking (Planck UT78) was re-applied, the CMB temperature and variance were re-sampled, and the weights were re-calculated before the OLS linear regression was re-fitted.

We also assessed the likelihood by creating simulations of the CMB. We used Python’s healpy.sphtfunc.anafast to compute the power spectrum () of the original (unmasked) Planck 2015 SMICA map. Note that this extracts values from the given map and does not assume a particular power spectrum or underlying cosmology. We then used healpy.sphtfunc.synfast to generate new synthetic maps from these values, at full resolution FWHM, (Planck Collaboration et al., 2016b), to match our fiducial Planck 2015 SMICA map. After each simulation the CMB temperature was re-sampled. The SNe mask (Planck UT78), variances, and weights were left unchanged (as the SNe had not moved) and the OLS linear regression was re-fitted.

We apply these methods to the data and various randomisations in the following sections.

## 3 Results: deep field bias

Field | No. | (J2000) | (J2000) | Angular | No. SNe | No. | Deep survey field(s) |
---|---|---|---|---|---|---|---|

SNe | h m s | size | deg | pixels | |||

Field 1 | 50 | 14:19:28 | 52:40:28 | 41.3 | 1599 | SNLS D3, EGS | |

Field 2 | 29 | 12:36:55 | 62:16:40 | 181.3 | 221 | HDF-N, GOODS-N | |

Field 3 | 54 | 02:31:41 | -08:24:43 | 13.5 | 4944 | ESSENCE wdd | |

Field 4 | 22 | 02:25:55 | -04:30:58 | 18.2 | 1596 | SNLS D1 | |

Field 5 | 64 | 02:07:53 | -04:19:04 | 16.0 | 5000 | ESSENCE wcc, NDWFS | |

Field 6 | 21 | 22:15:36 | -17:42:17 | 17.4 | 1593 | SNLS D4 | |

Field 7 | 16 | 03:32:26 | -27:38:25 | 100.0 | 219 | CDF-S, GOODS-S | |

Stripe 82 | 665 | 00:55:00 | 00:00:00 | 2.8 | 352871 | SDSS Stripe 82 | |

Remainder | 1862 | n/a | n/a | 14474 deg | 0.2 | 16971444 | n/a |

Yershov et al. (2012, 2014) detected a correlation between SNe redshifts and CMB temperature using OLS linear regression and Pearson’s correlation coefficient. We verified this correlation using the independent 2-sample Welch’s t-test, the 1-sided MWU test, and Spearman’s rank-order correlation coefficient, so we do not dispute that a correlation is found.

However, we do not believe that there is any astrophysical origin of the correlation and conclude that it is a composite selection bias caused by the chance alignment of certain deep survey fields with CMB hotspots. In this section we describe our identification of the deep survey fields in question and determine the significance of their contribution to the correlation. We show how their temperature and redshift biases cause this composite selection bias and we quantify the likelihood of it occurring by chance.

### 3.1 Identification of fields

SNe are transient objects, historically detected both by chance and by repeated observations of specific fields, galaxy clusters etc. Reliably detecting and following the lightcurves of SNe at high redshift requires particularly targeted approaches (e.g., Filippenko & Riess, 1998; Dawson et al., 2009). As a result SNe are not evenly detected across the sky and most SNe datasets are not spatially uniform. This situation is changing with wide and time-domain surveys such as DES (DES Collaboration et al., 2016).

We analysed the SAI sample to identify regions with a high surface density of SNe. Visual inspection of a Topcat Sky Plot suggested that defining these as regions containing SNe with an average surface density of SNe per square degree would be appropriate and productive. We placed no constraint on the overall angular size of the region. Our algorithm identified 7 SNe fields meeting these criteria, plus 2 additional fields within SDSS Stripe 82.

Table 1 lists the 7 SNe fields plus Stripe 82. These fields contain a total of 921 SNe (33.1 per cent), with Stripe 82 containing 665 SNe (23.9 per cent) and fields 1-7 containing 256 SNe (9.2 per cent) of the sample. For each field 1-7 we identified corresponding deep survey fields coincident with the SNe field. There were 3 fields from the Supernova Legacy Survey (SNLS, Astier et al., 2006), 2 from the ESSENCE supernova survey (Miknaitis et al., 2007), the Hubble Deep Field North (HDF-N, Williams et al., 1996), and the Chandra Deep Field South (CDF-S, Giacconi et al., 2001).

For each field 1-7 we defined a square in RA and Dec orientation consistent with the deep survey field footprint and encompassing the bulk of the SNe identified by our algorithm. For the majority of the fields the best fit was to increase the deep survey field edge lengths by 10 per cent. For field 2 and field 7, coincident with HDF-N and CDF-S respectively, the best fit was to rotate (to RA and Dec orientation) a square enclosing the deep survey field and increase the deep survey field edge lengths by 20 per cent. Note that localising our SNe fields to the corresponding deep survey fields in this way generally reduced their angular size and the number of SNe they contained, which in some cases reduced the number of SNe and/or their surface density below the initial detection thresholds used.

### 3.2 Contribution to correlation

To test whether these fields contribute to the correlation we compared the OLS linear regression both with and without them in the sample. We created ‘remainder’ samples containing SNe from the SAI sample minus those in fields 1-7 combined, minus those in Stripe 82, and minus those in fields 1-7 and Stripe 82 together. We calculated the OLS linear regression gradient and Spearman’s rank-order correlation coefficient of these remainders and compared them with those of the whole sample. Note that results using WLS linear regression, and values of Pearson’s correlation coefficient, were consistent.

Fields removed | No. | Gradient | Corr. Coeff. | ||||
---|---|---|---|---|---|---|---|

SNe | () | p-value | |||||

None | 2783 | 61 | 12 | 0.5 | |||

Fields 1-7 | 2527 | -2 | 22 | 0.1 | 0.6 | ||

Stripe 82 | 2118 | 54 | 14 | 0.4 | |||

Fields 1-7 & Stripe 82 | 1862 | -2 | 25 | 0.0 | 0.7 |

Table 2 shows the gradient of the OLS linear regression slope for each remainder sample in units of per unit redshift, plus the standard error on the gradient. In these units the gradient of the whole sample is , which is significantly above zero. Spearman’s rank-order correlation coefficient for the whole sample shows a moderate correlation () which is statistically significant (p-value ).

SDSS Stripe 82 is the largest field we identified, both in terms of the number of SNe (665) and angular size. Therefore the SNe in Stripe 82 cover a wider variety of CMB pixels and any statistical contribution from them should be much less prone to selection bias. Indeed, removing Stripe 82 from the sample does not significantly affect the OLS linear regression slope () or Spearman’s rank-order correlation coefficient (, p-value ). However, removing fields 1-7 (256 SNe) reduces the gradient dramatically to , consistent with zero, and there is no correlation evident (, p-value ) in the remainder. Removing both fields 1-7 and Stripe 82 together has a similar effect.

The result of removing fields 1-7 from the SAI sample is illustrated in Fig. 3, which plots the weighted mean CMB temperature at SNe locations in redshift bins of . This plot is repeated for the whole sample (3(a)), fields 1-7 only (3(b)) and the remainder of the sample after fields 1-7 are removed (3(c)). Note that OLS linear regression gradients of unbinned data were consistent.

The OLS gradient and correlation present in the whole sample (gradient , , p-value ) are entirely absent in the remainder (gradient , , p-value ). The OLS gradient of the fields 1-7 sample is slightly positive () but Spearman’s rank-order correlation coefficient indicates that there is no significant correlation evident (, p-value ).

The data clearly indicate that the correlation is caused by fields 1-7 and that SDSS Stripe 82 does not contribute significantly.

#### 3.2.1 Angular scales

We checked whether large-scale hot/cold spots, or anisotropies on scales of (), dominate the apparent correlation. We filtered the Planck 2015 SMICA map to remove large angular scales (, , and ) and repeated the OLS linear regression and Spearman’s rank-order correlation coefficient analyses. In all cases there was no significant correlation evident (e.g., , , p-value ). The large angular scales are dominating, as expected, indicating that the CMB map pixels at SNe locations contribute no more than any other pixels within these scales.

### 3.3 Temperature and redshift

We investigated whether the CMB temperature at SNe locations and/or the redshift of SNe within fields 1-7 and Stripe 82 differ from those in the rest of the sample. We calculated the mean CMB temperature () and mean SNe redshift () for each sample compared with the remainder. We also analysed the CMB temperature and SNe redshift distributions using the independent 2-sample Welch’s t-test and 1-sided MWU test. Note that results using the median CMB temperature at SNe locations, and median SNe redshift, were consistent.

#### 3.3.1 Mean CMB T and mean SNe

Field | CMB temperature () | Redshift | |||||||

SNe | pixels | SNe | |||||||

Field 1 | 29.1 | 9.1 | 47.5 | 1.7 | 0.55 | 0.03 | |||

Field 2 | 92.9 | 3.3 | 94.3 | 1.7 | 0.94 | 0.06 | |||

Field 3 | 130.8 | 9.2 | 137.5 | 1.0 | 0.41 | 0.02 | |||

Field 4 | 85.3 | 14.3 | 81.7 | 1.7 | 0.55 | 0.05 | |||

Field 5 | 49.7 | 8.5 | 68.2 | 1.2 | 0.45 | 0.02 | |||

Field 6 | 199.3 | 13.6 | 168.9 | 1.9 | 0.67 | 0.04 | |||

Field 7 | 120.5 | 12.0 | 100.8 | 4.0 | 0.89 | 0.11 | |||

Fields 1-7 | 87.4 | 5.0 | 101.5 | 0.7 | 0.57 | 0.02 | |||

Stripe 82 | 3.8 | 4.0 | 17.2 | 0.2 | 0.23 | 0.01 | |||

Whole sample | 10.3 | 2.0 | 3.1 | 0.02 | 0.18 | 0.00 |

The mean CMB temperature () at SNe locations, and of all CMB map HEALPix pixels within each sample, and mean SNe redshift () are specified in Table 3. The samples are fields 1-7 individually, fields 1-7 combined, Stripe 82, and the whole sample. Fig. 4 illustrates these distributions, namely CMB temperature at SNe locations (1), CMB temperature of all CMB map HEALPix pixels within each sample (2), and SNe redshift (3) in these samples. In both Table 3 and Fig. 4 SNe are restricted by Galactic latitude (), redshift (), and Planck UT78 confidence mask and CMB map HEALPix pixels are restricted by Galactic latitude () and Planck UT78 confidence mask.

SNe in all fields 1-7 are biased to CMB temperatures hotter than the mean of the whole sample (). Fields 3, 6 and 7 are particularly extreme, with mean CMB temperatures at SNe locations of , , and respectively. SNe in SDSS Stripe 82 are not biased to CMB temperatures hotter than the mean of the whole sample. For all the fields, fields 1-7 and Stripe 82, the CMB temperature distribution (and mean) at SNe locations is generally representative of the CMB map HEALPix pixel temperature distribution (and mean) to within .

SNe in all fields 1-7 are also biased to higher redshift than the mean of the whole sample (). Fields 2, 6 and 7 are particularly extreme, with mean SNe redshifts of , and respectively. SNe in SDSS Stripe 82 are biased to slightly higher redshift of than the whole sample. However, although Stripe 82 is deeper it is not hotter, which explains why it does not significantly contribute to the correlation.

Field | p-values | |||
---|---|---|---|---|

CMB temperature | SN redshift | |||

t-test | MWU | t-test | MWU | |

Field 1 | ||||

Field 2 | ||||

Field 3 | ||||

Field 4 | ||||

Field 5 | ||||

Field 6 | ||||

Field 7 | ||||

Fields 1-7 | ||||

Stripe 82 | 0.7 | 0.2 |

#### 3.3.2 MWU and t-test

We analysed whether the CMB temperature distribution at SNe locations and/or SNe redshift distribution within fields 1-7 and Stripe 82 differ from the rest of the sample using the independent 2-sample Welch’s t-test and 1-sided MWU test. To recap from section 2.2, Welch’s t-test (or unequal variance t-test) accommodates both the different angular extent of the deep survey fields, and hence difference in variance of the CMB temperature, and the variation in the number of SNe per redshift bin. The MWU test makes fewer assumptions (in particular, the t-test assumes a normal distribution) and is less sensitive to outliers than the t-test. Clearly not all the samples we tested are normally distributed (see Fig. 4) but we have included all the results for completeness.

We performed all the analysis using a constant ‘remainder’ sample created by removing fields 1-7 and Stripe 82 from the sample. This was tested against samples containing SNe from fields 1-7 individually, fields 1-7 combined, and Stripe 82. See Table 4 for the results (p-values). Note that comparison between the very small p-values is unlikely to be meaningful.

For CMB temperature both p-values for SDSS Stripe 82 and the MWU p-value for field 1 are above the per cent significance level. Therefore we cannot reject the null hypotheses that Stripe 82 has the same mean temperature and same temperature distribution as the remainder of the sample, nor the null hypothesis that field 1 has the same temperature distribution as the remainder of the sample.

However, for all the other tests the p-values indicate that the individual field samples do not have the same mean temperature as the remainder, and that the temperature distribution of the fields is significantly hotter than that of the remainder.

For SNe redshift all the p-values of all the samples indicate that the individual fields do not have the same mean redshift as the remainder, and that the redshift distribution of the fields is significantly higher than that of the remainder.

We have demonstrated that fields 1-7 are biased to hotter CMB temperatures, specifically at SNe locations but also at all CMB map HEALPix pixels within the fields. We believe this is the result of the chance alignment of those fields with CMB hotspots. This would not on its own be sufficient to lead to the correlation reported by Yershov et al. (2012, 2014). However, fields 1-7 are also biased to higher redshifts because they are the result of deep survey fields. The remainder of the SNe are generally lower redshift and are spread more uniformly across the sky, so they have a mean CMB temperature closer to the mean of the whole CMB map.

The composite effect is to introduce enough high-redshift SNe at locations of sufficiently high CMB temperature to skew all the analyses we have performed to demonstrate the presence of the correlation, namely OLS linear regression, Welch’s t-test, MWU test, and Spearman’s rank-order correlation coefficient. This effect was caused by 256 SNe, comprising 9.2 per cent of the restricted SAI sample of 2783 SNe.

### 3.4 Likelihood

We quantified the likelihood of this selection bias happening by chance by analysing the effect on the OLS gradient of moving SNe to random positions within fields 1-7, and by moving fields 1-7 to random positions on the sky. In both analyses the SNe were not moved between fields. We also analysed the effect on the OLS gradient of simulating the CMB sky, without moving the SNe at all.

Within each field 1-7 we moved SNe to 1000 random positions within the field boundaries defined in section 3.1. We also moved each field 1-7 to 10000 random positions on the sky, compliant with the Galactic latitude restriction (), whilst keeping the field size and shape constant and the SNe in approximately the same position within each field (within small angle approximation). In both cases all other SNe outside fields 1-7 were left in their original positions. After each move (within each field or of each field on the sky) the masking (Planck UT78) was re-applied, the CMB temperature and variance were re-sampled, and the weights were re-calculated.

We created 10000 simulations of the CMB sky from the power spectrum of our fiducial Planck 2015 SMICA map, as described in section 2.2. All SNe were left in their original positions. After each simulation the CMB temperature was re-sampled. The masking (Planck UT78) and variances were left unchanged as the SNe remained on their original CMB map HEALPix pixel.

Following each move or simulation and subsequent derivations/calculations we binned the data, re-calculated the weighted mean CMB temperature of each bin, fitted an OLS linear regression, and determined the gradient of the slope as previously described (section 2.2).

Fig. 5 shows the distribution of OLS gradients after these random moves and simulations. For comparison, the original gradient () is shown as a solid vertical line. Note that the uncertainty in the original gradient is the standard error on the gradient as calculated by the OLS linear regression, whereas the uncertainties in the means of the distributions, described below, are the standard errors of the means.

After moving SNe within each field 1-7 to 1000 random positions within the fields, the mean of the OLS gradient distribution (5(a)) is . The distribution is narrow and consistent with the original gradient, indicating that the position of SNe within fields 1-7 does not significantly affect the correlation.

After moving each field 1-7 to 10000 random positions on the sky, the mean of the OLS gradient distribution (5(b)) is . The distribution is wide, centred near zero, and inconsistent with the original gradient, unsurprisingly indicating that the position of fields 1-7 on the sky is responsible for the correlation.

After 10000 simulations of the CMB sky, the mean of the OLS gradient distribution (5(c)) is . The distribution is consistent with the results from moving each field 1-7 to 10000 random positions on the sky.

Assuming a standard normal distribution we calculated the z-score (standard score) of the original OLS gradient (X) as

(7) |

where is the mean of the gradient distribution and is its standard deviation. We then used the standard normal distribution table to provide the probability of observing a gradient at least as extreme as X within our gradient distributions. For moving fields 1-7 on the sky (5(b)) the probability is 6.8 per cent (approximately 1 in 15) and for simulating the CMB (5(c)) it is 8.9 per cent (approximately 1 in 11). Therefore the chance alignment of fields 1-7 with CMB hotspots is not an exceptionally unlikely event.

## 4 Results: alternative data

### 4.1 SNe types

Yershov et al. (2014) demonstrated that the correlation between SNe redshifts and CMB temperature was particularly strong for the SNIa sub-sample, whereas for the rest of the SNe it vanished. Is this consistent with our assertion that the correlation is the result of a composite selection bias caused by the chance alignment of certain deep survey fields (fields 1-7) with CMB hotspots?

Sub-sample | No. | No. | % |
---|---|---|---|

SNe | SNIa | SNIa | |

Whole sample | 2783 | 1,749 | 62.8% |

Fields 1-7 | 256 | 235 | 91.8% |

Remainder | 2527 | 1,514 | 59.9% |

Table 5 shows the number and proportion of Type Ia SNe in our SNe samples. Supernova surveys such as SNLS and ESSENCE primarily targeted SNIa (Pritchet & SNLS Collaboration, 2005; Miknaitis et al., 2007), so it is unsurprising that fields 1-7 contain predominantly SNIa (91.8 per cent). As expected in the whole sample, a little over half the SNe are SNIa. Removing fields 1-7 from the sample does not significantly decrease the proportion of SNIa, which drops from 62.8 per cent in the whole sample to 59.9 per cent in the remainder. However, we have shown that the correlation present in the whole sample (Fig. 3(a)) is entirely absent in this remainder (Fig. 3(c))

Fields 1-7 together comprise 9.2 per cent of the whole sample, but when the sample is restricted to SNIa only this increases to 13.4 per cent. Thus, restricting the sample to SNIa increases the influence of fields 1-7. We suggest that the correlation is not caused by SNIa themselves, but that it is inadvertently enhanced by restricting the sample to SNIa due to the dominance of SNIa in fields 1-7.

### 4.2 SNe catalogues

We have demonstrated that the correlation reported by Yershov et al. (2012, 2014) is a composite selection bias caused by the chance alignment of certain deep survey fields with CMB hotspots. Yershov et al. (2012, 2014) analysed the Sternberg Astronomical Institute (SAI, Bartunov et al., 2007) SNe catalogue, but it seems reasonable that other SNe catalogues could show a similar effect.

We repeated our analyses from section 3 using the Open Supernova Catalogue (OSC, Guillochon et al., 2017). Data were obtained, restricted and weighted as described in section 2, yielding a sample of 7880 SNe. We found that the OSC sample does indeed exhibit a similar apparent correlation (with a gradient of ) to the SAI sample.

We applied the same SNe field detection algorithm with the same detection thresholds described in section 3.1 to the OSC sample.

Field | No. SNe | |
---|---|---|

SAI | OSC | |

Field 1 | 50 | 152 |

Field 2 | 29 | 91 |

Field 3 | 54 | 79 |

Field 4 | 22 | 116 |

Field 5 | 64 | 91 |

Field 6 | 21 | 92 |

Field 7 | 16 | 55 |

Stripe 82 | 665 | 2445 |

Our algorithm identified the same 7 SNe fields with the same boundaries, but with somewhat increased SNe membership, plus 13 fields within Stripe 82. These fields (Table 6) contain a total of 3,121 SNe (39.6 per cent), with Stripe 82 containing 2,445 SNe (31.0 per cent) and fields 1-7 containing 676 SNe (8.6 per cent) of the OSC sample. We compared the OLS linear regression both with and without these fields in the sample and calculated Spearman’s rank-order correlation coefficient of these samples, as described in section 3.2.

Table 7 shows the gradient of the OLS linear regression slope for each OSC remainder sample in units of per unit redshift, plus the standard error on the gradient. In these units the gradient of the whole sample is , which as for the SAI sample is significantly above zero. Spearman’s rank-order correlation coefficient for the whole sample shows a moderate correlation () which is statistically significant (p-value ). These results are consistent with those for the whole SAI sample (gradient , and p-value ).

SDSS Stripe 82 is again the largest field we identified, both in terms of the number of SNe (2,445) and angular size. Removing Stripe 82 from the OSC sample does not significantly affect the OLS linear regression slope () or Spearman’s rank-order correlation coefficient (, p-value ). However, removing fields 1-7 (676 SNe) reduces the gradient dramatically to , and there is no correlation evident in the remainder (, p-value ). Removing both fields 1-7 and Stripe 82 together has a similar effect.

Fields removed | No. | Gradient | Corr. Coeff. | ||||
---|---|---|---|---|---|---|---|

SNe | () | p-value | |||||

None | 7880 | 42 | 7 | 0.6 | |||

Fields 1-7 | 7204 | 10 | 11 | 0.1 | 0.5 | ||

Stripe 82 | 5435 | 38 | 7 | 0.5 | |||

Fields 1-7 & Stripe 82 | 4759 | 11 | 13 | -0.0 | 0.1 |

The result of removing fields 1-7 from the OSC sample is illustrated in Fig. 6, which plots the weighted mean CMB temperature at SNe locations in redshift bins of . This plot is repeated for the whole sample (6(a)), fields 1-7 only (6(b)) and the remainder of the sample after fields 1-7 are removed (6(c)).

The results for the OSC sample indicate that the correlation is caused by fields 1-7 and that SDSS Stripe 82 does not contribute significantly, which is consistent with those for the SAI sample.

### 4.3 Planck CMB maps

Both our analysis and that of Yershov et al. (2014) used maps produced by the Planck SMICA component separation pipeline (Planck 2015 SMICA R2.01 and Planck 2013 SMICA R1.20 respectively). To check our results are consistent across all four of the Planck component separation pipelines we repeated selected analyses from section 3 using the Planck 2015 Commander, NILC, and SEVEM CMB maps. In all cases the pixel variance estimates were calculated from the corresponding HMHD maps as described in section 2.2.

We repeated the OLS linear regression gradient and Spearman’s rank-order correlation coefficient analyses from section 3.2. For all four maps (Commander, NILC, SEVEM, and SMICA) the OLS gradient and correlation present in the whole sample are entirely absent in the remainder once fields 1-7 are removed. For SMICA the contribution of fields 1-7 to the correlation was illustrated in Fig. 3. For Commander, NILC, and SEVEM see Appendix C Figs. 12, 13, and 14 respectively.

We repeated the mean CMB temperature analysis from section 3.3.1. For all four maps SNe in fields 1-7, and all HEALPix pixels within each sample, are biased to CMB temperatures hotter than the mean of the whole sample. In all cases fields 3, 6, and 7 are particularly extreme(Appendix C Table 8).

Our results are entirely consistent across all four Planck maps.

## 5 Discussion and conclusions

We have shown that the apparent correlation of CMB temperature and SNe redshift reported by Yershov et al. (2012, 2014) using OLS linear regression, Pearson’s correlation coefficient, and an SAI SNe sample, is also evident using Spearman’s rank-order correlation coefficient, Welch’s t-test, and MWU test, and it is discernible in at least one other SNe sample (OSC).

Whilst our analysis supports the prima facie existence of the apparent correlation, the data indicate that it is actually a composite selection bias (high CMB T high SNe ) caused by the accidental alignment of seven deep survey fields (fields 1-7) with CMB hotspots. These fields include 3 from the Supernova Legacy Survey, 2 from the ESSENCE supernova survey, HDF-N and CDF-S. These comprise 9.2 per cent of the SAI sample and 8.6 per cent of the OSC sample. These deep fields by their very nature contain SNe at higher redshift than the remainder of the samples. We have shown that the SNe within fields 1-7 are also biased to hotter CMB temperature than the remainder of the samples. Our results are consistent across all four of the Planck maps.

We have quantified the likelihood of fields 1-7 falling on CMB hotspots by chance and have found this to be at least 6.8 per cent, or approximately 1 in 15. We conclude that the correlation reported by Yershov et al. (2012, 2014) is a composite selection bias caused by the chance alignment of certain deep survey fields with CMB hotspots. This bias (high CMB T high SNe ) is the combined result of both a selection bias (high SNe in deep fields) and the chance alignment of those deep fields with CMB hotspots.

This selection bias results in heteroscedastic data, where the variance of CMB temperature at SNe locations is unequal across the range of redshifts. We have shown that high redshift SNe tend to be in deep survey fields which, given the chance alignments, generally give hot Planck pixel temperatures. Low redshift SNe are more uniformly scattered across the sky and thus have much wider variance of hot and cold Planck pixel temperatures. This heteroscedasticity was hidden by binning the data.

This paper shows that deep survey fields have biased SNe cross-correlation with CMB temperature, but the implications could extend further. Deep fields could potentially bias any cross-correlation between astronomical objects (e.g., SNe, galaxies, GRBs, quasars) and the CMB. It is conceivable that deep fields could, by chance, also be aligned with distant large-scale structures, voids, cosmic bulk flows, or even regions of anisotropic cosmic expansion (should they exist).

## Acknowledgements

We are indebted to the anonymous referee for helpful and constructive comments. We are grateful for advice on calculating individual pixel variances received from the NASA/IPAC Infrared Science Archive (IRSA) Help Desk and the Planck Legacy Archive (PLA) Helpdesk. We thank Christian Reichardt for reading and commenting on the draft manuscript. TF is in receipt of a STFC PhD studentship. SR acknowledges support from the Australian Research Council’s Discovery Projects scheme (DP150103208).

This research has made use of the PLA maintained by the European Space Agency, the IRSA, which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration (NASA), and NASA’s Astrophysics Data System. It has used data from the supernova catalogue maintained by the Sternberg Astronomical Institute at Moscow State University, and the Open Supernova Catalogue maintained by James Guillochon and Jerod Parrent at Harvard University. The analysis used the HEALPix pixelisation scheme and software packages maintained by NASA’s Jet Propulsion Laboratory at the California Institute of Technology.

## References

- Alam et al. (2015) Alam S., et al., 2015, ApJS, 219, 12
- Astier et al. (2006) Astier P., et al., 2006, A&A, 447, 31
- Bartunov et al. (2007) Bartunov O. S., Tsvetkov D. Y., Pavlyuk N. N., 2007, Highlights Astron., 14, 316
- Bennett et al. (2013) Bennett C. L., et al., 2013, ApJS, 208, 20
- Bueno Sanchez et al. (2009) Bueno Sanchez J. C., Nesseris S., Perivolaropoulos L., 2009, J. Cosmology Astropart. Phys., 11, 029
- Cardoso et al. (2008) Cardoso J.-F., Le Jeune M., Delabrouille J., Betoule M., Patanchon G., 2008, IEEE Journal of Selected Topics in Signal Processing, 2, 735
- Choudhury & Padmanabhan (2005) Choudhury T. R., Padmanabhan T., 2005, A&A, 429, 807
- DES Collaboration et al. (2016) DES Collaboration et al., 2016, MNRAS, 460, 1270
- DES Collaboration et al. (2017) DES Collaboration et al., 2017, preprint (arXiv:1708.01530)
- Dawson et al. (2009) Dawson K. S., et al., 2009, AJ, 138, 1271
- Delabrouille et al. (2009) Delabrouille J., Cardoso J.-F., Le Jeune M., Betoule M., Fay G., Guilloux F., 2009, A&A, 493, 835
- Eriksen et al. (2008) Eriksen H. K., Jewell J. B., Dickinson C., Banday A. J., Górski K. M., Lawrence C. R., 2008, ApJ, 676, 10
- Fernández-Cobos et al. (2012) Fernández-Cobos R., Vielva P., Barreiro R. B., Martínez-González E., 2012, MNRAS, 420, 2162
- Filippenko & Riess (1998) Filippenko A. V., Riess A. G., 1998, Phys. Rep., 307, 31
- Giacconi et al. (2001) Giacconi R., et al., 2001, preprint (arXiv:astro-ph/0112184)
- Giannantonio et al. (2016) Giannantonio T., et al., 2016, MNRAS, 456, 3213
- Górski et al. (2005) Górski K. M., Hivon E., Banday A. J., Wandelt B. D., Hansen F. K., Reinecke M., Bartelmann M., 2005, ApJ, 622, 759
- Guillochon et al. (2017) Guillochon J., Parrent J., Kelley L. Z., Margutti R., 2017, ApJ, 835, 64
- Karpenka et al. (2015) Karpenka N. V., Feroz F., Hobson M. P., 2015, MNRAS, 449, 2405
- Kessler et al. (2015) Kessler R., et al., 2015, AJ, 150, 172
- LSST Science Collaboration et al. (2009) LSST Science Collaboration et al., 2009, preprint (arXiv:0912.0201)
- Louis et al. (2017) Louis T., et al., 2017, J. Cosmology Astropart. Phys., 6, 031
- Miknaitis et al. (2007) Miknaitis G., et al., 2007, ApJ, 666, 674
- Nesseris & Perivolaropoulos (2007) Nesseris S., Perivolaropoulos L., 2007, J. Cosmology Astropart. Phys., 2, 025
- Perlmutter et al. (1999) Perlmutter S., et al., 1999, ApJ, 517, 565
- Planck Collaboration et al. (2014a) Planck Collaboration et al., 2014a, A&A, 571, A1
- Planck Collaboration et al. (2014b) Planck Collaboration et al., 2014b, A&A, 571, A12
- Planck Collaboration et al. (2016a) Planck Collaboration et al., 2016a, A&A, 594, A1
- Planck Collaboration et al. (2016b) Planck Collaboration et al., 2016b, A&A, 594, A9
- Planck Collaboration et al. (2016c) Planck Collaboration et al., 2016c, A&A, 594, A13
- Pritchet & SNLS Collaboration (2005) Pritchet C. J., SNLS Collaboration 2005, in Wolff S. C., Lauer T. R., eds, Astronomical Society of the Pacific Conference Series Vol. 339, Observing Dark Energy. p. 60 (arXiv:astro-ph/0406242)
- Riess et al. (1998) Riess A. G., et al., 1998, AJ, 116, 1009
- Sachs & Wolfe (1967) Sachs R. K., Wolfe A. M., 1967, ApJ, 147, 73
- Story et al. (2013) Story K. T., et al., 2013, ApJ, 779, 86
- Sunyaev & Zel'dovich (1970) Sunyaev R. A., Zel'dovich Y. B., 1970, Ap&SS, 7, 3
- Welch (1938) Welch B. L., 1938, Biometrika, 29, 350
- Williams et al. (1996) Williams R. E., et al., 1996, AJ, 112, 1335
- Yershov et al. (2012) Yershov V. N., Orlov V. V., Raikov A. A., 2012, MNRAS, 423, 2147
- Yershov et al. (2014) Yershov V. N., Orlov V. V., Raikov A. A., 2014, MNRAS, 445, 2440

## Appendix A Fields 1-7

## Appendix B Variances and smoothing

## Appendix C Alternative Planck maps

Field | CMB temperature () | |||||||||||||||||||||||

Commander | NILC | SEVEM | SMICA | |||||||||||||||||||||

SNe | pixels | SNe | pixels | SNe | pixels | SNe | pixels | |||||||||||||||||

Field 1 | 32.4 | 9.4 | 52.3 | 1.7 | 30.5 | 9.1 | 49.0 | 1.7 | 35.7 | 8.9 | 51.9 | 1.7 | 29.1 | 9.1 | 47.5 | 1.7 | ||||||||

Field 2 | 94.8 | 2.8 | 95.2 | 1.6 | 90.7 | 3.1 | 92.9 | 1.6 | 97.0 | 3.1 | 96.1 | 1.7 | 92.9 | 3.3 | 94.3 | 1.7 | ||||||||

Field 3 | 133.0 | 9.4 | 138.9 | 1.0 | 132.1 | 9.2 | 139.0 | 1.0 | 132.2 | 9.3 | 140.5 | 1.0 | 130.8 | 9.2 | 137.5 | 1.0 | ||||||||

Field 4 | 91.0 | 14.7 | 85.7 | 1.7 | 94.2 | 14.5 | 88.2 | 1.7 | 87.4 | 13.8 | 85.6 | 1.7 | 85.3 | 14.3 | 81.7 | 1.7 | ||||||||

Field 5 | 51.8 | 8.9 | 70.2 | 1.2 | 52.1 | 8.6 | 71.5 | 1.2 | 51.8 | 8.7 | 70.2 | 1.2 | 49.7 | 8.5 | 68.2 | 1.2 | ||||||||

Field 6 | 197.3 | 14.5 | 169.5 | 1.9 | 195.7 | 14.1 | 167.4 | 1.9 | 197.0 | 13.8 | 167.1 | 1.9 | 199.3 | 13.6 | 168.9 | 1.9 | ||||||||

Field 7 | 117.5 | 12.3 | 100.5 | 3.9 | 121.0 | 12.4 | 100.8 | 4.1 | 121.9 | 11.9 | 103.9 | 4.0 | 120.5 | 12.0 | 100.8 | 4.0 | ||||||||

Fields 1-7 | 89.4 | 5.0 | 103.6 | 0.7 | 88.4 | 5.0 | 103.7 | 0.7 | 90.1 | 4.9 | 103.8 | 4.0 | 87.4 | 5.0 | 101.5 | 0.7 | ||||||||

Stripe 82 | 4.8 | 4.0 | 17.9 | 0.2 | 6.1 | 4.0 | 19.5 | 0.2 | 5.8 | 4.0 | 18.6 | 0.2 | 3.8 | 4.0 | 17.2 | 0.2 | ||||||||

Whole sample | 10.8 | 2.0 | 3.2 | 0.0 | 10.4 | 2.0 | 2.2 | 0.0 | 11.8 | 2.0 | 3.0 | 0.0 | 10.3 | 2.0 | 3.1 | 0.0 |