Optimal constraints on primordial gravitational waves from the lensed CMB

# Optimal constraints on primordial gravitational waves from the lensed CMB

Julien Carron Department of Physics & Astronomy, University of Sussex, Brighton BN1 9QH, UK
July 18, 2019
###### Abstract

We demonstrate how to obtain optimal constraints on a primordial gravitational wave component in lensed Cosmic Microwave Background (CMB) data under ideal conditions. We first derive an estimator of the tensor-to-scalar ratio, , by using an error-controlled close approximation to the exact posterior, under the assumption of Gaussian primordial CMB and lensing deflection potential. This combines fast internal iterative lensing reconstruction with optimal recovery of the unlensed CMB. We evaluate its performance on simulated low-noise polarization data targeted at the recombination peak. We carefully demonstrate our -posterior estimate is unbiased and optimal, making it the most powerful estimator of primordial gravitational waves from the CMB. We compare these constraints to those obtained from -mode band-power likelihood analyses on the same simulated data, before and after map-level quadratic estimator delensing, and iterative delensing. Internally, iteratively delensed band powers are only slightly less powerful on average (by less than 10%), promising close-to-optimal constraints from a stage IV CMB experiment.

## I Motivation

After the completion of the Planck Cosmic Microwave Background (CMB) missionAkrami et al. (2018a), the major target of the CMB community has now become precise measurement of the CMB polarization. The magnetic (B) part of the CMB polarization Kamionkowski et al. (1997); Zaldarriaga and Seljak (1997) on degree scales is a unique signature of the stochastic background of primordial gravitational waves produced during inflation Seljak and Zaldarriaga (1997); Kamionkowski and Kovetz (2016). Constraints on the tensor-to-scalar power spectrum ratio, , are expected to increase by two orders of magnitude in precision within the next decade: CMB Stage IV (CMB-S4) has forecast sensitivity down to , using delensed -mode band powers after foreground cleaning Abazajian et al. (2016). Lensing of the CMB photons by large-scale structures generates polarization that effectively appears as an approximately white cosmic variance noise Zaldarriaga and Seljak (1998); Lewis and Challinor (2006). In order to reach such tight constraints on the primordial signal, successful delensing of this K-arcmin noise is mandatory. Currently and for the next few years, the most faithful lensing tracer at the scales relevant for -mode delensing is the Cosmic Infrared Background(CIB) Sherwin and Schmittfull (2015), able to achieve 40% delensing on 60% of the sky Aghanim et al. (2018), and possibly more in areas that are clean from galactic dust. It is also possible to combine the CIB with other large-scale structure tracers Manzotti (2018); Yu et al. (2017), or with the CMB internal reconstruction Aghanim et al. (2018), in order to increase its fidelity to the CMB lensing field. Lensing estimates for CMB-S4 will be dominated by the internal reconstruction, using polarization quadratic estimators Okamoto and Hu (2003) or more powerful iterative estimators, first introduced by Ref. Hirata and Seljak (2003). At the low instrumental noise levels expected for CMB-S4, iterative internal estimation from CMB polarization has been demonstrated on simulated data to give lensing reconstructions that are more than 90% cross-correlated to the true lensing Seljak and Hirata (2004); Carron and Lewis (2017).

One may ask whether it could be possible, at least in principle, to do even better than these forecasts. The lensing deflections introduce non-Gaussianities in the form of higher order statistics in the CMB temperature and polarization Lewis and Challinor (2006), which are used to reconstruct the lensing signal Hanson et al. (2010). Delensing will remove part of the non-Gaussianity, but only imperfectly, and some amount of information must remain beyond the power spectra. Hence, it is plausible that there may be room for alternative statistics that compress more information than delensed -mode band-powers.

This paper has two main purposes. The first is to demonstrate how to obtain directly the posterior probability density (PDF) for , from lensed CMB data. The posterior contains all the information on , and constraints based on it are optimal. The second is to compare this optimal method to band-power likelihood analysis. Finding a posterior width in agreement with naive expectations will confirm current forecasting methods and our understanding of how well CMB experiments can constrain primordial gravitational waves.

Our approach uses an approximate, analytic marginalization of the large-scale structure lensing to build the statistics of interest, here the tensor-to-scalar ratio, . This analytic marginalization is a fairly natural choice, and has been used already in a different context by Ref. Hirata and Seljak (2003), where the aim was to obtain an optimal estimator of the lensing spectrum. In this paper we provide a rigorous discussion of the accuracy of the approximation and show that corrections are negligible for our purposes and the experimental configurations investigated.

We work in the flat-sky approximation. All our simulations use the Planck 2015 cosmology Ade et al. (2016a), with power spectra generated with the camb (Lewis et al., 2000) software. We use square maps of area , assuming periodic boundary conditions for simplicity, with pixels of arcmin on a side. Foreground cleaning is a major challenge to the quest for primordial gravitational waves. We do not consider these complications in this paper, assuming throughout we are working with foreground-cleaned maps with white noise power spectra. We always use white Gaussian noise levels of K-arcmin in polarization, a instrument beam of 3 arcminutes, and consider CMB multipoles below only. The minimum multipole we probe in our flat-sky implementation of the sky patches is , excluding the reionization peak. We consider 3 different levels of tensor modes, with tensor-to-scalar ratio defined at the pivot scale and a vanishing tensor spectral index. The first has , close to current constraints (95% c.f.) from Planck in combination with BICEP2/KECK array BK14 data Akrami et al. (2018b). For the configuration just described, the nominal band powers are enough for a strong detection. Second, , in which case the nominal band-powers cannot detect the waves decisively but the delensed band-powers can. Third, a vanishing amplitude where in all cases only upper limits can be placed from the data.

The paper is built as follows. Sec. II describes our approximation scheme to the exact posterior density function, and Sec. III gives details on our numerical implementation. In Sec. IV, we discuss our nominal and delensed band powers likelihood and implementation. We present in Sec. V our results and summarise and conclude in Sec. VI. One appendix details a couple of technical points for completeness.

## Ii Posterior for tensor-to-scalar ratio

Given CMB Stokes parameter polarization data , and for a uniform prior on , the posterior is proportional to the likelihood

 p(r|Xdat)∝p(Xdat|r). (1)

Owing to the lensing by large scale structures, the CMB probability density function on the right-hand side is non-Gaussian in and does not have a simple analytical description. We may write, however, marginalising over possible lensing deflection maps,

 p(Xdat|r)=∫Dϕpϕ[ϕ]p(Xdat|ϕ,r), (2)

where is the probability density of the lensing potential, defined by the large-scale structure evolution. When the lensing map is known, the lensed CMB is still Gaussian, since the deflections just remap points on the sky to very good approximation Lewis et al. (2017). Hence we can consider Gaussian in , with covariance determined by the lensed spectra of the CMB together with the specified amount of tensor mode, the transfer function and noise covariance matrix. Neglecting for simplicity the tiny cross-correlation between and the CMB -polarization Lewis et al. (2011), which is too small to impact our results222We do include properly all cross-correlations including in all our simulations., we can write explicitly,

 lnp(Xdat|ϕ,r)=−12Xdat,†Cov−1ϕXdat−12lndetCovϕ, (3)

with pixel-space covariance matrix

 Covϕ=BDϕCunl,fidD†ϕB†+N. (4)

Here, is the transfer function including instrument beam, is the lensing deflection operator that maps the unlensed CMB Stokes parameters to the ones deflected by , is our set of unlensed fiducial CMB spectra, and is the noise matrix. In position space and on the flat-sky, has the explicit representation .

The model specified by Eq. (3) neglects the very small effect of lensing of the polarization generated from reionization, so that a single, common lensing deflection can be used. The CMB spectra contains the dependence on tensor modes

 Cunl,fidℓ≡Cscal,fidℓ+rCtens.,fidℓ (5)

We pick the prior on lensing maps in Eq. 2 to be Gaussian with power , calculated from the non-linear matter power333We follow standard practice of denoting lensing multipoles with and CMB multipoles with .. Owing to non-linear evolution in the late Universe, is not exactly Gaussian. However, non-linear effects are weak at scales relevant for the lensing -modes (the non-linear contribution to the degree-scale lensing -power is only a few percent), and generally the effects of non-Gaussian lenses is not expected to bias the lensing reconstruction from polarization Beck et al. (2018); Boehm et al. (2018). Hence this choice is unlikely to significantly bias our results. In practice this assumption could simply be explicitly tested performing the reconstruction of this paper using lensing maps from realistic -body simulations.

The integrand in Eq. 2 peaks at the most probable lensing map, given the data and candidate value for the tensor amplitude. Let be this most probable potential map. We may then write

 e−S[ϕ]≡pϕ[ϕ]p(Xdat|ϕ,r) (6)

and expand the action to second order around ,

 S[ϕ]∼S[ϕ∗(r)]+12(ϕ−ϕ∗(r))Hr[ϕ∗(r)](ϕ−ϕ∗(r)). (7)

Both the lensing potential and the curvature at this point have explicit dependence on the data map, and explicit (through the CMB spectra) and implicit (through ) dependence on . In the following we suppress the data dependence for notational clarity.

The lensing map marginalization, Eq. 2, becomes trivial under this approximation. The result is, up to terms independent from ,

 lnp(r|Xdat)≃−S[ϕ∗(r)]−12lndetHr[ϕ∗(r)], (8)

which forms the basis of our investigations. Leading corrections or alternative expansions are discussed below in Sec. II.1.

In the remainder of this subsection we just note how is related to the -mode power, by looking at its dependence on . Let us introduce the Wiener-filtered CMB maps through

 XWF≡Cunl,fidD†ϕB†Cov−1ϕXdat. (9)

These are the most probable unlensed CMB maps given the observed Stokes data and given the fiducial model entering the covariance . In particular, since the scalar unlensed power vanishes, is the most probable (maximum a posteriori) map of the tensor modes, assuming the fiducial value of and lensing map are the truth. By definition

 δS[ϕ]δϕ∣∣∣ϕ=ϕ∗(r)=0, (10)

hence for this lensing map. For this reason, the prior term does not contribute to the action -derivative, only the likelihood defined in Eq. (3). It then follows (neglecting the tensor contribution)

 dS[ϕ∗(r)]dlnr=12∑ℓ∣∣BWFℓ(r)∣∣2−⟨∣∣BWFℓ(r)∣∣2⟩rCBB,tensℓ. (11)

The first term on the right-hand side comes from the quadratic part in the CMB likelihood. The sum runs over all 2D frequencies of our sky patch. The second term (the average of the first over data realizations) comes from the log-determinant -derivative444This term is most easily derived realizing that on average, the likelihood variation vanishes, , since is a properly normalized probability density for any value of . Hence Eq. 11 must average to zero..

All dependence on the lensing map prior in the reconstructed -posterior Eq. (11) is absorbed into the lensing map reconstruction . If the lensing map were exactly known, is the properly delensed -mode map, there is no marginalization over the lensing map and Eq. (11) directly gives the (derivative of log)-posterior by trivial comparison to the expected power.

### ii.1 Corrections and alternative approximations

Eq. (8) is an approximation of the posterior PDF, that relies on the Gaussianity of the reconstructed lensing map. Lens reconstruction from the CMB is non-linear in the data, and some non-Gaussian features are expected, and visible, in standard lens reconstructions with perfectly Gaussian true input lensingLiu et al. (2016). If the approximation is not good enough, resulting constraints might be biased, or suboptimal, possibly both. Hence it is very useful to be able to assess the size of the leading corrections. We discuss now how corrections can be evaluated, also giving us alternative expansion schemes. Sec. III.4 later on demonstrates using these tools that Eq. (8) is accurate and unbiased.

We can proceed as follows. The exact posterior in Eq. (2) integrates . Using an arbitrary trial action as an approximation to , we may write the exact identity

 ∫Dϕe−S[ϕ]=(∫Dϕe−St[ϕ])⟨e−ΔS[ϕ]⟩t, (12)

where the average is with respect to probability density , and is the mismatch between the true and trial actions. After taking the logarithm, and for the natural expansion Eq. (7), the first term on the right-hand side (the normalization factor of the density ) is our approximation Eq. (8), and the second term encapsulates all errors. The point is that whenever is chosen quadratic in , this error term is an average over Gaussian lensing maps. We can simulate these maps, and estimate this term by averaging over simulations. Unless the approximation is extremely good, in practice we can never probe directly the exponential with a reasonable number of Monte-Carlo’s, but we obtain in this paper leading contributions. Using a standard cumulant expansion we can write the asymptotic expansion

 (13)

In particular, all expansions of the form

 St[ϕ]=S[ϕ∗]+12(ϕ−ϕ∗)†H(ϕ−ϕ∗), (14)

for an arbitrary matrix results in (including here only the leading cumulant corrections)

 lnp(r|Xdat)∼−S[ϕ∗]−12lndetH−κ1+12κ2, (15)

where and are the mean and variance of from a Gaussian ensemble of lensing maps with inverse covariance . This provides alternative expansion schemes and useful consistency checks, where some of the difficulties of dealing with the exact curvature can be eliminated by using a simpler matrix, at the cost of a larger cumulant correction (see Sec. III.4).

## Iii Posterior evaluation

This section describes the implementation of the different terms in the posterior, Eq. 8. Two large log-determinants must be evaluated. The determinant of the CMB data covariance matrix is discussed in III.1, and is in fact fairly harmless. The second, the determinant of the lensing curvature matrix, discussed in III.2, is more complex, and is the most expensive step of the entire implementation. However, we find that its dependence on the data realization is very weak, and can be neglected. For this reason, it only needs to be calculated once for a simulation suite. All terms must be evaluated at the best lensing map . The production of this map together with the Wiener-filtered CMB maps is described in Sec. III.3. Finally we give in Sec. III.4 details on our generation of Gaussian lensing map samples with large and dense covariance matrices, as required for evaluation of the cumulant corrections.

### iii.1 Data covariance determinant

How to estimate log-determinants of very large, dense matrices like ? There is no trivial diagonalization, so that the large size of the matrix renders brute force methods totally useless. Possibilities include integral representations combined with Monte-Carlo evaluation of the diagonal Dorn and Enßlin (2015). We are only interested in the -dependence. This allow us to simplify the problem quite a bit. We first note that

 c(r)≡−∂∂r12lndetCovϕ∗(r)=12TrCov−1ϕ∗(r)∂rCovϕ∗(r)

where only depends on the covariance matrix constructed from tensor CMB spectra555The derivative does not act on ; we are interested in the explicit -dependence only, for the reasons explained at the end of Sec. II. Using Monte-Carlo simulations, we can write an unbiased estimator as

 ^c(r)=12X†Cov−1ϕ∗(r)∂rCovϕ∗(r)X (16)

where are unit spectra random variables with . It is easily seen that the estimator is unbiased. Its Monte-Carlo noise variance (when using Gaussian ) is simply the Fisher information on , divided by the number of simulations used. We can further reduce the variance with the help of a reference ideal covariance matrix, denoted with a subscript , for which we know the determinant: consider the modified estimator

 ^c(r)−^c0(r)+c0(r), (17)

the last term being known analytically. With a good isotropic approximation to the covariance we might expect to be able to reduce the MC noise of the original estimate substantially. Fig. 1 shows in the upper panel the Monte-Carlo error as a function of , using the raw estimator (blue line) of Eq. (16), with the lensed spectra as reference (orange) in Eq. 17, and using unlensed spectra (green) as reference. More specifically, neglecting again the tensor contribution, these last two choices correspond to

 clen0(r)=12∑ℓ(2ℓ+1)CBB,tens.ℓrCBB,tens.ℓ+CBB,lenℓ+CBB,noiseℓ (18)

in the lensed case, and

 cunl0(r)=12∑ℓ(2ℓ+1)CBB,tens.ℓrCBB,tens.ℓ+CBB,noiseℓ (19)

in the unlensed case. In these two equations, is short-hand notation for the exact number of modes of our flat-sky patch, and is the beam-deconvolved noise spectrum. In the unlensed case, it is apparent that with a single simulation we can reach well below percent accuracy on the log-determinant.

In fact, for our experimental configuration at least, the isotropic, unlensed spectra determinant approximation is extremely accurate, and the small deviation from it can be very well captured by the leading term of a perturbative expansion in powers of . This is shown in the lower panel of Fig. 1, which displays the relative deviation of the exact determinant from its isotropic counterpart calculated with unlensed spectra (blue points with error bars) and the perturbative prediction (red line). More explicitly, the perturbative prediction is

 12lndetCovϕ=12lndetCovϕ=0+12∑LRL|ϕL|2, (20)

where the linear, mean-field response matrix is obtained in the appendix, and the first term on the right-hand side is evaluated with the unlensed CMB spectra. At first sight, it might appear counter-intuitive that the unlensed spectra prediction is so accurate, but this is easily explained. In the limit of vanishing instrumental noise and perfect resolution, the data covariance Eq. (3) reduces to

 Covϕlow noise−−−−−−→high res.DϕCunl,fidD†ϕ.

Averaged over lensing maps this gives the CMB spectra lensed by . However, taking the determinant one expects the factorization (for example after imposing a sufficiently high band-limit making all these operators square matrices)

 lndetCovϕlow noise% −−−−−−−→high res.lndetDϕD†ϕ+lndetCunl,fid.

The first determinant has no dependence on and is an irrelevant constant. Hence, usage of the unlensed spectra gives under these conditions the exact result irrespective of the amount of lensing in the data. Fig. 1 demonstrates that in our instrumental configuration, the mean-field response in Eq. (20) captures very well the tiny residual coupling between these terms, the transfer function and the noise matrix.

### iii.2 Hessian determinant

The Hessian is defined as the second variation of the log-posterior of the lensing map, Eq. (7). Operationally speaking there are three terms . The first originates from the quadratic piece of the CMB likelihood and is data realization dependent,

 HdatLL′≡12δ2δϕLδ¯ϕL′XdatCov−1ϕXdat∣∣∣ϕ=ϕ∗(r) (21)

the second also comes from the likelihood but is data-independent,

 HCovLL′≡12δ2δϕLδ¯ϕL′lndet% Covϕ∣∣∣ϕ=ϕ∗(r), (22)

and the third is the lensing map prior curvature, dominant on small scales where the data does not constrain the lensing deflection field. Under our choice of Gaussian statistics for , this prior term is trivially given by

 HPriLL′=δLL′Cϕϕ,fidL. (23)

The calculation of the total Hessian determinant is the main difficulty to get the -posterior. It plays a key role in shaping the final posterior and cannot be neglected, but is difficult to obtain exactly. However, as shown further below, the dependence of log-determinant on the data realization can be neglected for all practical purposes. Hence, this calculation only needs to be performed once for each model.

In general, we solve for by coupling the same trace probing method described in Sec. III.1 to a double layered conjugate gradient inversion. We use for our accurate approximation from Sec. III.1 for the covariance determinant, with the trivial result

 HCovLL′=δLL′RL. (24)

The main operational difficulty lies in the application of to an arbitrary lensing potential map . The details of this calculation are deferred to the appendix. There, it is shown that one can apply the Hessian matrix to a lensing potential vector at the cost of Wiener-filtering one pair of Stokes maps. This operation is itself performed via conjugate gradient inversion. Using a simple diagonal preconditioner for the outer conjugate gradient inversion, we found that we need slightly above 10 iterations and thus the inverse filtering of the same number of polarization maps to conservatively solve for to five significant figures. The same operations must then be repeated for each Monte-Carlo used to probe the trace, and at each point of interest. For the area of square degrees and the resolution of arcminutes dealt with in this paper, a single probe of the trace at some point takes a couple of minutes. The final Monte-Carlo error on the PDFs depends on the -density of trace-probing MC’s, rather than the number of MC’s per point. Hence it can be advantageous to use a dense -grid where each point is sparsely trace-probed. For the PDFs shown in this work, where we reconstruct accurately the entire PDF shape, we typically use 64 points, denser near the peak, and MC’s per point, which gives a crude but still reasonable estimate of the local error.

Figure 2 shows a couple of posterior reconstructions on simulated data. One pair with input (orange), and one pair with (blue). The filled colours shows the reconstructions using the realization-dependent Hessian estimate, where the width displays the uncertainties resulting from the Monte-Carlo determinant evaluation. These errors are independent only for the estimation of the posterior -derivative: after normalization to build the posterior itself, this produces a strong anti-correlation of errors between points on opposite side of the peak on this figure. The black lines show the corresponding curves, but with Hessian estimation swapped between the two realizations. Summary statistics are virtually identical, and this demonstrates that for all practical purposes, we can neglect the data dependence of the log-determinant. However, the dependence on the lensing potential remains, and we do not offer in this work a trivial, fully isotropic approximation.

### iii.3 Iterative lens estimation

The first step in the -posterior reconstruction is always the calculation of the iterative solution together with the Wiener-filtered CMB maps . We follow Ref. Carron and Lewis (2017) very closely, maximizing the lensing map posterior probability for the lensing deflections using a quasi-Newton iterative descent. At each iteration step, the current lensing estimate induces a mean-field (, in standard lensing estimation terminology) term to subtract from the quadratic estimate, originating from the first variation (with respect to ) of the covariance matrix determinant. This term is small on all scales for lens reconstruction from polarization, and we use the same accurate analytical approximation discussed in Sec. III.1 and in the appendix instead of simulations.

### iii.4 Higher-order cumulants calculation

Evaluation of the correction terms in Eq. (13) requires sampling Gaussian lensing maps with inverse covariance given by the curvature matrix chosen for the posterior expansion. This matrix is very large and has no trivial structure, hence is difficult to sample from, and standard generic methods such as Cholesky decomposition are useless. It is possible, however, to apply this matrix to a lensing map in reasonable time, and powerful methods can be used to construct samples with the correct covariance structure using this property. We proceed by deforming continuously the inverse square root of H as follows(Allen et al., 2000). We connect to the identity matrix defining

 Ht≡I+t(H−I), (25)

and introduce where is a unit variance Gaussian vector of the appropriate dimensionality. By definition, and has covariance . Differentiating gives the following ordinary differential equation (ODE) for :

 dϕtdt=−12(H−I)[t(H−I)+I]−1ϕt (26)

The solution of this ODE at is then by construction the desired Gaussian sample. We have used this machinery to test corrections to our posterior approximation Eq. (8) on one data realization. In this case,

 HLL′=δ2S[ϕ]δϕLδ¯ϕL′∣∣∣ϕ=ϕ∗(r). (27)

As a consistency check we also checked the alternative choice of curvature were the lensing deflections are neglected,

 HLL′=δ2S[ϕ]δϕLδ¯ϕL′∣∣∣ϕ≡0. (28)

The second case considerably speeds-up the calculation of the curvature log-determinant, but the posterior is expected to be less accurate. Figure 3 shows the different -posterior estimates, for one data realization with input . The solid lines shows our baseline approximation, without corrections (blue), with first cumulant (orange) and with first two cumulants (green). We have used 4 Monte-Carlo samples per -point, on a grid devised as just described in Sec. III.2. We use in all cases a standard low order Bulirsch-Stoer algorithm Press et al. (1996) to solve the ODE defined in Eq. (26). The bracketed inverse is solved with conjugate-gradient descent. The corrections have a completely negligible effect on the posterior. The dashed lines show the same curves with the choice of unlensed curvature. The agreement, including the corrections, is very good. However, the leading approximation is clearly worse and the corrections are essential in bringing the two expansion schemes in agreement. In the remainder of this paper we stick to the lensed curvature matrix, and safely neglect the cumulant corrections.

## Iv Nominal and delensed B-mode band powers

We now describe our reconstruction of from raw and delensed band powers. The likelihood of CMB data temperature and polarization band powers has been discussed at length in several places already, and efficient parameterizations both at low and high multipoles are well known (e.g. Efstathiou, 2004; Hamimeche and Lewis, 2008). Only preliminary work exists, however, for delensed band powers Namikawa and Nagata (2015). After performing consistency checks described in the next section, we decided to use the Hamimeche and Lewis (HL) likelihood Hamimeche and Lewis (2008), using analytical band-power predictions and covariance matrices as described below. We note that other possibilities such as the parametrization of Ref. Smith et al. (2006) seem to work just as well. We consider -mode multipoles only up to , and use as baseline the maximal number of bins (74) that our flat-sky mode structure allows when building the -posteriors.

All of our -spectra likelihoods may eventually be written

 −2lnp(r|^CBB)≡g(xℓ)^CBBℓΣ−1ℓℓ′^CBBℓ′g(xℓ′)

with and . These power spectra include tensor contribution, residual lensing power and beam-deconvolved noise. The predictions of the spectra are evaluated analytically, using the perturbative expression for the lensing -mode power to linear order in the appropriate lensing spectrum. For the nominal band powers we use the fiducial lensing spectrum . In the case of the quadratic estimator or iteratively delensed band powers, we first empirically measure the cross-correlation coefficient squared of the reconstructed lensing maps to the true input from a small set of simulations. We then use as lensing spectrum . We neglect the tiny lensing effects on the tensor spectra. The lensing maps themselves are obtained as described in Sec. III.3, with the difference that we exclude the tensor modes scales to build the tracers. Including these modes would dramatically complicate the analysis: if the delensed data and lensing tracer contain common multipoles, the delensed -modes would contain very strong, spurious delensing signature that originate from disconnected 4-points and 6-points statistics of the CMB Teng et al. (2011); Carron et al. (2017); Namikawa (2017). The reason is that the lensing estimator cannot distinguish between true and random lensing signatures in the data (such as shear or magnification) and all of these signatures are removed after delensing, leading to a spuriously unlensed-looking CMB at the two-point level. The loss of signal to noise in the lensing tracer by excising the largest angular scales is small: the delensing efficiency is reduced by roughly 1% on the most relevant scales .

To delens the CMB, we use a simple remapping technique. From the Wiener-filtered lensing deflection estimate , we simply remap the Stokes parameter (after beam deconvolution, and discarding multipoles higher than 2000) according to the lensing inverse deflection field , defined through the condition that deflected points get remapped back to themselves

 x+α−1(x+α(x))≡x. (29)

This inversion is performed exactly following Ref. Carron and Lewis (2017), using a fast-converging Newton-Raphson solver. This delensing technique differs operationally and conceptually from the template method used for instance by the SPTpol team on their polarization data Manzotti et al. (2017), which uses a Wiener-filtered -mode map to build a template, then subtracted from the data. The filters of the template method are optimized to minimize the resulting -power at linear order Sherwin and Schmittfull (2015). While the remapping method naturally contains higher-order terms, the noise is also remapped, which can result in higher total power if this is high such as for Planck data Carron et al. (2017); Aghanim et al. (2018). However, we found that there is basically no difference between the two methods for the low levels of noise here. By default we include the noise remapping into our delensed -power predictions, but this is very small. Finally, there are no separation complications Lewis et al. (2002); Smith and Zaldarriaga (2007) since we use periodic patches throughout this paper and the spectra estimator are trivial to build from the delensed or nominal Stokes maps.

For the covariance matrix, we use the simple approximation Smith et al. (2004); Benoit-Levy et al. (2012) combining the Gaussian part with corrections according to the lensing kernels:

 Σℓ1ℓ2=δℓ1ℓ2σ2ℓ1(CBB)+∑ℓ∂CBBℓ1∂CEE,unlℓσ2ℓ(CEE,unl)∂CBBℓ2∂CEE,unlℓ+∑L∂CBBℓ1∂CϕϕLσ2L(Cϕϕ)∂CBBℓ2∂CϕϕL (30)

with Gaussian spectrum variance

 σ2ℓ(CXX)≡2(CXXℓ)22ℓ+1 (31)

where really stands for the number of multipoles in our flat-sky patch. The derivatives are evaluated to first order in , or in for the delensed band powers. On the relevant scales , the covariance correlation coefficients are small and of minor importance, and the perturbative expansion is accurate enough. Collecting reconstructions from simulations, we checked that the diagonal of the covariance matrices matches the prediction to within a couple of percent at least.

## V Results

We have performed our baseline posterior reconstruction, including realization-dependent determinant calculation, on 21 CMB simulations for each of the 3 different levels of tensor modes advertised in the introduction. For each realization, we also performed likelihood analyses from the quadratic-estimator delensed, iteratively delensed -mode band powers and with no delensing. Fig. 5 shows our main results. For each simulation the constraints obtained from either our posterior approximation are displayed (black), together with those obtained from the different band-power analyses (coloured lines). Blue shows the results from the nominal (no delensing) band-power likelihood, orange after delensing with the Wiener-filtered quadratic estimator , and green after delensing with the iterative (formally, the maximum a posteriori (MAP) solution ) lensing solution . Finally, for comparison the red line shows the constraints achievable if the lensing deflections were known perfectly. Up to irrelevant constants, this latter case is given by

 lnp(r,|Xdat,ϕin)∝−12Xdat,†Cov−1ϕinXdat−12lndetCovϕin, (32)

where is the lensing potential input to data realization . The three panels shows and from top to bottom. In the first two panels the quoted numbers are the width of the PDF, and the lowest panel shows the 95% confidence upper limit. The grey area around the black line shows the uncertainty originating from the Monte-Carlo measurement of the Hessian determinant. We evaluate these errors simply by sampling an ensemble of posteriors from the Monte-Carlo log-determinant’s errors. Fig. 4 shows explicitly the posterior PDF for one of the realizations using the same colour scheme. As visible there, all reconstructions show skewness and other non-Gaussian features, hence these summary statistics are only an imperfect representation of the constraints on .

All of our exact posterior estimates mean value lie within 1.8 () of the true value. The series of simulations shows a single outlier, with recovered posterior mean formally lower than the input by , but with a pdf showing substantial skew with no strong evidence of an anomaly. We can try and isolate a possible systematic bias in our posterior approximation by combining all posteriors as if they were independent measurements of . Doing so, we obtain the constraints

 100^r=5.15±0.12(100rin=5,combined) (33)
 100^r=1.04±0.05(100rin=1,combined), (34)

both perfectly consistent with the input values. This constrains a systematic bias to a small fraction of a standard deviation. The same test on all the band-power likelihoods or the known- likelihood (Eq. 32) also shows consistency.

It is apparent that there is significant spread in the summary statistics displayed on Fig. 5. As expected, this is more pronounced for the nominal band-power analysis which has the weakest constraints. While the different methods constraining power do rank as expected on average, the results using the iteratively delensed band powers in green and our full posterior reconstruction in black are very close, with a few reconstructions showing nominally slightly worse summary statistics from the posterior reconstruction. These cases seem compatible with the errors on the posterior. Also, since the exact band-power likelihood is intractable, it is very difficult to assess precisely the impact of our band power-likelihood approximation on each realization. In all cases, iterative delensing does perform better than quadratic estimator delensing, but the improvement shows substantial realization dependence as well, and in a couple of cases can be absolutely minimal. Table 1 shows the summary statistics averaged across all these simulations. We find that the exact posterior outperforms the iteratively delensed band powers by 3%, 6% and 8% for and respectively.

## Vi Summary

Unless the primordial -mode power produced during inflation is very large, sophisticated analysis techniques such as delensing will be essential to provide best constraints on primordial gravitational waves. We have presented a new estimator, based on a close approximation to the exact posterior of the tensor-mode amplitude. By careful Monte-Carlo investigations of corrections to the approximation, we have demonstrated that it is unbiased and very close to optimal, providing the tightest possible constraints on primordial gravitational waves from CMB data. The estimator uses fast, joint estimation of the best lensing deflection map and of the unlensed CMB.

This first investigation used a simplified setting, including periodic sky patches, no analysis mask, and usage of homogeneous noise, facilitating both the iterative lens reconstruction and the unlensed - recovery from the observed lensed Stoke polarization data. These assumptions did not play a key role in obtaining our results, since the presence of the lensing deflections and data realization dependence break isotropy and prevent the existence of trivial basis to work with. Usage of Monte-Carlo simulations and inversion methods akin to conjugate-gradient seem unavoidable. Lens reconstruction and Wiener-filtering on masked data have already been demonstrated successfully Carron and Lewis (2017) with the same methods, at the cost of a manageable increase in execution time.

The posterior reconstruction has dependency on two aspects of the cosmological model used to define the CMB likelihood: the unlensed CMB scalar perturbations spectra and the lensing potential power spectrum. Changes in the lensing spectrum (or lensing deflection map prior) impacts slightly the optimal lens reconstruction, and changes in the scalar spectra the unlensed and polarization field reconstruction. Hence, formally, usage of a slightly different fiducial model might lead to a slightly different result. However, all these power spectra are extremely well constrained in practice from observations, including the lensing spectrum, so this is unlikely to bring significant biases. Furthermore, if necessary it is possible to include the uncertainty in the power spectra in the posterior, by obtaining the linear response to the spectra and extending in this way the posterior density, in analogy to the way state-of-art lensing spectrum reconstruction likelihoods are built Ade et al. (2016b); Sherwin et al. (2016). One can also go a step further and marginalize directly over these using the empirical spectra, as demonstrated by Ref. Aghanim et al. (2018). We also restricted our analysis to the extraction of the tensor-to-scalar ratio assuming a fixed template shape of the tensor spectrum. However, this is not a limitation of this approach; using high-quality observations the obvious generalization of this framework will be able to distinguish features as well.

We have compared the performance of our estimator to that of more traditionally planned -mode band powers extraction. We found standard, well-demonstrated analytical likelihood models are able to describe meaningfully the delensed band powers, and we have used these likelihoods on nominal, quadratic estimator delensed and iteratively delensed band powers. The performances of these delensed band powers do match naive expectations. For the configuration studied here, targeting the recombination peak of the -mode spectrum with noise levels in line with expectations from a CMB stage-IV experiment, our new estimator does outperform the iteratively delensed band powers by a realization-dependent amount, also depending on the exact value of , reaching on average for small values. Producing the posterior PDF for as we did is more expensive numerically than producing band powers. Nevertheless, we demonstrated in this paper that the analysis was possible, providing constraints optimal by construction, and improving prospects on detecting a tantalizing component of modern cosmology.

###### Acknowledgements.
I thank Antony Lewis for many discussions and useful comments on the manuscript, Anthony Challinor for early discussions that eventually lead to the material presented this paper, and Jesús Torrado for discussions on Gaussian vectors with large dense covariance matrices. The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement No. [616170]. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231

## Appendix A Lensing map posterior curvature

This appendix describes in more details the lensing map posterior curvature matrix , defined as the matrix of second variation of the lensing map log-posterior,

 HLL′≡δ2S[ϕ]δϕLδ¯ϕL′∣∣∣ϕ=ϕ∗(r), with S[ϕ]≡12XdatCov−1ϕXdat+12lndetCovϕ+12∑L|ϕL|2Cϕϕ,fidL. (35)

where the first two terms on the right-hand side come from the lensing map likelihood, and the last term from the Gaussian prior. In particular, in this appendix we obtain the result that the data-dependent part of the matrix can be applied in linear time to a vector (at the cost of one CMB Wiener-filtering). This makes the calculation of its log-determinant and inverse possible. The data-independent part poses no particular problems. The Gaussian prior curvature is trivial, and for the second term in Eq. (35) we adopt the same very accurate approximation as in Sec. III.1,

 12lndetCovϕ≃12lndetCovϕ≡0+12∑LL′ϕL¯ϕ′Lδ212lndetCovϕδϕLδ¯ϕL′∣∣ ∣∣ϕ≡0≡12lndetCovϕ≡0+12∑LRL|ϕL|2, (36)

where is the -induced mean-field linear response (hence also the desired curvature term), obtained analytically following Ref. Carron and Lewis (2017) and reproduced briefly at the end of this appendix.

To obtain the data-dependent part of the curvature, we calculate for convenience the curvature with respect to the two position-space deflection components of the flat sky, where . Once this is done, it is straightforward to apply the matrix to a lensing potential vector by expanding it into these two components, applying , and re-projecting eventually onto the potential Fourier harmonics. The data-dependent curvature splits into two terms which we describe next:

 Habdat(x,y)=12Xdat⋅Cov−1ϕδCovϕδαa(x)Cov−1ϕδCovϕδαb(y)Cov−1ϕXdat+(a,x↔b,y)(≡2HabF(x,y))−12Xdat⋅Cov−1ϕδ2Covϕδαa(x)δαb(y)Cov−1ϕXdat(≡Habdet(x,y)). (37)

When the data covariance precisely matches (that is, for our simulations, for fiducial values close to the true value in ). the term coincides, on average, with the Fisher information matrix on the displacement field. Under these conditions, and setting , it is precisely the isotropic inverse lensing reconstruction noise for the deflection angle calculated with the unlensed CMB spectra in the weightsOkamoto and Hu (2003). The second term is subdominant: on average, and again when the data covariance matches , this term would be cancelled by a contribution from . This term would also cancel one of the two factors in Eq. (37). The complete relation Eq. 35 for the lensing potential Fourier modes under these conditions is

 (38)

which can serve as a useful consistency check of the data-dependent curvature calculation.

To proceed, we need the first and second derivatives of the covariance. To simplify notation, we introduce

 ξϕ≡DϕCunlD†ϕ,ξϕ,a≡Dϕ∇aCunlD†ϕ,andξϕ,ab≡Dϕ∇a∇bCunlD†ϕ (39)

where the operator is defined by the block-diagonal matrix Fourier representation , and similarly for with an additional factor . The following relations hold

 δCovϕijδαa(x)=B(xi,x)(ξϕ,aB†)(x,xj)−(Bξϕ,a)(xi,x)(B†)(x,xj), (40)

and

 δ2Covϕijδαa(x)δαb(y)=B(xi,x)(ξϕ,abB†)(y,xj)δD(x−y)−B(xi,x)(ξϕ,ab)(x,y)(B†)(y,xj)−(B)(xi,y)(ξϕ,ab)(y,x)(B†)(x,xi)+(Bξϕ,ab)(xi,y)(B†)(x,xj)δD(x−y). (41)

There are implicit sums over Stokes indices in the above equations. We introduce further the following notation for the inverse-variance weighted CMB maps and Wiener-filtered CMB

 ¯X≡B†Cov−1ϕXdatandXWFϕ≡ξϕB†Cov−1ϕXdat=ξϕ¯X. (42)

The subscript on the Wiener-filtered CMB is present to emphasize that these are the lensed Wiener-filtered delensed CMB maps, in contrast to Eq. (9). With this we may write (full indices)

 [δCovϕδαb(y)Cov−1ϕXdat](xj)=B(xj,y)XWF,b(y)−(Bξϕ,b)(xj,y)¯X(y) (43)

The Fisher-like terms contract two such vectors, with an inverse covariance in the middle. When doing the contraction, the operator appear, and derivatives of on the left-hand side gets a minus sign because the first derivative is anti-symmetric. The result is

 HabF(x,y)=12XWFϕ,a(x)Kϕ(x,y)XWFϕ,b(y)−12¯X(x)(ξϕ,aKϕξϕ,b)(x,y)¯X(y)−12XWFϕ,a(x)(Kϕξϕ,b)(x,y)¯X(y)+12¯X(x)(ξϕ,aKϕ)(x,y)XWF,b(y) (44)

This matrix is positive definite for any model. To apply this a vector , we can do as follows

 ∫d2yHabF(x,y)vb(y)=XWFϕ,a(x)(V−W)(x)+¯X(x)(ξ,a(V−W))(x) (45)

with

 V=KϕXWFϕ,bvbW=Kϕξϕ,b¯Xvb. (46)

There is thus only one set of maps to inverse filter, the difference between and . There remains the term , which comes from the second derivative of the covariance in Eq. (37). It is also a contraction of these types of maps. Explicitly,

 Habdet(x,y)=¯X(x)ξϕ,ab(x,y)¯X(y)−δD(x−y)¯X(x)XWF,ab(x), (47)

which is easily applied to input vectors. The full matrix, after rescaling by a simple isotropic approximation, is close enough to unity that its inverse can be applied to a vector with conjugate gradient inversion without much difficulty.

Finally, we reproduce for completeness the expression for the mean-response used in several places in this work, defined above in Eq. (36). Again, we state the result for the two components of the displacement field, defined as

 Rab(x,y)≡δ212lndetCovϕδαa(x)δαb(x)∣∣ ∣∣ϕ≡0. (48)

Since this is evaluated for vanishing deflection, the matrix is only a function of . The result follows straightforwardly from the covariance first and second variations, Eqs. (39) and (41):

 −Rab(r)=(ξ0,aK0)(r)(ξ0,bK0)(r)+K0(r)(ξ0,aK0ξ0b)(r)−K0(r)ξ0ab(r)+δD(r)(K0ξ0,ab)(r). (49)

A contraction on Stokes indices is implicit in this equation. Contracting after Fourier transformation with gives the lensing potential response . All terms are easily calculated with 2-dimensional FFT methods.

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters