ArtDeco: A beamdeconvolution code for absolute CMB measurements
Key Words.:
methods: numerical – data analysis – cosmic microwave backgroundWe present a method for beamdeconvolving cosmic microwave background (CMB) anisotropy measurements. The code takes as input the timeordered data along with the corresponding detector pointings and known beam shapes, and produces as output the harmonic , , and coefficients of the observed sky. From these one can derive temperature and Q and U polarisation maps. The method is applicable to absolute CMB measurements with wide sky coverage, and is independent of the scanning strategy. We tested the code with extensive simulations, mimicking the resolution and data volume of Planck 30GHz and 70GHz channels, but with exaggerated beam asymmetry. We applied it to multipoles up to and examined the results in both pixel space and harmonic space. We also tested the method in presence of white noise.
1 Introduction
Removing systematic effects plays an important role in the data analysis of modern highsensitivity cosmic microwave background (CMB) experiments, such as the WMAP and Planck missions (Jarosik et al. 2011; Tauber et al. 2010). In this work we concentrate on one source of systematic effects, the beam asymmetry. We present an efficient beam deconvolution code, called artDeco. It can be applied to any absolute CMB experiment with sufficient sky coverage.
The required input consists of the timeordered data (TOD), the corresponding detector pointings, and known beam shapes. The primary output of the code consists of the harmonic , , and coefficients of the sky. From these one can derive temperature and Q and U polarisation maps.
This is the usual deconvolution mapmaking problem (see, e.g., Armitage & Wandelt 2004; ArmitageCaplan & Wandelt 2009; Harrison et al. 2011). Our method differs from the earlier works by an efficient reformulation of the deconvolution problem through heavy use of Wigner functions. The formulation is general, and makes no assumptions on the scanning pattern. Similar ideas have been studied by G. Prézeau, independently of us (private communication).
In the CMB literature the concept of mapmaking often refers to a procedure that involves removal of correlated noise. Methods for doing this have been treated in several papers (Keihänen et al. 2005, 2010; Poutanen et al. 2006; Ashdown et al. 2007a, b; KurkiSuonio et al. 2009; Sutton et al. 2009). The methods discussed in these works construct pixelised temperature and Q, U polarisation maps, but do not attempt to correct for an asymmetric beam shape.
Our method steps in at the point where the correlated noise has already been cleaned from the data. The input TOD is assumed to include signal and uncorrelated noise only. We are assuming here that the correlated noise component can be removed from the data to a sufficient degree prior to the beam deconvolution step. The noise removal step itself is affected by beam asymmetries because signal differences caused by beam mismatch can be falsely interpreted as noise. Fortunately, in destriping methods it is possible to strongly reduce this effect by masking out the regions with the strongest foreground contamination that generate most of the effect. Also, destriping methods produce as natural output the noisecleaned TOD, which is the required input in deconvolution mapmaking. Accordingly, destriping methods are the natural counterparts of our deconvolution method.
Since the aim of this paper is to present a new deconvolution method, rather than assess the performance of any particular instrument, we tested the code with a somewhat idealised simulation. In particular, we ignored other systematic effects such as frequency response. On the other hand, we chose strongly asymmetric beam shapes to show the beam effects more clearly.
2 Beam deconvolution
2.1 Constructing the linear system
In this section we present the deconvolution problem using a formalism based on Wigner functions. We write the solution in a form that allows us to solve the problem numerically in an efficient way. Some details are left for Appendix A.
We start by writing the signal received by a detector at a given time as
(1) 
The () are harmonic coefficients that represent temperature and polarisation of the sky signal, are corresponding beam coefficients, represents noise, and is a Wigner function that defines a rotation of the beam from a fiducial orientation at the north pole to its actual position and orientation. For exact definitions, see Appendix A.
We have adopted a compact notation where denotes a combination of three angles , which define the detector’s orientation. Angles and define a point on the celestial sphere, and defines the beam orientation. The angles can be interpreted as Euler rotation angles.
We perform a simple linear leastsquares fit of to the data up to some multipole .
In general, a linear leastsquares fit leads to an equation of the form
(2) 
where is the vector of unknowns, is the data vector, is noise covariance, and represents a priori information on the covariance of (Wiener 1949). Matrix relates the unknowns to the data through
(3) 
In the present case matrix is
(4) 
and represents .
We made two simplifying assumptions, which are not of fundamental nature, however. Firstly, we assume the noise is white with constant variance, and assign equal weights to all samples . Secondly, we use no a priori information on the sky signal, i.e. the harmonic coefficients are assumed to be uncorrelated and can host a signal of infinite variance. Under these assumptions the covariance matrices drop out of the equation (=const, ). The leastsquares solution for is obtained by solving the linear system of equations
(5) 
In the following we process Eq. (5) further to bring it into a form where it can be solved in an efficient way. We treat the right and lefthand sides separately.
2.1.1 Righthand side
The righthand side yields
(6) 
Here we define
(7) 
The triplets define the detector orientation and are provided as input to the code. To numerically evaluate the transform, we replace the sum over TOD samples by a sum over a threedimensional grid as follows. We divide the space spanned by all possible triplets into bins of equal volume and denote by the sum of values falling into one bin. In we use HEALPix^{1}^{1}1http://sourceforge.net/projects/healpix pixelisation (Górski et al. 2005), and in we divide the space uniformly. Most elements of are zero, since a particular HEALPix pixel is typically observed only in a few orientations of the beam.
We call quantity the 3D signal map, and quantity
(8) 
the Wigner transform of the signal map.
Here denotes a sum over all bins. Since most bins are empty, the sum needs to be carried out only over a subset of all values. Expansion (8) can be evaluated numerically quite efficiently using FFTs and fast algorithms for the generation of Wigner functions.
Quantities and carry no memory of the order in which the sky was scanned. They just record which pixels of the sky were observed in which orientation of the beam.
2.1.2 Lefthand side
Consider then the matrix on the lefthand side of Eq. (5),
(9) 
The multiplier matrix is too large to be inverted. Instead, we use the conjugate gradient (CG) iteration method to solve Eq. (5).
We pixelise the threedimensional pointing space in the same way we did with the signal, and denote by the number of hits into bin . In line with the definition of the 3D signal map, we call the 3D hit map.
We replace the sum over samples by a sum over bins and write
(10) 
As the next step, we write the Wigner functions as
(11) 
where are the reduced Wigner functions. Inserting this into (10), we obtain
(12) 
where we have defined
(13) 
The sums over are carried out over 3D bins. Substituting this back into (9) and rearranging, we bring the lefthand side of the deconvolution equation into the form
(14)  
This can be evaluated much more quickly than the original formula (9), since the summation over the long TOD vector has been replaced by an operation on the more compact object. Also, the convolution operations in indices and can be carried out efficiently with FFTs.
The terms are written in the order in which they are applied in the code.
2.2 Formulation through 3j coefficients and preconditioner
Eq. (9) can be written in a yet more compact form.
The product of two Wigner functions can be expressed in terms of symbols (Appendix A.3) as
(15)  
Substituting this into (10) and that into (9), we arrive at the formula
(16)  
Indices and run from 0 to . The sum over covers the values for which . We have defined the Wigner transform of the hit map as
(17) 
We use the formula (16) to explicitly construct the diagonal of the matrix, which serves as a preconditioner to speed up the CG iteration.
The formulation presented here is general, and makes no assumptions on the scanning pattern or on the sky coverage.
3 Implementation
3.1 User interface
The artDeco code takes two lists of input data objects containing the beam coefficients and the socalled 3D maps for the respective detector for all detectors that are to be processed simultaneously. The beam coefficients can be read from FITS files in the format that is also being used by the HEALPix package for storing spherical harmonic coefficients.
The 3D maps contain the quantities and for a given detector; they are created from time streams of detector pointings and sky signal, respectively, by another standalone module called todtobin. This module achieves the angular discretisation of the detector pointings by sorting them into pixels of a HEALPix map with a userdefined parameter, which are in turn subdivided into bins for the discretisation in . The choice of and has direct consequences on the runtime of the deconvolver and the accuracy of its results: if high values are chosen, the 3D map will become very large, and the deconvolver will require more memory and CPU time. On the other hand, too low values will result in noticeable discretisation errors and inaccurate output . As a guideline, should be chosen in such a way that the smallest features of all beams are well resolved at the angular resolution
(18) 
associated with this . In a similar fashion, should be large enough to resolve azimuthal features of the beams (assuming the beams are centred on a pole). For completely axisymmetric beams, is sufficient. Generally, (where is the highest azimuthal multipole used to describe the beams) is a good choice.
The computational complexity of todtobin is approximately , where is the total number of TOD samples, since its runtime is dominated by the sorting of the input TOD according to HEALPix pixel number and bin. Consequently, and only have a significant effect on the size of the output 3D maps. Typical execution times for todtobin with several billion samples (as used for the experiments shown in this paper) are of the order of a few minutes.
ArtDeco takes several usersupplied parameters that influence its behaviour. Most important among these are , which specifies the maximum multipole up to which deconvolution is carried out, and , which limits the maximum azimuthal multipole used for the beams. The explicit introduction of is advantageous since most beams can be expressed accurately even when using a (even if the beam has rotational symmetry and is unpolarised), and the reduction in reduces CPU and memory requirements dramatically.
ArtDeco’s output consists of a set of in HEALPixcompatible format. Depending on whether the user requested polarisation, the file contains only or additional and coefficients.
3.2 Algorithm overview
The deconvolution algorithm has been implemented in C++, making use of the MPI library for parallelisation. The quantities that depend on , such as and , are distributed to MPI tasks according to the angle. This allows a fairly good load balance. Summation over , when needed, is done efficiently through the collective MPI_Reduce() routine. The coefficients are distributed according to index . The beam coefficients are fully present on all tasks.
The procedure of evaluating Eq. (14) can be summarised as follows. First we multiply the input by the beam coefficients . Then we perform a loop over index , inside which we broadcast the coefficients for the current to all processes, multiply by , construct the reduced Wigner matrices for the local values of , and accumulate over . The multiplication by matrix is a convolution operation in indices and can be performed efficiently using 2D FFTs. Each MPI task performs the multiplication by independently for the local . After that, we make another loop over index , construct again the Wigner matrices, sum over , collect the result according to to the owner process, and multiply by the beam coefficients.
The solution is assumed to have converged sufficiently as soon as the residual of the current estimate is less than times the residual of a vector containing all zeroes.
To generate the symbols needed to construct the preconditioner, we use a C implementation of the recursive algorithm described by Schulten & Gordon (1975).
4 Simulation setup
The deconvolution algorithm was tested using mock data produced by the Planck simulation package (LevelS, Reinecke et al. 2006).
4.1 Mission parameters
We assumed a scanning strategy that caused the telescope’s spin axis to perform two cycloidal excursions from the ecliptic plane per year with an amplitude of ; in azimuthal direction, the spin axis always pointed towards the Sun.
Timeordered data were generated for a time span of 10 000 hours, corresponding to slightly more than two full sky surveys; their length is of the order of samples.
4.2 Detector geometry
For the first set of experiments, data were simulated for two horns, i.e. two pairs of detectors, which were modelled roughly following Planck’s 30GHz channel. In particular, the detectors had an average FWHM of 322, the polarisation directions in each detector pair were rotated by about (but not exactly) against each other, and the polarisation directions of the two horns were in turn shifted by . Both horns followed the same scan path on the sky, although with a little time delay.
The beam shapes were assumed to be elliptical, with ellipticities of 1.7 and 1.8 for each detector in a pair, respectively. These values are considerably higher than in a typical experiment and were chosen to demonstrate the performance of the method even under unfavourable conditions.
In order to show the algorithm’s performance at higher resolutions, another data set was produced using two horns sensitive at 70GHz and with an average FWHM of 13′.
Table 1 gives an overview over the detector properties.
Name  Frequency  ellipticity  FWHM  (K)  

30GHz  1.7  32.2352’  514  
30GHz  1.8  32.1377’  514  
30GHz  1.7  32.2352’  514  
30GHz  1.8  32.1377’  514  
70GHz  1.7  13.0328’  –  
70GHz  1.8  12.9916’  –  
70GHz  1.7  13.0328’  –  
70GHz  1.8  12.9916’  – 
4.3 Input data
To generate a theoretical power spectrum for the CMB emission, we used the CAMB^{3}^{3}3http://camb.info code with the cosmological parameters , and . Using the HEALPix tool syn_alm_cxx, we created a random, unconstrained realisation of polarised from this power spectrum.
Emission maps of foreground components were generated using the Planck prelaunch Sky model^{4}^{4}4http://www.apc.univparis7.fr/~delabrou/PSM/psm.html (PSM, Delabrouille et al. 2012), version 1.6.6. We chose to include the contributions from galactic infrared emission (galactic synchrotron radiation, freefree emission, as well as thermal and spinning dust).
The signal from strong point sources was taken into account as well; to achieve this, a PSMgenerated catalogue of point sources was directly expanded into spherical harmonic coefficients, using a sufficiently high cutoff value that the beam response was negligible at this scale.
Polarised emission was disabled for all contributions except for the CMB, which makes it very easy to detect leakage of the strong temperature signal in the galactic plane into the polarisation signal.
We neglected bandpass effects and assumed an identical delta frequency response for all detectors.
The power spectra of the signal components at 30 GHz are plotted in Fig. 1.
4.4 Noise
In some of our simulations, detector noise was added to the idealised sky signal. Since the deconvolution method requires correlated noise to be removed from the timeordered data beforehand, we only added Gaussian white noise with an appropriately chosen to the simulated samples (see Table 1).
4.5 3D map generation
The mock TOD were processed with the todtobin tool to generate the 3D input maps required by the deconvolver. We set for the 30GHz case, and for 70GHz; was always set to 256.
5 Results
5.1 Constructing a sky map
The primary output of the deconvolution code consists of the (X=T,E,B) coefficients of the sky, up to . From these one can construct the usual temperature and Q,U polarisation maps through harmonic expansion of the coefficients. We did this with the alm2map_cxx tool of the HEALPix package.
For comparison, we also produced binned maps directly from the TOD. A binned map is constructed by assigning each TOD sample completely to the pixel containing the centre of the beam. The binning operation can be formally written as
(19) 
Here is a pointing matrix, is white noise covariance, and is a (I,Q,U) map triplet.
Binned maps are distorted by beam effects if the beam is asymmetric. In particular, the polarisation component of a binned map is contaminated by temperature leakage through beam shape mismatch. We aim at demonstrating that deconvolution can reduce these undesired effects.
Since our simulation includes point sources, the input coefficients are not bandlimited. However, we can only solve the coefficients up to some limited . This leads to ringing artefacts, if a map is constructed directly from the raw coefficients. Ringing is particularly prominent around point sources and other sharp features.
Ringing is caused by the sharp cutoff of the spectrum at , and it occurs even if the deconvolved coefficients are fully correct up to . For this reason it is necessary to smooth the coefficients with a symmetric Gaussian beam prior to constructing a map. The required amount of smoothing depends on the spectrum of the underlying sky, and on . The highest that one can solve in turn depends on the beam width. In other words, it is not possible to recover structures significantly smaller than the beam itself. This gives the following rule of thumb: the required smoothing width is given by the smallest features of the original beam.
In Fig. 2 we demonstrate the ringing around a point source. We show an image of a point source, constructed from the input , but including only coefficients up to =800. Ringing artefacts are evident. On the right we show the same region of the sky after the map is smoothed with a FWHM=32′ Gaussian symmetric beam. Smoothing removes the ringing, but naturally also smoothes out the smallest structures in the map.
5.2 Signalonly simulations with elliptic beam
We analyse first the noisefree 30GHz simulation in detail. A detailed description of the simulation was given in Section 4. We ran the code with various combinations of parameters , , and analysed the results both in pixel space and in harmonic space. We included data from all four detectors and considered both temperature and polarisation.
5.2.1 Temperature map
In Fig. 3 we show a fullsky Mollweide projection of the binned temperature map together with that of the deconvolved map. We used deconvolution parameters =800, =6.
We chose the value FWHM=32′ as the baseline smoothing width for the 30GHz maps. This corresponds to the mean width of the input beam. In some cases it was necessary to apply even stronger smoothing.
The difference between the binned and the deconvolved map does not appear dramatic in the fullsky map image, but becomes evident when we zoom into a point source. Figure 4 shows a randomly selected strong point source below the galactic plane. While the source in the binned map is clearly elongated due to beam shape, it appears circular in the deconvolved map.
The deconvolved image can be compared to the right panel of Fig. 2, where we have shown the same patch of the sky constructed from the input coefficients. The maps are nearly identical.
5.2.2 Polarisation maps
The beam effects show up more dramatically in the polarisation maps (Fig. 5). The temperature signals recorded by two detectors differ due to the different beam shapes. The binning procedure interprets this erroneously as polarisation signal. This leads to a leakage of temperature signal into the polarisation maps. Since our simulation did not include polarised foregrounds, the whole galactic structure seen in the binned map is temperature leakage through beam mismatch.
We show deconvolved maps with two levels of smoothing below the binned map. Smoothing with a FWHM=32′ beam is not sufficient to fully remove the galactic residual, but if we smooth the map further to FWHM=45′, the residual disappears almost completely.
In Fig. 6 we show a zoom into the Q polarisation map, just below the galactic plane. In the binned map we see a strong galactic residual, and typical “fourleaf clover” structures that arise from temperature leakage at the location of a point source. We show the deconvolved map again for two levels of smoothing: FWHM=32′ and FWHM=45′. The additional smoothing removes the clover structure nearly completely. To show that this is not only an effect of more aggressive smoothing, we smoothed the binned map further with a Gaussian beam of width FWHM=32′, which, combined with the 32′ of the original beam, gives a total smoothing of 45′. This map is shown below the unsmoothed binned map. The additional smoothing has little effect on the galactic residual.
5.2.3 Harmonic space
We proceed to analyse the deconvolution results in harmonic space. We computed the TT, EE, and TE spectra of the deconvolved coefficients and compared them to the corresponding input spectra. The spectrum is constructed as
where and stand for and .
We ran the code for all combinations of =4,6 and =400,500,600,700,800. Results for different combinations are shown in Fig. 7. We applied no smoothing to the coefficients.
We observe that the optimal value for is coupled to . Results for =4 and =6 begin to diverge only above . The results for different with =6 are practically on top of each other, except for the highest multipoles near , which are clearly distorted. When constructing a map, we cut out the last 20 multipoles before making the harmonic expansion.
For comparison we also show the spectrum obtained from a harmonic expansion of the binned map, which we computed using the anafast tool of the HEALPix package. This decreases rapidly as a function of as a result of beam smoothing. To correct for the smoothing, we divided the spectrum by the window function of a symmetric Gaussian beam with FWHM=32′. This corrects for the mean smoothing and allows one to recover the correct spectrum up to =300. Above that, the result begins to deviate from the input spectrum due to beam ellipticity, which cannot be corrected for by application of a window function only.
Deconvolution recovers the TT input spectrum well up to =800. At higher multipoles there is little information left in the signal, and the convergence properties of the code become poor. The weaker EE and TE spectra are recovered up to =400. Owing to temperature leakage, the binned map only gives the correct EE spectrum for the ten first multipoles in EE, and the TE spectrum is useless.
We did one run with parameters =600, =8 (not shown), to see if it would help to recover the EE spectrum to even higher multipoles. The results, however, did not differ significantly from the =600, =6 case.
5.2.4 Reconstruction error of harmonic coefficients
A comparison between the reconstructed spectrum and the input spectrum is not a fully reliable test of the correctness of the deconvolution result. It is possible to construct a set of harmonic coefficients that provides the correct spectrum, although the individual coefficients are incorrect.
To assess directly the accuracy of the coefficients, we constructed a quantity we call the error spectrum. It is computed as follows. We take the difference between the coefficients we obtain from deconvolution and the corresponding input coefficients . As before, stands for T or E. We compute the spectrum of this difference as
The resulting spectrum is a measure of the error in the reconstruction of the coefficients, as a function of scale.
We show the error spectra of the 30GHz runs for T and E coefficients in Fig. 8. For comparison, we show the input TT or EE spectrum in the same figure. The coefficients are recovered with good accuracy when the error spectrum is well below the input spectrum. The meaning of the blue line is the same as in Fig. 7, that is, we compute the harmonic expansion of the binned map and divide them by the window function of a symmetric Gaussian beam (FWHM=32′).
We made another comparison with an ideal window function, which we computed as the ratio of the uncorrected spectrum of the binned map, and the input spectrum (green and black lines in the top panel of Fig. 7). The error spectrum for this is shown in light blue in Fig. 8 (TT spectrum only). By construction, applying the ideal window function to the spectrum of the binned map returns exactly the correct input spectrum. The error spectrum, however, is not improved with respect to the result obtained with Gaussian window function, which shows that the failure of the binning procedure to recover the correct spectrum above in Fig. 7 was not just a result of a poorly chosen window function.
The error spectrum reveals that the deconvolved coefficients are more accurate than those obtained by harmonic expansion of the binned map already at the lowest multipoles.
5.3 Simulations with noise
The first simulation was quite idealised, since the data were noisefree. We made another simulation, where we added white noise to the TOD. In other aspects the simulation was identical to the first one. No correlated noise was added. We are implicitly assuming that correlated noise components, if present, were removed to sufficient accuracy prior to deconvolution.
As expected, noise made the deconvolution process more difficult, and the code did not converge for the highest values of . We reached full convergence for the combinations =600, =6, and =700, =4. Case =700, =6 did not converge anymore.
Fig. 9 shows a zoom into a point source, the same as in Fig. 4, in presence of white noise. The deconvolution parameters were =600, =6. The deconvolved map smoothed with the fiducial FWHM=32′ is contaminated by distorted noise. Also, we see some ringing around the point source due to the moderately low . We therefore smoothed the map further to FWHM=45′, which was enough to remove the distortion. For comparison we also smoothed the binned map to comparable resolution by applying a smoothing by another FWHM=32′ to it. This, combined with the original beam, gives a total resolution of roughly FWHM=45′. These two maps are shown in the bottom row. Even after the additional smoothing, the point source in the binned map is elongated. Deconvolution restores the circular shape of the source.
Figure 10 shows a zoom into the Q polarisation map. We show the same patch of the sky as in Fig. 6. The raw binned map (not shown) is fully noisedominated. We show maps smoothed to FWHM=45′. Again, deconvolution removes the galactic leakage that is apparent in the binned map.
Figure 11 shows the spectra constructed from the deconvolved coefficients for the parameter combinations for which the code converged fully. Again we also show the spectrum of the binned map, and the same corrected for a symmetric Gaussian beam with FWHM=32′. Deconvolution recovers the true TT spectrum up to nearly , after which noise begins to dominate. The spectrum of the binned map begins to deviate already below . In the case of the EE spectrum, both deconvolution and binning are unable to recover the true spectra except for the very lowest multipoles. In the case of the TE spectrum, deconvolution performs considerably better than binning, and the result follows the input spectrum up to , although it is noisy.
Finally, in Fig. 12 we show the error spectra in presence of noise. Deconvolution recovers the coefficients well up to . In the case of coefficients, the error is well above the input spectrum for all but the lowest multipoles.
It is a wellknown and often quoted fact that deconvolution tends to amplify noise at high multipoles. We see this clearly in the upper deconvolved map in Fig. 9. The effects can, however, be suppressed by applying a sufficiently strong smoothing to the map, as we see from the lower panels of the same figure.
We can look at the amplification of noise also in harmonic space (Fig. 11). While the TT spectrum of the binned map (green line) continues to fall until it levels out at (not seen in the plot), the deconvolution spectrum begins to rise above . Note, however, that for the whole multipole range where we have a deconvolution result, the extra power in the deconvolved map is still below that in the binned map corrected for a symmetric beam (blue line).
In the EE spectrum, the power of the deconvolved map exceeds that in the binned map corrected for a symmetric beam, around . This is in the domain where we have full noise domination in any case.
5.4 70GHz simulation
To test the performance of the code at higher multipoles, we generated one noisefree simulation mimicking the data voulme of the Planck 70GHz channel. The used elliptic beams have an average FWHM of 13′; for more parameters see Table 1. We ran the deconvolution code with parameters =1700, =6.
Temperature and polarisation maps constructed from the with resolution FWHM=13′ (T), or FWHM=18′ (Q) are shown in Figs. 13 and 14 together with the corresponding binned maps. We show the same region of the sky as with 30GHz simulations. Again, deconvolution has worked well, returning the circular shape of the point source, and removing the galactic leakage in the Q polarisation map.
In Fig. 15 we show the TT and EE error spectra for the 70GHz simulation. The are recovered well nearly up to , and the coefficients well above . Thus we have shown that, with sufficient computational resources, our deconvolution method can be applied to realistic data sets of volume and resolution of the Planck 70GHz channel.
5.5 Convergence
The convergence criterion we had set in our code is rather strict. We require that the square of the residual vector has fallen below a fraction of of its initial value.
To see if a less strict criterion would help in reducing the computation time, we studied the convergence of the solution. We dumped the coefficients after every 50 steps, and computed the error spectrum with respect to the input coefficients.
We show the TT and EE spectra after 200 and 500 iteration steps in Fig. 15, along with the final results. We observe that the solution changes only very little after the first few hundred iterations, and 200 steps already give a significant improvement over the harmonic expansion of the binned map. In fact, after the first 100 iteration steps we can see no visible improvement in the reconstructed sky map. We could thus reduce the computation time by a large factor if we used a more relaxed convergence criterion and would not significantly lose accuracy.
5.6 Complicated beam shape
To demonstrate the power of the deconvolution method, we made yet another simulation with an unconventional Dshaped beam (D for “deconvolution”), which is shown in Fig. 16. The linear dimension of the beam was approximately squared. For demonstration purposes we included only point sources and considered temperature only. For the binning process into the 3D map, we chose and . Deconvolution parameters were , , and only a single fullsky survey was simulated. The comparatively high value of was required to properly take into account the complicated azimuthal structure of the beam.
A map binned directly from the TOD is shown in the left panel of Fig. 17. The beam transforms the image of a point source into a Dshaped pattern, resembling the beam. The deconvolved map is shown on the right; it was smoothed by a symmetric Gaussian beam with FWHM= to reduce ringing. Even with this complicated beam shape we are able to recover the underlying point sources.
It is interesting that in this case it was sufficient to smooth the map with a beam of FWHM= to obtain a very satisfactory map, although the full dimension of the beam is much larger.
6 Conclusions
We have presented an efficient beam deconvolution code, designed for absolute CMB experiments, and tested it on simulated data. The code is released under the terms of the GNU General Public License and can be obtained from http://sourceforge.net/projects/artdeco/.
We looked at maps constructed from the recovered , , and coefficients, and examined the coefficients directly in harmonic space. These reflect different aspects of the solution. We compared the results to a harmonic expansion of a binned map. We have also shown that with sufficient computational resources, we can extend the method to =1700, which would be sufficient for the Planck 70GHz channel.
In absence of noise we could recover the coefficients to a high accuracy, and remove the effects of beam asymmetry. When white noise was added, we were able to reach a lower value of , but the advantage of deconvolution over binning was still clear.
In the case of polarisation, deconvolution worked well in absence of noise. We were able to almost completely remove the temperature leakage due to beam mismatch. When noise was added, results were less clear. Deconvolution removed the visible galactic residual that arises from beam mismatch, but in harmonic space deconvolution did not seem to bring a clear benefit over binning. The recovered coefficients were dominated by noise at nearly all multipoles, but this was true for both deconvolved and binned maps.
Our simulation used beams that were highly asymmetrical and also had a strong mismatch between them, which leads to a significant polarisation leakage. We did this to show the beam effects more clearly, but at the same time we also made the deconvolution problem very challenging. The accuracy we can expect when the code is applied to a real experiment strongly depends on the beam shapes of the particular experiment.
Some topics for future study can be mentioned. We have set a very strict convergence criterion for the CG iteration. Our convergence study indicates that a much more relaxed criterion could be sufficient for practical purposes. A future improvement would be to define a more suitable convergence criterion. This could reduce the computation time by a significant factor.
We applied the method to a simulation data set that provides full sky coverage. Our code is built on a formalism that makes no assumptions on the sky coverage. In practice, good convergence requires nearly full coverage. This is intuitively evident, since we are solving the harmonic coefficients, which necessarily represent the complete celestial sphere. If the input data only cover a part of the sky, the coefficients cannot be well constrained. A straighforward extension of the method would be to insert a prior that constrains the solution in the region which is not covered by the data, but this is beyond the scope of this paper and is a subject for future study.
Yet another topic for further study is determining the noise bias present in the TT and EE spectra at high multipoles.
Though the simulations used in this work mimic some aspects of the Planck mission, the parameters used do not reflect the properties of the actual instrument. The results presented in this paper are thus not representative of the sensitivity of the Planck experiment, and the authors do not represent the Planck collaboration in this context.
Appendix A Definitions
In this appendix we present definitions and conventions used in the paper. We follow the conventions of Varshalovich et al. (1988).
a.1 Harmonic expansion of the CMB radiation field
The CMB temperature and polarisation fields can be presented by spinweighted harmonic functions with spin . Spin represents the temperature field, while components represent polarisation.
The temperature field on the celestial sphere is expanded as
(20) 
The Q and U Stokes fields, which describe linear polarisation, can be written as
(21) 
Here are spinweighted harmonic functions, and are the corresponding (complex) coefficients. The harmonic functions and coefficients obey the symmetry relation
(22)  
The harmonic functions are related to the Wigner functions through
(23) 
where
(24) 
It is convenient to define further
(25)  
The CMB radiation field is expected to be a statistically isotropic Gaussian spin 0,2 field. The statistical properties of the harmonic expansion coefficients are then fully characterised by four spectra , such that
(26)  
a.2 Wigner functions and rotations on a sphere
The harmonic coefficients transform under rotations as follows.
Assume we are given the coefficients in one coordinate system. We then rotate the coordinate system, as described by Euler angles :

Rotate by angle around the axis (in positive direction).

Rotate by angle around the initial axis.

Rotate by angle around the initial axis.
The rotation is equivalent to rotating first by angle around the axis, then by angle around the new axis, and finally by angle around the new axis. The harmonic coefficients in the rotated coordinate system are given by
(27) 
where are Wigner functions.
A Wigner function may be split into a product of three terms, each of which only depends on one of the angles:
(28) 
Functions are the reduced Wigner functions. They have the property
(29) 
They are realvalued and obey the symmetry relation
(30) 
The symmetry relation for the full Wigner functions is
(31) 
a.3 Wigner expansion
Wigner functions for integer values of are defined in domain :
(32) 
Domain can be interpreted as the space of Euler angles. The volume element in domain is , and ’s total volume is .
Any finite function defined in domain may be expanded in a series of Wigner functions as
(33) 
where the coefficients are given by
(34) 
Expansion (33) can be understood as the threedimensional equivalent of the spherical harmonics expansion.
In particular, a product of two Wigner functions can be expanded as a series of Wigner functions as
(35)  
An integral over a product of three Wigner functions is given by Wigner symbols as
(36)  
(Varshalovich et al. 1988).
The following properties of the symbols help in restricting the range of summation. A symbol is nonvanishing
(37) 
only if the two conditions
(38) 
and
(39) 
are fulfilled. We can thus eliminate the summation over and . Product (35) becomes
(40)  
a.4 Beam
a.4.1 Detector signal and definition of beam coefficients
Consider a detector measuring CMB temperature and linear polarisation.
We assume now, very generally, that the signal recorded by a detector, apart from noise, is proportional to the coefficients. We define the beam coefficients as follows. When the detector is at some (predefined) fiducial position and orientation (for instance at the north pole), the signal detected in the absence of noise is
(41) 
It is clear from the definition that the beam coefficients must contain all information not only of the beam shape, but also of the possible polarisation sensitivity and orientation.
We may then rotate the beam to another position and orientation, as described in Section A.2. The rotation brings the beam onto location in spherical coordinates, in an orientation defined by angle . The harmonic coefficients transform under rotations according to (27). The signal seen by the rotated detector is then
(42) 
In the following we derive expressions for the beam coefficients in some special cases.
a.4.2 Nonpolarised beam
Consider first a detector with an ideal beam, beam width equal to zero, and no sensitivity to polarisation. The signal detected by the beam is directly given by (20). We see readily, by comparing (20) and (41), that if the chosen fiducial beam position is at the north pole (), the beam coefficients must be
(43) 
A general nonpolarised beam with sensitivity gives for the components
(44)  
and for .
a.4.3 Polarisation measurements
A polarisationsensitive detector requires careful treatment, because the Stokes parameters Q and U are not defined at the pole. The beam coefficients can still be defined in a meaningful way, as we show in the following.
Consider again a beam at a fiducial position and orientation at the north pole. A general beam may have different polarisation sensitivity at different locations. We therefore define the beam shape separately for temperature and polarisation.
The signal detected may be written in a general form as
(45)  
Here is the local direction of polarisation sensitivity, measured as an angle from the local meridian. Functions , , and together define a general beam.
We proceed to calculate the beam coefficients. We write the trigonometric functions in exponential form, to obtain
(46)  
We can now insert the harmonic expansions (20) and (21) into (46) and read the beam coefficients
(47) 
We then use definition (23) and write the harmonic functions in terms of Wigner functions. In terms of the reduced Wigner functions we have
(48) 
The term can be transferred inside the Wigner function. We may thus write for a general polarised beam
(49) 
The above formulas for the beam coefficients are completely general as long as the beam response is linear.
We now make the simplifying assumption that the polarisation sensitivity is “in the same direction” everywhere on the beam. By “same direction” we mean that the direction of polarisation sensitivity at an arbitrary location is obtained by rotating the polarisation direction vector at the north pole to the desired location along a meridian. The local polarisation orientation angle then becomes
(50) 
where is the deviation of polarisation direction from at the north pole.
At this point it is convenient to write the Wigner functions in terms of the reduced Wigner functions (28). The coefficients for a beam with unique polarisation direction become
(51)  
We readily see that for a symmetric beam the only nonvanishing coefficients are those with . Especially, for a perfect beam with zero width and perfect polarisation sensitivity () we obtain
(52)  
We have assumed normalisation .
Appendix B Methods used to accelerate computation
b.1 Load balancing
In an MPIparallel code, good load balancing among the individual tasks is crucial. As was mentioned in Section 3.2, some quantities are distributed across tasks depending on their colatitude or the index . Since the computational cost of the algorithm is not independent of and , but rather an unknown, smoothly varying function of these two numbers, it is advisable not to assign sequential ranges of those indices to individual MPI tasks, because choosing index ranges that distribute the load evenly is very hard in that scenario.
To avoid this complication, we adopted a “round robin” strategy for data distribution. Assuming MPI tasks numbered 0 to , we assign to a particular task the indices , , , …, and deal with the index in analogous fashion. This results in nearperfect load balancing.
b.2 Convolution optimisation
A significant part of total CPU time is spent in the FFTs necessary for the convolution with the quantity (see section 3.2). Fortunately, several measures can be taken to decrease the resource requirements. First of all, we are making use of the highly optimised FFTW^{5}^{5}5http://fftw.org library (Frigo & Johnson 1998), which provides close to optimal performance for FFTs of specific size.
Furthermore, it is important to notice that the execution time for an FFT of length depends sensitively on the prime factorisation of ; in general, an composed of only small prime factors is much preferable.
Since we are performing a convolution operation, we have some freedom in the choice of array dimensions; they can be increased and zeropadded if desired. In our case, the minimal array dimensions are and , respectively; if necessary, we increase both of them to the next larger number, which is a product of the prime factors 2, 3, and 5 only. Depending on the prime factorisation of the original array dimensions, this can decrease the runtime for this part of the code by integral factors.
Finally, the quantities that are to be convolved exhibit the symmetry
which allows us to employ the specialised “realvalued” FFT, resulting in another substantial speedup and a size reduction of the precomputed Fourier transform of (see Section B.5).
b.3 Efficient computation of the reduced Wigner functions
During code execution, the reduced Wigner functions are required twice in every conjugate gradient iteration step. Storing them in memory is prohibitively expensive, so it is important to regenerate them efficiently whenever needed. For this purpose we used code originally presented by Prézeau & Reinecke (2010), which was extended to make use of the SSE2 instruction set present in all modern Intelcompatible CPUs. We also used the symmetry properties of the matrix to avoid redundant computations.
b.4 Shortcuts for symmetrical beams
If a beam used for deconvolution is symmetric with respect to a rotation around its axis (which is true for elliptical beams, for example), all its coefficients for odd vanish by definition. This fact is used to skip unnecessary computations as well as save space by not storing any array elements that are known to be zero. It also reduces the minimum second dimension of the convolution arrays from to , saving even more time on the FFT operations and reducing memory consumption.
b.5 Optional precomputation
The convolution step mentioned in section 3.2 requires the Fourier transform of array , which is constant throughout a run. The code permits one to precompute and store this quantity, which reduces total CPU time (because only two FFT operations are then needed for each convolution step, instead of three). However, especially for runs with high and this increases the total memory consumption considerably. So for situations where the available main memory is insufficient to store this precomputed array, we provide the option of recomputing the Fouriertransformed whenever necessary, approximately halving the memory consumption of the code, but also slowing it down by 515 percent. For the 70GHz runs described in this paper, we had to make use of this spacesaving feature.
b.6 Supplying an initial guess
While not strictly a performance optimisation, the code allows the user to supply an initial guess from which to start the CG iteration, which is then used instead of a vector of zeroes. This is particularly helpful if one tries to compute series of deconvolutions on the same input data set, but with different parameters. For example, if a solution for a deconvolution problem with a given and has already been obtained, it can be used as a good starting point for other calculations with higher and/or . In any case, the convergence criterion will still be evaluated with respect to an allzero , not the supplied guess.
Appendix C Resource usage
Most computations discussed in this paper were performed on a computer with 64GB of RAM and 24 Intel Xeon E7450 cores. Since this computer is used interactively, we only ran our calculations on 16 of these cores and ensured that interactive usage did not exceed the capability of the remaining 8 cores. Nevertheless, the measured timings contain small uncertainties and should be interpreted as upper limits.
Name  Mem  

30a  400  6  10:07  1:13  3.9  140  6.6 
30b  500  6  25:49  1:20  5.5  270  8.1 
30c  600  4  1:24:57  1:07  5.1  995  6.4 
30d  600  6  1:08:53  1:35  6.9  591  9.6 
30e  600  8  1:40:40  2:24  10.3  573  11.9 
30f  700  6  3:41:13  1:48  8.5  1567  11.3 
30g  800  6  5:10:41  1:56  10.6  1759  12.8 
D  500  30  10:31:38  8:34  20.3  1842  17.6 
70a  1700  6  63:51:56  15:27  78.3  2923  28.8 
70x  1700  6  9:43:05  5:58  11.8  2955  — 
70y  1700  6  5:25:32  7:20  6.6  2922  — 
Table 2 gives an overview of the parameters of the various runs and their impact on resource usage. Runs 30ag are the 30GHz calculations discussed in Sect. 5.2; runs 70a, 70x, and 70y are related to the 70GHz highresolution study (Sect. 5.4), and run “D” is the deconvolution with the Dshaped beam presented in Sect. 5.6. For all runs except the Dbeam, the beam was assumed to be symmetrical with respect to a rotation around its axis. Also, for all runs except the 70GHz ones the FFT of array was cached (see Section B.5).
For the 70GHz run on the test machine described above (run 70a), each CG iteration step needed significantly more time than for the calculations using the 30GHz data set; in combination with the high number of iterations, this leads to an inconveniently long computation time. As a consequence, deconvolution runs of this magnitude are better suited for execution on massively parallel computers. We therefore repeated the computation on the Louhi supercomputer of CSC (Cray XT4/XT5) on 128 and 512 processor cores, respectively. The corresponding performance figures are listed as runs 70x and 70y in Table 2; memory consumption was not measured for these runs.
The computational power of a single CPU core is roughly the same on both computers used for our tests. Looking at the total wall clock time for the 70GHz runs, it becomes obvious that the scaling from 16 to 128 cores is close to ideal, but becomes much worse when going from 128 to 512 cores. In this range, MPI communication and redundant computations begin to dominate. For deconvolution problems with larger , we expect that this transition will occur at a higher number of MPI tasks.
The numbers of iterations required for the 70GHz runs are not identical, as one might intuitively expect. This is because in an MPIparallel code some computation steps – most notably summations and other reductions of quantities residing on all tasks – are not carried out in a strictly defined order, but depend on the relative timing of the tasks during such an operation, which is nondeterministic. Since floatingpoint algebra with finite precision is nonassociative, the results of such operations can differ slightly from run to run, causing the conjugate gradient solver to take different paths towards the solution, but of course ultimately solving the same numerical problem.
Compared to earlier publications on deconvolution mapmaking, it is evident that both memory and CPU requirements of the presented algorithm are quite modest and allow deconvolution up to values of that were prohibitively expensive up to now. Making use of massively parallel computing clusters, we expect that the algorithm can be applied to data sets with even higher resolution than those presented here.
Acknowledgements.
Some of the results in this paper have been derived using the HEALPix (Górski et al. 2005) package. We acknowledge the use of the Planck Sky Model, developed by the Component Separation Working Group (WG2) of the Planck Collaboration. MR is supported by the German Aeronautics Center and Space Agency (DLR), under program 50OP0901, funded by the Federal Ministry of Economics and Technology. EK is supported by Academy of Finland grant 253204. This work was granted access to the HPC resources of CSC made available within the Distributed European Computing Initiative by the PRACE2IP, receiving funding from the European Community’s Seventh Framework Programme (FP7/20072013) under grant agreement RI283493. We thank CSC – the IT Center for Science Ltd. (Finland) – for computational resources. We thank Torsten Enßlin for constructive discussions and feedback, Valtteri Lindholm for help in using the Planck Sky Model, and Ronnie James Dio for the Dbeam.References
 Armitage & Wandelt (2004) Armitage, C. & Wandelt, B. D. 2004, Phys. Rev. D, 70, 123007
 ArmitageCaplan & Wandelt (2009) ArmitageCaplan, C. & Wandelt, B. D. 2009, ApJS, 181, 533
 Ashdown et al. (2007a) Ashdown, M. A. J., Baccigalupi, C., Balbi, A., et al. 2007a, A&A, 471, 361
 Ashdown et al. (2007b) Ashdown, M. A. J., Baccigalupi, C., Balbi, A., et al. 2007b, A&A, 467, 761
 Delabrouille et al. (2012) Delabrouille, J., Betoule, M., Melin, J.B., et al. 2012, ArXiv eprints
 Frigo & Johnson (1998) Frigo, M. & Johnson, S. G. 1998, in Proc. 1998 IEEE Intl. Conf. Acoustics Speech and Signal Processing, Vol. 3 (IEEE), 1381–1384
 Górski et al. (2005) Górski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759
 Harrison et al. (2011) Harrison, D. L., van Leeuwen, F., & Ashdown, M. A. J. 2011, A&A, 532, A55
 Jarosik et al. (2011) Jarosik, N., Bennett, C. L., Dunkley, J., et al. 2011, ApJS, 192, 14
 Keihänen et al. (2010) Keihänen, E., Keskitalo, R., KurkiSuonio, H., Poutanen, T., & Sirviö, A.S. 2010, A&A, 510, A57
 Keihänen et al. (2005) Keihänen, E., KurkiSuonio, H., & Poutanen, T. 2005, MNRAS, 360, 390
 KurkiSuonio et al. (2009) KurkiSuonio, H., Keihänen, E., Keskitalo, R., et al. 2009, A&A, 506, 1511
 Poutanen et al. (2006) Poutanen, T., de Gasperis, G., Hivon, E., et al. 2006, A&A, 449, 1311
 Prézeau & Reinecke (2010) Prézeau, G. & Reinecke, M. 2010, ApJS, 190, 267
 Reinecke et al. (2006) Reinecke, M., Dolag, K., Hell, R., Bartelmann, M., & Enßlin, T. A. 2006, A&A, 445, 373
 Schulten & Gordon (1975) Schulten, K. & Gordon, R. G. 1975, Journal of Mathematical Physics, 16, 1961
 Sutton et al. (2009) Sutton, D., Johnson, B. R., Brown, M. L., et al. 2009, MNRAS, 393, 894
 Tauber et al. (2010) Tauber, J. A., Mandolesi, N., Puget, J.L., et al. 2010, A&A, 520, A1
 Varshalovich et al. (1988) Varshalovich, D. A., Moskalev, A. N., & Khersonskii, V. K. 1988, Quantum Theory of Angular Momentum (World Scientific Publishers)
 Wiener (1949) Wiener, N. 1949, Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications, Technology press books in science and engineering (Technology Press of the Massachusetts Institute of Technology)