Optimized detection of shear peaks in weak lensing maps
Abstract
We present a new method to extract cosmological constraints from weak lensing (WL) peak counts, which we denote as ‘the hierarchical algorithm’. The idea of this method is to combine information from WL maps sequentially smoothed with a series of filters of different size, from the largest down to the smallest, thus increasing the cosmological sensitivity of the resulting peak function. We compare the cosmological constraints resulting from the peak abundance measured in this way and the abundance obtained by using a filter of fixed size, which is the standard practice in WL peak studies. For this purpose, we employ a large set of WL maps generated by raytracing through body simulations, and the Fisher matrix formalism. We find that if low peaks are included in the analysis (), the hierarchical method yields constraints significantly better than the singlesized filtering. For a large future survey such as or , combined with information from a CMB experiment like , the results for the hierarchical (singlesized) method are: . This forecast is conservative, as we assume no knowledge of the redshifts of the lenses, and consider a single broad bin for the redshifts of the sources. If only peaks with are considered, then there is little difference between the results of the two methods. We also examine the statistical properties of the hierarchical peak function: Its covariance matrix has offdiagonal terms for bins with and aperture mass of , the higher bins being largely uncorrelated and therefore well described by a Poisson distribution.
keywords:
Cosmology: theory – largescale structure of Universe1 Introduction
For more than a decade, weak gravitational lensing (WL) has been considered a powerful probe for testing cosmology due to its potential to map the 3D matter distribution of the Universe in an unbiased way, independent of baryonic matter tracers.
Several surveys have already demonstrated the ability of WL to constrain the cosmological model through cosmic shear measurements, e.g. the Cerro Tololo InterAmerican Observatory (CTIO) lensing survey (Jarvis et al., 2003, 2006), the GarchingBonn deep survey (GaBoDS, Hetterscheidt et al., 2007), and the CanadaFranceHawaii Telescope Legacy Survey (CFHTLS, Hoekstra et al., 2006; Semboloni et al., 2006).
Among the WL probes are shear peaks, regions of high signaltonoise () in shear maps that can be produced by individual clusters or by the alignment of several smaller objects on the line of sight. The abundance of shear peaks is as sensitive to cosmology as the cluster mass function (Marian et al., 2009, 2010; Kratochvil et al., 2010; Kratochvil et al., 2011; Yang et al., 2011). Clusters are one of the four most promising tools to measure dark energy (Albrecht et al., 2006), together with supernova surveys, baryonic acoustic oscillations, and WL surveys. Therefore, the WL peak function is equally promising in principle. There are two major advantages of WL peaks over clusters: (i) WL peaks will come for free with any future lensing survey; (ii) one can very reliably calibrate the abundance of peaks with cold dark matter (CDM) simulations and proceed with direct comparisons to data measurements, thus bypassing the thorny issue of the massobservable relation. Shear peak signals need not be translated into virial masses in order to be able to extract cosmological information from their abundance (Dietrich & Hartlap, 2010). A disadvantage is the absence of an analytical framework for the WL peaks (though see the recent work of Maturi et al., 2010).
Detections of shear peaks in WL data are exemplified in the works of Dahle (2006); Schirmer et al. (2007); Bergé et al. (2008); Abate et al. (2009). However, the exploration of the shear signal of clusters has been focused mostly on mass determination, as for instance in the recent work of Okabe et al. (2010) and Israel et al. (2010).
Since the introduction of the aperture mass by Schneider (1996), there have been many studies of filters optimal for peak detection and of the impact of largescale structure (LSS) projections on cluster mass reconstructions (e.g. Metzler et al., 2001; Hoekstra, 2001; Hamana et al., 2004; Clowe et al., 2004; Tang & Fan, 2005; Maturi et al., 2005; Hennawi & Spergel, 2005; Marian et al., 2010; Becker & Kravtsov, 2011; Gruen et al., 2011). The general agreement is that cluster masses derived from WL measurements are affected by both correlated and uncorrelated LSS projections, as well as by departures of the density profiles of real clusters from the assumed spherical models. These effects cause scatter and bias in the predicted and measured of the clusters. Nonetheless WL mass reconstructions retain the attractive feature of being able to rely on numerical simulations for accurate predictions of such biases.
In this paper we address a more general question related to the abundance of shear peaks. Given the upcoming WL surveys such as the KiloDegree Survey (KiDS, Kuijken, 2010), the Dark Energy Survey (DES, The Dark Energy Survey Collaboration, 2005), the Large Synoptic Survey Telescope (LSST) survey (LSST Science Collaborations et al., 2009), or the survey (The Euclid Collaboration, 2011), it will be possible to measure the shear peak function: what is the optimal way to do this?
The standard approach to peak detection is: A given shear map is smoothed with a given filter function; in the smoothed map, one looks for points of local maximum which have higher than a certain chosen threshold value, and one selects these points as ‘peaks’. This procedure is dependent on the filter used. There have been many studies on the shape of filters that maximize the assuming certain shapes for the peak signal and various types of noise such as shape noise or projection noise (e.g. Hennawi & Spergel, 2005; Maturi et al., 2005; Gruen et al., 2011).
Less attention has been directed towards the size of filters. Indeed in most studies, the peak abundances are measured using a singlesized filter (e.g. Hamana et al., 2004; Dietrich & Hartlap, 2010), though it is clear that each size will lead to a different peak function. For a different approach using wavelets, see Pires et al. (2009). Here we propose a method that we call ‘hierarchical algorithm’: A shear map is smoothed with several filters of the same shape but different size, from the largest to the smallest. We show that by taking into account the extended information from such multiscale filtering, one can assign in the context of an assumed halo paradigm, e.g. the Navarro et al. (1997) model, a unique and (redshiftdependent) mass to the detected peaks. We use the Fisher matrix formalism and a large set of simulated WL maps to show that the cosmological constraints derived from the hierarchical peak function are much improved compared to those obtained using a filter of the same shape but only one size.
The paper is structured as follows. In section §2, we present the body simulations and the raytracing performed to generate the WL maps employed in this study. In §3 we explain the hierarchical scheme and the filter that we adopt. The results of this work are presented in §4, along with a WLpeaks Fisher forecast for surveys like and , the first to be obtained from simulation measurements. In §5 we summarize and conclude.
2 Numerical simulations and raytracing
Cosmological parameters  

zHORIZONI  0.25  0.75  0.04  1  0.8  1.0  70.0 
zHORIZONV1a/V1b  0.25  0.75  0.04  1  0.8  0.95/1.05  70.0 
zHORIZONV2a/V2b  0.25  0.75  0.04  1  0.7/0.9  1.0  70.0 
zHORIZONV3a/V3b  0.2/0.3  0.8/0.7  0.04  1  0.8  1.0  70.0 
zHORIZONV4a/V4b  0.25  0.75  0.04  1.2/0.8  0.8  1.0  70.0 
Simulation Parameters  

zHORIZONI  1500  60  8  27  
zHORIZONV1, V2, V4  1500  60  4  13.5  
zHORIZONV3a  1500  60  4  13.5  
zHORIZONV3b  1500  60  4  13.5 
We generated WL maps from raytracing through body simulations. We used 8 simulations which are part of a larger suite performed on the zBOX2 and zBOX3 supercomputers at the University of Zürich. For all realizations 11 snapshots were output between redshifts ; further snapshots were at redshifts . We shall refer to these simulations as the zHORIZON simulations, and they were described in detail in Smith (2009).
Each of the zHORIZON simulations was performed using the publicly available Gadget2 code (Springel, 2005), and followed the nonlinear evolution under gravity of equalmass particles in a comoving cube of length ; the softening length was . The cosmological model was similar to that determined by the WMAP experiment (Komatsu et al., 2009). We refer to this cosmology as the fiducial model. The transfer function for the simulations was generated using the publicly available cmbfast code (Seljak & Zaldarriaga, 1996), with high sampling of the spatial frequencies on large scales. Initial conditions were set at redshift using the serial version of the publicly available 2LPT code (Scoccimarro, 1998; Crocce et al., 2006). Table 1 summarizes the cosmological parameters that we simulated and Table 2 summarizes the numerical parameters used.
For the Fisher matrix study of peak counts, we employed another series of simulations. Each of the new set was identical in every way to the fiducial model, except that we have varied one of the cosmological parameters by a small amount. For each new set we have generated 4 simulations, matching the random realization of the initial Gaussian field with the corresponding one from the fiducial model. The four parameter variations were: , and we refer to each of the sets as zHORIZONV1a,b,…,zHORIZONV4a,b, respectively. The details are summarized in Tables 1 and 2.
For the WL simulations, we considered a survey similar to (The Euclid Collaboration, 2011) and to (LSST Science Collaborations et al., 2009), with: an rms for the intrinsic image ellipticity, a source number density , and a redshift distribution of source galaxies given by:
(1) 
where the normalization constant insures that the integral of the source distribution over the source redshift interval is unity. If this interval extended to infinity, then the normalization could be written analytically as: . There is a small difference between this value and what we actually used, due to the fact that we considered a source interval of . We took , and required that the median redshift of this distribution be , which fixed , and gave a mean of .
From each body simulation we generated 16 independent fields of view. Each field had an area of and was tiled by pixels, yielding an angular resolution . For each variational model, the total area was of , while for the fiducial model it was of . The effective convergence in each pixel was calculated by tracing a light ray back through the simulation with a multiplelensplane raytracing algorithm (Hilbert et al., 2007, 2009). Gaussian shape noise with variance was then added to each pixel, creating a realistic noise level and correlation in the filtered convergence field (Hilbert et al., 2007). We keep the shape noise configuration fixed for each field in different cosmologies, in order to minimize its impact on the comparisons of the peak abundances measured for each cosmology.
3 Smoothing weak lensing maps
3.1 A matched filter
To find peaks in WL maps, we smooth the latter with an aperturemass filter (Schneider, 1996; Schneider et al., 1998). The smoothed convergence map is a convolution between the filter function and the field of our simulations:
(2) 
where is the convergence, is the tangential shear field, and and are aperture filters for convergence and shear respectively. is an arbitrary point. Aperture mass filters are compensated, which for a spherically symmetric function can be expressed through the equation:
(3) 
where is the compensation radius. In the presence of the ellipticity noise of the source galaxies, it can be shown that an optimal and compensated filter is given by
(4) 
where is an arbitrary normalization constant, is the adopted model for the convergence profile of peaks, i.e. NFW or similar, and is the shape noise variance per ellipticity component. The mean convergence inside a radius is defined by . The analogue filter function for the shear field is given by
(5) 
where is the assumed tangential shear model of the peaks. Under the assumption that the shape noise is the dominant source of noise in the measurements, this filter is optimal because it maximizes the signaltonoise () at the location of a peak with the convergence/shear profile /. From Eqs (2) and (4), the can be written as
(6) 
or in terms of the shear
(7) 
Note that the does not depend on the arbitrary normalization constant .
Although we do not make a comparison between peaks and clusters, and indeed do not use any information on the simulation halos in this study, our choice of provides insight into the halos that generate the peaks (though of course not all the peaks will correspond to a halo). If denotes the location of a peak formed by a halo of mass , redshift , and profile , then we require that the amplitude of the smoothed map at the location of this peak be exactly : . In this case, is given by (Marian & Bernstein, 2006):
(8) 
In the above equation, and it is assumed that the radial integral has an upper limit of . The latter applies also to Eqs (6), (7), (10). Note that for the shear filter the normalization is the same, since
(9) 
Our analysis was performed on convergence maps, and so in the following, we shall focus on the latter. Inserting Eqs (4) and (8) into Eq. (2), we write down the amplitude of the smoothed map for our particular choice of matched aperture filter:
(10)  
Within the validity bounds of our model, i.e. the peak is indeed generated by an NFW halo of that mass and redshift, Eq. (10) represents an unbiased estimator for mass.
Finally, we assume a relation between the model mass and the compensation radius: We take the latter to be the angular scale subtended by the virial radius of a halo with mass , redshift , and convergence profile :
(11) 
where is the angular diameter distance to . This choice enables us to connect the ‘size’ of the filter, i.e. the aperture radius, with the ‘mass’ of the filter , using the standard relation between mass and virial radius provided by models of structure formation, such as NFW for instance. Given the source distribution in Eq. (1), we take . This is just a clarification of what we mean by size and mass of the filter, all the necessary details will be provided in §3.3.
3.2 Hierarchical Filtering
We shall now describe our method to detect shear peaks and assign them masses and . This was already implemented in our previous works Marian et al. (2009, 2010); Marian et al. (2011).
As mentioned in §3.1, smoothed maps are obtained by convolving the convergence field with a filter, e.g. Eq. (2). Peaks are detected as local maxima in the smoothed maps, i.e. points with amplitude higher than that of their 8 neighbors, where the amplitude is given by Eq. (2), or in our particular case, by Eq. (10). Medium or large peaks will still be local maxima even when smoothed with filters of size much larger or smaller than the peak radius. But the and amplitude associated to the peak will be quite different for a range of filter sizes spanning 12 orders of magnitude. Therefore, there is some degree of arbitrariness when trying to classify the abundance of WL peaks in terms of their or amplitude: the answer will depend on the filter size. Also, if the peaks are small, then a large filter may render them quite indistinguishable from spurious shapenoise peaks. The strengths of our topdown approach are:

It uses filters of several sizes, which will increase the scale range of the detected peaks and therefore the cosmological information of the peak counts.

It uses an interpolation scheme for the results of the smoothing with each filter to accurately establish a unique value for the and amplitude of each peak. Thus the ambiguity of classifying peaks in terms of their is removed, and one obtains a ‘general’ peak function, as opposed to a different peak function for each filter size employed in smoothing.
The concrete steps that we take are as follows. We smooth the maps with a sequence of filters of different sizes (masses), in a hierarchical fashion, from the largest to the smallest size. The purpose is to determine the ‘mass’ of the peaks, i.e. the filter size which matches best the size of the peaks:
(12) 
where denotes the location of a detected peak. For each filter in the sequence, the peaks are selected so that: (a) ; (b) , where is the size of the filter. and are the aperture mass and defined in Eqs (6) and (10) corresponding to this particular filter, and we choose as a detection threshold.
Equation (12) is not likely to be satisfied by any particular filter in the sequence, hence we find its solution by interpolating between the results of smoothing with different filters in the sequence. This is illustrated in Figure 1. Suppose there is a peak of true mass and (true according to the assumed model). Then there will be two consecutive filters in the sequence, and , for which the following relations are true:
Since the aperture mass of the peak obtained from consecutive filters varies gently with the size of the filter, we can use linear interpolation to write down the solution for the true mass, i.e. the solution of Eq. (12).
(13) 
Once the true mass is determined, we use Eqs (10) and (12) to determine the of the peak:
(14) 
Given our two selection criteria, in the above example the point of local maximum will be selected as a peak by the filter , but not by its predecessor , if the signal to noise will also be above the detection threshold. It will also appear as a peak in the maps smoothed with filters , provided that the same requirement is fulfilled. In order to be able to carry out the interpolation scheme, for each filter used we record the aperture mass, , and the 2D location of the peaks. A peak might slightly change its coordinates in maps smoothed with differentsized filters; we take this into account, and allow for variations of up to 4 pixels in the directions of the map (a pixel has ). We also record those points of maximum where the aperture mass is smaller, but not much smaller than the mass of the respective filter; to be specific, the points obeying the condition: . These points we call ‘pseudopeaks’ and they are likely to be selected as peaks by the next filter in the sequence; therefore, they are useful for the interpolation that we perform later. The value is of no particular significance, it is suitable for the logarithmicallyspaced sequence of filters that we apply, based on several trials.
Finally, the processing of the peaks resulting from the smoothing with the hierarchical sequence of filters, consists of the following steps: 1. We exclude from the maps those peaks already selected by a larger filter; 2. We apply the interpolation scheme to assign the remaining peaks a unique aperture mass, according to Eq. (13). We then use this mass to compute the value, according to Eq. (14); 3. We exclude those peaks that are within the virial radius of a larger peak, since in many of such cases, the second peak is just an artifact of the smoothing or we simply deal with a very clumpy halo that is split by the smoothing into a large peak and some small ones. We thus remove the problem of ‘peaksinpeaks’ and also do not count substructures as independent halos. This is done for the purpose of obtaining a ‘clean’ peak function, but ultimately such events concern only small peaks and we have checked that the cosmological constraints derived from the counts are not significantly altered by these exclusions.
3.3 Model specifications
We base our halo model on the NFW density profile:
(15) 
where is the mean matter density of the Universe, is the characteristic overdensity, and is the scale radius. We adopt the ShethTormen (ST) definition of mass (Sheth & Tormen, 1999): , i.e. we use the mean matter density to define the overdensity for halo formation, as opposed to the critical density, . The two are related by . for ST and NFW. Integrating Eq. (15) to obtain the virial mass and using the above definition for the latter, one arrives at the following expression for the characteristic overdensity:
(16) 
where the concentration parameter is defined by . In the CDM model, ST and NFW halos have the same density profile, but ST halos have larger cutoff radii and concentration parameters than NFW ones. For the concentration parameter we employed the numerical prescription of Gao et al. (2008), whilst to translate NFW to ST parameters, we used the approach of Smith & Watts (2005).
We use the truncated convergence profile resulting from this profile, i.e. we limit the projection of the 3D density along the line of sight to a region delimited by the virial radius:
(17) 
with being the critical surface density for lensing. The above equation can be rewritten as
(18) 
where is adimensional, and the function depends on cosmology only through the concentration parameter (Hamana et al., 2004):
(19) 
The model that we assume for the of halos is a convolution of the NFW convergence profile defined by Eq. (19) with a twodimensional (2D) Gaussian function with the width of the order of the softening length of the simulations:
(20) 
The above equation can be rewritten as
(21) 
where is the width of the Gaussian function, and is the modified Bessel function of order . The dependence on the lens redshift is implicit for both and .
This model choice accounts for the finite resolution of the numerical simulations. The convolution in Eq. (21) has a similar effect to ‘coring’ the convergence profile, i.e. making it flat in the centre of the cluster, where the WL regime breaks down and measurements are very difficult to obtain. For numerical simulations, cored profile models are desirable because one cannot resolve structures below the softening length. Lastly, Eq. (21) alleviates uncertainties in the location of the centre of the peak, which could lead to large discrepancies between measured and theoretical profiles, if the latter have a cusp at the centre, e.g. like NFW. Therefore, we take the width of the Gaussian present in the convolution to be , where is the softening length of the simulations, for ST halos with , and for . For the redshift that we assume for our filter, arcsec respectively.
Figure 2 shows the comparison of the theoretical and measured profiles for the unfiltered convergence maps of the fiducial model corresponding to sources at redshift 1, in the absence of shape noise. We use the hierarchical method to assign masses to peaks. The peaks are binned according to the assigned mass, and the coordinates of their centres are used to measure the shear and convergence profiles. Each panel in the figure corresponds to a mass bin. The red points depict the average of the measured convergence profiles of the detected peaks. The error bars correspond to errors on the mean of the 128 fiducial fields. The solid blue lines represent the theoretical profile of Eq. (21), estimated for the mean mass of the peaks in the bin and the redshift , i.e. the optimal redshift for lensing for sources at redshift 1. The agreement between the model and measurements is remarkable, given the fact that some peaks correspond to halos at different redshifts or to no halos at all, and the fact that we assume a spherical density model, which is bound to fail for peaks arising from aspherical halos, or to be affected by projection effects. Despite these limitations, the hierarchical method classifies peaks efficiently on average, as shown in the figure. Note that the bottom panels of Figure 2 present an oscillatory feature around the virial radius of the profiles. We generated convergence maps of synthetic, perfect NFW halos, and checked that such features can appear if noise is added to the maps. This owes to the fact that the compensated filter prefers to select peaks which have regions of low convergence around the virial radius. This effect is more pronounced for smaller peaks because these most likely correspond to small halos, which have increased particle shot noise. We have measured the convergence profiles of the friendsoffriends (FoF) halos of the simulations at redshift 0.3, and did not find such features as seen in Figure 2, from which we conclude that they are caused by the filter selection.
We use the abovementioned values for in Eq. (21) to obtain Figure 2; based on several trials, we find these values to yield the closest resemblance between the measured and theoretical profiles. We do this test in the absence of shape noise, since we are trying to address a technical issue arising from our numerical simulations: the impact of the softening length. In the absence of shape noise, the algorithm in §3.2 can be applied by formally setting in Eq. (4), and using only the mass criterion to select peaks; Eq. (10) does not change.
The hierarchical method could be useful for determining cluster masses from WL profiles. In order to increase the accuracy of the results, one should perform a careful analysis of: the chosen filter and its parameters, the compensation radius , the inclusion of the halomatter crosscorrelation term visible in Figure 2, the impact of shape noise and projection noise, the impact of photometric redshifts errors of the source galaxies. However, this is beyond the goals of the present study.
4 Results
We present a comparison between peak statistics results obtained through the hierarchical algorithm described in §3 and from applying three singlesized filters of different size. When using the singlesized filters, we keep the same filter function as given in section §3, as this work is not concerned with assessing the performance of filters of different shape. In this case, we simply select the peaks by requiring that their , with the given by Eq. (6). The three sizes that we consider correspond at redshift to the masses , with the angular size of the virial radii given by arcmin, respectively. Note that due to the fact that ST halos have larger radii than NFW ones of the same mass, these angular sizes are also slightly larger than NFW angular sizes. For the hierarchical filter we consider a series of 12 filters, logarithmically spanning the mass interval , and with . Ultimately, throughout the entire analysis for the fixedsize and hierarchical methods, we shall use only peaks above the threshold , but going to lower values ensures the completeness of the sample of hierarchical peaks. The shape noise contamination makes it difficult to consider smaller filters. We bin the resulting peak abundances in terms of , logarithmically spanning an interval . For reasons discussed in Appendix §2, we choose . Note that for the hierarchical abundance it is useful to also consider binning in mass, as assigned through Eqs (10), (12), (13). This allows to draw analogies between the properties of the WL peaks and those of 3D haloes. Here too we use 20 bins spanning , the lower bound roughly corresponding to the for the analyzed cosmological models.
Comparing peak abundances obtained with different methods is not necessarily relevant: the results will be clearly different, and it would be hard to decide which filter size is more effective. This is shown in Figure 3, where we present the peak functions corresponding to the three singlesized filters, as well as the hierarchical method. The functions are expressed as number of peaks per unit for an area of , and the results are an average of the peak functions measured in the 128 fields of the fiducial model. The error bars correspond to errors on the mean. As expected, each singlesized filter favors the detection of peaks with in accord to its size: the smallest filter peak abundance is mostly formed by low peaks, and similarly for the medium and large filters. The hierarchical peak abundance is similar to the largestfilter abundance for the high bins, and to the smallestfilter abundance for the low bins. We shall next explore how the measured abundances translate into cosmological constraints.
To this effect, we shall resort to the Fisher matrix formalism, with four clear goals:

to provide a comparison between the filtering methods;

to test which range of mass or contributes most to the constraints derived from WL peak counts;

to test the difference between the errors obtained by using the full covariance matrix of counts, and the Poisson errors;

to provide a realistic forecast for surveys like and , in a very direct manner, based on simulation measurements.
4.1 Fisher matrix considerations
Using the measured peak abundances, we compute the Fisher information following the standard definition:
(22) 
where and are elements of the cosmological model parameter set upon which the likelihood depends. In our case the set is: . We assume a Gaussian likelihood
(23) 
where is the vector of peak counts, and is the vector of mean number of peaks; both vectors have the dimension , i.e. the number of bins considered. The covariance matrix of the counts in bins and is
(24) 
From the Fisher matrix, one may obtain an estimate of the marginalized errors and covariances of the parameters:
(25) 
as well as the unmarginalized errors:
(26) 
The size of the errors quantifies the efficiency of the filtering method to extract cosmological information from WL peak counts. For simplicity, we shall ignore the trace term in the Fisher matrix, (Tegmark et al., 1997). In this case, Eq. (22) can be rewritten as
(27) 
We are also interested in the Poisson errors of the peak counts, since the Poisson statistic is widely adopted in forecasting cosmological constraints from WL peak counts. They are given by
(28) 
The mean number of counts for bin is estimated as
(29) 
In the above designates the field number, while is the total number of fields; for the fiducial cosmology, , and for the variational cosmologies . An unbiased, maximumlikelihood estimator for the covariance matrix is:
(30) 
The derivatives of the counts with respect to the cosmological parameters are calculated from
(31) 
where represents the step in the cosmological parameters, e.g. Table 1.
We estimate the Fisher matrix errors using the covariance on the mean for the counts of the fiducial model; the rescaled covariance matrix corresponds to an area of . Together with the survey specifications given in section §2, this makes our study representative for two future surveys, and .
4.2 Comparison of filtering methods
Figure 4 depicts the derivatives of the measured peak abundances with respect to the four cosmological parameters that we consider, as a function of . We show results for the hierarchical method and the largest of the singlesized filters, . The derivatives are estimated using Eq. (31), and the result is divided by the mean counts of the fiducial cosmology. The figure shows that both filtering methods yield peak functions similarly sensitive to cosmology, with the hierarchical derivatives displaying slightly more features than the singlesized ones. For most of the range considered, the peakfunction derivatives are nonzero, signifying that there is cosmological information in the high peaks, as well as in the low ones, as previously noticed by Dietrich & Hartlap (2010). This originates in a similar behaviour displayed by the halo mass function: In a previous work (Smith & Marian, 2011), we found the derivatives of the latter with respect to the same parameters studied here to be nonzero for a large range of halo masses, down to (compare Figures 6, 7 in that work with Figures 4, 5 in this work).
Figure 5 depicts the marginalized Fisher errors for the four cosmological parameters that we consider. The errors are fractional, i.e. the error for each parameter is divided by the fiducial value of that parameter, and cumulative, i.e. we include bins cumulatively, from the largest to the lowest. The central values of the bins are indicated on the axis. The symbols and colors are the same as in Figure 3. It is apparent that the hierarchical filtering performs better than the singlesized filtering, if one takes into account peaks with . The greatest improvement is for : the hierarchical error is smaller by more than a factor of 2 compared to the fixedsize filtering. A smaller improvement happens also in the case of and , while the error on seems unaffected by the filtering method, due to generally poor constraining power that peaks have on this parameter. Figure 5 reinforces the suggestion of Figure 4 that the inclusion of peaks with small improves significantly the cosmological information.
Note that for the hierarchical method we can also use aperturemass bins to measure the peak function derivatives, the covariance matrix, and the Fisher matrix, with the mass given by Eq. (13). We obtain similar results to those presented in Figures 4, 5. The singlesized filters perform very similarly, the largest one being marginally better in the case of , , and . Its diameter of 13.2 arcmin is larger than what previous studies in the literature have used: Hamana et al. (2004) had a arcmin Gaussian filter, Hennawi & Spergel (2005) employed a arcmin NFW filter, and Dietrich & Hartlap (2010) a arcmin one. Since the constraints from filtering with fixed sizes are so similar, we shall only show the results from the marginallybetter one.
In Appendix §A we examine the statistical properties of the peaks detected through the hierarchical and fixedsized methods. We find that:

The Poisson statistic describes well the distribution of hierarchical high and highmass peaks. The mass is defined by Eq. (13). We show that the correlation matrix of binned hierarchical peaks has strong offdiagonal contributions for the small bins, while being largely diagonal for the large bins. The same applies to the massbinned correlation coefficient, and this behaviour is similar to that of halos, as shown in Smith & Marian (2011).

The high singlesized peaks are also reasonably described by the Poisson distribution, due to the fact that such peaks are usually quite massive and rare. The correlation matrix of these peaks seems slightly more correlated for than the hierarchical matrix.
4.3 Forecasting constraints on cosmology
Hierarchical errors Fixedsize errors  

We present a Fishermatrix forecast for the 4dimensional cosmological parameter space explored in this work. This will enable us to compare the filtering methods in a more realistic context, using marginalized errors and also cosmic microwave background (CMB) information.
For the Planck Fisher matrix, we shall assume that the CMB temperature and polarization spectra can constrain 9 parameters: the dark energy equationofstate parameters and ; the density parameter for dark energy ; the CDM and baryon density parameters scaled by the square of the dimensionless Hubble parameter and (); the primordial spectral index of scalar perturbations ; the primordial amplitude of scalar perturbations ; the running of the spectral index ; and the optical depth to the last scattering surface . To compute the CMB Fisher matrix we follow Eisenstein et al. (1999):
(32) 
where , where is the temperature power spectrum, is the Emode polarization power spectrum, is the temperatureEmode polarization crosspower spectrum, and is the Bmode polarization power spectrum. The assumed sky coverage is In order to make the CMB Fisher matrix compatible with our parameters, we rotate it to a new set
(33) 
where for us . We marginalize over the 5 parameters absent from our analysis.
Table 3 and Figure 6 represent the main results of this work, showing the overall improvement the hierarchical method brings over the fixedsize method after marginalization and especially after the inclusion of the CMB information. For the fixedsize method we choose the filter with , i.e. the bestperforming filter among the fixed sizes that we have probed. We also show the unmarginalized errors, for a more complete picture.
Combined with the CMB, the hierarchical errors are a factor of 2 better than the singlefilter method for and , and almost a factor of 3 better for . For there is no significant difference between the filtering methods. This happens because the CMB constrains the primordial power spectrum tighter than WL peak counts.
We further depict these results in Figure 6 as ellipses; the blue dashed ellipses correspond to the singlesized method, and the red solid ones to the hierarchical algorithm. Here we see again that the latter really improves the joint constraints for .
It is difficult to make a comparison to previous forecasts in the literature, as the probed parameter space and survey specifications are not the same, so we shall mention only two. Wang et al. (2004) presented a forecast for in which besides WL counts they also include the cluster power spectrum, which they treat as completely independent. They consider a larger parameter space than ours, including and , and assume and . The constraints that they find when combined with priors (see Table 6 in their paper) are: . They use a Gaussian filter of size arcmin, and their fiducial model has . The results are rather similar to ours, though their constraint of is surprisingly tight, given the sensitivity of to – higher than ours – and the fact that they include , known to degrade substantially the constraint on .
We also make a comparison to our previous work (Marian & Bernstein, 2006), which uses the same type of normalized filter as this study. The detection threshold is 5, the projection noise is accounted for, and instead of , is considered. The fiducial values for and are the same as in Wang et al. (2004); combining with the Planck information, we found the constraints: . These are in agreement with the results from the present study, given the abovementioned differences.
5 Summary and conclusions
In this paper we proposed a new method, which we called ‘the hierarchical algorithm’ to detect and explore WL peak counts. While previous studies have examined the benefits of using filters of a certain shape (Hennawi & Spergel, 2005; Maturi et al., 2005; Gruen et al., 2011), here we have focused on the way the filtering should be performed to maximize the inferred cosmological constraints. To this goal, we have used a large set of WL maps produced by raytracing through body simulations with varying cosmological models, as described in section §2.
Our method was based on the idea of sequential smoothing of the maps with filters of different size, from the largest to the smallest. The chosen filter was an aperturemass filter, matching the NFW density profile of halos. Combining the information contained in the maps smoothed on different scales, we determined the largest filter size for which a peak would not only be a point of local maximum, with a larger than a certain threshold, but also a match to the NFW profile of the filter. For the latter we have assumed a fixed redshift equal to the optimal redshift for lensing given the mean redshift of the source distribution. Under this assumption, we assigned a unique value of mass and to the detected peaks, as described in detail in section §3. Thus, the peak function arising from the hierarchical method does not depend on a particular filter size.
We compared the hierarchical peak abundance to that obtained from applying a filter of fixed size; for the latter we used the same aperture filter and considered three sizes arcmin. At the assumed redshift , these correspond to halos with a ShethTormen mass of . To quantify the efficiency of the smoothing methods, we took the Fisher matrix approach: we compared the errors on the cosmological parameters derived from each method. The considered parameters were: . Our findings are as follows:

The marginalized Fisher matrix errors obtained from the hierarchical peak abundance combined with CMB information from were better by a factor of compared to the results of the singlesized filtering. This was true if we took into account low peaks with ; if we allowed only peaks with then the hierarchical errors were only marginally better.

The three filters of fixed size yield very similar results, the largest being slightly more effective.

We have provided a cosmology forecast for WL peak counts relevant to future surveys like and . Combined with information from a CMB experiment such as , the hierarchical marginalized errors for the considered parameters were: , where the values in the parenthesis corresponded to the results of the largest, fixedsize filter. Note that we have assumed no knowledge of the redshifts of the peaks, and yet have obtained values in a reasonable accord with analytical forecasts in the literature.

The results of the Fisher matrix analysis had a slight dependence on the number of bins used: The most suitable number of bins for the hierarchical method was 20.
We have checked that the hierarchical method yields similar constraints if one bins the peak information in mass and not , which is a reassuring consistency check.
There are certain improvements that one could bring to the hierarchical method. First, the choice of filter shape: in this study, we have resorted to a filter which is optimal if one assumes the shape noise of galaxies as the main source of noise for WL measurements. Though this filter is also effective in reducing the impact of correlated lineofsight projections for the measured peaks (Marian et al., 2009, 2010), one could use a more sophisticated shape, as discussed in Gruen et al. (2011). Second, one should test the benefits of having more redshift information on the source galaxies, i.e. use tomography to improve the cosmological constraints derived from the peak abundance. We defer these issues to a future study.
The main message conveyed by our work is that, compared to the standard approach of singlesized smoothing usually discussed in the literature, the hierarchical method extracts significantly more of the cosmological information enclosed in WL peak counts. Therefore, it will be a very useful tool for surveys like and which have the potential to detect many thousands of peaks.
Acknowledgements
We thank Gary Bernstein for his comments on the manuscript. We also thank V. Springel for making public Gadget2 and for providing his BFoF halo finder. LM, SH, and PS are supported by the Deutsche Forschungsgemeinschaft (DFG) through the grant MA 4967/11, through the Priority Programme 1177 ‘Galaxy Evolution’ (SCHN 342/6 and WH 6/3), and through the Transregio TR33 ‘The Dark Universe’. SH also acknowledges support by NSF grant number AST0807458002. RES was partly supported by the Swiss National Foundation under contract 200021116696/1, the WCU grant R322008000101300, and the University of Zürich under contract FK UZH 57184001. RES also acknowledges support from a Marie Curie Reintegration Grant and the Alexander von Humboldt Foundation.
Appendix A The correlation coefficient of the peak counts
For a covariance matrix , the correlation coefficient is defined by
(34) 
We explore the correlation coefficient of the peak abundance based on measurements from 128 fields of the fiducial model. Figure 7 shows for the hierarchical method, using aperture mass bins. The mass is assigned according to the description given in section §3. The result is very similar to the correlation coefficient of the 3D halo abundance (Smith & Marian, 2011): There are strong correlations at the lowmass end () while the largemass bins are mostly uncorrelated. In Figure 8 we show the hierarchical results for based on the same measurements, but using bins in instead of mass. The same pattern as in the previous figure is visible, with the lower bins having a correlation coefficient of . Finally, Figure 9 depicts measured from the same fiducial maps, using a single filter of size . The correlations seem weaker than for the hierarchical case, but they extend to higher , the highest bin being however largely uncorrelated. This is further explained in the next figure.
Figure 10 presents the mass scatter of peaks detected through the two methods. For the fiducialcosmology peak abundances, we compare the coordinates of the peaks, to select only those peaks found through both methods. For these peaks, we concentrate on the following 3 quantities: the hierarchical given by Eq. (14), the singlesized from Eq. (7), and the hierarchical aperture mass from Eq. (13). We consider bins. For each bin we compute: the mean of the hierarchical and singlesized peaks in that respective bin, as well as the mean mass, and the error on the mass. We emphasize that for both methods, by mass we mean the aperture mass assigned through the hierarchical algorithm, i.e. Eq. (13). The figure shows that both methods yield on average a similar relation between mass and . However, in the hierarchical case, this relation is very tight: large/smallmass peaks have large/small , hence the similarity between the correlation matrices in Figures 7 and 8. For the singlefilter method this is not the case, as the size of the error bars suggests that peaks with quite varying mass are binned in the same bin. This is consistent with the correlation matrix shown in Figure 9.
In Figure 11 we compare the fractional cumulative errors obtained using the full covariance matrix Eq. (27) to the Poisson errors Eq. (28), for the two methods considered. The Poisson and fullcovariance errors converge at the high end, in accord with the fact that the most massive peaks are assigned the largest in both the hierarchical and singlefilter methods. Finally, in Figure 12 we show the fractional and cumulative Fisher errors for the hierarchical method, considering binning in mass, and not . As already suggested by Figure 7, the Poisson statistic captures reasonably well the highmass end of the distribution of hierarchical peaks, where the fullcovariance and Poisson errors converge.
To conclude, the Poisson distribution can be used to approximate the likelihood function of high and highmass hierarchical peaks.
Appendix B The number of bins
We now address the issue of the number of bins used to estimate the Fisher matrix in Eq. (27). We expect that too coarse a binning will diminish the constraints, as the information in the maps would not be fully captured. We also expect the constraints to saturate once a large enough number of bins is considered. However, for all filtering methods we noticed a continuous improvement in the constraints with the increasing number of bins. This is most likely due to noise in the covariance matrix measurement.
As a remedy, we apply the correction discussed by Hartlap et al. (2007). Given a data set drawn from a multivariate Gaussian distribution, and given the maximumlikelihood estimator for the covariance matrix C, i.e. Eq. (30), then an unbiased estimator for the inverse covariance is:
(35) 
where is the number of bins, and is the number of realizations – in our case the number of fiducial fields. We use logarithmicallyspaced bins in the interval . Estimating our Fisher matrix with the above equation alleviated significantly the dependence of the constraints on the number of bins used, as seen in Figure 13. The figure shows the dependence of the unmarginalized Fisher errors on the number of bins in which the interval is divided. All errors show little evolution with the number of bins, and are relatively stable once , except for the case of , where there is slightly more evolution. The entire analysis presented in this work was carried out for .
References
 Abate A., Wittman D., Margoniner V. E., Bridle S. L., Gee P., Tyson J. A., Dell’Antonio I. P., 2009, ApJ, 702, 603
 Albrecht A., Bernstein G., Cahn R., Freedman W. L., Hewitt J., Hu W., Huth J., Kamionkowski M., Kolb E. W., Knox L., Mather J. C., Staggs S., Suntzeff N. B., 2006, ArXiv Astrophysics eprints
 Becker M. R., Kravtsov A. V., 2011, ApJ, 740, 25
 Bergé J., Pacaud F., Réfrégier A., Massey R., Pierre M., Amara A., Birkinshaw M., PaulinHenriksson S., Smith G. P., Willis J., 2008, MNRAS, 385, 695
 Clowe D., De Lucia G., King L., 2004, MNRAS, 350, 1038
 Crocce M., Pueblas S., Scoccimarro R., 2006, MNRAS, 373, 369
 Dahle H., 2006, ApJ, 653, 954
 Dietrich J. P., Hartlap J., 2010, MNRAS, 402, 1049
 Eisenstein D. J., Hu W., Tegmark M., 1999, ApJ, 518, 2
 Gao L., Navarro J. F., Cole S., Frenk C. S., White S. D. M., Springel V., Jenkins A., Neto A. F., 2008, MNRAS, 387, 536
 Gruen D., Bernstein G. M., Lam T. Y., Seitz S., 2011, MNRAS, 416, 1392
 Hamana T., Takada M., Yoshida N., 2004, MNRAS, 350, 893
 Hartlap J., Simon P., Schneider P., 2007, A&A, 464, 399
 Hennawi J. F., Spergel D. N., 2005, ApJ, 624, 59
 Hetterscheidt M., Simon P., Schirmer M., Hildebrandt H., Schrabback T., Erben T., Schneider P., 2007, A&A, 468, 859
 Hilbert S., Hartlap J., White S. D. M., Schneider P., 2009, A&A, 499, 31
 Hilbert S., Metcalf R. B., White S. D. M., 2007, MNRAS, 382, 1494
 Hilbert S., White S. D. M., Hartlap J., Schneider P., 2007, MNRAS, 382, 121
 Hoekstra H., 2001, A&A, 370, 743
 Hoekstra H., Mellier Y., van Waerbeke L., Semboloni E., Fu L., Hudson M. J., Parker L. C., Tereno I., Benabed K., 2006, ApJ, 647, 116
 Israel H., Erben T., Reiprich T. H., Vikhlinin A., Hildebrandt H., Hudson D. S., McLeod B. A., Sarazin C. L., Schneider P., Zhang Y.Y., 2010, A&A, 520, A58+
 Jarvis M., Bernstein G. M., Fischer P., Smith D., Jain B., Tyson J. A., Wittman D., 2003, Astronomical Journal, 125, 1014
 Jarvis M., Jain B., Bernstein G., Dolney D., 2006, ApJ, 644, 71
 Komatsu E., Dunkley J., The WMAP Team 2009, ApJS, 180, 330
 Kratochvil J. M., Haiman Z., May M., 2010, PRD, 81, 043519
 Kratochvil J. M., Lim E. A., Wang S., Haiman Z., May M., Huffenberger K., 2011, ArXiv eprints
 Kuijken K., 2010, in D. L. Block, K. C. Freeman, & I. Puerari ed., Galaxies and their Masks Dark Haloes as Seen with Gravitational Lensing. pp 361–+
 LSST Science Collaborations Abell P. A., Allison J., Anderson S. F., Andrew J. R., Angel J. R. P., Armus L., Arnett D., Asztalos S. J., Axelrod T. S., et al. 2009, ArXiv eprints
 Marian L., Bernstein G. M., 2006, PRD, 73, 123525
 Marian L., Hilbert S., Smith R. E., Schneider P., Desjacques V., 2011, ApJL, 728, L13+
 Marian L., Smith R. E., Bernstein G. M., 2009, ApJL, 698, L33
 Marian L., Smith R. E., Bernstein G. M., 2010, ApJ, 709, 286
 Maturi M., Angrick C., Pace F., Bartelmann M., 2010, A&A, 519, A23+
 Maturi M., Meneghetti M., Bartelmann M., Dolag K., Moscardini L., 2005, A&A, 442, 851
 Metzler C. A., White M., Loken C., 2001, ApJ, 547, 560
 Navarro J. F., Frenk C. S., White S. D. M., 1997, ApJ, 490, 493
 Okabe N., Takada M., Umetsu K., Futamase T., Smith G. P., 2010, Publications of the Astronomical Society of Japan, 62, 811
 Pires S., Starck J.L., Amara A., Réfrégier A., Teyssier R., 2009, A&A, 505, 969
 Schirmer M., Erben T., Hetterscheidt M., Schneider P., 2007, A&A, 462, 875
 Schneider P., 1996, MNRAS, 283, 837
 Schneider P., van Waerbeke L., Jain B., Kruse G., 1998, MNRAS, 296, 873
 Scoccimarro R., 1998, MNRAS, 299, 1097
 Seljak U., Zaldarriaga M., 1996, ApJ, 469, 437
 Semboloni E., Mellier Y., van Waerbeke L., Hoekstra H., Tereno I., Benabed K., Gwyn S. D. J., Fu L., Hudson M. J., Maoli R., Parker L. C., 2006, A&A, 452, 51
 Sheth R. K., Tormen G., 1999, MNRAS, 308, 119
 Smith R. E., 2009, MNRAS, 400, 851
 Smith R. E., Marian L., 2011, MNRAS, 418, 729
 Smith R. E., Watts P. I. R., 2005, MNRAS, 360, 203
 Springel V., 2005, MNRAS, 364, 1105
 Tang J. Y., Fan Z. H., 2005, ApJ, 635, 60
 Tegmark M., Taylor A. N., Heavens A. F., 1997, ApJ, 480, 22
 The Dark Energy Survey Collaboration 2005, ArXiv Astrophysics eprints, pp astro–ph/0510346
 The Euclid Collaboration 2011, arXiv:astroph/1110.3193
 Wang S., Khoury J., Haiman Z., May M., 2004, PRD, 70, 123008
 Yang X., Kratochvil J. M., Wang S., Lim E. A., Haiman Z., May M., 2011, PRD, 84, 043529