# Introducing constrained matched filters for improved separation of point sources from galaxy clusters

## Abstract

Matched filters (MF) are elegant and widely used tools to detect and measure signals that resemble a known template in noisy data. However, they can perform poorly in the presence of contaminating sources of similar or smaller spatial scale than the desired signal, especially if signal and contaminants are spatially correlated. We introduce new multi–component MF and matched multifilter (MMF) techniques that allow for optimal reduction of the contamination introduced by sources that can be approximated by templates. The application of these new filters is demonstrated by applying them to microwave and X–ray mock data of galaxy clusters with the aim of reducing contamination by point–like sources, which are well approximated by the instrument beam. Using microwave mock data, we show that our method allows for unbiased photometry of clusters with a central point source but requires sufficient spatial resolution to reach a competitive noise level after filtering. A comparison of various MF and MMF techniques is given by applying them to Planck multi–frequency data of the Perseus galaxy cluster, whose brightest cluster galaxy hosts a powerful radio source known as Perseus A. We also give a brief outline how the constrained MF (CMF) introduced in this work can be used to reduce the number of point sources misidentified as clusters in X–ray surveys like the upcoming eROSITA all–sky survey. A python implementation of the filters is provided at https://github.com/j-erler/pymf.

###### keywords:

galaxies: clusters: general – methods: data analysis – techniques: image processing^{1}

^{2}

## 1 Introduction

Matched filtering (MF) is a technique for the extraction of the flux of sources with a well known spatial template at optimal signal–to–noise ratio (SNR). Matched filtering was first proposed for the study of the kinetic Sunyaev–Zeldovich (kSZ) signal from clusters of galaxies by Haehnelt & Tegmark (1996) and subsequently developed and generalized by Herranz et al. (2002) and Melin, Bartlett & Delabrouille (2006) for the extraction of the thermal Sunyaev–Zeldovich (tSZ) signal from multi–frequency data sets like those delivered by the Planck mission, giving rise to what is now known as the matched multifilter (MMF). These filters have since been adopted with great success by the SPT, ACT and Planck Collaborations to extract the tSZ signal of clusters from their respective multi–frequency data sets (Hasselfield 2013; Bleem et al. 2015; Planck Collaboration 2016a).

While matched filters perform admirably in separating diffuse galactic foregrounds and primary cosmic microwave background (CMB) anisotropies from the SZ signal of clusters, contamination by point sources remains an issue (e.g. Bartlett & Melin 2006; Melin, Bartlett & Delabrouille 2006) and can lead to significant biases in the measured cluster parameters (Knox, Holder & Church, 2004; Aghanim, Hansen & Lagache, 2005; Lin & Mohr, 2007; Sehgal et al., 2010). This problem is mitigated to a degree by MMFs due to the prior knowledge of the tSZ spectrum that is used to construct these multifilters, but accurate photometry of clusters that contain a central radio source remains challenging. Point source confusion is also a central concern for the detection of clusters in X–ray observations (e.g. Biffi, Dolag & Merloni 2018; Koulouridis et al. 2018). Tarrío et al. (2016); Tarrío, Melin & Arnaud (2018) recently demonstrated that point source confusion can be reduced by a joint SZ and X–ray MMF analysis, making use of their very different spectral characteristics at microwave frequencies compared to the X–ray regime.

In this work we present a multi-component extension of the matched filter concept that can improve the separation of contaminants that can be approximated by well known templates (e.g. point sources) based purely on their spatial characteristics. This approach is mathematically identical to the generalized multi-component internal linear combination (ILC) algorithms introduced by Remazeilles, Delabrouille & Cardoso (2011a, b) and Hurier, Macías-Pérez & Hildebrandt (2013), which can be thought of as matched filters in frequency space, and allows for an unbiased photometry of clusters with a central point source. Generalizing our method to multi–frequency data gives rise to a new matched multifiltering technique that combines spatial and spectral constraints to provide an optimal separation. A similar but less general approach was presented by Herranz et al. (2005), who showed that the tSZ and kSZ signals of clusters can be separated with matched multifilters that use the different spectra of the two effects but take the same spatial template for the two components, which restricts the method from being applied to other contaminating sources. In this work we derive our new filters and demonstrate their application using mock microwave and X–ray data of clusters, as well as Planck data of the Perseus galaxy cluster.

This article is structured as follows: Section 2 introduces matched filters and multifilters for galaxy clusters and our proposed constrained filters in detail. Section 3 describes our simulation pipeline for the creation of mock data that are used to test the performance of the constrained matched filters. The results obtained on both simulations and on data from the Planck mission are presented in Section 4. In Section 5 we provide a discussion of our new technique and give an outlook to its application in future experiments. Section 6 provides a summary and concludes our analysis.

Throughout this paper we assume a flat CDM cosmology with , , , and . denotes the redshift-dependent Hubble ratio and the critical density of the universe at redshift . Unless noted otherwise, the quoted parameter uncertainties refer to the 68 per cent confidence interval. All–sky maps were processed with HEALPIX (v3.31; Górski et al. 2005).

## 2 Matched filtering

Setting up a matched filter requires only very limited knowledge about the astrophysical content of a dataset. We assume that an observed map at frequency represents a linear combination of the desired signal, e.g. the SZ signal from galaxy clusters with the spectrum , plus a noise map that contains both instrumental noise and astrophysical emission:

(1) |

The signal must be well approximated by a known spatial template y like the projected pressure profile of clusters. We now would like to construct a filter that returns the signal (i.e. the amplitude of the source template if y is normalized to unity) at maximum significance. Using the flat sky approximation and changing to Fourier space, a matched filter can be constructed by minimizing the variance of the filtered map (e.g. Schäfer et al. 2006)

(2) |

where C is the azimuthally-averaged noise power spectrum of the unfiltered map expressed as a diagonal matrix . Here denotes the two-dimensional spatial frequency that corresponds to the two dimensional sky position in Fourier space. At the same time, we demand the filtered field to be an unbiased estimator of the deconvolved amplitude of the signal template at the position of sources. This condition can be written as

(3) |

where is the Fourier transform of the source template y convolved with the instrument beam. A solution to this optimization problem is found by introducing a Lagrange multiplier , which leads to a system of linear equations

(4) |

the solution to which is:

(5) |

The matched filter derived here is optimal in the least square sense and was first proposed for the study of galaxy clusters by Haehnelt & Tegmark (1996). Although it is most commonly applied to data sets with Gaussian noise, Gaussianity is not a strict requirement. Non–Gaussian noise will not cause a bias but the solution might no longer be optimal (Melin, Bartlett & Delabrouille, 2006). However, optimal matched filters were recently derived for the low–number count Poisson noise regime that is relevant for X–ray and –ray observations (Ofek & Zackay, 2018; Vio & Andreani, 2018).

### 2.1 Constrained matched filters (CMF)

We now show that the matched filter concept can be generalized to multiple sources with known spatial templates. For this we assume that the observed sky is a linear combination of sources with known templates plus noise:

(6) |

Our goal is to construct a filter that minimizes the variance of the filtered map as defined in equation (2) and at the same time has an unbiased response to the chosen source template. We now place additional constraints by e.g. demanding the filter to have zero response to contaminating sources with well known spatial templates:

(7) |

In the following it is convenient to construct a matrix T of dimensions from the spatial templates :

(8) |

We can derive the form of the new filter by solving a system of linear equations analogous to equation (4)

(9) |

where is a vector that contains the response of the filter to the constraints defined in equation (7) and are the Lagrange multipliers. The solution for the constrained matched filter is

(10) |

which is similar to the one of the traditional matched filter in equation (5). A possible application of this new filter is the reduction of point source contamination in observations of galaxy clusters, which will be explored in Section 4. However any other contaminating source with a well known template or even multiple sources could be set to zero using this approach. This benefit will come at the cost of a reduced SNR, which will be discussed in Section 4. A comparison of the two filters using simulated microwave data of galaxy clusters and point sources is shown in Fig. 1.

A mathematically identical multi–component generalization to the constrained matched filter has been derived and successfully applied for ILC algorithms (Remazeilles, Delabrouille & Cardoso, 2011a, b; Hurier, Macías-Pérez & Hildebrandt, 2013), which are commonly used to extract Comptonization–maps from Planck data using the spectrum of the tSZ signal while zeroing out the primary CMB anisotropies by constraining their well understood blackbody spectrum.

### 2.2 Constrained matched multifilters (CMMF)

Both the matched filter and constrained matched filter presented previously were built to be applied to a single–frequency map. However, the matched filter concept can be generalized to multi–frequency datasets like the ones delivered by Planck. These generalized techniques are known as matched multifilters (MMF; Herranz et al. 2002; Melin, Bartlett & Delabrouille 2006; Lanz et al. 2010; Melin et al. 2012; Tarrío et al. 2016; Tarrío, Melin & Arnaud 2018) and are designed to use prior spatial and spectral information about a source to return an optimally filtered map of A in the least-square sense. We will show here that the matched multifilter concept can be generalized to separate multiple components with known spatial and spectral templates in an analogous way to the single–frequency filter. We start again by constructing a simple model of the observed sky. As before, we can represent observations of the sky as a linear mixture of astrophysical emission and noise:

(11) |

Different from equation (1) we now describe the observed maps as vectors in frequency space with components at each sky position in order to simplify the notation. Using this formalism and changing to Fourier space, the multi–frequency source template will be given at each as a vector in frequency space

(12) |

where denotes the Fourier transform of the beam, which in general will be frequency dependent. We now aim to find a filter that, as before, has unit response to the multi–frequency source template:

(13) |

We therefore construct a series of filters that are the components of . The final result will be a single filtered map that is the linear combination of the observed maps, each convolved with their respective frequency–dependent filter. The matched multifilter is derived analogously to the single–frequency case by demanding minimum variance of the filtered map at each spatial scale

(14) |

where P is the noise power spectrum, a matrix in frequency space with components for each that are defined as . The asterisk denotes the complex conjugate. The matched multifilter is then given by

(15) |

with the variance of the filtered map:

(16) |

The matched multifilter derived here was employed with great success for the detection and photometry of galaxy clusters by the ACT, SPT, and Planck collaborations (Hasselfield 2013; Bleem et al. 2015; Planck Collaboration 2016a).

We now show that a constrained matched multifilter can be constructed in similar fashion as before. The aim is to find a filter that allows us to constrain multiple well known multi–frequency source templates to reduce the impact of well characterized contaminants on the filtered map. We begin by assuming that the observed sky is a linear mixture of known sources plus noise:

(17) |

Next, we define the desired response of the filter to our known source templates:

(18) |

For each the constraints can be written as a matrix U with dimensions :

(19) |

By minimizing the variance of the filtered map, we find that the constrained matched multifilter is

(20) |

with the matrix S defined as:

(21) |

The variance of the filtered map can be computed as:

(22) |

The constrained matched multifilter this way can be used to separate sources in a similar fashion as the single frequency constrained matched filter, but for multi–frequency datasets. This will require both a spatial and a spectral template for each constrained source, e.g. if we want to extract galaxy clusters from multi–frequency microwave data while minimizing point source contamination we need to know the beam as well as the spectral energy distribution (SED) of the point sources. This makes the method very efficient in cleaning the data, but it will be limited to a specific type of source. Reducing the contamination of radio and far-infrared point sources at the same time can be achived by placing two additional constraints using the same spatial template (i.e. the beam) but two different SEDs. We will compare the performance of the different filters presented here by applying them to Planck High Frequency Instrument (HFI) data of the Perseus galaxy cluster in Section 4.

## 3 Simulations

### 3.1 The SZ effect of galaxy clusters

In order to test the performance of the constrained matched filter and compare it to the traditional matched filter we prepared a pipeline for the creation of mock images of the microwave sky. We use the tSZ effect signal (Sunyaev & Zeldovich, 1970, 1972; Birkinshaw, 1999; Carlstrom, Holder & Reese, 2002) of galaxy clusters as our sources of interest.

The tSZ effect is a secondary anisotropy of the CMB that is caused by inverse Compton scattering of CMB photons by free electrons in the intracluster medium (ICM). The tSZ effect causes a characteristic distortion of the CMB spectrum with a temperature decrement at low and a temperature increment at high frequencies. Peculiar motion of clusters will cause a red/blue-shift of the CMB in their rest frame, which gives rise to the kSZ effect. The spectra of the SZ signals are commonly expressed as a temperature shift relative to the CMB monopole, which can be written as

(23) |

where is the CMB temperature, is the speed of light, is the peculiar velocity along the line of sight, is the relativistic tSZ (rSZ) spectrum (e.g., Wright, 1979; Itoh, Kohyama & Nozawa, 1998; Chluba et al., 2012), is the dimensionless frequency, is the optical depth of the plasma and is the Comptonization parameter:

(24) |

Here, is the Boltzmann constant, is the Thomson cross-section, is the electron rest mass and and are the number density and temperature of the electrons in the ICM.

The Comptonization parameter is a measure of the gas pressure integrated along the line of sight (l.o.s.) and is computed by projection of the Generalized Navarro-Frenk-White (GNFW) pressure profile (Nagai, Kravtsov & Vikhlinin, 2007) using the parametrization presented by Arnaud et al. (2010)

(25) |

where is the so-called “universal” shape of the cluster pressure profile

(26) |

for which we adopt the best fit values for the profile parameters presented by Arnaud et al. (2010). In the following we refer to this profile as the GNFW profile. The characteristic cluster size marks the radius of the sphere within which the average matter density is 500 times the critical density, while is the total mass enclosed within . The temperature profile of the clusters is computed assuming a polytropic relation between electron density and temperature, with (Ostriker, Bode & Babul, 2005).

### 3.2 Simulating the microwave sky

The simulated clusters are added to an artificial CMB map computed from a synthetic power spectrum that was generated using CAMB (Lewis, Challinor & Lasenby, 2000). We account for emission from the cosmic infrared background (CIB) by adding maps of the resolved and the clustered CIB

provided by the WebSky Extragalactic CMB Mocks team^{3}

Compact radio sources are modeled by including all sources from the NVSS point source catalog (Condon et al., 1998). The measured fluxes densities at 1.4 GHz are extrapolated to microwave frequencies assuming a power law SED, with a spectral index randomly drawn for each source from a Gaussian distribution with a mean of 0.5 and a standard deviation of 0.1. Galactic and extragalactic near-infrared point sources are included by adding the sources listed in the IRAS point source catalog (Beichman et al., 1988) by following the approach presented by Delabrouille et al. (2013) to extrapolate the reported flux densities to lower frequencies.

We restrict our analysis to the extragalactic sky that is relevant for studies of galaxy clusters and cosmological studies by applying a 40 per cent Galactic dust mask to our mock maps. We furthermore exclude the region of the sky that has not been observed by the NVSS in order to keep the properties of our sky model homogeneous.

All maps are processed at HEALPix , which allows to generate mock data with a minimum FWHM of . Maps that come at a lower native resolution are oversampled and smoothed with a narrow Gaussian beam to avoid pixelization artifacts. The microwave sky is simulated at and with different spatial resolutions ranging from to , assuming circular Gaussian beams and white instrumental noise with -arcmin and -arcmin. The wide range of simulated spatial resolutions allows us to test our filtering techniques for instruments ranging from Planck to current and future ground–based experiments. The resulting maps are shown in Fig. 2.

### 3.3 Simulating X–ray data

In addition to tests using the simulated microwave data that was previously introduced, we apply the single–frequency filters presented in Section 2 to simulated X-ray data. We chose to create mock images of the upcoming extended ROentgen Survey with an Imaging Telescope Array (eROSITA, Merloni et al. 2012; Predehl 2017). Following Clerc et al. (2018), the images are simulated in the – energy band. Each image has a size of , with a pixel size of and a simulated exposure time of . We include the X–ray and instrumental background model presented in Table 1 of Borm et al. (2014). A randomly distributed population of point sources, which is described by the Moretti et al. (2013) – relation, is added. For simplicity, the images only contain a single type of isothermal cluster, simulated using a projected –model (Cavaliere & Fusco-Femiano, 1976)

(27) |

with a fixed flux of , core radius of , and of 2/3. All sources are convolved with a Gaussian PSF with a FWHM of , which is the expected value in survey mode for eROSITA (Merloni et al., 2012). We derived the count–rates of the sources from the physical fluxes for a given spectral emission model and using the instrumental response file erosita_iv_7telfov_ff.rsp^{4}

## 4 Results

### 4.1 Photometry of clusters with a central point source

Using the simulation pipeline introduced in the previous section, we first investigate how the new filtering technique presented in this work can improve the photometry of clusters that harbour a bright central point source. We do so by creating mock observations of clusters with masses ranging from to at a constant redshift of . Each simulated cluster features a central radio source with a fixed flux density of at . The beam is assumed to have a FHWM of . The values of the central Comptonization parameter computed from the measured cluster flux after filtering are shown in the left–hand panel of Fig. 3. By construction, the constrained matched filter always returns an unbiased result, while the values obtained through matched filtering are biased low with a linear dependence on the brightness of the point source. For the given flux density this bias has a significance of and increases to for lower cluster masses due to the decreasing cluster size. The biased fluxes can therefore lead to a non-detection of low mass clusters and biased inferred cluster properties for high mass systems.

We also consider a potential offset of a bright central point source relative to the cluster center. If a central point source is not aligned with the cluster, both methods will find a bias due to ringing artifacts around the filtered point source. We find that for small angular separations up to the constrained matched filter returns a value with a bias that is smaller than the one observed in the values returned by the matched filter, while both methods find similar values for larger offsets.

The comparison above highlights a clear advantage of the constrained matched filter, which however is bought with an increase in the noise level in the filtered map that limits the usefulness of the method in some cases. This noise increase results from placing additional constraints that inevitably lower the degrees of freedom available for the optimization of the variance of the filtered map. Figure 4 shows the ratio of the noise in the filtered maps as a function of apparent cluster size and instrument beam. This ratio scales linearly with the cluster-size-to-beam ratio if the map noise is Gaussian. The differences between the results obtained at and are due to the different foreground properties. We find that the constrained matched filter provides maps with a marginally increased noise level for most modern ground–based mm telescopes that offer a typical resolution of . The use case for low-resolution instruments like Planck is however restricted to large, mostly nearby clusters with radii of several tens of arcminutes.

### 4.2 Application to Planck data

In addition to tests on simulated microwave images we apply the matched filters and multifilters presented in Section 2 to Planck HFI data of the Perseus galaxy cluster at . The brightest cluster galaxy (BCG) of the Perseus cluster (NGC 1275) is a powerful radio source known as Perseus A that is unresolved in all Planck bands. While the matched multifilter and constrained matched multifilter are applied directly to the HFI data without any pre–processing other than converting the and maps to units of , the HFI maps are combined into a single map before applying the single–frequency filers. This is achieved by smoothing the maps to a common resolution of after which they are combined into a –map with ILC or constrained ILC (CILC) algorithms (Remazeilles, Delabrouille & Cardoso 2011a; see Appendix C of Erler et al. 2018 for details). The radio galaxy Perseus A appears as a bright source with negative amplitude in the ILC –map due to its diminishing brightness with increasing frequency, which is also seen in the MILCA and NILC –maps published by the Planck Collaboration (2016c). In contrast, the CILC algorithm allows to constrain the Perseus A SED and thus remove its contamination to the –map. The Perseus A SED used for the CILC and constrained matched multifilter algorithms is extracted directly from the Planck HFI data using constrained matched filters that remove the tSZ contamination by the ICM of the cluster and found to be well approximated by a power law with spectral index (see Table 1). We model the tSZ signal of the Perseus cluster with a GNFW pressure profile with (Urban et al., 2014) and use a non-relativistic approximation of the tSZ spectrum. All maps are fields centered on .

FWHM | ||
---|---|---|

(GHz) | (arcmin) | (Jy) |

100 | 9.68 | 10.36 0.15 |

143 | 7.30 | 7.80 0.13 |

217 | 5.02 | 5.74 0.25 |

353 | 4.94 | 4.12 0.87 |

545 | 4.83 | 2.82 2.81 |

857 | 4.64 | 2.12 7.90 |

We summarize our results by providing the extracted values for the central Comptonization parameter and the derived integrated value in Table 2. The latter is integrated in a cylindrical aperture with the radius

(28) |

where is the cluster template that has been normalized to unit amplitude and is the angular diameter distance of the cluster. The processed maps are shown in Fig. 5.

If neither a spatial or a spectral constraint for Perseus A is used, as in the matched multifilter and ILC + matched filter scenarios, we extract a strongly biased negative value for and thus . This provides a plausible explanation for the necessity of point source masks that are the reason why the Perseus cluster is not listed in the Planck SZ cluster catalogs (PSZ and PSZ2, Planck Collaboration 2014a, 2016a), which were built using two matched multifilter pipelines (MMF1 and MMF3) and the Bayesian PowellSnakes (PwS) algorithm.

The bias introduced by Perseus A is removed by applying a constrained matched filter to the same –map, which yields . For the application of a constrained matched filter to an ILC –map it is critical to smooth all maps to a common resolution before combining them. Combining the maps in Fourier space at their native resolution will distort the beam in the –map, which increases the complexity of constraining the beam for point source removal.

Using the Perseus A SED to construct a CILC –map before filtering is an alternative way to remove the bias introduced by the radio source. In that case, both the traditional and the constrained matched filter yield similar values for , both of which are consistent with the previous result. Placing a spectral constraint in the CILC step however results in a noisier –map and thus a slightly lower SNR in both cases.

Finally, applying a constrained matched multifilter that uses both the SED of Perseus A and our knowledge of the Planck beams yields , which is in agreement with the previous values and with a SNR of 24 offers the strongest signal of all methods compared here. This SNR is comparable to the the SNR of 22 we obtain by applying a matched multifilter to Planck HFI maps of the Coma cluster, a system of similar mass at z= 0.0231. Using the scaling relation from the Planck Collaboration (2014b, 2016d) and converting to we find a mass of for the Perseus cluster, which is consistent with the value obtained by Urban et al. (2014)^{5}

For Coma, all six methods yield similar values for due to the lack of a bright central radio or FIR source. We find however that the two multifilters deliver an almost identical SNR as the ILC plus matched filter techniques, while the CILC approach gives a slightly lower SNR of 17. This indicates that the additional constraints are “cheaper” for multifilters but come at the drawback that multiple constraints have to be placed for sources with identical spatial template but different SEDs. Combining an ILC map and constrained matched filtering will remove sources just based on their spatial signature with no need to have constraints on their SED.

Technique | SNR | ||
---|---|---|---|

ILC + MF | -0.74 0.66 | -0.50 0.44 | -1.1 |

ILC + CMF | 9.35 0.70 | 6.31 0.47 | 13.4 |

CILC + MF | 9.44 0.77 | 6.37 0.52 | 12.3 |

CILC + CMF | 9.77 0.82 | 6.59 0.55 | 12.0 |

MMF | -2.64 0.39 | -1.77 0.27 | -6.8 |

CMMF | 10.0 0.42 | 6.76 0.28 | 24.0 |

This example illustrates that there are multiple ways of dealing with point source contamination in clusters. The advantage of the constrained matched filter over using spectral constraints is that it is often easier to characterize the instrument beam than measuring the SED of a source. Radio sources like Perseus A can show variability and extrapolating their fluxes to microwave frequencies based on radio measurements often relies on the assumption of a perfect power-law SED, which can be prone to mistakes since many sources are known to have SEDs that deviate from a power law (Herbig & Readhead, 1992). Furthermore, using spectral information will require individual measurements for each source, while a spatial technique can be applied blindly to a large number of objects.

### 4.3 Blind cluster detection and X–ray application

We also investigate the potential application of the constrained matched filter to reduce point source contamination for blind cluster detection. In tSZ surveys below point sources will not be misclassified as galaxy clusters due to the tSZ effect’s characteristic decrement. They can however lower the decrement or even overpower it, which can lead to a biased flux or a non-detection as has been illustrated previously in Section 4.1. At higher frequencies, in the tSZ increment, point sources can bias the flux and might be misclassified as clusters. For instruments like Planck the situation has been mitigated by multi–frequency coverage (e.g. Bartlett & Melin 2006), but prominent examples like the Perseus cluster remain.

Point source contamination is an even greater issue in X–ray surveys due to the stochastic nature of the observed signal. The upcoming eROSITA survey is expected to detect about 100,000 galaxy clusters (Pillepich, Porciani & Reiprich, 2012; Clerc et al., 2018) as well as millions of active galactic nuclei (AGN). Separating both source populations presents a major challenge for cluster detection algorithms. The constrained matched filter introduced here presents an additional tool for this task that has the benefit of using reasonable assumptions, like well known cluster profiles and the PSF of the instrument, to deliver an optimal result. In the remaining part of this section we will provide a brief outline how the traditional and constrained matched filters can be combined to detect clusters in X–ray surveys and reduce the number of misclassified point sources.

We perform our tests on the eROSITA mock data that was introduced in Section 3.3. Each field is filtered with both a matched filter and a constrained matched filter. We then apply a simple source finder^{6}

We find that using both filters in conjunction will strongly reduce the number of misclassified point sources. As demonstrated clearly in Fig. 7, the constrained matched filter yields a better segregation of point sources in terms of their SNR but also raises the scatter of the filtered cluster photon counts due to the increased map noise. However this is a very qualitative analysis since we do not account for the Poissonian statistics that govern X–ray observations and do not tune the detection threshold to maximize the number of detected clusters while staying below a fixed rate of spurious detections. We leave a more quantitative analysis of the X–ray application to future work.

## 5 Discussion

The new constrained matched filtering and multifiltering techniques presented in this work are straightforward extensions of the matched filtering concept that enable optimal extraction of sources with known templates while at the same time allowing for an optimal reduction of known contaminating sources. The results presented in Section 4 focused on the reduction of point source contamination to SZ and X–ray observations of galaxy clusters, but it is important to stress that the methods presented here are applicable to any contaminating source that can be approximated through a known template. It is also possible to place more than one constraint, yet care has to be taken since every additional constraint will result in a noisier map. As with any matched filter, the values found in the filtered map will be biased if the source template does not match the true shape of a resolved source. We note however that the constrained filters can provide a slightly larger bias than the traditional matched filter if the desired source is more compact than its template and other compact sources are supposed to be removed.

A technique similar to the constrained matched multifilter presented in this work was explored by (Herranz et al., 2005), who derived an unbiased matched multifilter to minimize the contamination of the tSZ to kSZ maps and vice versa. These authors derived a two-component version of the filter presented here and then use the same spatial but different spectral templates for the two different SZ components to separate them. However, a potential drawback of this method is that the spatial templates of the tSZ and kSZ signals should in general be different, especially for merging systems.

An important detail of the new methods is their dependence on the spatial resolution of the instrument, which has a crucial impact on the noise level of the filtered map. Compact clusters will thus remain spatially indistinguishable from point sources if the instrument beam is large. This also restricts the application of the constrained filters on Planck data to nearby clusters with large apparent radii. The situation improves when the instrument beam has a FWHM of or less, at which point the noise will only increase by a few percent compared the a matched filtered map for most cluster sizes. Such resolution is quite common for ground–based cluster surveys like the ones performed by the SPT and ACT. However, additional filtering will be applied for ground–based instruments to reduce atmospheric contamination. The impact of these filtering steps on the astrophysical signal has to be understood and characterized before matched filters are applied (e.g. Bleem et al. 2015).

The new filtering techniques are especially interesting for studies of the kSZ and relativistic tSZ with upcoming instruments like the Simons Observatory (Simons Observatory Collaboration, 2018) and CCAT–prime^{7}

We also briefly outlined how the constrained matched filter can aid the separation of AGN and galaxy clusters in upcoming X–ray surveys. The need for new techniques for better point source separation was recently highlighted by Biffi, Dolag & Merloni (2018), who used X–ray mocks derived from the hydrodynamical Magneticum Pathfinder simulation to investigate the contribution of AGN inside clusters to the X–ray luminosity of ICM. The methods presented in this work are especially tailored to this application, since they use few assumptions and offer an optimal result. An important benefit of the filters presented here is that they are able to separate clusters and point sources even when they are aligned. This however can lead to biased photometry of clusters with compact cool cores if the template does not account for it. The constrained matched filter should not be considered as a replacement for well proven and tested methods but rather presents an additional tool that will work best in conjunction with other methods such as the traditional matched filter or the well–known sliding cell (Harnden et al., 1984) and WAVEDETECT (Freeman et al., 2002) algorithms, since significant discrepancies between their extracted signals hint at potential point source contamination. Other recent attempts on improving the separation of point sources and galaxy clusters in X–ray cluster searches include the combination with optical data (Green et al., 2017) and a new matched multifilter technique introduced by Tarrío et al. (2016); Tarrío, Melin & Arnaud (2018) who used ROSAT data as an additional Planck channel to make use of the very different source populations in the two data sets.

## 6 Conclusions

This work introduced a new way to generalize matched filters and multifilters to separate desired and undesired sources based on just their spatial (CMF) or their spatial and spectral (CMMF) characteristics. Adding additional constraints will reduce the SNR of the sources, but if both source and contaminant are well approximated by given templates the methods introduced here will allow for unbiased photometry and reduced confusion. When applied to Gaussian data, matched filters are optimal in the least-square sense, making them ideal tools for the extraction of the SZ signal of galaxy clusters from microwave data. However, traditional matched filtering techniques can perform poorly if microwave data of galaxy clusters is contaminated by point sources.

At microwave frequencies, there are two distinct populations of point-like sources that are spatially correlated with galaxy clusters. The first consists of radio-bright AGN which are fount at the centres of many BCGs, and the second being composed of dusty star-forming galaxies. Using realistic microwave mock data we showed that the constrained matched filter introduced in this work allows for unbiased photometry of clusters that harbour a central point source. If applied at multiple frequencies it enables studies of the SZ spectrum of clusters with no need to account for the SED of the point source. We showed that our method requires sufficient spatial resolution to be competitive and otherwise will yield an unbiased but noisy result. Applying constrained and unconstrained matched filters and multifilters to Planck HFI data of the Perseus cluster, which features a bright central radio source, demonstrated that there are multiple ways to remove a central source from actual data, requiring only spatial or spectral constraints, or the combination of both. In the latter case we showed that Perseus can be detected with a SNR typical for a cluster of its mass and redshift. However using only spatial constraints will reduce contamination by point sources regardless of their SED.

The application of the methods presented here is especially interesting to the upcoming CCAT–prime and eROSITA cluster surveys. While CCAT–prime will benefit from unbiased photometry of clusters with central point sources for detailed measurements of the rSZ and kSZ effects, point source confusion during cluster detection is a major concern for X–ray surveys. We illustrated how the constrained matched filter can provide an optimal way to distinguish between clusters and point sources and showed that the new method has the potential to be developed into a competitive cluster finding algorithm.

## Acknowledgements

The authors would like to thank Jean-Baptiste Melin, Paula Tarrío, Florian Pacaud, Nicolas Clerc, Eve Vavagiakis and Christos Karoumpis for insightful comments and discussions. J.E., K.B. and F.B. acknowledge partial funding from the Transregio programme TRR33 of the Deutsche Forschungsgemeinschaft (DFG). J.E. furthermore acknowledges support by the Bonn-Cologne Graduate School of Physics and Astronomy (BCGS) M.E.R.C. acknowledges support by the German Aerospace Agency (DLR) with funds from the Ministry of Economy and Technology (BMWi) through grant 50 OR 1514. The simulations of the CIB used in this paper were developed by the WebSky Extragalactic CMB Mocks team, with the continuous support of the Canadian Institute for Theoretical Astrophysics (CITA), the Canadian Institute for Advanced Research (CIFAR), and the Natural Sciences and Engineering Council of Canada (NSERC), and were generated on the GPC supercomputer at the SciNet HPC Consortium. SciNet is funded by: the Canada Foundation for Innovation under the auspices of Compute Canada; the Government of Ontario; Ontario Research Fund – Research Excellence; and the University of Toronto. This research made use of photutils and Astropy, a community–developed core Python package for Astronomy (Astropy Collaboration, 2018).

## Appendix A all–sky formalism

The matched filter formalism presented in Section 2 used the flat sky approximation but can be adopted to the full sphere with little effort. Implementing matched filters on the full sphere can have advantages in certain situations, because we can avoid using an approximate projection to a flat-sky geometry. Schäfer et al. (2006) provides an excellent overview on the details. This section is intended to give a summary of the most important points.

Assuming radial symmetry of the sources that we are interested in (i.e. ) and using the convolution theorem on the sphere we can relate the spherical harmonic coefficients of the unfiltered map to the ones of the filtered map by:

(29) |

The new all–sky matched filter will thus be

(30) |

where C is the power spectrum of the all–sky map recast as a diagonal matrix as was done in Section 2 and the elements of and are defined as:

(31) |

Here, denotes the spherical harmonic transform of the source template profile, while and are the beam and pixel window functions. When computing the it is often useful to mask the brightest regions of the Galaxy to reduce contamination from bright ringing artifacts and ensure that the data is Gaussian.

The constrained matched filter can be applied to the full sphere analogously. Using equation (29) the all–sky filter can be written as:

(32) |

As defined in Section 2 and are matrices build from the spatial constraints

(33) |

(34) |

where the components and are defined for each template as done in equation (31).

### Footnotes

- pubyear: 2018
- pagerange: Introducing constrained matched filters for improved separation of point sources from galaxy clusters–A
- The mocks are provided at https://mocks.cita.utoronto.ca
- The eROSITA response file is available at

http://www2011.mpe.mpg.de/erosita/response/ - The error on the mass includes the uncertainties of the scaling relation parameters given by Planck Collaboration (2016d), which we assume to be uncorrelated.
- We use the find_peaks() function of the python Photutils package.
- http://www.ccatobservatory.org/

### References

- Aghanim N., Hansen S. H., Lagache G., 2005, A&A, 439, 901
- Arnaud M., Pratt G. W., Piffaretti R., Böhringer H., Croston J. H., Pointecouteau E., 2010, A&A, 517, A92
- Bartlett J. G., Melin J. B., 2006, A&A, 447, 405
- Beichman C. A., Neugebauer G., Habing H. J., Clegg P. E., Chester T. J., 1988, IRAS Catalogs and Atlases, Explanatory Supplement, eds. C. Beichman, et al., NASA RP-1190, 1
- Biffi V., Dolag K., Merloni A., 2018, preprint (arXiv:1804.01096)
- Birkinshaw M., 1999, Phys. Rep., 310, 97
- Bleem L. E. et al., 2015, ApJS, 216, 27
- Borm K., Reiprich T. H., Mohammed I., Lovisari L., 2014, A&A, 567, A65
- Carlstrom J. E., Holder G. P., Reese E. D., 2002, ARA&A, 40, 643
- Cavaliere A., Fusco-Femiano R., 1976, A&A, 49, 137
- Chluba J., Nagai D., Sazonov S., Nelson K., 2012, MNRAS, 426, 510
- Clerc N. et al., 2018, preprint (arXiv:1806.08652)
- Condon J. J., Cotton W. D., Greisen E. W., Yin Q. F., Perley R. A., Taylor G. B., Broderick J. J., 1998, AJ, 115, 1693
- Delabrouille J., 2013, A&A, 553, A96
- Freeman P. E., Kashyap V., Rosner R., Lamb D. Q., 2002, ApJS, 138, 185
- Erler J., Basu K., Chluba J., Bertoldi F., 2018, MNRAS, 476, 3360
- Górski K. M., Hivon E., Banday A. J., Wandelt B. D., Hansen F. K., Reinecke M., Bartelmann M., 2005, ApJ, 622, 759
- Green et al., 2017, MNRAS, 465, 4872
- Haehnelt M. G., Tegmark M., 1996, MNRAS, 279, 545
- Harnden Jr. F. R., Fabricant D. G., Harris D. E., Schwarz J., 1984, SAO Special Report, 393
- Hasselfield M. et al., 2013, J. Cosmology Astropart. Phys., 7, 008
- Herbig T., Readhead A. C. S., 1992, ApJS, 81, 83
- Herranz D., Sanz J. L., Hobson M. P., Barreiro R. B., Diego J. M., Martínez-González E., Lasenby A. N., 2002, MNRAS, 336, 1057
- Herranz D., Sanz J. L., Barreiro R. B., López-Caniego M., 2005, MNRAS, 356, 944
- Hurier G., Macías-Pérez J. F., Hildebrandt S., 2013, A&A, 558, A118
- Itoh N., Kohyama Y., Nozawa S., 1998, ApJ, 502, 7
- Kalberla P. M. W., Burton W. B., Hartmann D., Arnal E. M., Bajaja E., Morras R., Pöppel W. G. L., 2005, A&A, 440, 775
- Knox L., Holder G. P., Church S. E., 2004, ApJ, 612, 96
- Koulouridis et al., 2018, A&A, xxx, xxx
- Lanz L. F., Herranz D., Sanz J. L., González-Nuevo J., López-Caniego M., 2010, MNRAS, 403, 2120
- Lewis A., Challinor A., Lasenby A., 2000, ApJ, 538, 473
- Lin Y.-T., Mohr J. J., 2007, ApJS, 170, 71
- Melin, J.-B., Bartlett J. G., Delabrouille J., 2006, A&A, 459, 341
- Melin, J.-B. et al. 2012, A&A, 548, A51
- Merloni et al., 2012, preprint (arXiv:1209.3114)
- Mittal A., de Bernardis F., Niemack M. D., 2018, J. Cosmology Astropart. Phys., 2, 032
- Miville-Deschênes M.-A., Lagache G., Boulanger F., Puget J.-L., 2007, A&A, 469, 595
- Moretti A., Vattakunnel S., Tozzi P., Salvaterra R., Severgnini P., Fugazza D., Haardt F., Gilli R., 2013, Mem. Soc. Astron. Italiana, 84, 653
- Nagai D., Kravtsov A. V., Vikhlinin A., 2007, ApJ, 668, 1
- Ofek E. O., Zackay B., 2018, AJ, 155, 169
- Ostriker J. P., Bode P., Babul A., 2005, ApJ, 634, 964
- Parshley, S. C. et al., 2018a, Proc. SPIE, 10700, 107005X
- Parshley, S. C. et al., 2018b, Proc. SPIE, 10700, 1070041
- Pillepich A., Porciani C., Reiprich, T. H., 2012, MNRAS, 422, 44
- Planck Collaboration 2013 XX, 2014b, A&A, 571, A20
- Planck Collaboration 2013 XXIX, 2014a, A&A, 571, A29
- Planck Collaboration 2015 X, 2016b, A&A, 594, A10
- Planck Collaboration 2015 XXII, 2016c, A&A, 594, A22
- Planck Collaboration 2015 XXIV, 2016d, A&A, 594, A24
- Planck Collaboration 2015 XXVII, 2016a, A&A, 594, A27
- Predehl P., 2017, Astronomische Nachrichten, 338, 159
- Remazeilles M., Delabrouille J., Cardoso J.-F. 2011a, MNRAS, 410, 2481
- Remazeilles M., Delabrouille J., Cardoso, J.-F., 2011b, MNRAS, 418, 467
- Schäfer B. M., Pfrommer C., Hell R. M., Bartelmann M., 2006, MNRAS, 370, 1713
- Sehgal N., Bode P., Das S., Hernandez-Monteagudo C., Huffenberger K., Lin Y.-T., Ostriker J. P., Trac H., 2010, ApJ, 709, 920
- Smith R. K., Brickhouse N. S., Liedahl D. A., Raymond J. C., 2001, ApJ, 556, L91
- Soergel B., Giannantonio T., Efstathiou G., Puchwein E., Sijacki D., 2016, MNRAS, 468, 577
- Stacey G. J. et al., 2018, Proc. SPIE, 10700, 107001M
- Simons Observatory Collaboration, 2018, preprint (arXiv:1808.07445)
- Sunyaev R. A., Zeldovich Y. B., 1970, Comments Astrophys. Space Phys., 2, 66
- Sunyaev R. A., Zeldovich Y. B., 1972, Comments Astrophys. Space Phys., 4, 173
- Tarrío P., Melin J.-B., Arnaud M., Pratt G. W., 2016, A&A, 591, A39
- Tarrío P., Melin J.-B., Arnaud M., 2018, A&A, 614, A82
- Thorne B., Dunkley J., Alonso D., Næss S., 2017, MNRAS, 469, 2821
- Urban et al., 2014, MNRAS, 437, 3939
- Vavagiakis, E. M. et al. 2018, Proc. SPIE, 10708, 107081U
- Vio R., Andreani P., 2018, A&A, 616, A25
- Wright E. L., 1979, ApJ, 232, 348