The faint end of the z\sim 3-7 Luminosity Function of Lyman-alpha emitters behind lensing clusters observed with MUSE

# The faint end of the z∼3−7 Luminosity Function of Lyman-alpha emitters behind lensing clusters observed with MUSE

###### Key Words.:
High redshift – Luminosity function – Lensing clusters – Reionization
###### Abstract

Context:This paper presents the results obtained with the Multi Unit Spectroscopic Explorer (MUSE) at the ESO Very Large Telescope on the faint-end of the Lyman-alpha luminosity function (LF) based on deep observations of four lensing clusters.

Aims:The precise aim of the present study is to further constrain the abundance of Lyman-alpha emitters (LAEs) by taking advantage of the magnification provided by lensing clusters. By construction, this sample of LAEs is complementary to those built from deep blank fields, and makes it possible to determine the shape of the LF at fainter levels, as well as its evolution with redshift.

Methods:We blindly selected a sample of 156 LAEs, with redshifts between and magnification-corrected luminosities in the range [erg s] . The price to pay to benefit from magnification is a reduction of the effective volume of the survey, together with a more complex analysis procedure. To properly take into account the individual differences in detection conditions (including lensing configurations, spatial and spectral morphologies) when computing the LF, a new method based on the approach was implemented. This procedure, including some new methods for masking, effective volume integration and (individual) completeness determinations can be easily generalized to the analysis of IFU observations in blank fields.

Results:As a result of this analysis, the Lyman-alpha LF has been obtained in four different redshift bins: , , and with constraints down to . From our data only, no significant evolution of LF mean slope can be found. When performing a Schechter analysis including also data from the literature to complete the present sample towards the brightest luminosities, a steep faint-end slope was measured varying from to between the lowest and the highest redshift bins.

Conclusions: The contribution of the LAE population to the star formation rate density at is % depending on the luminosity limit considered, which is of the same order as the Lyman-break galaxy (LBG) contribution. The evolution of the LAE contribution with redshift depends on the assumed escape fraction of Lyman-alpha photons, and appears to slightly increase with increasing redshift when this fraction is conservatively set to one. Depending on the intersection between the LAE/LBG populations, the contribution of the observed galaxies to the ionizing flux may suffice to keep the universe ionized at . (abridged)

## 1 Introduction

Reionization is an important change of state of the universe after recombination, and many resources have been devoted in recent years to understand this process. The formation of the first structures, stars and galaxies, marked the end of the dark ages, following the formation of the first structures, the density of ionizing photons was high enough to allow the ionization of the entire neutral hydrogen content of the Inter-Galactic Medium (IGM). It has been established that this state transition was mostly completed by (Fan et al. 2006; Becker et al. 2015). However the identification of the sources responsible for this major transition and their relative contribution to the process is still a matter of substantial debate.

Although quasars were initially considered as important candidates due to their ionising continuum, star-forming galaxies presently appear as the main contributors to the reionization (see e.g. Robertson et al., 2013, 2015; Bouwens et al., 2015a; Ricci et al., 2017). However a large uncertainty still remains on the actual contribution of quasars, as the faint population of quasars at high redshift remains poorly constrained (see e.g. Willott et al., 2010; Fontanot et al., 2012; McGreer et al., 2013). There are two main signatures currently used for the identification of star-forming galaxies around and beyond the reionization epoch. The first one is the Lyman “drop-out” in the continuum bluewards with respect to Lyman-alpha, due to the combined effect of interstellar and intergalactic scattering by neutral hydrogen. Different redshift intervals can be defined to select Lyman Break Galaxies (LBGs) using the appropriate color-color diagrams or photometric redshifts. Extensive literature is available on this topic since the pioneering work by Steidel et al. (1996); (see e.g. Ouchi et al., 2004; Stark et al., 2009; McLure et al., 2009; Bouwens et al., 2015b, and the references therein). The second method is the detection of the Lyman-alpha line to target Lyman-Alpha Emitters (hereafter LAEs). The classical approach is based on wide-field narrow-band surveys, targeting a precise redshift bin (e.g. Rhoads et al., 2000; Kashikawa et al., 2006; Konno et al., 2014) or more recently the efficient use of 3D/IFU spectroscopy in pencil beam mode (e.g. using MUSE/VLT, Bacon et al., 2015), a technique presently limited to 7 in the optical domain.

Based on LBG studies, the UV Luminosity Function (LF) evolves strongly at , with a depletion of bright galaxies with increasing redshift on one hand, and the slope of the faint end becoming steeper on the other hand (Bouwens et al., 2015b). This evolution is consistent with the expected evolution of the halo mass function during the galaxy assembly process. LAE studies find a deficit of strongly-emitting (”bright”) Lyman-alpha galaxies at , whereas no significant evolution is observed below (Kashikawa et al., 2006; Pentericci et al., 2014; Tilvi et al., 2014) a trend which is attributed to either an increase in the fraction of neutral hydrogen in the IGM or an evolution of the parent population, or both. LBGs and LAEs constitute two different observational approaches to select star-forming galaxies, partly overlapping. The prevalence of Lyman-alpha emission in well-controlled samples of star-forming galaxies is also a test for the reionization history. However, a complete and ”as unbiased as possible” census of ionizing sources can only be enabled through 3D/IFU spectroscopy without any photometric preselection.

As pointed out by different authors (see e.g. Maizy et al., 2010), lensing clusters are more efficient than blank-field for detailed (spectroscopic) studies at high redshift, and also to explore the faint-end of the LF. In this respect, they are quite complementary to observations in wide blank fields, which are needed to set reliable constraints on the “bright” end of both the UV and LAE LF. Several recent results in the Hubble Frontier Fields (Lotz et al., 2017) fully confirm the benefit expected from gravitational magnification (see e.g. Laporte et al., 2014; Atek et al., 2014; Infante et al., 2015; Ishigaki et al., 2015; Laporte et al., 2016; Livermore et al., 2017).

This paper presents the results obtained with the Multi Unit Spectroscopic Explorer (MUSE; Bacon et al. 2010) at the ESO Very Large Telescope on the faint-end of the LAE LF based on deep observations of four lensing clusters. The data were obtained as part of the MUSE consortium Garanted Time Observations (GTO) program and first commissioning run. The final goal of our project in lensing clusters is to set strong constraints on the relative contribution of the LAE population to cosmic reionization. As shown in Richard et al. (2015) for SMACSJ2031.8-4036, Bina et al. (2016) for A1689, Lagattuta et al. (2017) for A370, Caminha et al. (2016) for AS1063, Karman et al. (2016) for MACS1149 and Mahler et al. (2018) for A2744, MUSE is ideally designed for the study of lensed background sources, in particular for LAEs at 2.9 6.7. MUSE provides a blind survey of the background population, irrespective of the detection or not of the associated continuum. MUSE is also a unique facility in deriving the 2D properties of “normal” strongly-lensed galaxies, as recently shown by Patricio et al. (2018). In this project, an important point is that MUSE allows to reliably recover a greater fraction of the Lyman-alpha flux for LAE emitters, as compared to usual long-slit surveys or even narrow-band imaging.

The precise aim of the present study is to further constrain the abundance of LAEs by taking advantage from the magnification provided by lensing clusters to build a blindly-selected sample of galaxies which is less biased than current blank-field samples in redshift and luminosity. By construction, this sample of LAEs is complementary to those built in deep blank fields, whether observed by MUSE or by other facilities, and makes it possible to determine in a more reliable way the shape of the luminosity function towards the faintest levels, as well as its evolution with redshift. Here we focus on four well known lensing clusters from the GTO sample, namely Abell 1689, Abell 2390, Abell 2667 and Abell 2744. In this study we present the method and we establish the feasibility of the project before extending this approach to all available lensing clusters observed by MUSE in a future work.

In this paper we present the deepest study of the LAE LF to date, combining deep MUSE observations with the magnification provided by four lensing clusters. In Sect. 2, we present the MUSE data together with the ancillary Hubble Space Telescope (HST) data used for this project, as well as the observational strategy adopted. The method used to extract LAE sources in the MUSE cubes is presented in Sect. 3. The main characteristics and the references for the four lensing models used in this article are presented in Sect. 4, knowing that the present MUSE data were also used to identify new multiply-imaged systems in these clusters, and therefore to further improve the mass-models. The selection of the LAE sample used in this study is presented in Sect. 5. Sect. 6 is devoted to the computation of the LF. In this Section we present the complete procedure developed for the determination of the LF based on IFU detections in lensing clusters, with some additional technical points and examples given in appendices A to D. This procedure includes novel methods for masking, effective volume integration and (individual) completeness determination, using as far as possible the true spatial and spectral morphology of LAEs instead of a parametric approach. The parametric fit of the LF by a Schechter function, including data from the literature to complete the present sample, is presented in Sect. 7. The impact of mass model on the faint end and the contribution of the LAE population to the Star Formation Rate Density (SFRD) are discussed in Sect. 8. Conclusions and perspectives are given in Sect. 9.

Throughout this paper we adopt the following cosmology: = 0.7, = 0.3 and 70 km s Mpc. Magnitudes are given in the AB system (Oke & Gunn, 1983). All redshifts quoted are based on vacuum rest-frame wavelengths.

## 2 Data

### 2.1 MUSE Observations

The sample used in this study consists of four different MUSE cubes of different sizes and exposure times, covering the central regions of well-characterized lensing clusters: Abell 1689, Abell 2390, Abell 2667 and Abell 2744 (resp. A1689, A2390, A2667 and A2744 hereafter). These four clusters already had well constrained mass models before the MUSE observations, as they benefited from previous spectroscopic observations. The reference mass models can be found in Richard et al. (2010) (LoCuSS) for A2390 and A2667, in Limousin et al. (2007) for A1689 and in Richard et al. (2014) for the Frontier Fields cluster A2744.

The MUSE instrument has a Field of View (FoV) and a spatial pixel size of , the covered wavelength range from to with a sampling, effectively making the detection of LAEs possible between redshifts of and . The data were obtained as part of the MUSE GTO program and first commissioning run (for A1689 only). All the observations were conducted in the nominal WFM-NOAO-N mode of MUSE. The main characteristics of the four fields are listed in Table 1. The geometry and limits of the four FoVs are shown on the available HST images, in Fig. 1.

##### A1689:

Observations were already presented in Bina et al. (2016) from the first MUSE commissioning run in 2014. The total exposure was divided into six individual exposures of . A small linear dither pattern of was applied between each exposure to minimize the impact of the structure of the instrument on the final data. No rotation was applied between individual exposures.

##### A2390, A2667 and A2744:

The same observational strategy was used for all three cubes: the individual pointings were divided into exposures of 1800 sec. In addition to a small dither pattern of , the position angle was incremented by between each individual exposure to minimize the striping patterns caused by the slicers of the instrument. A2744 is the only mosaic included in the present sample. The strategy was to completely cover the multiple-image area. For this cluster, the exposures of the four different FoVs are as follows : 3.5, 4, 4, 5 hours of exposure plus an additional 2 hours at the centre of the cluster (see fig. 1 in Mahler et al. 2018 for the details of the exposure map). For A2390 and A2667, the centre of the FoV was positioned on the central region of the cluster as shown in Table 1 and Fig. 1.

#### 2.1.1 Data reduction

All the MUSE data were reduced using the MUSE ESO reduction pipeline (Weilbacher et al., 2012, 2014). This pipeline includes: bias subtraction, flat-fielding, wavelength and flux calibrations, basic sky subtraction, and astrometry. The individual exposures were then assembled to form a final data cube or a mosaic. An additional sky line subtraction was performed with the ZAP software (Zurich Atmosphere Purge, Soto et al. 2016). This software uses a principal component analysis to characterize the residuals of the first sky line subtraction to further remove them from the cubes. Even though the line subtraction is improved by this process, the variance in the wavelength-layers affected by the presence of sky-lines remains higher, making the source detection more difficult on these layers. For simplicity, hereafter we simply use the term layer to refer to the monochromatic images in MUSE cubes.

### 2.2 Complementary data (HST)

For all MUSE fields analysed in this paper, complementary deep data from HST are available. They were used to help the source detection process in the cubes but also for modelling the mass distribution of the clusters (see Sect. 4). A brief list of the ancillary HST data used for this project is presented in Table 2. For A1689 the data are presented in Broadhurst et al. (2005). For A2390 and A2667, a very thorough summary of all the HST observations available are presented in Richard et al. (2008) and more recently in Olmstead et al. (2014) for A2390. A2744 is part of the Hubble Frontier Fields (HFF) program which comprises the deepest observations performed by HST on lensing clusters. All the raw data and individual exposures are available from the Mikulski Archive for Space Telescopes (MAST), and the details of the reduction are addressed in the articles cited above.

## 3 Detection of the LAE population

### 3.1 Source detection

MUSE is most efficient to detect emission lines (see for example Bacon et al. 2017; Herenz et al. 2017). On the contrary, deep photometry is well suited to detect faint objects with weak continua, with or without emission lines. To build a complete catalog of the sources in a MUSE cube, we combined a continuum-guided detection strategy based on deep HST images (see Table 2 for the available photometric data) with a blind detection in the MUSE cubes. Many of the sources end up being detected by both approaches and the catalogs are merged at the end of the process to make a single master catalog. The detailed method used for the extraction of sources in A1689 and A2744 can be found in Bina et al. (2016) and Mahler et al. (2018) 111 The complete catalog of MUSE sources detected by G. Mahler in A2744 is publicly available at http://muse-vlt.eu/science/a2744/ respectively. The general method used for A2744 (which contains the vast majority of sources in the present sample) is summarized below.

The presence of diffuse Intra-Cluster Light (ICL) makes the detection of faint sources difficult in the cluster core, in particular for multiple-images located in this area. A running median filter computed in a window of was applied to the HST images to remove most of the ICL. The ICL-subtracted images were then weighted by their inverse variance map and combined to make a single deep image. The final photometric detection was performed by SExtractor (Bertin & Arnouts, 1996) on the weighted and combined deep images.

For the blind detection on the MUSE cubes, the Muselet software was used (MUSE Line Emission Tracker, written by J. Richard 222Publicly available as part of the python MPDAF package (Piqueras et al., 2017) : http://mpdaf.readthedocs.io/en/latest/muselet.html ). This tool is based on SExtractor to detect emission-line objects from MUSE cubes. It produces spectrally-weighted, continuum-subtracted Narrow Band images (NB) for each layer of the cube. The NB images are the weighted average of 5 wavelength layers, corresponding to a spectral width of . They form a NB cube, in which only the emission-line objects remain. Sextractor is then applied to each of the NB images. At the end of the process, the individual detection catalogs are merged together, and sources with several detected emission lines are assembled as one single source.

After building the master catalog, all spectra were extracted and the redshifts of galaxies were measured. For A1689, A2390 and A2667, 1D spectra were extracted using a fixed aperture. For A2744, the extraction area is based on the SExtractor segmentation maps obtained from the deblended photometric detections described above. At this stage, the extracted spectra are only used for the redshift determination. The precise measurement of the total line fluxes requires a specific procedure, described in Sect. 3.2. Extracted spectra were manually inspected to identify the different emission lines and accurately measure the redshift.

A system of confidence levels was adopted to reflect the uncertainty in the measured redshifts, following Mahler et al. (2018). The reader can find in this paper some examples that illustrate the different cases. All the LAEs used in the present paper belong to the confidence category 2 and 3, meaning that they all have fairly robust redshift measurements. For LAEs with a single line and no continuum detected, the wide wavelength coverage of MUSE, the absence of any other line and the asymmetry of the line were used to validate the identification of the Lyman-alpha emission. For A1689, A2390 and A2667 most of the background galaxies are part of multiple-image systems, and are therefore confirmed high redshift galaxies based on lensing considerations.

In total 247 LAEs were identified in the four fields: 17 in A1689, 18 in A2390, 15 in A2667 and 197 in A2744. The important difference between the number of sources found in the different fields results from a well-understood combination of field size, magnification regime and exposure time, as explained in Sect. 5.

### 3.2 Flux measurements

The flux measurement is part of the main procedure developed and presented in Sect. 6 to compute the LF of LAEs in lensing clusters observed with MUSE. We discuss it here in order to understand the selection of the final sample of galaxies used to build the LF.

For each LAE, the flux measurement in the Lyman-alpha line was done on a continuum subtracted NB image that contains the whole Lyman-alpha emission. For each source, we have built a subcube centred on the Lyman-alpha emission, plus adjacent blue and red ones used to estimate the spectral continuum. The central cube is a square of size and the spectral range depends on the spectral width of the line. To determine this width and the precise position of the Lyman-alpha emission, all sources were manually inspected. The blue and red subcubes are centered on the same spatial position, with the same spatial extent, and are 20Å wide in the wavelength direction. A continuum image was estimated from the average of the blue and red subcubes and it was subtracted pixel-to-pixel from the central NB image. For sources with large full width at half maximum (FWHM), the NB used for flux measurement can regroup more than 20 wavelength layers (or equivalently ).

Because SExtractor with FLUX_AUTO is known to provide a good estimate of the total flux of the sources to the  5% level (see e.g. the SExtractor Manual Sect. 10.4 Fig. 8.), it was used to measure the flux and the corresponding uncertainties on the continuum-subtracted images. FLUX_AUTO is based on Kron first moment algorithm, and is well suited to account for the extended Lyman-alpha halos that can be found around many LAEs (see Wisotzki et al. 2016 for the extended nature of the Lyman-alpha emission). In addition, the automated aperture is useful to properly account for the distorted images that are often found in lensing fields. As our sample contains faint, low surface brightness sources, and given that the NB images are not designed to maximize the signal-to-noise (SN) ratio, it is sometimes challenging to extract sources with faint or low-surface brightness Lyman-alpha emission. In order to measure their flux we force the extraction at the position of the source. To do so, the SExtractor detection parameters were progressively loosened until a successful extraction was achieved. An extraction was considered successful when the source was recovered at less than a certain matching radius () from the original position given by Muselet. Such an offset is sometimes observed between the peak of the UV continuum and the Lyman-alpha emission in case of high magnification. A careful inspection was needed to make sure that no errors or mis-matches were introduced in the process.

Other automated alternatives to SExtractor exist to measure the line flux (see e.g. LSDCat in Herenz et al. 2017 or NoiseChisel in Akhlaghi & Ichikawa 2015 or a curve of growth approach as developed in Drake et al. (2017)). A comparison between these different methods is encouraged in the future but beyond the scope of the present analysis.

## 4 Lensing clusters and mass models

In this work, we used detailed mass models to compute the magnification of each LAE, and the source plane projections of the MUSE FoVs at different redshifts. These projections were needed when performing the volume computation (see Sect. 6.1). The mass models were constructed with Lenstool, using the parametric approach described in Kneib et al. (1996), Jullo et al. (2007) and Jullo & Kneib (2009). This parametric approach relies on the use of analytical dark-matter (DM) halo profiles to describe the projected 2D mass distribution of the cluster. Two main contributions are considered by Lenstool: one for each large-scale structure of the cluster, and one for each massive cluster galaxy. The parameters of the individual profiles are optimized through a Monte Carlo Markov Chain (MCMC) minimization. Lenstool aims at reducing the cumulative distance in the parameter space between the predicted position of multiple images obtained from the model, and the observed ones. The presence of several robust multiple systems greatly improves the accuracy of the resulting mass model. The use of MUSE is therefore a great advantage as it allowed us to confirm multiple systems through spectroscopic redshifts and also to discover new ones (e.g. Richard et al. (2015); Bina et al. (2016); Lagattuta et al. (2017); Mahler et al. (2018)). Some of the models used in this study are based on the new constraints provided by MUSE. An example of source plane projection of the MUSE FoVs is provided in Fig. 2.

Because of the large number of cluster members, the optimization of each individual galaxy-scale clump cannot be achieved in practice. Instead, a relation combining the constant mass-luminosity scaling relation described in Faber & Jackson (1976) and the fundamental plane of elliptical galaxies is used by Lenstool. This assumption allows us to reduce the parameter space explored during the minimization process, leading to more constrained mass models, whereas individual parameterization of clumps would lead to an extremely degenerate final result and therefore, a poorly constrained mass model. The analytical profiles used were double pseudo-isothermal elliptical potentials (dPIEs) as described in Elíasdóttir et al. (2007). The ellipticity and position angle of these elliptical profiles were measured for the galaxy-scale clumps with SExtractor taking advantage of the high spatial resolution of the HST images.
Because the Brightest Cluster Galaxies (BCGs) lie at the center of clusters, they are subjected to numerous merging processes, and are not expected to follow the same light-mass scaling relation. They are modeled separately in order to not bias the final result. In a similar way, galaxies that are close to the multiple images or critical lines are sometimes manually optimized because of the significant impact they can have on the local magnification and the geometry of the critical lines.

The present MUSE survey has allowed us to improve the reference models available in previous works. Table 3 summarizes their main characteristics. For A1689, the model used is an improvement made on the model of Limousin et al. (2007), previously presented in Bina et al. (2016). For A2390, the reference model is presented in Pello et al. (1991), Richard et al. (2010) and the recent improvements in Pello et al. (in prep.) For A2667, the original model was obtained by Covone et al. (2006) and was updated in Richard et al. (2010). For A2744, the gold model presented in Mahler et al. (2018) was used, including as novelty the presence of NorthGal and SouthGal, two background galaxies included in the mass model as they could have a local influence on the position and magnification of multiple images.

## 5 Selection of the final LAE sample

To obtain the final LAE sample used to build the LF, only one source per multiple-image system was retained. The ideal strategy would be to keep the image with the highest signal-to-noise ratio (which often coincides with the image with highest magnification). However, it is more secure for the needs of the LF determination to keep the sources with the most reliable flux measurement and magnification determination. In practice, it means that we often chose the less distorted and most isolated image. The flux and extraction of all sources among multiple systems were manually reviewed to select the best one to be included in the final sample. All the sources for which the flux measurement failed or that were too close to the edge of the FoV were removed from the final sample. One extremely diffuse and low surface brightness source (Id : A2744, 5681) was also removed as it was impossible to properly determine its profile for the completeness estimation in Sect. 6.2.1.

The final sample consists of 156 lensed LAEs: 16 in A1689, 5 in A2390, 7 in A2667 and 128 in A2744. Out of these 156 sources, four are removed at a later stage of the analysis for completeness reasons (see Sect. 6.2.2) leaving 152 to compute the LFs. The large difference between the clusters on the number of sources detected is expected for two reasons :

• The A2744 cube is a MUSE FoV mosaic and is deeper than the three other fields : on average, four hours exposure time for each quadrant whereas all the others have two hours or less of integration time (see Table 1).

• The larger FoV allows us to reach further away from the critical lines of the cluster, therefore increasing the probed volume as we get close to the edges of the mosaic.

This makes the effective volume of universe explored in the A2744 cube much larger (see end of Sect. 6.1.2) than in the three other fields combined. It is therefore not surprising to find most of the sources in this field. This volume dilution effect is most visible when looking at the projection of the MUSE FoVs in the source plane (see Fig. 2). Even though this difference is expected, it seems that we are also affected by an over-density of background sources at as shown in Fig. 3. This over density is currently being investigated as a potential primordial group of galaxies (Mahler et al. in prep.). The complete source catalog is provided in Table LABEL:tab:sample_tab and the Lyman-alpha luminosity distribution corrected for magnification can be found on the lower panel of Fig. 3. The corrected luminosity was computed from the detection flux with:

 LLyα=FLyαμ4πD2L (1)

where and are respectively the magnification and luminosity distance of the source. Here and in the rest of this work, a flux weighted magnification is used to better account for extended sources and for sources detected close to the critical lines of the clusters where the magnification gradient is very strong. This magnification is computed by sending a segmentation of each LAE in the source plane with Lenstool, measuring a magnification for each of its pixels and making a flux weighted average of it. A full probability density of magnification is also computed for each LAE and used in combination with its uncertainties on to obtain a realistic luminosity distribution when computing the LFs (see Sect. 6.3). Objects with the highest magnification are affected by the strongest uncertainties and tend to have very asymmetric with a long tail towards high magnifications. Because of this effect, LAEs with should be considered with great caution.

Figure 4 compares our final sample with the sample used in the MUSE HUDF LAE LF study (Drake et al. 2017, hereafter D17). The MUSE HUDF (Bacon et al., 2017), with a total of 137 hours of integration, is the deepest MUSE observation to date. It consists of a MUSE FoV mosaic, each of the quadrants being a 10 hours exposure, with an additional pointing (udf-10) of 30 hours, overlaid on the mosaic. The population selected in D17 is composed of 481 LAEs found in the mosaic and 123 in the udf-10, for a total of 604 LAEs. On the upper panel of the figure, the plot presents the luminosity of the different samples versus the redshift. Using lensing clusters, the redshift selection tends to be less affected by luminosity bias, especially for higher redshift. On the lower panel, the normalized distribution of the two populations is presented. The strength of the study presented in D17 resides in the large number of sources selected. However, a sharp drop is observed in the distribution at . Using the lensing clusters, with hours of exposure time and a much smaller lens-corrected volume of survey, a broader luminosity selection was achieved. As discussed in the next sections, despite a smaller number of LAEs compared to D17, the sample presented in this paper is more sensitive to the faint end of the LF by construction.

## 6 Computation of the Luminosity Function

Because of the combined use of lensing clusters and spectroscopic data cubes, it is extremely challenging to adopt a parametric approach to determine a selection function. By construction, the sample of LAEs used in this paper includes sources coming from very different detection conditions, from intrinsically bright emitters with moderate magnification to highly magnified galaxies that could not have been detected far from the critical lines. To properly take into account these differences when computing the LF, we have adopted a non parametric approach allowing us to treat the sources individually: the method (Schmidt, 1968; Felten, 1976). We present in this section the four steps developed to compute the LFs:

• The flux computation, performed for all the detected sources. This step was already described in Sect. 3.2 as the selection of the final sample relies partly on the results of the flux measurements.

• The volume computation for each of the sources included in the final sample, presented in Sect. 6.1.

• The completeness estimation using the real source profiles (both spatial and spectral), presented in Sect. 6.2.

• The computation of the points of the differential LF, using the results of the volume computation and the completeness estimations, presented in Sect. 6.3.

### 6.1 Volume computation in spectroscopic cubes in lensing clusters

The value is defined as the volume of the survey where an individual source could have been detected. The inverse value, , is used to determine the contribution of one source to a numerical density of galaxies. Because this survey consists of several fields of view, the value for a given source must be determined from all the fields that are part of the survey, including the fields where the source is not actually present. The volumes were computed in the source plane to avoid multiple counting of parts of the survey that are multiply imaged. For that, we used Lenstool to get the projection of the MUSE fields in the source plane and then used these projections to compute the volume (see Fig. 2 for an example of source plane projection). In this analysis, the volume computation was performed independently from the completeness estimation, focusing on the spectral noise variations of the cubes only.

The detectability of each LAEs needs to be evaluated on the entire survey to compute . This task is not straightforward, as the detectability depends on many different factors:

• The flux of the source: the brighter the source, the higher the chances to be detected. For a given spatial profile, brighter sources have higher values.

• The surface brightness and the line profile of the source: for a given flux, a compact source would have a higher surface brightness value than an extended one, and therefore would be easier to detect. This aspect is especially important as most LAEs have an extended halo (see Wisotzki et al. 2016).

• The local noise level: at first approximation, it depends on the exposure time. This point is especially important for mosaics where noise levels are not the same on different parts of the mosaic as the noisier parts contribute less to the values.

• The redshift of the source. The Lyman-alpha line profile of a source may be affected by the presence of strong sky lines in the close neighborhood. The cubes themselves have strong variations of noise level caused by the presence of those sky emission lines (See e.g., Fig. 5).

• The magnification induced by the cluster. Where the magnification is too small, the faintest sources could not have been detected.

• The seeing variation from one cube to another

This shows that, in order to properly compute , each source has to be individually considered. The easiest method to evaluate the detectability of sources is to simply mask the brightest objects of the survey, assuming that no objects could be detected behind them. This can be achieved from a white light image, using a mask generated from a SExtractor segmentation map. The volume computation can then be done on the unmasked pixels and only where the magnification is high enough to allow the detection of the source. However, as it is shown in Appendix C, this technique has some limitations to account for the 3D morphologies of real LAEs. For this reason, a method to precisely determine the detectability map (referred to as detection mask or simply masks hereafter) of individual sources has been developed. As the detection process in this work is based on 2D collapsed images, we have adopted the same scheme to build the 2D detection masks, and from them, build the 3D masks in the source plane adapted to each LAE of the sample. Using these individual source plane 3D masks, and as previously mentioned, the volume integration was performed on the unmasked pixels only where the magnification is high enough. In the paragraphs below, we quickly summarize the method adopted to produce masks for 2D images and explain the reasons that lead to the complex method detailed in Sects. 6.1.1 and 6.1.2.

The basic idea of our method producing masks for 2D images is to mimic the SExtractor source detection process: for each pixel in the detection image, we determine whether the source could have been detected, had it been centred on this pixel. For this pseudo detection, we fetch the values of the brightest pixels of the source (hereafter ) and compare them pixel-to-pixel to the background Root Mean Square maps (shortened to just RMS maps) produced by SExtractor from the detection image. The pixels where this pseudo detection is successful are left unmasked, and where it failed, the pixels are masked. Technical details of the method for 2D images can be found in appendix A.
The detection masks produced in this way are binary masks and show where the source could have been detected. We use the term “covering fraction” to refer to the fraction of a single FoV covered by a mask. A covering fraction of 1 means that the source could not be detected anywhere on the image, whereas a covering fraction of 0 means that the source could be detected on the entire image.

This method to produce the detection masks from 2D images is precise and quite simple to implement when the survey consists of 2D photometric images. However, when dealing with 3D spectroscopic cubes, its application becomes much more complicated due to the strong variations of noise level with wavelength in the cubes. Because of these variations, the detectability of a single source through the cubes cannot be represented by a single mask, duplicated on the spectral axis to form a 3D mask. An example of the spectral variations of noise level in a MUSE cube is provided in Fig. 5. These spectral variations are very similar for the four cubes. “Noise level” is used here to refer to the average level of noise on a single layer. It is determined from the RMS cubes, which are created by SExtractor from the detection cube (i.e, the Muselet cube of NB images). For a layer of the RMS cube, the noise level corresponds to the spatial median of the RMS layer over a normalization factor:

 Noise level(RMSi)=x,yx,y (2)

In this equation is the spatial median operator. The 2D median RMS map, , is obtained from a median along the wavelength axis for each spatial pixel of the RMS cube. The normalization is the spatial median value of the median RMS map. The main factor responsible for the high frequency spectral variations of noise level is the presence of sky lines affecting the variance of the cubes.

To properly account for the noise variations, the detectability of each source has to be evaluated throughout the spectral direction of the cubes by creating a series of detection masks from individual layers. These masks are then projected into the source plane for the volume computation. This step is the severely limiting factor, as it would take an excessive amount of computation time. For a sample of 160 galaxies in 4 cubes, sampling different noise levels in cubes at only 10 different wavelengths, we would need to do 6 400 Lenstool projections. This represents more than 20 days of computation on a 60 CPU computer, and it is still not representative of the actual variations of noise level versus wavelength. To circumvent this difficulty, we have developed a new approach that allows for a fine sampling of the noise level variations while drastically limiting the number of source plane reconstructions. A flow chart of the method described in the next sections is provided in Fig. LABEL:fig:flow_chart_volume.

The general idea of the method is to use a signal-to-noise proxy of individual sources instead of comparing their flux to the actual noise. In other words, the explicit computation of the detection mask for every source, wavelength layer and cube is replaced by a set of pre-computed masks for every cube, covering a wide range of SN values, in such a way that a given source can be assigned the mask corresponding to its SN in a given layer. Two independent steps were performed before assembling the final 3D masks:

• Computation of the evolution of SN values through the spectral dimension of the cubes for each LAE.

• For each cube, a series of 2D detection masks were created for an independent set of SN values. This is referred to as the SN curves hereafter.

These two steps are detailed below. The final 3D detection masks were then assembled by successively picking the 2D mask that corresponds to the SN value of the source at a given wavelength in a given cube. This process was done for all sources individually.

For the first step, the signal-to-noise value of a given source was defined as follows, from the bright pixels profile of the source and a RMS map, by comparing the maximum flux of the brightest pixels profile () to the noise level of that RMS map.

For each layer of the RMS cube, we compute the SN value the source would have had at that spectral position in the cube. We point out that this is not a proper signal-to-noise value (hence the use of the term “proxy”) as the normalization used to define the noise levels in Eq. 2 depends on the cube. For a layer of the RMS cube, the corresponding value is given by:

 SNi=max(Bp)Noise level(RMSi) (3)

An example of SN curve defined this way is provided in Fig. 7. For a given source, this computation was done on every layer of every cube part of the survey. When computing the SN of a given source in a cube different from the parent one, the seeing difference (see Table 1) is accounted for by introducing convolution or deconvolution procedure to set the detection image of the LAE to the resolution of the cube considered. As a result for each LAE, three additional images are produced. The four images (original detection image plus the three simulated ones) are then used to measure the value of the brightest pixels in all four seeing conditions. For the deconvolution a python implementation of a Wiener filter part of the Scikit-image package (van der Walt et al., 2014) was used. The deconvolution algorithm itself is presented in Orieux et al. (2010) and for all these computations, the PSF of the seeing is assumed to be gaussian.

For the second step, 2D masks are created from a set of SN values that encompass all the possible values for our sample. To produce a single 2D mask, the two following inputs are needed: the list of bright pixels of the source and the RMS maps produced from the detection image (in our case, the NB images produced by Muselet). To limit the number of masks produced, two simplifications were introduced, the main one being that all RMS maps of a same cube present roughly the same pattern down to a certain normalization factor. This is equivalent to say that all individual layers of the RMS cube can be approximately modeled and reproduced by a properly rescaled version of the same median RMS map. The second simplification is the use of four generalized bright-pixel profiles (hereafter ). To be consistent with the seeing variations, one profile is computed for each cluster, taking the median of all the individual LAE profiles computed from the detection images simulated in each seeing condition (see Fig. 17 for an example of generalized bright pixel profile, also including the effect of seeing). These profiles are normalized in such a way that . For each value of the SN set defined, a mask is created for each cluster from its median RMS map and the corresponding , meaning that the 2D detection masks are no longer associated with a specific source, but with a specific SN value.

Using the definition of SN adopted in Eq. 3, the four are rescaled to fit any value of the SN set and to obtain profiles that are directly comparable to the median RMS maps:

 SNj=max(cj×Bpg)Noise level(RMSmedian) (4)

where is the scaling factor. According to Eq. 2, the noise level of the median RMS maps is just 1, and as mentioned above . We can see that the scaling factor is simply . Therefore the four sets of bright-pixels profiles and the corresponding median RMS maps are used as input to produce the set of 2D detection masks.

After the completion of these two steps, the final 3D detection masks were assembled for every source individually. For this purpose, a subset of wavelength values (or equivalently, a subset of layer index) drawn from the wavelength axis of a MUSE cube was used to resample the SN curves of individual sources. For each source and each entry of this wavelength subset, the procedure fetches the value in the SN set that is the closest to the measured one, and returns the associated 2D detection mask, effectively assembling a 3D mask. An example of this 2D sampling is provided in Fig. 7. To each of the red points resampling the SN curve, a pre-computed 2D detection mask is associated, and the higher the density of the wavelength sampling, the higher the precision on the final reconstructed 3D mask. The important point being that, to increase the sampling density, we do not need to create more masks and therefore it is not necessary to increase the number of source plane reconstructions.

#### 6.1.2 Volume integration

In the previous section we presented the construction of 3D masks in the image plane for all sources, with a limited number of 2D masks. For the actual volume computation, the same was achieved in the source plane by computing the source plane projection of all the 2D masks, and combinning them with the magnification maps. Thanks to the method developed in the previous subsection, the number of source plane reconstructions only depends on the length of the SN set initially defined and the number of MUSE cubes used in the survey. It depends neither on the number of sources in the sample nor the accuracy of the sampling of the SN variations. For the projections, we used PyLenstool 333Python module written by G. Mahler, publicly available at http://pylenstool.readthedocs.io/en/latest/index.html that allows for an automated use of Lenstool. Reconstruction of the source plane were performed for different redshift values to sample the variation of both the shape of the projected area and the magnification. In practice, the variations are very small with redshift, and we reduce the redshift sampling to and .

In a very similar way to what is described at the end of the previous section, 3D masks were assembled and combined with magnification maps, in the source plane. In addition to the closest SN value, the procedure also looks for the closest redshift bin in such a way that, for a given point of the resampled SN curve, the redshift of the projection is the closest to .

The last important aspect to take into account when computing is to limit the survey to the regions where the magnification is such that the source could have been detected. The condition is given by:

 μlimμFdδFd=1 (5)

where is the flux weighted magnification of the source, the detection flux and the uncertainty on the detection which reflects the local noise properties. This condition simply states that is the magnification that would allow for a signal-to-noise ratio of 1, under which the detection of the source would be impossible. It is complex to find a signal-to-noise criterion to use here that would be coherent with the way Muselet works on the detection images, since the images used for the flux computation are different and of variable spectral width compared to the Muslet NBs. Therefore, this criterion for the computation of is intentionally conservative to not overestimate the steepness of the faint end slope.

To be consistent with the difference in seeing values and in exposure time from cube to cube, is computed for each LAE and for each MUSE cube (i.e., four values for a given LAE). A source only detected because of very high magnification in a shallow and bad seeing cube (e.g., A1689), would need a much smaller magnification to be detected in a deeper and better seeing cube (e.g., A2744). For the exposure time difference, the ratio of the median RMS value of the entire cube is used, and for the seeing, the ratio of the squared seeing value is used. In other words, the limiting magnification in A2744 for a source detected in A1689 is given by:

 μlim,A2744=x,y,λx,y,λs2A2744s2A1689×μlim,A1689 (6)

where is the median operator over the three axis of the RMS cubes and is the seeing. The exact same formula can be applied to compute the limit magnification of any source in any cube. This simple approximation is sufficient for now as only the volume of the rare LAEs with very high magnification are dominated by the effects of the limiting magnification.

The volume integration is performed from one layer of the source plane projected (and masked) cubes to the next, counting only pixels with . For this integration, the following cosmological volume formula was used:

 V=ωcH0∫zmaxzminD2L(z′)(1+z′)2E(z′)dz′ (7)

where is the angular size of a pixel, is the luminosity distance, and is given by :

 E(z)=√Ωm(1+z)3+(1−Ωm−ΩΛ)(1+z)2+ΩΛ (8)

In practice, and for a given source, when using more than 300 points to resample the SN curve along the spectral dimension, a stable value is reached for the volume (i.e, less than 5% of variation with respect to a sampling of 1 000 points). A comparison is provided in appendix C between the results obtained with this method and the equivalent ones when a simple mask based on SExtractor segmentation maps is adopted instead. The maximum covolume explored between , accounting for magnification, is about , distributed as follows among the four clusters: Mpc for A1689, Mpc for A2390, Mpc for A2667 and Mpc for A2744.

### 6.2 Completeness determination using real source profiles

Completeness corrections account for the sources missed during the selection process. Applying the correction is crucial for the study of the LF. The procedure used in this article separates, on one hand, the contribution to incompleteness due to SN effects across the detection area, and the contribution due to masking across the spectral dimension on the other hand (see in Sect. 6.1).

The 3D masking method presented in the previous section aims to precisely map the volume where a source could be detected. However, an additional completeness correction was needed to account for the fact that a source does not have a 100% chance of being detected on its own wavelength layer. In the continuity of the non parametric approach developed for the volume computation, the completeness was determined for individual sources. To better account for the properties of sources, namely their spatial and spectral profiles, simulations were performed using their real profiles instead of parameterized realizations. Because the detection of sources was done in the image plane, the simulations were also performed in the image plane, on the actual masked detection layer of a given source (i.e the layer of the NB image cube containing the peak of the Lyman-alpha emission of the source). The mask used on the detection layer was picked using the same method as described in 6.1.1, leaving only the cleanest part of the layer available for the simulations.

#### 6.2.1 Estimating the source profile

To get an estimate of the real source profile, we use the Muselet NB image that captures the peak of the Lyman-alpha emission (called the max-NB image hereafter).

Using a similar method to the one presented in Sect. 3.2, the extraction of sources on the max-NB images were forced by progressively loosening the detection criterion. The vast majority of our sources were successfully detected on the first try using the original parameters used by Muselet for the initial detection of the sample: DETECT_THRES = 1.3 and MIN_AREA = 6.

To recover the estimated profile of a source, the pixels belonging to the source were extracted on the filtered image according to the segmentation map. The filtered image here is the convolved and background subtracted image that SExtractor uses for the detection. The use of filtered images allowed us to retrieve a background-subtracted and smooth profile for each LAE. Fig. 8 presents examples of source profile recovery for three representative LAEs.

A flag was assigned to each extracted profile to reflect the quality of the extraction, based on a predefined set of parameters (detection threshold, minimum number of pixels and matching radius) used for the successful extraction of the source. A source with flag 1 is extremely trustworthy, and was recovered with the original set of parameters used for initial automated detection of the sample. A source with flag 2 is still a robust extraction, and a source with flag 3 is doubtful and is not used for the LF computation. 95% of LAEs were properly recovered with a flag value of 1. The summary of flag values is shown in Table 4. The three examples presented in Fig. 8 have a flag value of 1 and where recovered using DETECT_THRESH = 1.3, MIN_AREA=6 and a matching radius of . Objects with flag are less than 5% of the total sample. For the few sources with an extraction flag above 1, several possible explanations are found, listed by order of importance:

• The image used to recover the profiles () is smaller than the entire max-NB image. As the SExtractor background estimation depends on the size of the input image, this may slightly affect the detection of some objects. This is most likely the predominant reason for a flag value of two.

• Small difference in coordinates between the recovered position and the listed position. This may be due to a change in morphology with wavelength or band-width. By increasing the matching radius to recover the profile, we obtain a successful extraction but we also increase the value of the extraction flag.

• The NB used does not actually correspond to the NB that lead the source to be detected. By picking the NB image that catches the maximum of the Lyman-alpha emission we do not necessarily pick the layer with the cleanest detection. For example the peak could fall in a very noisy layer of the cube, whereas the neighboring layers would provide a much cleaner detection.

• The source is extremely faint and was actually detected with relaxed detection parameters or manually detected.

We have checked that we have not included LAEs that were expected to be at a certain position as part of multiple-image system. This is to say, we have not selected the noisiest images in multiple-image systems.

#### 6.2.2 Recovering mock sources

Once a realistic profile for all LAEs was obtained, source recovery simulations were conducted. For this step, the detection process was exactly the same as the one initially used for the sample detection. However, since we limited the simulations to the max-NB (see Sect. 6.2.1) images and not the entire cubes, we do not need to use the full Muselet software. To gain computation time, we only use SExtractor on the max-NB images, using the same configuration files that Muselet uses, to reproduce the initial detection parameters. In this section, the set of parameters were also DETECT_THRESH = 1.3 and MIN_AREA = 6.

To create the mock images, we use the masked max-NB images. Each source profile was randomly injected many times on the corresponding detection max-NB image, avoiding overlapping. After running the detection process on the mocks, the recovered sources were matched to the injected ones based on their position. The completeness values were derived by comparing the number of successful matches to the number of injected sources. The process was repeated forty times to derive the associated uncertainties.

The results of the completeness obtained for each source of the sample are shown in Fig. 9. The average completeness value over the entire sample is  0.74 and the median value is 0.90. The values are this high because we used masked NB images, effectively making source recovery simulations on the cleanest part of the detection layer only. As seen on this figure, there is no well-defined trend between completeness and detection flux. At a given flux, a compact source detected on a clean layer of the cube will have a higher completeness than a diffuse source with the same flux detected on a layer affected by a sky line. Four LAEs with a flag value of 3 or with a completeness value less than 10% are not used for the computation of the LFs in Sect. 6.3.

A more popular approach to estimate the completeness would be to perform heavy Monte Carlo simulations for each of the cubes in the survey to get a parameterized completeness (see Drake et al. 2017 for an example). The “classical” approach consists in injecting sources with parameterized spatial and spectral morphologies, and to retrieve the completeness as a function of redshift and flux. This method is extremely time consuming, in particular for IFUs where the extraction process is lengthy and tedious. The main advantage of computing the completeness based on the real source profile is that it allows us to accurately account for the different shape and surface brightness of individual sources. And because the simulations are done on the detection image of the source in the cubes, we are also more sensitive to the noise increase caused by sky lines. As seen in Fig.10, except from the obvious flux–completeness correlation, it is difficult to identify correlations between completeness and redshift or sky lines. This tends to show that the profile of the sources is a dominant factor when it comes to properly estimating the completeness. The same conclusion was reached in D17 and in Herenz et al. (2019). A non-parametric approach of completeness is therefore better suited in the case of lensing clusters, where a proper parametric approach is almost impossible to implement due to the large number of parameters to take into account (e.g. spatial and spectral morphologies including distortion effects, lensing configuration, cluster galaxies).

### 6.3 Determination of the Luminosity Function

To study the possible evolution of the LF with redshift, the 152 LAE population has been subdivided into several redshift bins : , and . In addition to these three LFs, the global LF for the entire sample was also determined. For a given redshift and luminosity bin, the following expression to build the points of the differential LFs was used:

 Φ(Li)=1ΔlogLi∑j1CjVmax,j (9)

where corresponds to the width of the luminosity bin in logarithmic space, is the index corresponding to the sources falling in the bin indexed by , and stands for the completeness correction of the source j.

To properly account for the uncertainties affecting each LAE, Monte Carlo (MC) iterations are performed to build 10 000 catalogs from the original catalog. For each LAE in the parent catalog, a random magnification is drawn from its , and a random flux and completeness values are also drawn assuming a Gaussian distribution of width fixed by their respective uncertainties. A single value of the LF was obtained at each iteration following Eq. 9. The distribution of LF values obtained at the end of the process was used to derive the average in linear space, and to compute asymmetric error bars. MC iterations are well suited to account for LAEs with poorly constrained luminosities. This happens for sources close, or even on the critical lines of the clusters. Drawing random values from their probability density and uncertainties for magnification and flux results in a luminosity distribution (see Eq. 1) that allows these sources to have a diluted contribution across several luminosity bins.

For the estimation of the cosmic variance, we used the cosmic variance calculator presented in Trenti & Stiavelli (2007). Lacking other options, a single compact geometry made of the union of the effective areas of the four FoVs is assumed and used as input for the calculator. The blank field equivalent of our survey is an angular area of about . Given that a MUSE FoV is a square of size , the observed area of the present survey is roughly square. Our survey is therefore roughly equivalent to a bit more than only one MUSE FoV in a blank field. The computation is done for all the bins as the value depends on the average volume explored in each bin as well as on the intrinsic number of sources. The uncertainty due to cosmic variance on the intrinsic counts of galaxies in a luminosity bin typically range from 15% to 20% for the global LF and from 15% to 30% for the LFs computed in redshift bins. For , the total error budget is dominated by the MC dispersion, which is mainly caused by objects with poorly constrained luminosity jumping from one bin to another during the MC process. The larger the bins the lesser this effect is, as a given source is less likely to jump outside of a larger bin. For the Poissonian uncertainty is slightly larger than the cosmic variance but does not completely dominate the error budget. Finally for , the Poissonian uncertainty is the dominant source of error due to the small volume and therefore the small number of bright sources in the survey.

The data points of the derived LFs and the corresponding error bars are listed in Table 5. These LF points provide solid constraints on the shape of the faint end of the LAE distribution. In the following sections, we elaborate on these results and discuss the evolution of the faint end slope as well as the implications for cosmic reionization.

## 7 Parametric fit of the LF

The differential LFs are presented in Fig. 11 for the four redshift bins. Some points in the LF, displayed as empty squares, are considered as unreliable and presented for comparison purpose only. Therefore, they are not used in the subsequent parametric fits. An LF value is considered as unreliable when it is dominated by the contribution of a single source, with either a small or a low completeness value, due to luminosity and/or redshift sampling. These unreliable points are referred to as “incomplete” hereafter. The rest of the points are fitted with a straight line as a visual guide, the corresponding 68% confidence regions are represented as shaded areas. For , the exercise is limited due to the large uncertainties and the lack of constraints on the bright end. The measured mean slope for the four LFs are: for , for , for and for . These values are consistent with no evolution of the mean slope with redshift.

In addition, and because the integrated value of each LF is of great interest regarding the constraints they can provide on the sources of reionization, the different LFs were fitted with the standard Schechter function (Schechter, 1976) using the formalism described in Dawson et al. (2007). The Schechter function is defined as:

 (10)

where is a normalization parameter, a characteristic luminosity that defines the position of the transition from the power law to the exponential law at high luminosity, and is the slope of the power law at low luminosity. In logarithmic scale the Schechter function writes as:

 Φlog(L)d(logL)=(Lloge)(Φ∗L∗)(LL∗)αexp(−LL∗)d(logL) (11)

It represents the numerical density per logarithmic luminosity interval. The fits were done using the python package Lmfit (Newville et al., 2014) which is specifically dedicated to non-linear optimization and provides robust estimations for confidence intervals. We define an objective function, accounting for the strong asymmetry in the error bars, whose results are then minimized in the least-square sense, using the default Levenberg-Marquardt method provided by the package. The results of this first minimization are then passed to a MCMC process444Lmift uses the emcee algorithm implementation of the emcee python package (see Foreman-Mackey et al. 2013) that uses the same objective function. The uncertainty on the three parameters of the Schechter function () are recovered from the resulting individual posterior distributions. The minimization in the least-square sens is an easy way to fit our data but is not guaranteed to give the most probable parameterization for the LFs. A more robust method would be the maximum-likelihood method. However, because of the non parametric approach used in this work to build the points of the LF, taking into account the specific complexity of the lensing regime, the implementation of a maximum likelihood approach such as the one developed in D17 or in Herenz et al. (2019) could not be envisaged.

Because of the use of lensing clusters, the volume of Universe explored is smaller than in blank field surveys. The direct consequence is that we are not very efficient in probing the transition area around and the high luminosity regime of the LF. Instead, the lensing regime is more efficient in selecting faint and low luminosity galaxies and is therefore much more sensitive to the slope parameter. To properly determine the three best parameters, additional data are needed to constrain the bright end of the LFs. To this aim, previous LFs from the literature are used and combined together into a single average LF with the same luminosity bin size as the LFs derived in this work. This last point is important to ensure that the fits are not dominated by the literature data points that are more numerous, with smaller bin sizes and uncertainties. In this way we determine the three Schechter parameters while properly sampling the covariance between them.

The choice of the precise data sets used for the Schechter fits is expected to have a significant impact on the results, including possible systematic effects. To estimate the extent of this effect and its contribution to uncertainties, different series of data sets were used to fit the LF, among those available in a given redshift interval (see Fig. 13). The best-fit parameters recovered are found to be always consistent within their own error bars.

In addition, the error bars do not account for the error introduced by the binning of the data. To further test the robustness of the slope measurement and to recover more realistic error bars, different bins were tested for the construction of the LF. The exact same fit process was applied to the resulting LFs. The confidence regions derived from these tests are shown in Fig. 12 for and . The bins used hereafter to build the LFs are identified in this figure as black lines. We estimate that these bins are amongst the most reliable possibilities, and in the following they will be referred to as the ”optimal” bins. They were determined in such a way that each bin is properly sampled in both redshift and luminosity, and has a reasonable level of completeness. From Fig. 12, it can be seen that is very stable for and that all the posterior distributions are very similar. Because we are able to probe very low luminosity regimes far below , the effect of binning on the measured slope is negligible for because of the increased statistics. As redshift increases, due to lower statistics and higher uncertainties, the effects of binning on the measured slope increases. For the LF is affected by a small overdensity of LAEs at resulting in a higher dispersion on the faint end slope value when testing different binnings. It was ensured that the optimal binning allowed this fit to be consistent with the fit made for : in both cases the points at , affected by the same sources at , are treated as a small overdensity with respect to the Schechter distribution. Finally, for , the lack of statistics seriously limits the possibilities of binnings to test. The only viable options are the two presented on the right panel of Fig. 12: in both cases the quality of the fit is poor compared to the other redshift bins, but the measured slopes are consistent within their own error bars.

The LF points from the literature used to constrain the bright end are taken from Blanc et al. (2011) and Sobral et al. (2018) for and , Dawson et al. (2007), Zheng et al. (2013) and Sobral et al. (2018) for and finally Ouchi et al. (2010), Santos et al. (2016), Konno et al. (2018) and Sobral et al. (2018) for . The goal here is to extend our own data towards the highest luminosities using available high-quality data with enough overlap to check the consistency with the present data set. The best fits and the literature data sets used for the fits are also shown in Fig. 13 as respectively full lines and lightly colored diamonds. The dark red colored regions indicate the 68% and 95% confidence areas for the Schechter fit. The best Schechter parameters are listed in Table 6. In addition, this Table contains the results obtained when the exact same method of LF computation is applied to the sources of A2744 as an independent data set. This is done to assess the robustness of the method and to see whether or not the addition of low volume / high magnification cubes add significant constraints on the faint end slopes. All four fits made using the complete sample are summed up in Fig. 14 which shows the evolution of the confidence regions for , and with redshift.

From Table 6, it can be seen that the results are very similar for and when considering A2744 only or the full sample. For and the recovered slopes exhibit a small difference at the level. This difference is caused by one single source with that has a high contribution to the density count. When adding more cubes and sources, the contribution of this LAE is averaged down because of the larger volume and the contribution of other LAEs. This argues in favor of a systematic underestimation of the cosmic variance in this work. Using the results of cosmological simulations to estimate a proper cosmic variance is out of the scope of this paper. For the higher redshift bin, even though the same slope is measured when using only the LAEs of A2744, the analysis can only be pushed down to (instead of for the other redshift bins or when using the full sample). This shows the benefit of increasing the number of lensing fields to avoid a sudden drop in completeness at high redshift. The effect of increasing the number of lensing fields will be addressed in a future article in preparation. In the following, only the results obtained with the full sample are discussed

The values measured for are in good agreement with the literature (e.g, in Dawson et al. (2007) for , in Santos et al. (2016) for and a fixed value of , and in Hu et al. (2010) for and a fixed value of ) and tend to increase with redshift. This is not a surprise as this parameter is most sensitive to the data points from the literature used to fit the Schechter functions. Given the large degeneracy and therefore large uncertainty affecting the normalization parameter , a direct comparison and discussion with previous studies is difficult and not so relevant. Regarding the parameter, the Schechter analysis reveals a steepening of the faint end slope with increasing redshift, which in itself means an increase in the observed number of low luminosity LAEs with respect to the bright population, with redshift. However, this is a trend that can only be seen in the light of the Schechter analysis, with a solid anchorage of the bright end, and cannot be seen using only the points derived in this work (see e.g., Fig. 11).

Taking advantage of the unprecedented level of constraints on the low luminosity regime, the present analysis has confirmed a steep faint end slope varying from at to at . The result for the lower redshift bin is not consistent with measured using the maximum likelihood technique in D17. At higher redshift, the slopes measured in D17 are upper limits, consistent with all the values in Table 6. The points in purple in Fig. 13 are the points derived with the from this same study. It can be seen that there is a systematic difference, increasing at lower luminosity for , and . This difference, taken at face value, could be evidence for a systematic underestimation of the cosmic variance both in this work and in D17. This aspect clearly requires further investigation in the future. Faint end slope values of for and for ( for ) were found in respectively Santos et al. (2016) and Konno et al. (2018). These values are reasonably consistent with our measurement made for . Here again, the comparison with the literature is quite limited as the faint end slope is often fixed (see e.g. Dawson et al., 2007; Ouchi et al., 2010) or the luminosity range probed is not adequate leading to poor constraints on .

From Fig. 13, we see that the Schechter function provides a relatively good fit for , , and . The overdensity in number count at for is indeed seen as an overdensity with respect to the Schechter distribution. For however, the fit is not as good with one point well above the confidence area. The final goal of this work is not the measurement of the Schechter slope in itself, but to provide a solid constraint on the shape of the faint end of the LF. Furthermore it is not certain that such a low luminosity population is expected to follow a Schechter distribution. Some studies have already been exploring the possibility of a turnover in the luminosity function of UV selected galaxies (e.g. Bouwens et al., 2017; Atek et al., 2018), and the same possibility is not to be excluded for the LAE population. For the specific needs of this work, it remains convenient to adopt a parametric form as it makes easier the computation of proper integrations with correct error transfer (see Sect. 8) and facilitates the comparison with previous and future works. When talking about integrated LFs, any reasonable deviations from the Schechter form is of little consequence as long as the fit is representative of the data. In other words, as long as no large extrapolation toward low luminosity is made, our Schechter fits provide a good estimation of the integrated values.

## 8 Discussion and contribution of LAEs to reionization

In this section, before going to the integration of the LFs and the constraints and implications for reionization, we discuss the uncertainties introduced by the use of lensing.

As part of the Hubble Frontiers Fields (HFF) program, several good quality mass models were produced and made publicly available by different teams, using different methodologies. The uncertainties introduced by the use of lensing fields when measuring the faint end of the UV luminosity function are discussed in detail in Bouwens et al. (2017) and Atek et al. (2018) through simulations. A more general discussion on the reason why mass models of the same lensing cluster may differ from one another can be found in Priewe et al. (2017). And finally, a thorough comparison of the mass reconstruction produced by different teams with different methods from simulated lensing clusters and HST images is done in Meneghetti et al. (2017). The uncertainties are of two types:

• The large uncertainties for high magnification values. This aspect is well treated in this work through the use of which allows any source to have a diluted and very asymmetric contribution to the LF over a large luminosity range. This aspect was already addressed in Sect. 5.

• The possible systematic variation from one mass model to another. This aspect is more complex as it has an impact on both the individual magnification of sources and on the total volume of the survey.

Figure 15 illustrates the problem of variation of individual magnification from one mass model to another, using the V4 models produced by the GLAFIC team (Kawamata et al., 2016; Kawamata, 2018), Sharon & Johnson (Johnson et al., 2014), and Keeton that are publicly available on the Frontier Fields website . Since we are restricted to the HFF, this comparison can only be done for the LAEs of A2744. The figure shows the Lyman-alpha luminosity histograms when using alternatively the individual magnification provided by these three additional models. The bin size is which is the bin size used in this work for the LFs at , and . For the highest dispersion is of the order of 15%. This shows that even though there is a dispersion when looking at the magnification predicted by the four models, the underlying luminosity population remains roughly the same. Regarding the needs of the luminosity function, this is the most important point.

Figure 10 of Atek et al. (2018) shows an example of the variations of volume probed with rest-frame UV magnitude using different mass models for the lensing cluster MACS1149. This evolution is very similar for the models derived by the Sharon and Keeton teams and, in the worst case scenario, implies a factor of 2 of difference among the models compared in this figure. These important variations are largely caused by the lack of constraints on the mass distribution outside of the multiple image area: a small difference in the outer slope of the mass density affects the overall mass of the cluster and therefore the total volume probed. However, unlike other lensing fields from the HFF program, A2744 has an unprecedented number of good lensing constraints at various redshifts thanks to the deep MUSE observations. These constraints were shared between the teams and are included in all the V4 models used for comparison in this work. These four resulting mass models are robust and coherent, at the state of the art of what can be achieved with the current facilities. It has also been shown by Meneghetti et al. (2017) based on simulated cluster mass distributions, that the methodology employed by the CATS (the CATS model for A2744 is the model presented in Mahler et al. (2018)) and GLAFIC teams are among the best to recover the intrinsic mass distribution of galaxy clusters. To test the possibility of a systematic error on the survey volume, the surface of the source plane reconstruction of the MUSE FoV is compared at using the same four models as in Fig. 15. The surfaces are , , and using respectively the mass models of Mahler, GLAFIC, Keeton and Sharon. The strongest difference is observed between the models of Mahler and Sharon and corresponds to a relatively small difference of only 25%.

Given the complex nature of the MUSE data combined with the lensing cluster analysis, precisely assessing the effect of a possible total volume bias is non trivial and out of the scope of this paper. From this discussion it seems clear that the use of lensing fields introduces an additional uncertainty on the faint end slope. However the luminosity limit under which this effect becomes dominant remains unknown as all the simulations (Bouwens et al., 2017; Atek et al., 2018) were only done for the UV LF for which the data structure is much simpler.

In order to estimate the contribution of the LAE population to the cosmic reionization, its Star Formation Rate Density (SFRD) was computed. From the best parameters derived in the previous section, the integrated luminosity density was estimated. The SFRD produced by the LAE population can be estimated using the following prescription for the (Kennicutt, Jr., 1998) assuming the case B for the recombination (Osterbrock & Ferland, 2006):

 SFRDLyα[M⊙yr−1Mpc−3]=LLyα[erg s−1 Mpc−3]/1.05×1042 (12)

This expression assumes an escape fraction of the Lyman-alpha photons () of 1 and is therefore a lower limit for the SFRD. Uncertainties on this integration were estimated with MC iterations, by perturbing the best-fit parameters within their rescaled error bars, neglecting the correlations between the parameters. The values obtained for the and are presented in Table 6, for a lower limit of integration of which corresponds to the lowest luminosity points used to fit the LFs (i.e, no extrapolation towards lower luminosities). is used as upper limit for all integrations. The upper limit has virtually no impact on the final result because the LF drops so steeply at higher luminosity.

We show on Fig. 16 the results obtained using different lower limits of integration and how they compare to previous studies of both LBG and LAE luminosity functions. The yellow area corresponds to the and SFRD needed to fully reionize the universe, estimated from the cosmic ionizing emissivity derived in Bouwens et al. (2015a). The cosmic emissivity was derived using a clumping factor of 3, the conversion to UV luminosity density was done assuming = 24.50 where is the escape fraction of UV photons and is the Lyman-continuum photon production efficiency. Finally the conversion to SFRD was done with the following relation (see Kennicutt, Jr. 1998; Madau et al. 1998). Because all the slopes are over (for the integral of the Schechter parameterisation diverges), the integrated values increase relatively slowly when decreasing the lower luminosity limit. On the same plot, the SFRD computed from the integration of the LFs derived in Bouwens et al. (2015b) are shown in darker grey for two limiting magnitudes : (which is the observation limit) and which is thought to be the limit of galaxy formation (e.g. Rees & Ostriker 1977, Mac Low & Ferrara 1999 and Dijkstra et al. 2004).

From this plot, and with , we see that the observed LAE population only is not enough to fully reionize the universe at , even with a large extrapolation of 2 dex down to . However, a straightforward comparison is dangerous: an escape fraction would be roughly enough to match the cosmic ionizing emissivity needed for reionization at . Moreover, in this comparison, we implicitly assumed that the LAE population has the same properties () as the LBG population in Bouwens et al. (2015b). A recent study on the typical values of and its scatter for typical star forming galaxies at by Shivaei et al. (2018) has shown that is highly uncertain due to galaxy-to galaxy variations on the stellar population and UV dust attenuation, while most current estimates at high- rely on (too) simple prescriptions from stellar population models. The SFRD obtained from LAEs when no evolution in is introduced remains roughly constant as a function of redshift when no extrapolation is introduced and slightly increases with redshift when using . Figure 16 shows in green/blue, the values derived in previous studies of the LAE LF, namely Ouchi et al. (2008), Cassata et al. (2011) (hereafter, O08, C11) and D17. In C11, a basic correction for IGM absorption was performed assuming varying from 15% at to 50% at and using a simple radiative transfer prescription from Fan et al. (2006). This correction can easily explain the clear trend of increase of SFRD with redshift and the discrepancy with our points at higher redshift. At lower redshifts, the IGM correction is lower and the points are in a relatively good agreement. The points in O08 are the result of a full integration of the LFs with a slope fixed at and are in reasonable agreement for all redshift domains. The two higher redshift points derived in D17 are inconsistent with our measurements. This is not a surprise as the slopes derived in D17 are systematically steeper and inconsistent with this work.

The use of an IFU (MUSE) in D17, in Herenz et al. (2019) (hereafter H19) and in this survey ensures that we better recover the total flux, even though we may still miss the faintest part of the extended Lyman-alpha halos (see Wisotzki et al. 2016). This is not the case for NB (e.g. O08, ) or slit-spectroscopy (e.g. Cassata et al., 2011) surveys where a systematic loss of flux is possible for spatially extended sources or broad emission lines due to the limited aperture of the slits or the limited spectral width of NB filters. It is noted in H19 that the LF estimates in C11 tend to be lower than most literature estimates (including those in H19). One possible explanation would be a systematic loss of flux which results in a systematic shift of the derived LF towards lower luminosities. Interestingly, when assuming point-like sources to compute the selection function, H19 manages to recover very well the results of C11 for this redshift domain. It is also interesting to see that as luminosity decreases, the LF estimates from C11 become more and more consistent with the points and Schechter parameterisation derived in this work. For , the C11 LF is even fully consistent with the Schechter parameterisation across the entire luminosity domain (see Fig. 13). The following line of thought could explain the concordance of this work with the C11 estimates at lower luminosity and higher redshift:

• at lower luminosity/higher redshift, a higher fraction of LAEs detected are point-like sources, making the C11 LFs more consistent with our values.

• at higher luminosity / lower redshift, more extended LAEs are detected and a more complex correction is needed to get a realistic LF estimate.

The second advantage of using an IFU is linked to the selection of the LAE population. O08 use a NB photometric selection of sources with spectroscopic follow-up to confirm the LAE candidates. This results in an extremely narrow redshift window which is likely to lead to lower completeness of the sample due to the two-steps selection process. The studies by D17 and H19, adopt the same approach as this work: a blind spectroscopic selection of sources. In addition, as shown in Fig. 4 and stated in Sect. 7 when discussing the differences in slope between A2744 alone and the full sample, the use of highly magnified observations allows for a more complete source selection at increasing redshift. The sample used in the present work could arguably have a higher completeness level than other previous studies.

To summarize the above discussion, the observational strategy adopted in this study by combining the use of MUSE and lensing clusters has allowed us to:

• reach fainter luminosities, providing better constraints on the faint end slope of the LF, while still taking advantage of the previous studies to constrain the bright end.

• recover a greater fraction of flux for all LAEs.

• cover a large window in redshift and flux.

• reach a higher level of completeness, especially at high redshift.

A steepening of the faint end slope is observed with redshift which follows what is usually expected. This trend can be explained by a higher proportion of low luminosity LAEs observed at higher redshift due to higher dust content at lower redshift. On the other hand, the density of neutral hydrogen is expected to increase across the interval, reducing the escape fraction of Lyman-alpha photons, a trend affecting LAEs in a different way depending on large scale structure. While an increase of SFRD with redshift is observed, the evolution of the observed is also affected by . From the literature point of view, the expected evolution of is an increase with redshift up to and then a sudden drop at higher redshift (see e.g. Clément et al. 2012, Pentericci et al. 2014). For , the increase of is generally explained by the reduced amount of dust at higher redshift. And for and above, we start to probe the reionization era and due to the increasing amount of neutral hydrogen and the resonant nature of the , the escape fraction is expected to drop at some point. It has been suggested in Trainor et al. (2015) and Matthee et al. (2016) that the escape fraction would decrease with an increasing SFRD. This would only increase the significance of the trend observed, as it means the points with the higher SFRD would have a larger correction.

Furthermore the derived LFs and the corresponding SFRD values could be affected by bubbles of ionized hydrogen, especially in the last redshift bin. In our current understanding of the phenomenon, reionization is not an homogeneous process (Becker et al., 2015; Bosman et al., 2018). It could be that the expanding areas of ionized hydrogen develop faster in the vicinity of large structures with a high ionizing flux, leaving other areas of the universe practically untouched. There is an increasing observational evidence of this effect (see e.g. Stark et al. 2017). It was shown in Matthee et al. (2015), using a simple toy model, that an increased amount of neutral hydrgen in the IGM could produce a flattening of the faint end shape of the LF. This same study also concluded that the clustering of LAEs had a large impact on the individual escape fraction, which makes it difficult to estimate a realistic correction, as the escape fraction should be estimated on a source to source basis.

As previously discussed, it is neither certain nor expected that the LAE population alone is enough to reionize the universe at . However, the LBG and the LAE population have roughly the same level of contribution to the total SFRD at face value. Depending on the intersection between the two populations, the observed LAEs and LBGs together could produce enough ionizing flux to maintain the ionized state of the universe at .

This question of the intersection is crucial in the study of the sources of reionization. Several authors have addressed the prevalence of LAE among LBG galaxies, and shown that the fraction of LAE increases for low-luminosity UV galaxies till , whereas the LAE fraction strongly decreases towards (see e.g. Stark et al. 2010, Pentericci et al. 2011). The important point however is to precisely determine the contribution of the different populations of star-forming galaxies within the same volume, a problem that requires the use of 3D/IFU spectroscopy. As a preliminary result, we estimate that % of the sample presented in this study has no detected counterpart on the deep images of the Hubble Frontier Fields. A similar analysis is being conducted on the deepest observations of MUSE on the Hubble Ultra Deep Field (Maseda et al., 2018).

## 9 Conclusions

The goal of this study was to set constraints on the sources of cosmic reionization by studying the LAE LF. Taking advantage of the great capabilities of the MUSE instrument and using lensing clusters as a tool to reach lower luminosities, we blindly selected a population of 156 spectroscopically identified LAEs behind four lensing clusters, with and magnification corrected luminosities .

Given the complexity in combining the spectroscopic data cubes of MUSE with gravitational lensing, and taking into account that each source needs an appropriate treatment to properly account for its magnification and representativity, the computation of the LF needed a careful implementation, including some original developments. For these needs, a specific procedure has been developed, including the following new methods:

• A precise computation for the sources found behind lensing clusters. Based on the creation of 3D masks, this method allows to precisely map the detectability of a given source in MUSE spectroscopic cubes. These masks are then used to compute the cosmological volume in the source plane. This method could be easily adapted to be used in blank field surveys.

• Completeness determination based on simulations using the real profile of the sources. Instead of performing a heavy parametric approach based on Monte Carlo source injection and recovery simulations, which is not ideally suited for lensed galaxies, this method uses the real profile of sources to estimate their individual completeness. The method is faster, more flexible and accounts in a better way for the specificities of individual sources, both in the spatial and spectral dimensions.

After applying this procedure to the LAE population, the Lyman-alpha Luminosity Function has been built for different redshift bins using 152 of the 156 detected LAEs. Four LAEs were removed because their contribution was not trustworthy. Because of the observational strategy, this study provides the most reliable constraints on the shape of the faint-end of the LFs to date and therefore, a more precise measurement of the integrated SFRD associated to the LAE population. The results and conclusions can be summarized as follows :

• The LAE population found behind the four lensing clusters was split in four redshift bins: , , and . Due to the lensing effect, the volume of universe probed is greatly reduced in comparison to blank field studies. The estimated average volume of universe probed in the four redshift bins are respectively Mpc, Mpc, Mpc and Mpc.

• The LAE LF was computed for the four redshift bins. By construction of the sample, the derived LFs efficiently probe the low luminosity regime and the data from this survey alone provide solid constraints on the shape of the faint end of the observed LAE LFs. No significant evolution in the shape of the LF with redshift is found using these points only. These results have to be taken with caution given the complex nature of the lensing analysis on one hand, and the small effective volume probed by the current sample on the other hand. Our results argue in the sense of a possible systematic underestimation of cosmic variance in the present and other similar works.

• A Schechter fit of the LAE LF was performed by combining the LAE LF computed in this analysis with data from previous studies to constrain the bright end. As a result of this study, a steep slope was measured for the faint end, varying with redshift between at and at

• The values were obtained as a function of redshift by the integration of the corresponding Lyman-alpha LF, and compared to the levels needed to ionize the universe as determined in Bouwens et al. (2015a). No assumptions were made regarding the escape fraction of the Lyman-alpha photons and the derived in this work correspond to the observed values. Because of the well-constrained LFs and a better recovery of the total flux, we estimate that the present results are more reliable than previous studies. Even though the LAE population undoubtedly contributes to a significant fraction of the total SFRD, it remains unclear whether this population alone is enough to ionize the universe at . The results depend on the actual escape fraction of Lyman-alpha photons.

• The LAEs and the LBGs have a similar level of contribution at to the total SFRD level of the universe. Depending on the intersection between the two populations, the union of both the LAE and LBG populations may be enough to reionize the universe at .

Through this work, we have shown that the capabilities of the MUSE instrument make it an ideal tool to determine the LAE LF. Being an IFU, MUSE allows a blind survey of LAEs, homogeneous in redshift, with a better recovery of the total flux as compared to classical slit facilities. The selection function is also better understood as compared to NB imaging.

About of the present LAE sample have no identified photometric counterpart, even on the deepest surveys to date (HST Frontier Fields). This is an important point to keep in mind as this is a first element of response regarding the intersection between the LAE and LBG populations. Further investigation is needed to better quantify this intersection. Also the extension of the method presented in this paper to other lensing fields should make it possible to improve the determination of the Lyman-alpha LF, and to make the constraints on the sources of the reionization more robust.

###### Acknowledgements.
We thank the anonymous referee for her/his critical review and useful suggestions. This work has been carried out thanks to the support of the OCEVU Labex (ANR-11-LABX-0060) and the A*MIDEX project (ANR-11-IDEX-0001-02) funded by the ”Investissements d’Avenir” French government program managed by the ANR. Partially funded by the ERC starting grant CALENDS (JR, VP, BC, JM), the Agence Nationale de la recherche bearing the reference ANR-13-BS05-0010-02 (FOGHAR), and the “Programme National de Cosmologie and Galaxies” (PNCG) of CNRS/INSU, France. GdV, RP, JR, GM, JM, BC and VP also acknowledge support by the Programa de Cooperacion Cientifica - ECOS SUD Program C16U02. NL acknowledges funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 669253), ABD acknowledges support from the ERC advanced grant “Cosmic Gas” and LW acknowledges support by the Competitive Fund of the Leibniz Association through grant SAW-2015-AIP-2. Based on observations made with ESO Telescopes at the La Silla Paranal Observatory under programme IDs 060.A-9345, 094.A-0115, 095.A-0181, 096.A-0710, 097.A0269, 100.A-0249 and 294.A-5032. Also based on observations obtained with the NASA/ESA Hubble Space Telescope, retrieved from the Mikulski Archive for Space Telescopes (MAST) at the Space Telescope Science Institute (STScI). STScI is operated by the Association of Universities for Research in Astronomy, Inc. under NASA contract NAS 5-26555. This research made use of Astropy, a community-developed core Python package for Astronomy (Astropy Collaboration et al., 2013). All plots in this paper were created using Matplotlib (Hunter, 2007).

## References

• Akhlaghi & Ichikawa (2015) Akhlaghi, M. & Ichikawa, T. 2015, ApJ Supplement Series, 220, 1
• Astropy Collaboration et al. (2013) Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33
• Atek et al. (2014) Atek, H., Richard, J., Kneib, J.-P., et al. 2014, ApJ, 786, 60
• Atek et al. (2018) Atek, H., Richard, J., Kneib, J.-p., & Schaerer, D. 2018, MNRAS, 479, 5184
• Bacon et al. (2010) Bacon, R., Accardo, M., Adjali, L., et al. 2010, in Ground-based and Airborne Instrumentation for Astronomy III, Vol. 7735, 773508
• Bacon et al. (2015) Bacon, R., Brinchmann, J., Richard, J., et al. 2015, A&A, 575, A75
• Bacon et al. (2017) Bacon, R., Conseil, S., Mary, D., et al. 2017, A&A, 608, A1
• Becker et al. (2015) Becker, G. D., Bolton, J. S., Madau, P., et al. 2015, MNRAS, 447, 3402
• Bertin & Arnouts (1996) Bertin, E. & Arnouts, S. 1996, A&A Supplement Series, 117, 393
• Bina et al. (2016) Bina, D., Pelló, R., Richard, J., et al. 2016, A&A, 14
• Blanc et al. (2011) Blanc, G. a., Adams, J. J., Gebhardt, K., et al. 2011, ApJ, 736, 31
• Bosman et al. (2018) Bosman, S. E. I., Fan, X., Jiang, L., et al. 2018, MNRAS, 479, 1055
• Bouwens et al. (2015a) Bouwens, R. J., Illingworth, G. D., Oesch, P. A., et al. 2015a, ApJ, 811, 140
• Bouwens et al. (2015b) Bouwens, R. J., Illingworth, G. D., Oesch, P. A., et al. 2015b, ApJ, 803, 34
• Bouwens et al. (2017) Bouwens, R. J., Oesch, P. A., Illingworth, G. D., Ellis, R. S., & Stefanon, M. 2017, ApJ, 843, 129
• Broadhurst et al. (2005) Broadhurst, T., Benitez, N., Coe, D., et al. 2005, ApJ, 621, 53
• Caminha et al. (2016) Caminha, G., Karman, W., Rosati, P., et al. 2016, A&A, 595, A100
• Cassata et al. (2011) Cassata, P., Le Fèvre, O., Garilli, B., et al. 2011, A&A, 525, A143
• Clément et al. (2012) Clément, B., Cuby, J. G., Courbin, F., et al. 2012, A&A, 538, A66
• Covone et al. (2006) Covone, G., Kneib, J.-P., Soucail, G., et al. 2006, A&A, 456, 409
• Dawson et al. (2007) Dawson, S., Rhoads, J. E., Malhotra, S., et al. 2007, ApJ, 671, 1227
• Dijkstra et al. (2004) Dijkstra, M., Haiman, Z., Rees, M. J., & Weinberg, D. H. 2004, ApJ, 601, 666
• Drake et al. (2017) Drake, A., Garel, T., & Wisotzki, L. 2017, A&A, 608
• Elíasdóttir et al. (2007) Elíasdóttir, Á., Limousin, M., Richard, J., et al. 2007, ArXiv e-prints [\eprint[arXiv]0710.5636]
• Faber & Jackson (1976) Faber, S. M. & Jackson, R. E. 1976, ApJ, 204, 668
• Fan et al. (2006) Fan, X., Strauss, M. A., Becker, R. H., et al. 2006, AJ, 132, 117
• Felten (1976) Felten, J. E. 1976, ApJ, 207, 700
• Fontanot et al. (2012) Fontanot, F., Cristiani, S., & Vanzella, E. 2012, MNRAS, 425, 1413
• Foreman-Mackey et al. (2013) Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306
• Herenz et al. (2017) Herenz, E. C., Urrutia, T., Wisotzki, L., et al. 2017, A&A, 606, A12
• Herenz et al. (2019) Herenz, E. C., Wisotzki, L., Saust, R., et al. 2019, A&A, 621, A107
• Hu et al. (2010) Hu, E. M., Cowie, L. L., Barger, A. J., et al. 2010, ApJ, 725, 394
• Hunter (2007) Hunter, J. D. 2007, Computing In Science & Engineering, 9, 90
• Infante et al. (2015) Infante, L., Zheng, W., Laporte, N., et al. 2015, ApJ, 815, 18
• Ishigaki et al. (2015) Ishigaki, M., Kawamata, R., Ouchi, M., et al. 2015, ApJ, 799, 12
• Johnson et al. (2014) Johnson, T. L., Sharon, K., Bayliss, M. B., et al. 2014, ApJ, 797, 48
• Jullo & Kneib (2009) Jullo, E. & Kneib, J. P. 2009, MNRAS, 395, 1319
• Jullo et al. (2007) Jullo, E., Kneib, J. P., Limousin, M., et al. 2007, New Journal of Physics, 9, 447
• Karman et al. (2016) Karman, W., Grillo, C., Balestra, I., et al. 2016, A&A, 585, A27
• Kashikawa et al. (2006) Kashikawa, N., Shimasaku, K., Malkan, M. A., et al. 2006, ApJ, 648, 7
• Kawamata (2018) Kawamata, R. 2018, ApJ, 855, 4
• Kawamata et al. (2016) Kawamata, R., Oguri, M., Ishigaki, M., Shimasaku, K., & Ouchi, M. 2016, ApJ, 819, 0
• Kennicutt, Jr. (1998) Kennicutt, Jr., R. C. 1998, ApJ, 498, 541
• Kneib et al. (1996) Kneib, J.-P., Ellis, R. S., Smail, I., Couch, W. J., & Sharples, R. M. 1996, ApJ, 471, 643
• Konno et al. (2014) Konno, A., Ouchi, M., Ono, Y., et al. 2014, ApJ, 797, 16
• Konno et al. (2018) Konno, A., Ouchi, M., Shibuya, T., et al. 2018, PASJ, 70, S16
• Lagattuta et al. (2017) Lagattuta, D. J., Richard, J., Clément, B., et al. 2017, MNRAS, 469, 3946
• Laporte et al. (2016) Laporte, N., Infante, L., Troncoso Iribarren, P., et al. 2016, ApJ, 820, 98
• Laporte et al. (2014) Laporte, N., Streblyanska, A., Clement, B., et al. 2014, A&A, 562, L8
• Limousin et al. (2007) Limousin, M., Richard, J., Jullo, E., et al. 2007, ApJ, 668, 643
• Livermore et al. (2017) Livermore, R. C., Finkelstein, S. L., & Lotz, J. M. 2017, ApJ, 835, 113
• Lotz et al. (2017) Lotz, J. M., Koekemoer, A., Coe, D., et al. 2017, ApJ, 97 [\eprint[arXiv]1605.06567]
• Mac Low & Ferrara (1999) Mac Low, M. & Ferrara, A. 1999, ApJ, 513, 142
• Madau et al. (1998) Madau, P., Pozzetti, L., & Dickinson, M. 1998, ApJ, 498, 106
• Mahler et al. (2018) Mahler, G., Richard, J., Clément, B., et al. 2018, MNRAS, 692, 663
• Maizy et al. (2010) Maizy, A., Richard, J., de Leo, M., Pelló, R., & Kneib, J. 2010, A&A, 509, A105
• Maseda et al. (2018) Maseda, M. V., Bacon, R., Franx, M., et al. 2018, ApJ Letters, 865, L1
• Matthee et al. (2016) Matthee, J., Sobral, D., Oteo, I., et al. 2016, MNRAS, 458, 449
• Matthee et al. (2015) Matthee, J., Sobral, D., Santos, S., et al. 2015, MNRAS, 451, 400
• McGreer et al. (2013) McGreer, I. D., Jiang, L., Fan, X., et al. 2013, ApJ, 768, 105
• McLure et al. (2009) McLure, R., Cirasuolo, M., Dunlop, J., Foucaud, S., & Almaini, O. 2009, MNRAS, 395, 2196
• Meneghetti et al. (2017) Meneghetti, M., Natarajan, P., Coe, D., et al. 2017, MNRAS, 3216, 3177
• Newville et al. (2014) Newville, M., Stensitzki, T., Allen, D. B., & Ingargiola, A. 2014
• Oke & Gunn (1983) Oke, J. B. & Gunn, J. E. 1983, ApJ, 266, 713
• Olmstead et al. (2014) Olmstead, A., Rigby, J. R., Swinbank, M., & Veilleux, S. 2014, AJ, 148
• Orieux et al. (2010) Orieux, F., Giovannelli, J.-f., & Rodet, T. 2010, OSA, 27, 1593
• Osterbrock & Ferland (2006) Osterbrock, D. E. & Ferland, G. J. 2006, Mercury, 35, 40
• Ouchi et al. (2008) Ouchi, M., Shimasaku, K., Akiyama, M., Simpson, C., & Saito, T. 2008, ApJ Supplement Series, 301
• Ouchi et al. (2010) Ouchi, M., Shimasaku, K., Furusawa, H., et al. 2010, ApJ, 723, 869
• Ouchi et al. (2004) Ouchi, M., Shimasaku, K., Okamura, S., et al. 2004, ApJ, 611, 660
• Patricio et al. (2018) Patricio, V., Richard, J., Carton, D., et al. 2018, MNRAS, 28, 1
• Pello et al. (1991) Pello, R., Le Borgne, J.-F., Soucail, G., Mellier, Y., & Sanahuja, B. 1991, ApJ, 366, 405
• Pentericci et al. (2011) Pentericci, L., Fontana, A., Vanzella, E., et al. 2011, ApJ, 743, 132
• Pentericci et al. (2014) Pentericci, L., Vanzella, E., Fontana, A., et al. 2014, ApJ, 793, 113
• Piqueras et al. (2017) Piqueras, L., Conseil, S., Shepherd, M., et al. 2017, ArXiv e-prints, arXiv:1710.03554
• Priewe et al. (2017) Priewe, J., Williams, L. L. R., Liesenborgs, J., Coe, D., & Rodney, S. A. 2017, MNRAS, 465, 1030
• Rees & Ostriker (1977) Rees, M. J. & Ostriker, J. P. 1977, MNRAS, 179, 541
• Rhoads et al. (2000) Rhoads, J. E., Malhotra, S., Dey, A., et al. 2000, ApJ, 545, L85
• Ricci et al. (2017) Ricci, F., Marchesi, S., Shankar, F., La Franca, F., & Civano, F. 2017, MNRAS, 465, 1915
• Richard et al. (2014) Richard, J., Jauzac, M., Limousin, M., et al. 2014, MNRAS, 444, 268
• Richard et al. (2015) Richard, J., Patricio, V., Martinez, J., et al. 2015, MNRAS: Letters, 446, L16
• Richard et al. (2010) Richard, J., Smith, G. P., Kneib, J. P., et al. 2010, MNRAS, 404, 325
• Richard et al. (2008) Richard, J., Stark, D. P., Ellis, R. S., et al. 2008, ApJ, 685, 705
• Robertson et al. (2015) Robertson, B. E., Ellis, R. S., Furlanetto, S. R., & Dunlop, J. S. 2015, ApJ, 802, L19
• Robertson et al. (2013) Robertson, B. E., Furlanetto, S. R., Schneider, E., et al. 2013, ApJ, 768, 71
• Santos et al. (2016) Santos, S., Sobral, D., & Matthee, J. 2016, MNRAS, 14, 1
• Schechter (1976) Schechter, P. L. 1976, APJ, 297 - 306, 610
• Schiminovich et al. (2005) Schiminovich, D., Ilbert, O., Arnouts, S., et al. 2005, ApJ, 619, L47
• Schmidt (1968) Schmidt, M. 1968, ApJ, 151, 393
• Shivaei et al. (2018) Shivaei, I., Reddy, N. A., Siana, B., et al. 2018, ApJ, 855, 42
• Sobral et al. (2018) Sobral, D., Santos, S., Matthee, J., et al. 2018, MNRAS, 476, 4725
• Soto et al. (2016) Soto, K. T., Lilly, S. J., Bacon, R., Richard, J., & Conseil, S. 2016, MNRAS, 458, 3210
• Stark et al. (2009) Stark, D. P., Ellis, R. S., Bunker, A., et al. 2009, ApJ, 697, 1493
• Stark et al. (2017) Stark, D. P., Ellis, R. S., Charlot, S., et al. 2017, MNRAS, 464, 469
• Stark et al. (2010) Stark, D. P., Ellis, R. S., Chiu, K., Ouchi, M., & Bunker, A. 2010, MNRAS, 408, 1628
• Steidel et al. (1996) Steidel, C. C., Giavalisco, M., Pettini, M., Dickinson, M., & Adelberger, K. L. 1996, ApJ, 462, L17
• Tilvi et al. (2014) Tilvi, V., Papovich, C., Finkelstein, S., et al. 2014, ApJ, 794, 5
• Trainor et al. (2015) Trainor, R. F., Steidel, C. C., Strom, A. L., & Rudie, G. C. 2015, ApJ, 809, 89
• Trenti & Stiavelli (2007) Trenti, M. & Stiavelli, M. 2007, ApJ
• van der Burg et al. (2010) van der Burg, R., Hildebrandt, H., & Erben, T. 2010, A&A, 523, A74
• van der Walt et al. (2014) van der Walt, S., Schönberger, J. L., Nunez-Iglesias, J., et al. 2014, PeerJ, 2, e453
• Weilbacher et al. (2012) Weilbacher, P. M., Streicher, O., Urrutia, T., et al. 2012, in Software and Cyberinfrastructure for Astronomy II, Vol. 8451, 84510B
• Weilbacher et al. (2014) Weilbacher, P. M., Streicher, O., Urrutia, T., et al. 2014, 485, 451
• Willott et al. (2010) Willott, C. J., Delorme, P., Reylé, C., et al. 2010, ApJ, 139, 906
• Wisotzki et al. (2016) Wisotzki, L., Bacon, R., Blaizot, J., et al. 2016, A&A, 587, A98
• Zheng et al. (2013) Zheng, Z. Y., Finkelstein, S. L., Finkelstein, K., et al. 2013, MNRAS, 431, 3589

## Appendix A Method to create a mask for a 2D image

In this section we describe the generic method used to create a mask from the detection image of one given source. The goal is to produce a binary mask, or detection mask, that indicate where the source could have been detected. The details on how this generic method can be adapted to produce masks for spectroscopic cubes can be found in Sect. 6.1. The method relies on the detection process itself. For each pixel of the detection image, it checks whether the object would have been detected had it been centered on that pixel. This is done by comparing the local noise to the signal of the brightest pixels of the source used as input.

The method is based on SExtractor. To perform the source detection, SExtractor uses a set of parameters, the most important of which are the DETECT_THRESH and MIN_AREA. The first parameter corresponds to a detection threshold, and the second one to a minimal number of neighboring pixels. SExtractor works on a convolved and background subtracted image, called the filtered image. A source is only detected if at least MIN_AREA neighboring pixels are DETECT_THRESH times above the background RMS map (shortened to only RMS map) produced from the detection image. This RMS map is the noise map of the background image also computed by SExtractor. The comparison between the filtered image and the RMS map is done pixel to pixel meaning that filtered[x,y] is compared to RMS[x,y]

The detection mask computation method is based on the same two parameters: DETECTION_THRESH and MIN_AREA. From the filtered image, the procedure selects only the MIN_AREA brightest pixels of the source, (we call this list of values Bp), and compare them to the RMS map. The bright pixels profiles of our LAE sample are shown on Fig. 17 for illustration purpose. This list contains all the information related to the spatial features of the input source needed by the method. The adopted criterion is close to the one applied by SExtractor for the detection even though it is not, strictly speaking, the same:

• For each pixel [x,y] of the RMS map, a list of 9 RMS pixels is created, the list contains the central RMS pixel as well as the 8 connected neighbouring RMS pixel values. We call this list local_noise[x,y].

• From the Bp list that contains the brightest pixel of the input source, min(Bp) is determined and only this value used for the comparison to local_noise. For the comparison, the following criterion is used: if any value in local_noise[x,y] fulfills the condition min(Bp) / DETECT_THRESH < local_noise[x,y], then the pixel [x,y] is masked. In all of the other cases, the central pixel remains unmasked. This criterion is a bit looser than the one used by SExtractor as the comparison is only done for min(Bp) and not for all the pixels. However assuming that the noise in a certain small area is not too drastically different, the SExtractor criterion and the one we use are still very close. If min(Bp) fulfills the criterion, is it very likely that the other bright pixels, who all have higher signal values, will also fulfills the same criterion at some point on the 9 pixels area.

• The operation is performed for each pixel of the RMS map.

An example of application is given in figure 18. In both cases, the lowest values of the bright pixel list are compared to the 9 pixels in the area set by the red square. The lowest value of the Bp list is set to 6. Using DETECT_THRESH = 2, for the central pixel to be masked, none of the values in the red area must be strictly less than min(Bp) / DETECT_THRESH = 3. However, for the central pixel to remain unmasked, only one pixel in the red area has to be strictly less than 3, which is true for three pixels on the example on the right.

An example of RMS maps, filtered image and mask produced for a given source is provided on Figure 19. The RMS and filtered maps are directly produced by SExtractor. The bright pixels determined on the filtered image are compared to the RMS map to produce the mask according to the method presented above.

This exercise can be used to simulate the detectability of a given source in an image completely independent of the input source. This is useful, for example in the case of a survey that consists of different and independent FoVs. In that situation, the differences in seeing condition have to be accounted for when measuring the bright pixel profile of the source. This can be achieved through convolution or deconvolution of the original image of the source. An example of how the seeing affects the determination of the bright pixel profiles is shown on Fig. 17.

## Appendix B Mask examples using median RMS maps

In this section we illustrate the results found when applying the method presented in A to the different cubes, for LAEs detected with different SN values. A sample of representative masks is presented on Fig. 20. These masks were used for masking the 3D cubes during the volume computation. They were created with the method described in Sect. 6.1.1, including

• A median RMS map for each data cube.

• A Median bright pixel profile to be rescaled in agreement with the actual SN of the source.

The SN values used to build the masks increase from left to right. Note that, in this case, this is not a real SN but a proxy (see Sect. 6.1.1 for details).

We see that at lower SN values, the masks are efficient to retrieve the instrumental patterns. At higher SN values, these patterns disappear, and only the bright galaxies and the edge of the FoVs remain masked. For A2744, we see that the masks are very efficient to account for the difference in exposure time in the mosaic. The central quadrant of the mosaic, being the deepest is mostly not masked, whereas the upper right quadrant, being the shallowest, is only unmasked for the highest SN values.

## Appendix C Comparison of the different volume computation methods

In this section we compare the results obtained when computing the using the method adopted in this study to the classical integration, based on a unique mask. We present on Fig. 22 the comparison between the values obtained from these two different methods. The first one (on the y-axis) is the one used in this project, based on 3D masks following the noise variation through the MUSE cubes. The second one (on the x-axis), uses a mask generated from a unique SExtractor segmentation map, which is replicated across the spectral dimension. An example of such a mask is provided in Fig. 21. It is mostly efficient to mask the brightest sources and halos on the image. Comparing this mask to the masks presented in Fig. 20, we see that they are completely different. Whereas the 3D masks adopted in this paper are able to follow the differences in exposure time while encoding the instrumental noise patterns, the simple masks provide a unique pattern for all sources, irrespective of their SN values. This results on the following effects as seen in Fig. 22:

• A unique mask translates into a unique value for a large number of sources, as only the lensing effects play a role in the determination of . This corresponds to the vertical pattern on the right-hand side of Fig. 22.

• Using the adaptive mask method, systematically lower values are obtained. And more interestingly, for sources in A1689, A2390, and A2667, we see that the differences are less pronounced (or even not significant for some sources) than it is for the sources in the A2744 mosaic.

To explain the first point, it is important to understand that when using a single mask, the only factor that could influence the value, is the limit magnification (see Sect. 6.1.2). A source with a higher value would end up with a smaller as the area of the survey with large magnification is smaller. For the bright sources of the sample, it could be that the computed would be under the lower magnification reached on the survey. For those sources, the volume was integrated on the entire survey area. Using the 3D mask method, still plays a role but it is no longer the only factor affecting the final volume value and the local noise level is properly taken into account.

To explain the second point and to illustrate the systematical difference between the two methods, we can consider a faint source detected in one of the deepest parts of the A2744 mosaic. When comparing the source to the noise level in the rest of the mosaic, the quadrants with the lower integration time end up being completely masked. As for the 3 other cubes, their contribution is zero as they have even less integration time. In that case, only a small portion of the mosaic has a significant contribution to the value and it results in a low