LSDCat: Detection and cataloguing of emissionline sources in
integralfield spectroscopy datacubes^{1}
Key Words.:
methods: data analysis – techniques: imaging spectroscopyWe present a robust, efficient, and userfriendly algorithm for detecting faint emissionline sources in large integralfield spectroscopic datacubes together with the public release of the software package LSDCat (Line Source Detection and Cataloguing). LSDCat uses a threedimensional matched filter approach, combined with thresholding in signaltonoise, to build a catalogue of individual line detections. In a second pass, the detected lines are grouped into distinct objects, and positions, spatial extents, and fluxes of the detected lines are determined. LSDCat requires only a small number of input parameters, and we provide guidelines for choosing appropriate values. The software is coded in Python and capable of processing very large datacubes in a short time. We verify the implementation with a source insertion and recovery experiment utilising a real datacube taken with the MUSE instrument at the ESO Very Large Telescope.
1 Introduction
One motivating driver for the construction of the current generation of optical widearea integralfield spectrographs such as the Multi Unit Spectroscopic Explorer at the ESO VLT (MUSE, in operation since 2014; Bacon et al. 2014; Kelz et al. 2016) or the Keck Cosmic Web Imager (KCWI, under construction; Martin et al. 2010) is the detection of faint emission lines from highredshift galaxies. The highlevel data products from those instruments are threedimensional (3D) arrays containing intensityrelated values with two spatial axes and one wavelength axis (usually referred to as datacubes). So far, efficient, robust, and userfriendly detection and cataloguing software for faintlineemitting sources in such datacubes is not publicly available. To remedy this situation we now present the Line Source Detection and Cataloguing (LSDCat) tool, developed in the course of our work within the MUSE consortium.
Automatic source detection in twodimensional (2D) imaging data is a wellstudied problem. Various methods to tackle it have been implemented in software packages that are widely adopted in the astronomical community (see reviews by Bertin 2001 and Masias et al. 2012, or the comparison of two frequently used tools by Annunziatella et al. 2013). A conceptually simple approach consists of two steps: First, the observed imaging data is transformed in order to highlight objects while simultaneously reducing the background noise. A particular transformation that satisfies both requirements is the matched filter (MF) transform. Here the image is crosscorrelated with a 2D template that matches the expected light distribution of the sources to be detected. Mathematically, it can be proven that for stationary noise the MF maximises the signaltonoise ratio () of a source that is optimally represented by the template (e.g. Schwartz & Shaw 1975; Das 1991; Zackay & Ofek 2015; Vio & Andreani 2016). In the second step, the MFtransformed image is segmented into objects via thresholding, that is, each pixel in the threshold mask is set to 1 if the MFtransformed value is above the threshold, and 0 otherwise. Connected 1valued pixels then define the objects on which further measurements (e.g. centroid coordinates, brightnesses, ellipticities etc.) can be performed. Other image transformations (e.g. multiscale methods) and detection strategies (e.g. histogrambased methods) exist, but the “MF + thresholding”approach is most frequently employed, especially in optical/nearinfrared imaging surveys for faint extragalactic objects. This widespread preference is certainly attributable to the conceptual simplicity and robustness of the method, despite known limitations (see e.g. Akhlaghi & Ichikawa 2015). But it is also due to the availability of a stable and userfriendly implementation of a software based on this approach (Shore 2009): SExtractor (Bertin & Arnouts 1996).
The detection and cataloguing of astronomical sources in 3D datasets has so far mostly been of interest in the domain of radio astronomy. Similar to integralfield spectroscopy (IFS), these observations result in 3D datacubes containing intensity values with two spatial axes and one frequency axis. Here, especially surveys for extragalactic 21 cm H i emission are faced with the challenge to discover faint line emissiononly sources in such datacubes. The current generation of such surveys utilises a variety of custommade software for this task, also relying heavily on manual inspection of the datacubes. Notably, the approach of Saintonge (2007) tries to minimise such errorprone interactivity by employing a search technique based on matched filtering, although only in spectral direction. More recently, driven mainly by the huge data volumes expected from future generations of large radio surveys with the Square Kilometre Array, development and testing of new 3D sourcefinding techniques has started (e.g. Koribalski 2012; Popping et al. 2012; Serra et al. 2012; Jurek 2012). Currently, two software packages implementing some of these techniques are available to the community: duchamp (Whiting 2012) and SoFiA (Serra et al. 2015). While in principle these programs could be adopted for source detection purposes in optical integralfield spectroscopic datasets, in practice there are limitations that necessitate the development of a dedicated IFS source detector. For example, the noise properties of long exposure IFS datacubes are dominated by telluric line and continuum emission and are therefore varying with wavelength, in contrast to the more uniform noise properties of radio datacubes. Moreover, the search strategies implemented in the radio 3D source finders are tuned to capture the large variety of signals expected. But, as we argue in this paper, the emission line signature from compact highredshift sources in IFS data is well described by a simple template that, however, needs an IFSspecific parameterisation. Finally, the input data as well as the parameterisation of detected sources is different in the radio domain compared to the requirements in optical IFS: typically radio datacubes have their spectral axis expressed as frequency and flux densities measured in Jansky, while in IFS cubes the spectral axis is in wavelengths and flux densities are measured as with units of erg s cm Å.
LSDCat is part of a longterm effort, initially motivated by the construction of MUSE, to develop source detection algorithms for widefield IFS datasets (e.g. Bourguignon et al. 2012). As part of this effort, Meillier et al. (2016) recently released the Source Emission Line FInder SELFI. This software is based on a Baysian scheme utilising a reversible jump Monte Carlo Markov Chain algorithm. While this mathematically sophisticated machinery is quite powerful in unearthing faint emission line objects in a datacube, the execution time of the software is too long for practical use. In contrast, the algorithm of LSDCat is relatively simple and correspondingly fast in execution time on stateofthe art workstations. It is also robust, as it is based on the matchedfiltering method that has long been successfully utilised for detecting sources in imaging data. LSDCat is therefore well suited for surveys exploiting even large numbers of widefield IFS datacubes. Furthermore, the short execution time also permits extensive fake source insertion experiments to empirically reconstruct the selection function of such surveys.
This article is structured as follows: In Sect. 2, we describe the mathematical basis and the algorithm implemented in LSDCat. In Sect. 3, we validate the correctness of the implementation in our software. We then provide guidelines for the LSDCat user for adjusting the free parameters governing the detection procedure in Sect. 4 and conclude in Sect. 5 with a brief outlook on future improvements of LSDCat. A link to the software repository is available from the Astrophysics Source Code Libary http://ascl.net/1612.002. In Appendix A, we provide a short example on how the user interacts with LSDCat routines. The public release of the software includes a detailed manual, to which we refer all potential users for details of installing and working with the software.
2 Method description & implementation
Processing step  LSDCat Routine  Input parameters 

Spatial filtering  lsd_cc_spatial.py  – PSF functional form: Moffat or Gaussian 
– ,, and for  
– (Moffat , if Moffat PSF)  
Spectral filtering  lsd_cc_spectral.py  – FWHM of Gaussian profile: [km s] 
Thresholding  lsd_cat.py  – Detection threshold: S/N 
Measurements  lsd_cat_measure.py  – Analysis threshold: S/N 
2.1 Input data
The principal data product of an integral field unit (IFU) observing campaign is a datacube (e.g. AllingtonSmith 2006; Turner 2010). The purpose of LSDCat is to detect and characterise emissionline sources in . We adopt the following notations and conventions: is a set of volume pixels (voxels) with intensity related values, for example, flux densities in units of erg s cm Å. The indices index the spatial pixels (spaxels), and indexes the spectral layers of the datacube. Mappings between and sky position (right ascension and declination), as well as between and wavelength can be included in the metadata (Greisen & Calabretta 2002; Greisen et al. 2006). For the new generation of widefield IFUs based on imageslicers, the sky is sampled contiguously, and typical dimensions of a MUSE datacube are , that is, a datacube consists of voxels.
We make a number of assumptions regarding the data structure, guided by the output of the MUSE data reduction system (DRS; Weilbacher et al. 2012, 2014):

LSDCat currently requires a rectilinear grid in . In particular, the mapping between and has to be linear with a fixed increment per spectral layer, that is,
(1) where designates the wavelength corresponding to the wavelength of the first spectral layer.
For MUSE, this is achieved by the DRS through resampling the raw CCD data into the final datacube . We also demand a constant mapping between and sky position for all spectral layers , which the MUSE pipeline accounts for in the resampling step by correcting for the wavelengthdependent lateral offset (differential atmospheric refraction) along the parallactic angle (e.g. Filippenko 1982; Roth 2006).

Together with the flux datacube , LSDCat expects a second cube containing voxelbyvoxel variances. While such a variance cube is provided by the MUSE DRS as formal propagation of the various detectorlevel noise terms through the data reduction steps, the pipeline currently neglects the covariances between adjacent voxels introduced by the resampling process. We discuss this issue and some practical considerations further in Sect. 4.4; here we simply assume that a datacube with appropriate variance estimates is available.
As a very useful preparatory step for LSDCat, we recommend that galaxies and stars with bright continuum emission (i.e. sources with significant signal in the majority of spectral bins) are subtracted from . While the presence of such sources does not render the detection and cataloguing algorithm unusable, continuum bright sources may lead to (possibly many) catalogue entries unrelated to actual emissionline objects. We give some guidance for the subtraction of continuum bright sources prior to running LSDCat in Sect. 4.1.
Figure 1 depicts, as a flowchart, all the main
processing steps of LSDCat, leading from the input datacube
and its associated variances to a catalogue
of emission lines. Each processing step is implemented as a
standalone Python
2.2 3D matched filtering
The optimal detection statistic of an isolated signal in a dataset with additive white Gaussian noise is given by the matched filter transform of the dataset (e.g. Schwartz & Shaw 1975; Das 1991; Bertin 2001; Zackay & Ofek 2015; Vio & Andreani 2016). This transform crosscorrelates the dataset with a template that matches the properties of the signal to be detected.
We utilise the matched filtering approach in LSDCat to obtain a robust detection statistic for isolated emission line sources in widefield IFS datacubes. For a symmetric 3D template , this is equivalent to a convolution of with :
(2) 
Here denotes the (discrete) convolution operation, that is, every voxel of is given by
(3) 
In principle, the summation runs over all dimensions of the datacube, but in practice, terms where can be neglected. Propagating the variances from through Eq. (3) yields the voxels of :
(4) 
The template must be chosen such that its spectral and spatial properties match those of a compact expected emissionline source in . LSDCat is primarily intended to search for faint compact emissionline sources. For such sources it is a reasonable assumption that their spatial and spectral properties are independent of one another. We can therefore write as a product
(5) 
where is the expected spatial profile and is the expected spectral profile. The voxels of are thus given by
(6) 
Of course, spatially extended sources with a velocity profile along the line of sight are not optimally captured by separating the filter into the spectral and spatial domain. However, as we detail in Sects. 4.2 and 4.3, moderate template mismatches do not result in a major reduction of the maximum detectability that could be achieved with an exactly matching template.
Now with Eq. (6), the 3D convolution of Eq. (3) yielding can be performed as a succession of two separate convolutions, in no particular order: A spatial convolution with the appropriate in each spectral layer , and a spectral convolution with in each spaxel :
(7)  
(8) 
Note that since every spectral layer of the datacube is convolved with a different spatial template, and since the width of the spectral template also changes with (see Sect. 2.3), the equivalence between Eq. (7) and Eq. (8), mathematically speaking, is not strictly correct. However, the variations of the templates with are much slower than the typical line widths of the spectral templates, meaning that we can approximate the convolution by using a locally invariant template.
As an indicator of the presence or absence of an emission line in at a position we then evaluate the statistic
(9) 
with from Eq. (3) and as the square root of Eq. (4). The voxels constitute the signaltonoise cube
(10) 
computed after the matched filtering has been performed. The values on the left side of Eq. (9) can be translated into a probability for rejecting the nullhypothesis of no source being present at position in . This is commonly referred to as the detection significance of a source. However, in a strict mathematical sense this direct translation is only valid for sources that are exactly described by the adopted template , in a dataset where the variance terms on the righthand side of Eq. (4) are stationary and fully uncorrelated. Nevertheless, even if these strict requirements are not met by the IFS datacube, filtering with will always reduce highfrequency noise while enhancing the presence of sources that are similar in appearance with . Thus Eq. (9) can still be used as a robust empirical measure of the significance of a source present in at a position . We illustrate this for a faint highredshift Lyman emitting galaxy observed with MUSE in Fig. 2; this source is detected with high significance at a peak value of 9 in the cube, although it is barely visible in the monochromatic layers of the original MUSE datacube.
In LSDCat, the spatial convolution is performed by the routine lsd_cc_spatial.py and the spectral convolution is performed by lsd_cc_spectral.py. The output of a subsequent run of those routines is a FITS file with two header and data units; one storing and the other one storing . Both routines are capable of fully leveraging multiple processor cores, if available, to process several datacube layers or spaxels simultaneously. Moreover, in lsd_cc_spatial.py the computational time is reduced by using the fast Fourier transform method provided by SciPy to convolve the individual layers. However, since the spectral kernel varies with wavelength we cannot utilise the same approach in lsd_cc_spectral.py, but the discrete 1D convolution operation can be written as a matrixvector product with the convolution kernel given by a sparse banded matrix (e.g. Das 1991, Sect. 3.3.6). Hence here we achieve a substantial acceleration of the computational speed by utilising the sparse matrix multiplication algorithm of SciPy. Typical execution times of lsd_cc_spatial.py and lsd_cc_spectral.py running in sequence on a full MUSE datacube, including reading and writing of the data, are minutes on an Intel Corei7 workstation with 8 cores, or minutes on a workstation with two AMD Opteron 6376 processors with 32 cores each.
2.3 Templates
In line with the approximate separation of spatial and spectral filtering described in the previous subsection, LSDCat requires two templates, a spatial and a spectral one.
For the spatial template the user has currently the choice between a circular Gaussian profile,
(11) 
with the standard deviation , or a Moffat (1969) profile,
(12) 
with the width parameter and the kurtosis parameter
. Both functions
Typically the parameter used to characterise the atmospheric seeing is the full width at half maximum (FWHM) of the PSF. For Eq. (11) this is
(13) 
and for the Moffat in Eq. (12) it is
(14) 
Generally, the seeing depends on wavelength. Specifically, in MUSE data and adopting the Moffat form to describe the PSF, the variation of FWHM with appears to be mainly driven by , while appears to be close to constant (e.g., Husser et al. 2016). In LSDCat we approximate the dependency as a quadratic polynomial
(15) 
where the coefficients [″], [″Å], [″Å], and the reference wavelength are input parameters supplied by the user. In Sect. 4.2 we further discuss the adopted PSF parametrisation and the choice of suitable parameter values.
As a spectral template in LSDCat we employ a simple 1D Gaussian
(16) 
with the standard deviation . While the line profiles of actual galaxies may deviate in detail from this idealised function, the deviations are usually minor and do not lead to significant changes in the achieved by the matched filter when compared to a simple Gaussian.
Since velocity broadening usually dominated the widths of emission lines from galaxies, LSDCat assumes the width of the spectral template to be fixed in velocity space, . As long as the mapping between and spectral coordinate is linear, in Eq. (16) depends linearly on when parameterised by :
(17) 
The input parameter supplied by the LSDCat user is the velocity FWHM of the Gaussian profile . In Sect. 4.3 we show that the of any given emission line does not depend very sensitively on the chosen value of , and in many cases a single spectral template with a typical value of may be sufficient.
2.4 Thresholding
A catalogue of emission line sources can now be constructed by thresholding the cube with a userspecified value . This threshold is used to create a binary cube with voxels given by
(18) 
The detection threshold is the principal input parameter to be set by the LSDCat user. Each cluster of nonzero neighbouring voxels in (6connected topology) constitutes a detection. A high threshold will lead to a small number of highly significant detections, while lower values of will lead to more entries in the catalogue, but also increase the chance of including entries that are spurious, that is, that do not correspond to real emission line objects. We give guidelines on the choice of the detection threshold based on our experience with MUSE data in Sect. 4 below.
For each detection, LSDCat records the coordinates , , of the local maximum in the cube, its value , and the number of voxels constituting the detection. These values constitute an intermediate catalogue of detections. Each entry is assigned a unique running integer number . Moreover, LSDCat also assigns an integer object identifier to detection clusters occurring at different wavelengths but similar spatial positions within a small search radius to account for sources with multiple significantly detectable emission lines. This intermediate catalogue is created by the routine lsd_cat.py, utilising routines from the SciPy ndimage.measurements package. The intermediate catalogue is written to disk as a FITS binary table. The actual execution time is generally much shorter than the previous step of applying the matched filter, but it depends on the detection threshold that determines the number of objects.
2.5 Measurements
Output Parameter(s)  LSDCat Name  Description 

I  Running ID; see Sect. 2.4  
ID  Object identifier; see Sect. 2.4  
, ,  {X,Y,Z}_PEAK_SN  coordinate; see Sect. 2.4 
NPIX  Number of voxels above ; see Sect. 2.4  
DETSN_MAX  value at ; see Sect. 2.4  
, ,  {X,Y,Z}_SN  weighted centroid; Eq. (2.5.1) 
, ,  {X,Y,Z}_FLUX  weighted centroid; Eq. (2.5.1) with substituted by 
, ,  {X,Y,Z}_SFLUX  weighted centroid; Eq. (2.5.1) with substituted by from Eq. (3) 
,  Z_NB_{MIN,MAX}  Minimum or maximum coordinate above in cube 
{X,Y}_1MOM  2D momentbased centroid in NB image (Eq. 19, Eq. 20)  
{X,Y,XY}_2MOM  2D second central moments in NB image (Eq. 2.5.1, Eq. 2.5.1, Eq. 2.5.1)  
R_SIGMA  
R_KRON  Kron radius (Eq. 2.5.1)  
FLUX_{}KRON  Integrated flux in aperture (Eq. 23)  
ERR_FLUX_{}KRON  Uncertainty on 
LSDCat provides a set of basic measurement parameters for each detection of the intermediate catalogue. These parameters are determined using the datacubes , , and . The set of parameters is chosen to be robust and independent from a specific scientific application. For more complex measurements involving, for example, fitting of flux distributions, the LSDCat measurement capability can serve as a starting point.
In Table 2 we list the various output parameters that are generated for each detection. This parametrisation of the detected sources and their emission lines constitutes the final processing step of LSDCat. The routine lsd_cat_measure.py performs the parameterisation task and writes out the final catalogue. Running lsd_cat_measure.py on an intermediate catalogue with entries takes typically 100 seconds on the two machines mentioned above in Sect.2.2, with most of the time spent on reading the four input datacubes.
Centroids
The coordinates of each detection local maximum in the
cube (Sect. 2.4) serve only
as a first approximation of its spatial and spectral position. As a
refinement lsd_cat_measure.py can calculate several
different sets of centroid positions for each detected line
cluster. For example, the 3D weighted centroid is given by the
first moments
{multline}
(x_S/N^com, y_S/N^com, z_S/N^com )
=
(
∑x,y,zx⋅S/Nx,y,z∑x,y,zS/Nx,y,z ,
∑x,y,zy⋅S/Nx,y,z∑x,y,zS/Nx,y,z ,
∑x,y,zz⋅S/Nx,y,z∑x,y,zS/Nx,y,z
) \text.
Here, for each detection, the summation runs over all nonzero
(, ,)neighbouring
voxels in a thresholded datacube similar to
(Eq. 18), but with voxels set to one if they are above an
analysis threshold . This
additional threshold, which must be smaller or equal to
, is the required input parameter for
lsd_cat_measure.py. Guidelines for choosing
are discussed in
Sect. 4.5. Similarly, LSDCat can
measure 3D centroid coordinates with the original flux cube
and with the filtered flux cube as weights. In these
cases, Eq. (2.5.1) is applied again, but with
being substituted by and
, respectively.
3D centroids provide a nonparametric way of calculating spatial and spectral positions, making use of the full 3D information present in the datacube. While the calculation using the flux cube is unbiased against a particular choice of filter template, it is not very robust for low detections. This shortcoming is alleviated for the centroids calculated on the filtered flux cube or the cube. The latter, however, could potentially be biased by local noise extrema, for example, at spectral layers near skylines.
As an alternative approach to 3D centroids, LSDCat can calculate 2D centroids on synthesised narrowband images from the filtered flux cube :
(19) 
The boundary indices of each synthetic narrowband image and are taken as the minimum and maximum coordinate of all voxels of a detection above the analysis threshold . Then the 2D weighted centroid coordinates follows from the first image moments:
(20) 
In this equation the summation runs over all pixels of that belong to the detection cluster and are above the analysis threshold in the layer of the cube.
The 2D narrowband images are furthermore used by lsd_cat_measure.py
to derive basic shape information using the second central image
moments of each detection. These are defined as
{align}
x^(2) =& ∑x,yx2⋅NBx,y(~F)∑x,yNBx,y(~F)  (x^(1))^2 \text,
y^(2) =& ∑x,yy2⋅NBx,y(~F)∑x,yNBx,y(~F)  (y^(1))^2 \text,
[xy]^(2) =& ∑x,yx2⋅y2⋅NBx,y(~F)∑x,yNBx,y(~F)  x^(1) ⋅y^(1) \text.
These values are simple indicators for the spatial extent and
elongation of a detected emission line cluster. For example,
for a circular symmetric distribution of the
object in . Moreover, in this case the radius
(21) 
encircles 68% of the flux in the filtered narrowband image for a perfect point source blurred by a Gaussian PSF.
While in principle, the original flux datacube could also be used in Eqs. (19) to (2.5.1), in practice, the calculation of the moments directly from the flux cube is relatively susceptible to noise for very faint sources. Our experience with MUSE data showed that the centroids and shapes determined in the 2D narrowband images based on the filtered datacubes provide the most reliable measurements even for low detections. Moreover, the spatial coordinates from Eq. (20) are in closest agreement with the centroids determined in broadband imaging data.
Integrated line fluxes
The difficulty in measuring the total flux of a detected line lies in its unknown spectral shape and in the unknown spatial distribution of the flux. While for the detection it is not required to accurately know these properties (as long as the template mismatch is not too poor), the template scaling factor depends very critically on the degree of similarity between source and template and can therefore not be used as flux indicator. This is different from PSF matching techniques in stellar fields (e.g. Kamann et al. 2013), where all objects are point sources and no strong mismatches are expected. The task for LSDCat is different: We have to define 3D boundaries for each emission line to allow for the summation of voxel values within these boundaries as flux measurement, but avoid the inclusion of too many unrelated emptysky voxels that would compromise the precision of the measurement. In line with the prime purpose of LSDCat as a detection tool, we implemented an approach that emphasises robustness over sophistication.
LSDCat addresses this task in two steps. The first is the construction of a narrowband image via setting an analysis threshold as described in the previous subsection. A reasonable choice of should produce spectral boundaries and that enclose the emission line (more or less) completely in the original datacube . Therefore the pixels of the narrowband image created by summation of from to (analogous to Eq. 19) can be used for flux integration.
The second step is then to derive suitable apertures for these narrowband images, taking the spatial extent of each source into account. Currently, LSDCat measures fluxes in circular apertures with radii defined as multiples of the Kron radius (Kron 1980) of a detected line. The Kron radius is however determined not in the measured dataset itself, but again in the narrowband image based on the filtered datacube, which makes the procedure much more robust especially at low S/N. The measured quantity is then with
(22) 
In order to avoid possibly unphysically small or large values of caused by artefacts in the data, a minimal and a maximal value for can be set by the user.
The factor is also defined by the LSDCat user. Multiple values of result in multiple columns of the output catalogue. The line flux is then given by the sum
(23) 
with the first sum running over all that satisfy . The aperture thus has cylindrical shape, with its symmetry axis going through the 2D centroid position. The factor that denotes the increment per spectral layer is needed to convert the sum of voxel values into a proper integral over the line.
It can be shown that the aperture includes 90% of the total flux even for extended sources with relatively shallow profiles, as long as the determination of the Kron radius in Eq. (22) accounts for pixels at sufficiently large radii (e.g. Graham & Driver 2005). We follow a similar approach as adopted in SExtractor (Bertin & Arnouts 1996) by summing over all in Eq. (22) that satisfy
(24) 
with as defined in Eq. (21). The uncertainty of this flux measurement is obtained by propagating the voxel variances through Eq. (23).
3 Validation of the software
3.1 Creation of test datacubes
We now validate the correctness of the algorithms implemented in the
LSDCat software. To this aim we produced a set of datacubes
that contain fake emission line sources at known positions with known
fluxes and extents. Instead of utilising a completely artificial data
set with ideal noise, we based our source insertion experiment on the
MUSE HDFS datacube
We implanted the fake emission line sources into a continuumsubtracted version of the HDFS datacube at a wavelength of 5000 Å. Continuum subtraction was performed with the medianfilter subtraction method detailed in Sect. 4.1. The chosen insertion wavelength ensures a clean test environment that is not hampered by systematic skysubtraction residuals which exist in the redder parts of the datacube. In total we created 23 test datacubes for the fake source emission line fluxes to in steps of dex. Each cube contains 51 fake sources with the same emission line flux. The spatial positions of the implanted sources are based on the pseudorandom Sobol sequence (Press et al. 1992, , Sect. 7.7). The pseudorandom grid guarantees that all sources have different distances to the edges of the rectangular grid of MUSE slicerstacks, thus possible systematic effects from localised noise properties within this slicerstack grid are mitigated. As test sources we utilised Gaussian emission lines with a line width (FWHM) of 250 km s. The sources were assumed to be PSFlike and we approximated the PSF blurring by a 2D Gaussian with 0.88″ FWHM at 5000 Å, a value we obtained from a 2D Gaussian fit to the brightest star in the HDFS field. The datacubes containing the fake sources as well as all processing steps needed to reproduce the results from the validation exercise presented in this section are available via the LSDCat software repository.
3.2 Minimum detectable emission line flux at a given detection threshold
For known background noise, the minimum detection significance at which an emission line can be recovered from the datacube is intrinsically linked to the total flux of the line, its spatial and spectral morphology, and the degree of mismatch between filter template and emission line source signal. The latter we discuss in Sects. 4.2 and 4.3 for the spectral and spatial filtering processes, respectively, and Gaussian emission line expressions for the detection significance attenuation due to shape mismatch are presented in Eq. (32) and Eq. (33).
For the test datacubes described above we have complete control over the shape parameters. Thus we can use the exact template in the matched filtering process. As a benchmark we will now derive an analytic approximation for the minimum recoverable line flux at a given detection threshold and then compare the result to a source recovery experiment performed with LSDCat on the test datacube.
For the match between emission line and filter being perfect in the datacube, the flux distribution of that emission line at position in the datacube can be written as
(25) 
with being the total flux in that line, being the wavelength increment per spectral layer defined in Eq. (1), and and being the spectral and spatial templates given by Eq. (16) and Eq. (11) for the 1D and 2D Gaussian, respectively. With Eq. (25) and Eq. (8), we can thus write for the peak value of the matched filtered datacube at position :
(26) 
Since the noise is usually not constant over the shape of the filter, there exists no general solution for the error propagation given in Eq. (4). To obtain an approximate solution, we approximate as being, on average, constant spectrally and spatially, at least over all voxels within the matched filter: . Due to the absence of strong sky emission lines in the blue part of the MUSE datacube this approximation is well justified for the test data set that we consider here. Therefore Eq. (4) can be written as
(27) 
Assuming that the dispersion and of the 1D and 2D Gaussian in Eq. (11) and Eq. (20) are large enough to make sampling and aliasing effects of the profiles in the datacube negligible we can replace the sum with an integral, thus
(28) 
and
(29) 
With these expressions in Eq. (26) and Eq. (27) we can write the peak detection significance of the line at position via Eq. (9) as
(30) 
With the expression given in Eq. (30) it is now possible to estimate the minimum recoverable line flux at a given detection threshold. For the artificial sources implanted in the MUSE HDFS data we have px. The spectral line width km s translates to px at 5000 Å and Å. By averaging our empirical noise estimate around 5000 Å we find erg scmÅ. As we detail in Sect. 4.5 for the MUSE HDFS datacube, a detection threshold of is a value found to be suitable for practical work. With the stated values inserted into Eq. (30) we calculate
(31) 
as the minimum line flux at which sources should be detectable at , if an exactly matching filter was chosen in the crosscorrelation process.
We now perform the recovery experiment utilising the test datacubes introduced in Sect. 3.1 to check whether LSDCat is indeed able to detect emission line sources with at the estimated minimum line flux. Therefore we process all 23 test datacubes with LSDCat utilising the perfect matched filter, as well as setting . In the resulting catalogues we then search for positional crossmatches with the input source positions. From counting these crossmatches we produce the completeness curve displayed in Fig. 3. This curve displays the fraction of recovered sources as a function of input emission line flux. As vertical dashed and dotted lines, respectively, we show in this figure the analytically approximated minimum flux of Eq. (31) for detectability and 50% completeness estimate.
As can be seen in Fig. 3, the completeness curve starts to rise at the analytically estimated minimum flux for detectability. This validates the implementation of the detection algorithm in LSDCat. Nevertheless, it is also clear from Fig. 3 that the rise of the completeness curve is not from 0 to 1 at the estimated minimum, but it takes approximately 0.4 dex to recover all emission line sources at ; slightly above the estimated minimum value. Overall, however, there is an excellent match between a simple analytic model of detectability and our realistic implementation of a detection experiment, especially given the fact that the noise in the real data is certainly not exactly Gaussian as assumed in the model.
3.3 Line flux measurements
Given the known input emission line fluxes of the implanted fake emission line sources we are equipped to validate the flux integration routine implemented in LSDCat. As detailed in Sect. 2.5.2, LSDCat integrates fluxes in circular apertures of radii (Eqs. 22 and 23). It has been shown that apertures with are expected to contain of the flux for Gaussian profiles (Graham & Driver 2005). Therefore we compare in this experiment to the LSDCat measured flux in three Kronradii: . In absence of noise we thus expect that for every source .
The result of the above comparison from our source insertion and recovery experiment is visualised in Fig. 4 where we plot, as a function of input flux, the difference between input and output flux for each individual emission line source detected by LSDCat. In addition to the individual differences we also show, again as a function of input flux, the mean and the standard deviation of the difference over all recovered sources (blue circles and red bars in Fig. 4). We can compare the latter with the expected spread in flux measurements. This expected spread is simply given by the average uncertainty on the flux measurement as tabulated by LSDCat for each input flux bin. As noted in Sect. 2.5.2, the flux measurement uncertainties follow from direct propagation of the voxel variances through Eq. (23).
As can be seen from Fig. 5, there is no bias in the recovered flux levels above input fluxes of . Even for lower flux levels, the systematic errors are small, with (on average) 90% of the total flux being recovered. Still, at the lowest flux levels, where the detection completeness is also below 100%, a handful of implanted emission line sources are recovered with fluxes that are only 50% () of the input flux. We checked that the larger deviations for some of the faintest simulated sources are all due to imperfections in the HDFS datacube. This experiment thus demonstrates at the same time the robustness of the flux measurement procedure in LSDCat, but also the need to use real data in quantifying the true performance of a certain measurement approach.
3.4 Comparison to manual flux integration on spatially extended objects
To further demonstrate the robustness of the fluxes obtained with LSDCat, we compare the flux measurements from LSDCat with manually measured fluxes (Fig. 5). The sample on which we performed the test consists of 60 Lyman emitting galaxies that were found by us in MUSE datacubes (Herenz et al., in prep.). As established recently by Wisotzki et al. (2016), such galaxies show regularly extended but lowsurfacebrightness Lyman haloes, thus constituting good test cases for the flux measurement in nontrivially shaped extended objects. We obtained the manual flux measurements by using a curveofgrowth approach on pseudonarrowband image created from the datacube. The width and central wavelength of those images were determined by eye in order to ensure that the complete emission line signal is encompassed within the bandpass. Therefore we utilised 1D spectra extracted in a circular aperture of three pixel radius centred on , . On these images we then constructed the growth curve by integrating the fluxes in concentric circular apertures with consecutively increasing radii. Finally, we visually inspected these curves to pin down the radius at which the curve saturates, and the total flux within this circular aperture was adopted as the final flux measurement.
The LSDCat measurements were obtained in apertures of and and applying Eq. (23). As can be seen in Fig. 5, the different measurement approaches agree very well globally. The LSDCat fluxes are systematically somewhat below the growth curve measurements, indicating that the apertures still lose a small but significant fraction of the flux (2 % or 6 % flux lost compared to the manual fluxes in the median or average of the sample, respectively); this is no longer the case for the apertures (+10 % or +8 % flux gained in the median or average, respectively). We conclude that LSDCat delivers reliable and robust flux measurements for emission lines with nonpathological spatial and spectral shapes.
4 Guidelines for the usage of LSDCat
We now provide some guidelines for using LSDCat on widefield IFS datacubes. These are based on our experience with applying the code on MUSE datacubes, searching for faint emission lines from highredshift galaxies (first results presented in Bacon et al. 2015 and Bina et al. 2016; more will be reported in Herenz et al., in prep). We intend these guidelines to be instructive for the potential LSDCat user, but they should not be understood as recipes.
4.1 Dealing with bright continuum sources in the datacube
Even in relatively empty regions in the sky, any blankfield exposure will contain objects that produce a detectable continuum signal within the datacube, and that correspondingly appear in a whitelight image resulting from averaging over all layers in the datacube (e.g. Fig. 3 in Bacon et al. 2015). To detect such sources, conventional 2D source detection algorithms are clearly sufficient. Since in the detection process of LSDCat we implicitly assume any significant signal to be due to emission lines, it is advisable to either mask out or, better, subtract any significant continuum signal from the datacube before running LSDCat. This step is not strictly needed, and the presence of continuum sources in the datacube does not render the detection algorithm unusable as such. But the significance of a line detection would clearly change if the line sat on top of a continuum signal, while a very bright continuumonly object would even turn out a band of spurious detections.
To remove the continuum, we found it useful to create and subtract a “continuumonly” cube by median filtering the original flux datacube in spectral direction only. The width of the median filter should be much broader than the expected widths of the emission lines, and it should be narrow enough to approximately trace a slowly varying continuum. In our experience, filter radii of Å– Å serve these goals very well. The median filtersubtracted datacube can then be used as input for LSDCat.
4.2 Width of the spatial filter template
The matched filtering approach produces maximum significance if the shapes of the true signal and of the template exactly agree; any template mismatch leads to a reduced value. Fortunately, this dependence of on the template shape parameters is relatively weak around the optimum. If both signal and template are of Gaussian shape and the template has an incorrect width of , it can be shown that decreases only as
(32) 
(Zackay & Ofek 2015). Hence even a difference of 20% between the adopted and the correct FWHM will result in a reduction of by only 2%, entirely negligible for our purposes; this number is supported by our own numerical experiments.
When searching for compact emission line objects, a single spatial template modelling the light distribution of a point source (i.e. the PSF) will therefore be sufficient for many applications. Even neglecting the wavelength dependence of the seeing (i.e. setting the polynomial coefficient in Eq. (15) to the mean Gaussian seeing FWHM, and all other PSF parameters to zero) will result in only a very modest reduction of sensitivity at the lowest and highest wavelengths.
To go one step further in accuracy, one has to account for the seeing as a function of wavelength. In the framework of the standard (Kolmogorov) turbulence model of the atmosphere, the seeing is expected to decrease with increasing wavelength as , where the constant of proportionality also depends on airmass (e.g. Hickson 2014). In reality, other effects also play a role, such as guiding errors, or the blurring induced by coadding multiple dithered exposures of the same field.
When the observed field contains a point source of sufficient brightness, a direct measurement of is possible from the datacube. As an example, we show in Fig. 6 the derived relation from a Moffat fit to the brightest star in the MUSE HDFS datacube (same data as Fig. 2 in Bacon et al. 2015). Here we overplot the polynomial fit according to Eq. (15) and the relative difference between this fit and the values derived from the star, . This difference is less than 3%, thus totally negligible in our matchedfilter application. For datacubes containing only faint stars, it may be advisable to bin over several spectral layers before modelling the PSF.
If no point source is present within the datacube, the relations by Tokovinin (2002) can be used to get an idea of the dependence of FWHM on . They predict given a differential image motion monitor (DIMM) seeing FWHM measurement, the airmass of the observation, and an additional parameter (called the wavefront outer scale length) that quantifies the maximum size of wavefront perturbations by the atmosphere (e.g. Martin et al. 1998). In Fig. 6 we also compare the relation derived from this model to the actual measurement of the brightest star in the HDFS. For the plot, we adopted a DIMM seeing of 0.75″ at 5000Å and an airmass of 1.41, both averages over all individual MUSE HDFS observations. We set to 22 m, which is the median of this for the Paranal observatory (Conan et al. 2000). As can be seen, this model provides a very good description of the measured relation. Moreover, from this figure it is also clear that a secondorder polynomial is a nearly perfect representation of the Tokovinin prediction.
4.3 Width of the spectral filter template
As explained above, LSDCat assumes the spectral (Gaussian) template to have a fixed width in velocity space. We now briefly demonstrate the effect of template mismatch in the spectral domain. Similarly to the 2D case considered in the previous subsection (Eq. 32), it can be shown that decreases as
(33) 
where is the ratio between adopted and true template width. Even when the filter width is half or twice that of the actual object, the maximum reachable detection significance reduces by only . For the same reason, the achievable is very robust against moderate shape mismatches between real emission line profiles and the Gaussian template profile.
A good choice of the spectral filter width can be motivated by analysing the distribution of expected emission line widths (taking instrumental line broadening into account). If the distribution of observed line widths is relatively narrow it will be sufficient to adopt a single template with width close to the midpoint of the expected distribution. If however the expected distribution is very broad, for example, when searching both for starforming galaxies and for AGN, it may be useful to generate two filtered datacubes with significantly different values (the ratio should be at least a factor 3). This implies two LSDCat runs creating two catalogues that later have to be merged.
To demonstrate the impact of varying the template width on the detectability of a particular class of emissionline objects we show in Fig. 7 the dependence of the ratio for all Lyman emitters in the MUSE Hubble Deep Field South datacube (Bacon et al. 2015). Here, denotes the maximum detection significance for a line, comparing all considered line widths. The plot shows that varies quite slowly with template width, confirming the theoretically expected behaviour (shown by the red curve). A good choice, at least for this object class, appears to be a filter width of km s, for which basically all the Lyman emitters in the sample are detected with at least 90% of their maximum possible detection significance. The same template will capture even completely unresolved emission (i.e. with just the MUSE instrumental line width FWHM of 100 km s at Å) at more than 80% of the value for a perfectly matching template.
4.4 Empirical noise calibration
Source detection is essentially a decision process based on a test statistic to either reject or accept features in the data as genuine astronomical source signals (e.g. Schwartz & Shaw 1975; Wall 1979; Hong et al. 2014; Zackay & Ofek 2015; Vio & Andreani 2016). This decision is usually based on a comparison with the noise statistics of the dataset under scrutiny. Consequently a good knowledge of the noise properties is required for deciding on meaningful thresholds, for example, in terms of falsealarm probability. In LSDCat we assume that contains a good estimate of the variances. However, in reality this is often not so easy to obtain. In particular, resampling processes carried during the data reduction usually neglect the covariance terms; even if known, it would be computationally prohibitive to formally include them in a dataset of voxels. In consequence, any direct noise property derived from will underestimate the true noise, possibly by a very substantial factor, and the resulting detection significances will be biased towards overly high values that lose their probabilistic connotation.
A possibly remedy is selfcalibration of the noise from the flux datacube. There are several ways to accomplish this, and here we simply provide a few insights from our own experience with MUSE cubes. It must be realised that the pixeltopixel noise in any given spectral layer will be as much affected by the resampling as the propagated variances, and therefore biased low in the same way. This is not so, however, for the variance of a typical aperture, for which the effects of resampling are much lower (basically acting only on the pixels at the circumference of the aperture). One can therefore estimate effective variances by evaluating the standard deviation between several identical apertures placed in different blank sky locations, separately for each spectral layer. This approach also implicitly accounts for additional noiselike effects due to imperfect flatfielding or sky subtraction.
Indeed, sky subtraction residuals can still be an issue in MUSE data, especially for the OH forest in the red part of the spectral range. If a dataset should be heavily affected and suffer from too many spurious detections that are just sky residuals (and even the new ZAP software by Soto et al. 2016 not providing sufficient improvement), it is still possible to increase the effective variances for the affected spectral layers to a level where only extremely bright sources at these wavelengths are detected by LSDCat.
4.5 Choosing the detection and analysis thresholds
A crucial quantity in each detection process is the incidence rate of false positives. If the noise properties are accurately known, the falsealarm probability can be directly calculated from the detection threshold . Given the voxels in a MUSElike widefield datacube it is obviously necessary to have a very low falsealarm probability. After matched filtering with a typical template the number of independent voxels gets reduced by a factor of . (Here we consider as a typical template km s and (the default values in LSDCat). Given the spatial sampling of 0.2″ per spatial pixel and spectral sampling of 1.25Å per datacube layer in MUSE, this corresponds at the central wavelength range of MUSE (6980Å) to a filter FWHM of 10 voxels.) Even assuming perfect Gaussian noise, a threshold of would then already result in spurious detections per cube. However, the wings of the noise distribution are never perfectly Gaussian, to which also flatfielding and sky subtraction residuals have to be added; any such deviations from Gaussianity will directly inflate the false detection rate.
Instead of relying on theoretically expected false alarm probabilities, we recommend selfcalibrating the rate of spurious detections using LSDCat on the negated cube . While obviously no real emission line object can be detected in such a dataset, it can probably be assumed that the effective noise properties are approximately the same in original and negated data. Running LSDCat on both versions then immediately allows us to estimate the ratio of real to spurious detections and adjust the final value of accordingly. A possible criterion could then be the point of diminishing returns, where lowering the detection threshold would produce a large increase in spurious detections with only a small compensatory increase of genuine emission lines.
In Fig. 8 we exemplarily present the result of such an analysis for the MUSE HDFS dataset. This datacube was processed according to the guidelines given in the previous subsections. In the figure, we show both the number of detections in the original and in the negated datacube at different thresholds. Moreover, in the bottom panel of Fig. 8 we show the ratio , where and designate the number of detections in the original and negated datacube, respectively. Assuming a symmetric noise distribution around zero and no systematic negative holes in the data, the rate of genuine detections within a catalogue at a given threshold can be approximated by (e.g. Serra et al. 2012). However, we find that the skysubtraction residuals in the MUSE HDFS datacube appear to be systematically skewed to more negative values. In addition, since we subtracted continuum signal utilising the median filter approach (Sect. 4.1), absorption lines in some continuum bright objects created holes that mimic emission line signals in the negated datacube. Indeed, most of the detections in the negated cube at can be associated either with sky subtraction residuals or holes from oversubtracted absorption lines. Taking this into account, we notice in Fig. 8 a second strong increase of detections in the negated dataset that is not compensated by positive detections at . Hence, for this particular dataset a threshold below would not be advisable.
The choice of the analysis threshold used in the measurement routine (Sect. 2.5) again depends on the noise characteristics. Here it is useful to visually inspect the cubes. Considering for example the faint emission line source presented in Fig. 2, by comparing the left panel (where no source is present) to the right panel that includes the source, we find that nonsource voxels rarely obtain values high than 3. Hence, in this dataset, voxels around a real detection above a threshold value of 3.5 are likely linked to the actual source and should be included in the measurement process.
5 Conclusion and outlook
Here we present LSDCat, a conceptually simple but robust and efficient detection package for emission line objects in widefield IFS datacubes. The detection utilises a 3D matchedfiltering approach to detect individual emission lines and sorts them into discrete objects. Furthermore, the software measures fluxes and the spatial extents of detected lines. LSDCat is implemented in Python, with a focus on fast processing of the large data volumes generated by instruments such as MUSE. In this paper we also provide some instructive guidelines for the prospective usage of LSDCat.
LSDCat is opensource. Following the example of AstroPy
(Astropy Collaboration et al. 2013), we release it to the community under
a 3clause BSD style
license
LSDCat is documented in two ways. First, each of the routines described in Sect. 2 is equipped with an online help. Secondly we provide an extensive README file describing all the routines and options in detail. Moreover, that README contains examples and scripts that help use LSDCat efficiently.
LSDCat is actively maintained as it is currently used by the
MUSE consortium to search for high faint emission line galaxies
(e.g., Bacon et al. 2015, Bina et al. 2016, Herenz et al., in
prep., Urrutia et al., in prep.). Development takes place within a
git repository
While LSDCat is fully operational, we see a number of aspects where there is room for future improvement. For example, LSDCat currently does not perform a deblending of overmerged detections, nor does it automatically merge detections belonging to a single source unless their initial positions are within a predefined radius. Indeed, in our search for faint line emitters in MUSE datacubes we encountered a few cases of very extended lineemitting galaxies that fragmented into several ‘sources’. We aim at addressing this problem in a future version. Currently these sources have to be merged or deblended manually in the resulting output catalogue. Another improvement planned for a future release is an automatic object classification for objects where multiple lines are detected. To this aim, the combination of spectral lines found at the same position on the sky must match a known combination of redshifted galaxy emission line peaks (Garilli et al. 2010). Finally, while in principle the software could be used for any sort of astronomical datacubes (e.g. coming from radio observations, or other IFS instruments), we have so far focused our efforts in development and testing on MUSE datacubes. However, the code is independent of instrument specifications and requires only valid FITS files following the conventions stated in Sect. 2.1. Still, despite all these possible enhancements LSDCat is already a complete software package, and we hope that it will be of value to the community.
Acknowledgements.
We thank Maria Werhahn for valuable help with the parameter study shown in Fig. 7 and Joseph Caruana for providing us with curveofgrowth flux integrations of 60 Ly emitters shown in Fig. 5. We thank Rikke Saust for preparing the test data. We also thank the MUSE consortium lead by Roland Bacon for constructive feedback during the LSDCat development. E.C.H. dedicates this paper to Anna’s cute fat cat Cosmos.Appendix A Usage example
We provide a short example to demonstrate how the user interacts with the LSDCat routines to obtain an emission line catalogue from an IFS datacube (see also Fig. 1). For this example we use the MUSE HDFS datacube (Bacon et al. 2015) in version 1.34. This datacube can be downloaded from the MUSE consortium data release web page: http://musevlt.eu/science/datareleases/. For brevity, we refer to the MUSE HDFS datacube FITS file as datacube.fits. Moreover, in all following commands, the o option specifies the output filename of a particular routine.
As detailed in Sect. 4.1 any detectable source continua should be removed from a datacube on which LSDCat is used. There we outlined, that it is often sufficient to subtract these continua by subtracting an inspectraldirection median filtered version of the datacube from the original datacube. We package with LSDCat an additional routine medianfiltercube.py that performs this task via the following command:
medianfiltercube.py datacube.fits \ o mf_datacube.fits
(Run with its default settings the full width of the running median is 180Å.) The output file mf_datacube.fits contains suitable input datacubes and for the matchedfiltering procedure (Sect. 2.2).
The spatial convolution and corresponding error propagation can now be achieved by running the following command:
lsd_cc_spatial.py gaussian \ p0=0.65 p1=4.5e5 lambda0=7050 \ i mf_datacube.fits m mask.fits \ o spac_datacube.fits
Here we specified with p0, p1 and
lambda0 the coefficients and zeropoint of the linear
function
which is an apt representation of
the PSF FWHM wavelength dependency in this field. The switch
gaussian models the PSF as a circular Gaussian
(Eq. 11). Here we also utilise a mask mask.fits
that masks out imperfections in the HDFS datacube near the FoV borders
and the brightest star in the field, as these features cause numerous
unwanted detections in the datacube
Next, the spectral convolution is run on spac_datacube.fits via
lsd_cc_spectral.py i spac_datacube.fits \ FWHM=300 o 3d_filtered_datacube.fits
Here FWHM is used to specify a filter FWHM of 300 km s. The output 3d_filtered_datacube.fits now contains the matched filter output and from which we can compute the signaltonoise cube (Eq. 10) via:
s2ncube.py i 3d_filtered_datacube.fits \ o sncube.fits
While optional, the above step reduces the execution time of the routines lsd_cat.py and lsd_cat_measure.py, as they normally would create the datacube on the fly.
We can now create an emission line source catalogue with by typing:
lsd_cat.py i sncube.fits t 10 c catalogue.cat
Utilising the thresholding procedure described in Sect. 2.4, the above command creates a ASCII and FITS table catalogue.cat and catalogue.fits for 2456 emission line candidates from 225 potential individual objects. Finally, the basic measurements described in Sect. 2.5 are obtained for each of the emission line candidates with the command
lsd_cat_measure.py ic catalogue.fits ta 5 \ f mf_datacube.fits \ ff 3d_filtered_datacube.fits \ ffsn sncube.fits
At this stage we recommend the catalogue to be inspected with
QtClassify
More complete and uptodate documentation describing all the routines can be found in the LSDCat repository. Moreover, for all routines, online usage information can be displayed by calling them with the h switch.
Footnotes
 thanks: The LSDCat software is available for download at http://musevlt.eu/science/tools and via the Astrophysics Source Code Library at http://ascl.net/1612.002.
 institutetext: LeibnizInstitut fÃ¼r Astrophysik Potsdam (AIP), An der Sternware 16, 14482 Potsdam, Germany
 institutetext: Department of Astronomy, Stockholm University, AlbaNova University Centre, SE106 91,
Stockholm, Sweden
email: christian.herenz@astro.su.se  email: christian.herenz@astro.su.se
 Python Software Foundation. Python Language Reference, version 2.7. Available at http://www.python.org
 NumPy version 1.10.1. Available at http://www.numpy.org/.
 SciPy version 0.16.1, available at http://scipy.org/
 AstroPy version 1.0.1, available at http://www.astropy.org/
 In the limiting case Eq. (12) is equal to Eq. (11); cf. Trujillo et al. (2001b).
 A comma separated list within brackets in the second column indicates the set of corresponding LSDCat output column names. For each coordinate there is also a corresponding right ascension and declination value available and for each coordinate a corresponding wavelength can be tabulated if a corresponding world coordinate system is specified.
 MUSE HDFS version 1.24, available for download from http://musevlt.eu/science/hdfsv10/
 https://www.w3.org/Consortium/Legal/2008/03bsdlicense.html
 http://gitscm.com
 In order to follow the example we provide a suitable mask for the HDFS datacube version 1.34 in the examples folder of the LSDCat repository.
 https://bitbucket.org/Leviosa/qtclassify
References
 Akhlaghi, M. & Ichikawa, T. 2015, ApJS, 220, 1
 AllingtonSmith, J. 2006, New A Rev., 50, 244
 Annunziatella, M., Mercurio, A., Brescia, M., Cavuoti, S., & Longo, G. 2013, PASP, 125, 68
 Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33
 Bacon, R., Brinchmann, J., Richard, J., et al. 2015, A&A, 575, A75
 Bacon, R., Vernet, J., Borisiva, E., et al. 2014, The Messenger, 157, 13
 Bertin, E. 2001, in Mining the Sky, ed. A. J. Banday, S. Zaroubi, & M. Bartelmann, 353
 Bertin, E. & Arnouts, S. 1996, A&AS, 117, 393
 Bina, D., Pelló, R., Richard, J., et al. 2016, A&A, 590, A14
 Bourguignon, S., Mary, D., & Ãric Slezak. 2012, Statistical Methodology, 9, 32 , special Issue on Astrostatistics + Special Issue on Spatial Statistics
 Conan, R., Ziad, A., Borgnino, J., Martin, F., & Tokovinin, A. A. 2000, in Proc. SPIE, Vol. 4006, Interferometry in Optical Astronomy, ed. P. Léna & A. Quirrenbach, 963–973
 Das, P. K. 1991, Optical Signal Processing (Springer Science + Business Media)
 Filippenko, A. V. 1982, PASP, 94, 715
 Garilli, B., Fumana, M., Franzetti, P., et al. 2010, PASP, 122, 827
 Graham, A. W. & Driver, S. P. 2005, PASA, 22, 118
 Greisen, E. W. & Calabretta, M. R. 2002, A&A, 395, 1061
 Greisen, E. W., Calabretta, M. R., Valdes, F. G., & Allen, S. L. 2006, A&A, 446, 747
 Herenz, E. C. & Wistozki, L. 2016, LSDCat: Line Source Detection and Cataloguing Tool, Astrophysics Source Code Library
 Hickson, P. 2014, A&A Rev., 22, 76
 Hong, S., Dey, A., & Prescott, M. K. M. 2014, PASP, 126, 1048
 Husser, T.O., Kamann, S., Dreizler, S., et al. 2016, A&A, 588, A148
 Jones, E., Oliphant, T., Peterson, P., et al. 2001, SciPy: Open source scientific tools for Python, [Online; accessed 20160115]
 Jurek, R. 2012, PASA, 29, 251
 Kamann, S., Wisotzki, L., & Roth, M. M. 2013, A&A, 549, A71
 Kelz, A., Kamann, S., Urrutia, T., Weilbacher, P., & Bacon, R. 2016, in Astronomical Society of the Pacific Conference Series, Vol. 507, MultiObject Spectroscopy in the Next Decade: Big Questions, Large Surveys, and Wide Fields, ed. I. Skillen, M. Barcells, & S. Trager, 323
 Koribalski, B. S. 2012, PASA, 29, 359
 Kron, R. G. 1980, ApJS, 43, 305
 Martin, C., Moore, A., Morrissey, P., et al. 2010, in Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series, Vol. 7735, Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series, 0
 Martin, F., Tokovinin, A., Ziad, A., et al. 1998, A&A, 336, L49
 Masias, M., Freixenet, J., Lladó, X., & Peracaula, M. 2012, MNRAS, 422, 1674
 Meillier, C., Chatelain, F., Michel, O., et al. 2016, A&A, 588, A140
 Moffat, A. F. J. 1969, A&A, 3, 455
 Pence, W. D., Chiappetti, L., Page, C. G., Shaw, R. A., & Stobie, E. 2010, A&A, 524, A42
 Popping, A., Jurek, R., Westmeier, T., et al. 2012, PASA, 29, 318
 Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical recipes in FORTRAN. The art of scientific computing
 Roth, M. M. 2006, New A Rev., 49, 573
 Saintonge, A. 2007, AJ, 133, 2087
 Schwartz, M. & Shaw, L. 1975, Signal processing: discrete spectral analysis, detection, and estimation (Tokyo: McGrawHill Kogakusha, Ltd.)
 Serra, P., Jurek, R., & Flöer, L. 2012, PASA, 29, 296
 Serra, P., Westmeier, T., Giese, N., et al. 2015, MNRAS, 448, 1922
 Shore, S. N. 2009, A&A, 500, 491
 Soto, K. T., Lilly, S. J., Bacon, R., Richard, J., & Conseil, S. 2016, MNRAS, 458, 3210
 Streicher, O., Weilbacher, P. M., Bacon, R., & Jarno, A. 2011, in Astronomical Society of the Pacific Conference Series, Vol. 442, Astronomical Data Analysis Software and Systems XX, ed. I. N. Evans, A. Accomazzi, D. J. Mink, & A. H. Rots, 257
 Tokovinin, A. 2002, PASP, 114, 1156
 Trujillo, I., Aguerri, J. A. L., Cepa, J., & Gutiérrez, C. M. 2001a, MNRAS, 321, 269
 Trujillo, I., Aguerri, J. A. L., Cepa, J., & Gutiérrez, C. M. 2001b, MNRAS, 328, 977
 Turner, J. E. 2010, Canary Islands Winter School of Astrophysics, Vol. XVII, 3D Spectroscopy in Astronomy, ed. E. Mediavilla, S. Arribas, M. Roth, J. CepaNogue, & F. Sanchez (Cambridge University Press), 87–125
 van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, Computing in Science & Engineering, 13, 22
 Vio, R. & Andreani, P. 2016, A&A, 589, A20
 Wall, J. V. 1979, QJRAS, 20, 138
 Weilbacher, P. M., Streicher, O., Urrutia, T., et al. 2012, in Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series, Vol. 8451, Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series, 84510B
 Weilbacher, P. M., Streicher, O., Urrutia, T., et al. 2014, in Astronomical Society of the Pacific Conference Series, Vol. 485, Astronomical Data Analysis Software and Systems XXIII, ed. N. Manset & P. Forshay, 451
 Whiting, M. T. 2012, MNRAS, 421, 3242
 Wisotzki, L., Bacon, R., Blaizot, J., et al. 2016, A&A, 587, A98
 Zackay, B. & Ofek, E. O. 2015, ArXiv eprints [\eprint[arXiv]1512.06872]