Beyond Stacking: A MaximumLikelihood Method to Constrain Radio Source Counts Below the Detection Threshold
Abstract
We present a statistical method based on a maximum likelihood approach to constrain the number counts of extragalactic sources below the nominal fluxdensity limit of continuum imaging surveys. We extract flux densities from a radio map using positional information from an auxiliary catalogue and show that we can model the number counts of this undetected population down to flux density levels well below the detection threshold of the radio survey. We demonstrate the capabilities that our method will have with future generation widearea radio surveys by performing simulations over various sky areas. We show that it is possible to accurately constrain the number counts of the simulated distribution down to one tenth of the flux noise rms with just a sky area of 100 deg. We then test the application of our method using data from the Faint Images of the Radio Sky at TwentyCentimeters survey (FIRST). We extract flux densities from the FIRST map, sensitive to 150 Jy beam (1 ), using the positional information from a catalogue in the same field, also acquired at the same frequency, sensitive to 12 Jy beam (1 ). Implementing our method, with known source positions, we are able to recover the right differential number counts of the noisedominated FIRST map fluxes down to a flux density level which is onetenth the FIRST detection threshold.
keywords:
cosmology: observations — galaxies: statistics — methods: data analysis — radio continuum: galaxies.1 Introduction
Large astronomical surveys have become a fundamental way to understand the Universe at the largest scales. Widely known examples are the Sloan Digital Sky Survey (SDSS; York:2000), the Faint Images of the Radio Sky at Twenty centimeters survey (FIRST; Becker:1994), and the Two Micron All Sky Survey (2MASS; Skrutskie:2007) all of which resulted in a plethora of results applicable to essentially all fields of astronomy, demonstrating how powerful and important wide and deep surveys are, and how promising the next generation of surveys will be. Powerful statistical tools will provide an invaluable way to interpret the observations of the next generation radio telescopes, capable of observing at higher sensitivities and with higher mapping speeds, covering unprecedentedly large areas of sky.
One of the most immediate indicators to be taken from astronomical surveys are the source counts, , the number of sources per flux interval per sky area. Longair:1966 first showed that cosmological evolution can be studied from such a basic quantity, and since then it has been shown that the source counts can be used to estimate details about source population and galaxy evolution (e.g. Rowan:1970; Condon:1984, and references therein).
While source counts can appear to make full use of astronomical images, the lower flux density level at which the counts can be calculated is limited by both instrumental noise from the detectors and Poisson noise from the sky. However, using the statistical properties of the noise itself, one can estimate the characteristics of the sources that lie below the detection limit. A relatively mundane example is the technique of stacking (e.g. Dunne2009; Karim2011), where a collection of sources, in spite of being undetected, can have their average flux estimated by summing the signals of the respective observations while decreasing the nonpathological noise levels. The final flux estimate is frequently used as an approximate measure of the statistical properties of a well defined sample of sources, known to exist from other wavelengths.
Over the past decade many authors have used stacking techniques to extract the average properties of populations of extragalactic sources selected in a different waveband. This has allowed the study of objects well below the usual 3–5 detection threshold of a given survey, providing information on starformation rates and/or accretion activity inaccessible to standard fluxlimited source catalogues. However, these stacking experiments also mean that information is lost on the distribution of sources below the fluxdensity limit of a given survey. A more general approach can be attempted by analysing the statistical characteristics of the noise itself, and conclusions can be drawn about the shape of the source counts in the flux density regime at or below the noise level. The method outlined in this paper can thus be used on the wealth of existing and forthcoming data at all wavelengths to extract real physical understanding of the subthreshold source populations.
There are currently a number of methods to estimate faint radio source counts in the low flux density regime, down to Jy levels. In the presence of confusion noise, Scheuer:1957 first showed that using a analysis allows one to statistically measure the source counts from a confusionlimited survey. This method has recently been used to measure number counts down to 1 Jy at 3 GHz (Condon:2012). With the advent of more sensitive and higher resolution radio telescopes, the level of confusion noise drops significantly, so the extra information from sources below the detection limit will be usually dominated by instrumental noise instead of confusion noise. In this paper we assume surveys which are not limited by confusion, so a analysis would not be necessary.
Crawford:1970 (see also Murdoch:1973 and Hogg1998) use a maximum likelihood method to constrain faint galaxy counts at levels above the confusion noise, and has since been implemented a number of times at submillimetre (Vieira2010; Austermann2009) and radio wavelengths (Fomalont:1993; Windhorst:1993; Fomalont:2002). Although this is a robust method, the drawback is that it is to be used explicitly on sources with flux densities above some detection limit, typically 5 . Fomalont:1988 reports on a method which constrains radio source counts by analyzing the variances in radio maps at flux densities below the detection threshold, which allows one to measure counts down to 1 (Fomalont:1993); although powerful, we are interested in going to even deeper flux density levels, making use of the upcoming availability of catalogues with several millions of positions where radio galaxies are known to exist.
In this paper we develop a technique based on a Maximum Likelihood approach that is a hybrid between a stacking analysis and a pure background noise analysis. The method presented here is similar to stacking in that we use positions from an external catalogue to extract fluxes from a noisy map, but it is superior to the stacking technique in that we are able to make more detailed statements about the undetected population–namely, we are able to constrain their differential number counts. Putting it simply, the combined measured fluxes from the noisy map at positions where we know that galaxies exist should provide more information than just the average, and we aim to extract that information with this new method. However, we explicitly note here that while no general remarks about the overall source population present below the detection level can be immediately drawn, it offers the unique opportunity to build partial source counts for specific populations, using previously known source positions. For example, given a catalog obtained from observations taken at ultraviolet (UV) wavelengths, one can extract fluxes from a noisy radio map at the UV detected positions, apply the method presented here, and obtain the number count contribution of UV sources at radio wavelengths. In this way, our method cannot be compared on a onetoone basis to the method, nor can it be interpreted to return the intrinsic number counts of the population in question (e.g radio in the previous example). On the other hand, as one increases the multi wavelength coverage of a given sample, it is hoped that a clear picture of the global distribution can be achieved. Considering the new generation of deep, ultrawidefield radio surveys which will be performed over the coming years, we develop this technique with radio surveys in mind, and exemplify its use with current stateoftheart observations. A robust implementation of our method requires very large area surveys, e.g. with an expected large number of undetected galaxies in the images and multiwavelength data capable of providing catalogues with large numbers of these undetected galaxies. This makes future large radio continuum surveys an ideal candidate to apply this statistical analysis.
The Square Kilometer Array (SKA) Phase 1 midfrequency array will have an angular resolution of arcsec and a continuum sensitivity of about 0.7 Jy hr (Dewdney:2013), allowing to reach Jy rms sensitivity over most of the visible sky and therefore, detect millions of faint galaxies over large areas of the sky (see e.g. JarvisRawlings2004; Wilman2008; Wilman2010). However, due to the time table for SKA, we should also consider more nearterm surveys, which are scheduled to be completed before the end of the decade.
On the shortest timescale, the Lowfrequency Array (LOFAR; vanHaarlem2013) will conduct a tiered survey at low radio frequencies (30–200 MHz) reaching Jy rms fluxdensities for the first time at such low frequencies. The Karoo Array Telescope (MeerKAT; Booth:2009) will be able to perform observations at GHz from 2016. The MeerKAT continuum survey (see e.g. Jarvis2012) will conduct a twotier survey to fluxdensity limits of 0.1 Jy and 1 Jy rms over and square degrees respectively.
Two additional surveys, which are expected to come online within the next few years, will image upwards of millions of sources at cm wavelengths, with rms levels of 10 Jy beam. The Evolutionary Map of the Universe survey (EMU; Norris:2011) will be carried out using the Australian Square Kilometer Array Pathfinder telescope (ASKAP; Johnston:2008) and is predicted to detect up to 70 million galaxies in the Southern sky on images reaching a rms of 10 Jy. With the installation of the APERTIF receiver (Oosterloo:2009) on the Westerbrook Synthesis Radio Telescope (WSRT), the Westerbork Observations of the Deep Apertif Northern sky survey (WODAN) will detect millions of galaxies in the Northern hemisphere at 1.4 GHz with similar depth and resolution (Rotter:2011).
All the above instruments will have arc second resolutions so that confusion noise (given by the integrated contribution from undetected sources, Condon:2009) and “natural” confusion (when there is a high chance of source overlap, Condon:2012) are expected to be at much lower flux levels than the corresponding instrumental noise, creating ideal conditions for the application of our method. Based on the simulations of Wilman2008; Wilman2010, which agree with the latest results on the faint source counts (e.g. Simpson et al. 2012), EMU with a resolution of has around 70 beams per source at the 10 Jy rms of the survey. With a resolution of 3 acrcsec at the depth of the proposed MeerKATMIGHTEE survey of 1 Jy rms, we expect 30 beams per source. Given that radio sources at these fluxdensity levels tend to be compact () (Mux:2005), we will again be in the realms of instrumental confusion rather than natural source confusion.
2 Statistical description
The objective in this analysis is to constrain the source number density , for a population of galaxies, below the detection threshold (usually 3 to 5 times the rms noise) for a given radio survey^{1}^{1}1As mentioned before, this analysis can be extended to other wavelengths. We can take advantage of the knowledge of the positions of sources, undetected in the radio, from other (e.g., optical, nearinfrared) surveys. By measuring the radio flux densities at these known positions, we will obtain noisedominated flux density estimates () between some minimum and a maximum . We will assume that the measured flux at each of the catalogue positions will correspond to the flux density from one galaxy plus noise (i.e. we will neglect the possibility that more than one galaxy will dominate the flux in any given pixel). We can then bin the measured galaxy flux densities into flux density bins. Note that we will not consider in this analysis the possible angular correlations on the number density of galaxies. This extra information could further constrain the model, but this is usually only useful in the confusion noise limited case (the analysis mentioned earlier). We also note that the “detection” image (i.e. the image from which the flux densities are extracted) will already implicitly contain this clustering information.
If we can calculate the probability, P, of obtaining a given number of galaxies for bin given our proposed model , then Bayes theorem states that the probability of the model given the data is, up to some normalisation, the probability of the data given the model:
(1) 
where is our data.
2.1 Probability function
We will now calculate the theoretical expected distribution. Let us start by assuming that the number of galaxies in a given patch of sky obey a Poisson distribution with an average number per flux given by (after integrating over the area), depending only on the flux . Ignoring the effect of noise for the moment, the probability of observing galaxies over the observed sky area in the flux bin between and is given by
(2) 
with and assumed constant across the interval . For an infinitesimal interval it should be enough to only consider the possibility of having zero galaxies in that bin, or at most one galaxy in each bin, for which the probability would be:
(3) 
Adding noise, the measured flux for a given galaxy will be:
(4) 
where is the underlying ”real” flux of the galaxy and the experimental noise which is assumed to obey a Gaussian distribution with average zero and variance . Taking again the case of an infinitesimal value, it is enough to consider the probability of finding one single galaxy with measured flux in the interval , which is given by
(5) 
Here, and will be set by the model we choose, and is assumed to equal zero outside this range. The probability of finding galaxies in some larger, arbitrary range between and , can be found by taking into account the different possibilities for distributing those galaxies in the interval:
(6) 
This is the salient equation of our method–with it, we can calculate the probability of any model reproducing a binned distribution from a set of measured, noisedominated flux densities. Note that for a large number of galaxies this basically translates into a Gaussian distribution.
Similar forms of eq. 6 can be found in Borys2003, Laurent2005 and Coppin2006, where the number counts of millimeter and submillimeter surveys are analysed. Note however that our application of the above equation is essentially different, since in the instrumental noise dominated regime we would not be capable of fitting for the number if the position of the galaxies was unknown.
Now, if was constant across the interval in the above equation, this would give
(7) 
which, for the zero noise case would fall back into the Poisson distribution:
(8) 
2.2 Simulation
We will now simulate the expected distribution using a MonteCarlo approach and compare it to the obtained function above. Note that the probability distribution is, of course, only a function of the model assumed, not of the data. We simulate the expected probability through the following steps:

Set a model parameterized as
(9) 
Define an arbitrary set of bins which will be held constant throughout the process.

Generate a set of galaxies (fluxes) between and with the model above.

Add random noise to each flux assuming a Gaussian distribution for the noise.

Bin the resultant fluxes and record the number at each bin.

Repeat the process a large number of times, starting from step (3).
In this way, we can then obtain a histogram of the number of galaxies for each bin, which will give us a probability distribution for each bin.
We can now compare these results with eq. (6) above. For the example discussed here, we chose a model with Jy, Jy, Jy deg, and Jy. The simulation was done over an area of 10 deg and binned logarithmically between . We iterated through the steps times, resampling both and the noise distribution at each iteration, in order to generate probability distributions at each bin. We then calculated analytic values using eq. 6 at each bin given the same model, bin sizes and sky area. This comparison can be seen in Fig. 1.
3 Parameter Estimation Method
We aim to constrain the four parameters, and of the model described above, using only the information from a noisy distribution. We assume that the form of the parameterization we choose can adequately model the number counts. If we denote the four unknown variables as , the probability that a given set of model parameters correctly describes the data is:
(10) 
where we are interested in the posterior distribution ; is the likelihood, denotes the priors, and the denominator is the normalising Bayesian evidence term. In this analysis we ignore the Bayesian evidence term, since we are only solving for parameter values and not comparing between different models of parameterization. In this case, the Bayesian evidence term is a constant, (however, note that in the event that we want to perform model selection this Bayesian evidence term could be included). The likelihood is calculated at each bin using eq. (6). So all we need to do in order to constrain the four parameters is to sample from the parameter space, within some set of priors, and at each iteration over the parameter space compute the likelihood using the given set of observed, binned, noise dominated fluxes . The total probability of the given set of model parameters is then just the product of the probability at each bin.
3.1 MCMC approach
A direct gridding approach, where one generates four different grids spanning the prior range for each parameter and calculates the likelihood at each grid index, is effective but quickly becomes computationally inefficient as the grid resolutions are increased. An obvious solution to this is to implement a Markov chain Monte Carlo (MCMC) analysis. There is a weak degeneracy between the four variables we are solving for, so a standard MetropolisHastings MCMC approach has a tendency to ”get stuck” in local maxima, as the degeneracy causes the target probability distribution to be multimodal. This creates an inadequate sampling of the parameter space, and ultimately unconvergent or slowly convergent chains. We find that a reasonable solution to this is to employ a Parallel Tempering (PT) MCMC algorithm, as discussed in Liu:2001.
The PT algorithm is similar to the MetropolisHastings algorithm in that we use the same acceptance/rejection criteria, with eq. (6) as our likelihood function, but differs because the previous parameter state is not necessarily from the last iteration of the chain in question. The PT algorithm is basically just a way to swap parameter states between a ladder of chains that are running in parallel, where each chain has a different “temperature” ; we define the parameter = 1/. The probability distributions for the chains are equal to . The first chain has (the target distribution), with decreasing as you move down the ladder of chains, effectively broadening the probability distribution at each decrease in . In our analysis here, we ran six chains in parallel with linearly decreasing values ranging between 1.0 and 10. The chains to be swapped were chosen by generating a random uniform number between 1 and 5, where we propose to swap parameter states from chains and . For chains and , the probability that a swap occurs is equal to
(11) 
and a swap is carried out if is greater than or equal to a randomly generated uniform number.
With this method we were able to efficiently sample the entire parameter space of all four variables. In the end, each chain had a minimum length of , and only results from the first chain are recorded. We executed the code many times, each time varying the initial values and the interval at which a parameter state is swapped. Parameter states between two chains were proposed to be swapped between every 30 and 10 000 iteration. (Note that in the limit of high swapping intervals, this method effectively becomes a MetropolisHastings algorithm ) In addition, we ran a number of simulations using the more standard MetropolisHastings algorithm. For our final analysis output, we mixed all the chains from both the PT and MetropolisHastings methods and set a burnin of for each individual chain. Our final, mixed chains had lengths of .
3.2 Prior Selection
In general, MCMC solutions are directly dependent on the choice of priors. In our analysis we set conservative priors on and , as we assume one will be able to gain ample insight from the detected population (i.e. sources with flux densities ). We also quantify the minimum flux density at which our method will work by relating to the shotnoise component. Given that the integrated flux fluctuations from undetected galaxies should be below the instrumental noise, we impose the following condition on when we choose our initial parameter guesses to implement the MCMC:
(12) 
where is the sky area subtended by one pixel and is an arbitrarily small number less than 1. Thus, as the resolution of the telescope increases, the minimum flux at which our analysis will work decreases. Decreasing improves the errors on the model fitting; in our simulations, which are performed down to Jy with a resolution of 10, we find an of is acceptable.
In addition to the aforementioned constraints, we also imposed uniform priors for each parameter separately: ; ; ; .
Area (deg)  (Jy deg)  (Jy)  (Jy)  

1  
10  
100  
Target:  1.00  20.00 
3.3 Simulation Results
We test the efficacy of our method with a PT MCMC implementation by performing simulations over various sky areas with a constant model, effectively increasing the average total number of sources used for each increase in sky area. We simulated a noisy distribution parameterized in the same way as discussed in 2.2, used 10 bins logarithmically spaced between 1.0 and 50.0 Jy, and imposed the prior constraints as discussed in the preceding section. With simulated, noisedominated sources (1 deg with our chosen model), we are able to recover all four of the model parameters, within a confidence interval of 68%. As we increase the number of sources to (100 deg), the errors on the marginal posterior distributions fall by roughly 1/3; the best fit values for this many sources varies from the true model parameters by at most 7% (e.g. in Table 1). Results of our simulations can be seen in Table 1 and Fig. 2. These results demonstrate the potential of our method under ideal conditions; next we demonstrate our method using real, noisy radio data.
4 Analysis of Faint Radio Data
We now turn to the application of our method to a real data set. In order to do this, we need noisy radio data with source flux densities extending well below the 5 level. We obtained a noisy distribution of sources by extracting peak fluxes from a shallow radio image using the positions from a significantly deeper catalogue. This was done using data from the 2 deg Cosmological Evolution Survey (COSMOS; Scoville:2007) and The Faint Images of the Radio Sky at Twenty centimeters survey (FIRST; Becker:1994).
FIRST is a largearea survey using the NRAO^{2}^{2}2The National Radio Astronomy Observatory is operated by Associated Universities, Inc., under cooperative agreement with the National Science Foundation. Very Large Array (VLA). We used FIRST tiled images (Becker:1995ei) covering the COSMOS field, which have a typical rms of 150 Jy beam, 5 resolution, and a source detection threshold of 1 mJy. We extracted fluxes from this map using positions from observations done with the VLACOSMOS survey, which is 13 times deeper than the FIRST survey.
The VLACOSMOS Survey, centered on the COSMOS field, is comprised of both deep (2.5 resolution) and large (1.5 resolution) VLA projects at 1.4 GHz. We utilised the Large Project Catalogue (Schinnerer2007), which was updated and revised to account for bandwidth smearing (Bondi:2008). The catalogue contains only sources with peak flux densities above 5 , and 1 12 Jy beam.
An example of nondetections in the FIRST image can be seen in Fig. 3. Fig. 4 shows the relationship between the 5 fluxes from the VLACOSMOS merged catalogue against those extracted from the FIRST map using the VLACOSMOS positional information. Clearly the FIRST fluxes are noise dominated, with values extending down to a minimum of Jy.
Area (deg)  (Jy sr)  (Jy)  (Jy)  

Random Positions:  2.16  506  
FIRST:  2.16  
Target:  2.16 
4.1 FIRST Map Results
Bondi:2008 performed a detailed analysis on the number counts properties of the sources in the VLACOSMOS large project catalogue. In their analysis they parameterized the number counts as a sixthorder polynomial, so in order to include their results in our work, we fit their values to a function of the form . For this fitting, we choose only the data with flux density values ranging between as given in Table 3 of Bondi:2008. The noise properties of the VLACOSMOS maps used to generate the catalogues varies as a function of radius, with a 5 detection limit at the edges of the map of Jy. An upper limit for the sources we use in this analysis is set at the 5 detection limit of the FIRST survey (in Becker:1995ei, the FIRST team quoted as the detection limit for the FIRST maps). Our parameterized fit of the Bondi:2008 data agrees reasonably well, and can be seen in Figure 5. We find the data can be parameterized as Jy sr. We use these values to compare our PT MCMC analysis of the noisedominated FIRST data.
We constrain the number counts properties of the FIRST data in a similar way as the simulations discussed in the previous section. We performed a PT MCMC analysis using the noise dominated FIRST extracted fluxes and set flat priors on the four unknown parameters we are solving for: ; ; ; . We performed the PT MCMC analysis to generate multiple chains, and mixed them to produce the results presented here. The analysis was done using 7 bins linearly spaced between and Jy. In Table 2 we show the best fit values as computed from each parameter’s marginal posterior distribution, and Fig. 5 shows the best fit curve.
As a check of the results of our method using known source positions, we perform the analysis again using flux densities from a blind extraction on the FIRST image. We extract flux densities for the same number of galaxies as in the catalogue but at random pixel positions, and apply the same prior constraints, binning intervals and PT MCMC conditions. The marginal posterior distributions for each of the parameters returned from this analysis have larger errors, and the bestfit values are far from the expected values, as can be seen in figure 5.
5 Simulations with Nonconstant Noise Variance
Throughout this paper we have considered distributions dominated by a Gaussian noise with a constant rms. In general, this is an ideal case since radio maps typically have a varying rms which primarily depends on the depth of the map–a function of multiple, nonuniform telescope pointings. In addition, bright sources will also affect the local rms of any given section of the map. Future generation radio surveys will be reduced with state of the art algorithms (e.g. Smirnov2011a; Smirnov2011b) which should greatly minimize nonuniformities, but small fluctuations on the instrumental noise rms are still expected.
The most direct approach to deal with this changing rms will be to divide the sky into sections were we can take the rms to be very close to constant. If each section of sky is similar in size, so that is roughly constant, we can apply the previous described method to each section taking into account that the total probability of the model should be the product of the probability for each section. However, if the sections of sky vary in size, additional steps must be taken.
Accounting for varying sky sizes begins by binning the extracted fluxes in “rms noise bins” according to the rms noise associated with each extracted flux. This will introduce an extra step in the initial stage of our method: in addition to extracting noiseconfused source flux densities at known positions, the rms noise of the area adjacent to each source must also be “extracted”. This “rms binning” will allow one to perform the previous probability analysis for each group of galaxies which are approximated to have the same associated noise (i.e. each group of galaxies in the same rms bin). We can still assume a global model for the source counts, , but the fitting to each rms noise bin , with rms given by , should now be: , where is the fraction of the total number of galaxies that fall into that rms bin. The total probability for a given model is now the product of the probabilities for each noise bin with model . This is a straightforward extension of the method outlined in this paper; given the range of values in bin , we can simply arrange the flux densities as a function of binned values and calculate the probability in eq. 6 for each bin increment of . We denote the flux density bins the same as before, with index ; we denote bin of (i.e. ) as just . The probability distribution of eq. 6 then becomes a function of both bin index and rms noise index , and contains the extra variable :
(13) 
and the total probability for any proposed distribution is
(14) 
Thus, with the additional step of binning the noise rms, one can extend the method presented in this paper to include the effects of a varying rms.
6 Summary and Conclusions
The next generation of radio telescopes will survey huge areas of the sky with high sensitivity and resolution, so that statistical methods to analyse the data obtained will be increasingly important. Here we present a new method which is able to constrain the differential number counts, of specific source populations, down to flux density levels well below the detection threshold. In order to utilize this method, one needs to have knowledge of the positions of the noisedominated flux densities in question, which will likely come from some auxiliary catalogues generated with data taken at other wavelengths. Our method does not allow holistic knowledge of the overall source population below the rms noise, but does allow one to build partial source counts of sources detected at other frequencies, using the flux densities extracted from noisy maps at known source positions. The method described in this paper models the distribution as a simple power law, but can be readily modified to include more complex models.
We demonstrate, with simulated populations parameterized by four values, what our method will be capable of with increasingly large numbers of sources. Our simulations aim to constrain the number counts between and . With simulated, noisedominated sources, we are able to recover all four of the model parameters within a 68% confidence interval. Increasing the number of sources to , the bestfit values for our parameterization differ, at most, by only .
Finally, we show a practical application of our method by performing the analysis on a set of noisy flux densities from the FIRST survey, extracted using wellknown positions from the VLACOSMOS survey, which is 13 times deeper than FIRST. We use the results from a previous analysis done on the VLACOSMOS data, which we find can be parameterized as sr Jy. With our method applied to the noisedominated FIRST data, we are able to recover the number counts from the noisy population alone as sr Jy; with Jy Jy. This analysis was performed using only 506 sources and extends below the rms of the FIRST survey.
Acknowledgements
Support from the Science and Technology Foundation (FCT, Portugal) through the research grants PTDC/FISAST/2194/2012 (K.M.W., J.A., M.S,), PEstOE/FIS/UI2751/2011 (K.M.W., J.A.), PTDC/CTEAST/105287/2008 (K.M.W, J.A.), PTDC/FIS/100170/2008 (M.S.) and SFRH/BD/51791/2011 (K.M.W.) is gratefully acknowledged. The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by CENTRA/IST. Computations were performed at the cluster “BaltasarSeteSóis” and supported by the DyBHoâ256667 ERC Starting Grant.