The mass-sheet degeneracy and time-delay cosmography: Analysis of the strong lens RXJ1131-1231

The mass-sheet degeneracy and time-delay cosmography: Analysis of the strong lens RXJ1131-1231

[    [    [
Abstract

We present extended modelling of the strong lens system RXJ1131-1231 with archival data in two HST bands in combination with existing line-of-sight contribution and velocity dispersion estimates. Our focus is on source size and its influence on time-delay cosmography. We therefore examine the impact of mass-sheet degeneracy and especially the degeneracy pointed out by Schneider & Sluse (2013) [1] using the source reconstruction scale. We also extend on previous work by further exploring the effects of priors on the kinematics of the lens and the external convergence in the environment of the lensing system. Our results coming from RXJ1131-1231 are given in a simple analytic form so that they can be easily combined with constraints coming from other cosmological probes. We find that the choice of priors on lens model parameters and source size are subdominant for the statistical errors for measurements of this systems. The choice of prior for the source is sub-dominant at present (2% uncertainty on ) but may be relevant for future studies. More importantly, we find that the priors on the kinematic anisotropy of the lens galaxy have a significant impact on our cosmological inference. When incorporating all the above modeling uncertainties, we find km sMpc, when using kinematic priors similar to other studies. When we use a different kinematic prior motivated by Barnabè et al. (2012) [2] but covering the same anisotropic range, we find km sMpc. This means that the choice of kinematic modeling and priors have a significant impact on cosmographic inferences. The way forward is either to get better velocity dispersion measures which would down weight the impact of the priors or to construct physically motivated priors for the velocity dispersion model.

a]Simon Birrer a]Adam Amara a]Alexandre Refregier Prepared for submission to JCAP

The mass-sheet degeneracy and time-delay cosmography: Analysis of the strong lens RXJ1131-1231


  • Institute for Astronomy, Department of Physics, ETH Zurich
    Wolfgang-Pauli-Strasse 27, 8093, Zurich, Switzerland

E-mail: simon.birrer@phys.ethz.ch, adam.amara@phys.ethz.ch, alexandre.refregier@phys.ethz.ch

Keywords: Gravitational lensing, strong lensing, cosmology, parameter estimation, Hubble constant

ArXiv ePrint: 1511.03662

 

 

1 Introduction

Strong lensing systems and the time delays between different images of the same background source can provide information about angular diameter distance relations (see [3] and review of [4] for the early work). Cosmographic analyses rely on measurements of time delay [see e.g., 5, 6, 7, 8, 9, 10, and the COSMOGRAIL collaboration]111www.cosmograil.org and estimates of the line-of-sight structure and lensing potential. This cosmography technique has been applied to determine the Hubble parameter using different strong lens systems [see e.g. 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23] and also by applying statistics to multiple systems [see e.g. 24, 25, 26]. In the past, some of the measurements have produced a wide range of results for [e.g. see section 8.2 of 21]. One concern has been to evaluate the impact of potential systematic errors. In particular, the mass-sheet degeneracy (MSD) [27] and related degeneracies that cause biases due to model assumptions [e.g. 11, 28, 29, 30, 31] need special consideration. For instance, this has been illustrated by [1] where they show that assuming a power-law lens model can cause significant biasing of results.

In this paper, we introduce a new treatment of the MSD and source reconstruction for cosmographic analyses. This approach integrates information coming from imaging, velocity dispersion, external convergence and time delay measurements. For the choice of data and the parameterization of the lens we follow the work of [22], and we infer the values of the parameters using our recent framework presented in [32]. In our framework we reconstruct the source using shapelet basis sets. This allows us to explicitly set an overall scale for the reconstruction. We will show that this enables us to better disentangle the effects coming from source structure and MSD. This then makes it simpler to robustly combine the information coming from the different data sets.

The paper is organized as follow: In Section 2 we briefly review the principles of time delay cosmography. Section 3 presents the data used in this work. Section 4 describes the details of the lens modeling, including kinematics, likelihood analysis and the source reconstruction technique of [32]. In Section 5 we show that the use of this reconstruction technique turns out to be well designed for mapping out the MSD. Section 6 describes the combined likelihood analysis and posterior sampling. Section 7 discuss the cosmological constraints in terms of angular diameter relations and cosmological parameters. In Section 8, we compare our results to others. We summarize our conclusions in Section 9.

2 Theory

Gravitational lensing is caused by deflection of light by matter. In this section, we review the principles of gravitational lensing and time delay cosmography and introduce our conventions.

2.1 Lensing formalism

The lensing potential at an angular position is given by

(2.1)

where is the convergence and is given by

(2.2)

with

(2.3)

is the critical density and is the physical projected surface mass density. , and are the angular diameter distances from the observer to the lens, to the source and from the lens to the source 222 is not the subtraction . In a flat universe: , where is the transverse co-moving distance., respectively. The deflection angle is and the lens equation, which describes the mapping from the source plane to the image plane is given by

(2.4)

The convergence can also be written as

(2.5)

2.2 Time delays

The Fermat potential is defined as

(2.6)

The excess time delay of an image at with corresponding source position is

(2.7)

where

(2.8)

is referred as the time delay distance. The relative time delay difference between two images positioned at and , the actual observable, is then given by

(2.9)

Line-of-sight (LOS) structures external to the lens also affect the observed time delay distance through additional focusing or de-focusing of the light rays. We parameterize the LOS structure with a single constant mass sheet parameter , the external convergence. The actual time delay distance relates to the one inferred by ignoring the external LOS structure by

(2.10)

3 RXJ1131-1231 system

The quadrupole lens system RXJ1131-1231 (Figure 1) was discovered by [33] and the redshift of the lens and of the background quasar source was determined spectroscopically by [33]. The lens was modeled extensively by [34, 35, 22, 32, 36] with single band images. We use the archival HST ACS WFC1 images in filter F814W and F555W (GO 9744; PI: Kochanek). The filter F814W was also used for lens modeling in [22], [23] and [32]. We make use of the MultiDrizzle product from the HST archive. We use a 160 pixel image centered at the lens position with pixel scale 0.05”. This corresponds to a FOV of 8”.

For the analysis in this work, we take the time delay measurements and uncertainties from [37], namely days, days, and days, where represent the quasar images in Figure 1. This data was used in [22], where they also measure the LOS velocity dispersion of km s, that we use in our analysis.

For the external convergence , we take the estimate of [22] based on relative galaxy counts in the field [38] and their modeled external shear component compared with ray tracing of the Millennium Simulation (see their Figure 6). As their probability density function for is not given in a parameterized form, we use an approximation of their PDF in the form of a skewed normal distribution with mean , standard deviation and skewness . This function is illustrated in Figure 2 and described in Appendix E.

Figure 1: HST ACS WFC1 images in filters F814W (left) and F555W (right). The F814W filter has more high signal-to-noise pixels than the F555W filter. In the F555W filter, the substructure in the Einstein ring and the diffraction spikes of the quasar images are more prominent. The letters A,B,C,D indicate the quasar images for the time delay differences.
Figure 2: Probability density function of the external convergence in the form of a skewed normal distribution. The parameters chosen are designed to match well the probability density function quoted in [22] (their Figure 6).

4 Lens modeling

In this section, we present the parameterization of the lens model, the lens light description, the source reconstruction technique, PSF modeling, the modeling of the lens kinematics and the likelihood analysis.

4.1 Lens model parameterization

For the lens model, we use:

  1. An elliptical power-law mass distribution parameterized as

    (4.1)

    where is the Einstein radius, is the ellipticity and is the radial power-law slope.

  2. A second spherical isothermal profile (Equation 4.1 with fixed and ) centered at the position of the visible companion of the lens galaxy about 0.6 arc seconds away from the center.

  3. A constant external shear yielding a potential parameterized in polar coordinates given by

    (4.2)

    with is the shear strength and is the shear angle.

4.2 Lens light parameterization

The light distribution of the lens is modeled in a parameterized form. We use the same profiles as [22], namely two elliptical Sérsic profiles [39] with common centroid for the central elliptical galaxy and an additional spherical Sérsic profile for the companion galaxy. The intensity profile is parameterized as

(4.3)

where is the amplitude, is a constant such that is the effective half-light radius, is the axis ratio and is the Sérsic index. We use the value of half-light radius as the effective radius in the kinematics modeling of Section 4.5.

4.3 Source surface brightness reconstruction

We use the source reconstruction method presented in [32] based on shapelet basis functions introduced by [40]. To apply this method, three choices have to be made. (1) The shapelet center position, which we fixed to quasar source position. The determination of the quasar source position is explained in detail in Section 4.2 of [32]. (2) The width of the shapelet basis function (see Section 5 for its impact). (3) The maximal order of the shapelet polynomials. We set for modeling and parameter inference. With this, most of the features in the extended source can be modeled. Given these three choices, one can reconstruct the angular scales between and around the center of the shapelet in the source plane.

4.4 PSF modeling

We use four bright stars in the same ACS image to model the PSF. After normalizing for flux, we apply a sub-pixel shift to recenter the stars and then stack. When comparing the individual star images and the stack, we see significant variations that we need to consider in our analysis. To do this by measuring the scatter for each pixel and assume that the scatter in high signal-to-noise pixels is due to a model error that we quantify as a fraction of the flux. This leads to an additional error term, beyond the Poisson and background contribution, that is important close to the center of the bright point sources (see Section 4.6). For the quasar point sources, we use a cutout of the PSF of 111 pixels to cover most of the diffraction spikes. For the extended surface brightness we apply a PSF-convolution kernel of 21 pixels.

4.5 Stellar kinematics

We follow the analysis of [21] for the modeling of the stellar velocity dispersion. The mass profile is assumed to be a spherical symmetric power-law in the form of

(4.4)

where is the density at radius and is a power-law slope of the mass profile (the same as for the lens model in Equation 4.1). The normalization of the mass profile can be expressed in terms of the lensing quantities as

(4.5)

where is the external convergence, is the critical projected density, is the Einstein radius, is the angular diameter distance from the observer to the lens and is the Gamma function. The estimation of the projected velocity dispersion along the line of sight requires a description of the anisotropic velocity component split in radial and tangential component

(4.6)

Massive elliptical galaxies are assumed to have isotropic stellar motions in the center of the galaxy () and radial motions in the outskirts (). A simplified description of the transition can be made with an anisotropy radius parameterization defining as a function of radius as

(4.7)

Assuming a Hernquist profile [41] and an anisotropy radius for the stellar orbits in the lens galaxy, the three-dimensional radial velocity dispersion at radius from Jeans modeling is given by

(4.8)

where is related to the effective radius of the lens light profile by and is a hyper geometric function. The modeled luminosity-weighted projected velocity dispersion is given by

(4.9)

where is the projected radius, is the stellar density and is the projected Hernquist distribution. The luminosity weighted LOS velocity dispersion within an aperture is then (see also equation 20 in [21])

(4.10)

where indicate the convolution with the seeing. In Appendix A we describe in detail how we compute a modeled in a numerically stable way. This calculation assumes no rotational behaviour of the lensing galaxy. Priors on the anisotropic behaviour are discussed in section 6.3.

Equation (4.10) can be expressed as a function of angular scales of and paired with a cosmological dependent angular diameter distance relation and an external convergence factor as

(4.11)

where is capturing all the computation of equation (4.10) without cosmological and external convergence specifications. With this calculation, we see that any estimate of the (central) velocity dispersion is dependent on the ratio of angular diameter distance from us to the source and from the deflector to the source. This fact is important when kinematic modeling is used to infer cosmographic information. We separate in the modeling the angular and the cosmological information. The separability allows us to consistently infer cosmographic information without the need of cosmological priors in the kinematic modeling.

4.6 Likelihood analysis

We estimate the pixel uncertainty in the image with a Gaussian background contribution estimated from an empty region in the image and a Poisson contribution from the model signal scaled by the exposure map . In addition, the modeling uncertainty of the PSF of the bright point sources with amplitude , PSF kernel and model uncertainty coming from the star-by-star scatter is given as

(4.12)

at a pixel i, where is the number of quasar images. All together, the uncertainty for each pixel sums up in quadrature as

(4.13)

For the linear source surface brightness reconstruction is replaced by the image intensity .

The likelihood of an image given a model is

(4.14)

with being the number of pixels in the modeled image and is the normalization

(4.15)

At this stage, it is useful to separate the model into nonlinear parameters and linear parameters . The likelihood of the non-linear parameters is given by

(4.16)

The integral is computed in [32] (their Equation 13) assuming flat priors in , which we adopt.

The likelihood for the time delays is the product of the likelihoods of all relative delays of the quasar pairs

(4.17)

The likelihood of the LOS central velocity dispersion is given by

(4.18)

5 The mass sheet degeneracy

There exists many different degeneracies in strong lens modeling [e.g., 28, 42]. In this section we focus on the MSD [27] and in particular its impact on time delay cosmography as it was pointed out by [1]. As shown by [27], a remapping of a reference mass distribution by

(5.1)

combined with an isotropic scaling of the source plane coordinates

(5.2)

will result in the same dimensionless observables (image positions, image shapes and magnification ratios) regardless of the value of . This type of mapping is called mass-sheet-transform (MST), and shows that imaging data, no matter how good, can not break the MSD.

The additional mass term in MST (Equation 5.1) can be internal to the lens galaxy (affecting the lens kinematics) or due to line-of-sight structure (not affecting the lens kinematics) [see e.g., 28, 29]. The external part of the MST can be approximated by an external convergence , which rescales the time delays accordingly. The external contribution also rescales the source plane. Lens modeling often only explicitly models the internal structure of the lens. The inferred source scale has to be rescaled by the external mass sheet to match the physical scale.

5.1 Source scaling and the MSD

An important parameter in the lens model inference is the physical source scale. Neither the lens model nor the source size are direct observables, but they share the MST in each others inference. Given a lens model, certain source sizes are preferred. The opposite is also true: Given a source size, certain lens models are preferred. This is a direct consequence of the MST (Equation 5.1 and 5.2). Therefore, it is important to control the prior on the assumed source scale in the modeling. A particular source surface brightness reconstruction method, depending on the choice of regularization, basis set, pixel grid size or parameters of the source reconstruction, will potentially favor a certain size of the reconstructed source and therefore may indirectly lead to priors on the internal mass model through the MST. As one does not know a priori the physical scales in the source galaxy, this may lead to significant biases in the inference of the lens model.

We use shapelets [40] as the source surface brightness basis functions as implemented in [32]. These basis functions form a complete basis set when the order goes to infinity. When restricting the shapelet basis to a finite order , the reconstruction of an image depends on the chosen scale of the shapelet basis function. As pointed out by [40], for a given , there is a scale that best fits the data. From Equation 5.2, we see that changes in can be remapped into changes in the lensing potential through the linear parameter . Therefore, since our source reconstruction technique has an explicit scale, we have a tool to walk along the MST.

5.2 Varying source scale in the ACS WFC1 images

We have identified the source scale to have an impact on the inference of the lens model within the MST. To investigate the specific dependence of the shapelet scale in the source reconstruction in combination with lens model parameterization (Section 4.1) in our analysis of RXJ1131-1231, we model the ACS WFC1 F814W and F555W images with different choices of the shapelet scale . For the F814W image, we use the range 0.14” - 0.19” and for the F555W image the range 0.13” - 0.18”. The shapelet order was held constant at . To find the best fit model, we used a particle swarm optimization as used in [32] to maximize the likelihood (Equation 4.16). In this section, we only use the HST images for our modeling. Time-delay and kinematic data will be added in Section 6.

Figure 3 shows the source reconstruction of the best fit models of filter F814W for six different scales . We see that the source reconstructions are very similar but scaled by the relative factors of the chosen shapelet scale. More explicitly, we overlay in Figure 4 the intensity contours of the different source reconstructions rescaled by . We also show the reconstructions for the F555W image, which shows the same behavior. On the right of Figure 4 we over-plot a joint source reconstruction of the two bands in a fake color image. In Appendix B, we present the corresponding normalized residuals for this analysis of the F814W reconstruction.

The difference in the likelihood value for different scales from the imaging data exceeds the 10- level between each modeled scale . This reflect the fact that the chosen lens model parameterization (see Section 4.1) does not allow for the full freedom needed for a perfect transform according to the MST (Equation 5.1). The source scale can not be fixed to an arbitrary value and caution on any scale dependent source reconstruction description is needed. When assigning a prior on and infer this parameter together with all the lens model parameters from the image reconstruction, we are able to very precisely determine the corresponding source scale and the parameters of the given functional form of the lens model.

Figure 3: Reconstructed source surface brightness profiles as a function of shapelet scale for filter F814W. The source reconstructions of the best fit lens model configurations are shown with a given . We see that the features become larger with larger choices of .
Figure 4: Left: Intensity contours of the reconstructed source surface profiles rescaled to fiducial value for the different shapelet scales in filter F814W of Figure 3. The contour lines overlay well. The lens model does adopt to the choice of such that the source reconstruction catches the best scales. Middle: Same as left for the filter F555W. The same behavior can be seen as for F814W. Right: Color composite model of the filters F814W and F555W for a chosen joint lens model.

5.3 Relaxing on the lens model assumption

As pointed out by [1], there can also be an internal component to the MST. Namely when the lens model can not reproduce the underlining internal mass distribution. The assumption of a power-law lens model formally sets the internal part of the MST. The parameters will fit preferentially those models, whose shape, modulo an artificial MST, are the most similar to the underlying mass distribution. The only effect visible in the modeling of the imaging data is on the source scale. The inferred source scale will be different from the one of the true lens model. Any assumed mass distribution which can not be rescaled according to Equation (5.1) can thus potentially lead to biased inferences, in particular on the slope of the mass profile. This also can result in significant biases in the inferred lensing potential and lens kinematics. In particular, it was stated by [1] that the assumption of a power-law lens model can potentially lead to a significant bias in the inference of the time delay distance.

Three approaches to handle the concerns of [1] in performing cosmographic estimates are:

  1. One assumes that the true lens model can be described within the functional form of the chosen parameterization. This is the approach done by [22]. In this case we end up with the potentially biased inference discussed in [1], a situation we want to avoid as good as possible.

  2. One choses a more flexible lens model than a single power-law mass profile. This approach was followed in [23] in response to [1]. Different profile parameterizations may lead to different preferred source scales. It is not guaranteed that a more sophisticated lens model parameterization infers an unbiased result in the cosmographic inference.

  3. Perform simplifications and approximations that lead to greater robustness against known degeneracies. For instance accommodating MST through careful handling of the source size inference.

In this work we chose the third option mentioned above. This option requires the least assumptions on the lens model and a prior is placed on the source size, rather through the functional form of the lens model. In Appendix D we specifically state the process in a Bayesian inference way to make clear our steps and approximations and show that a renormalization of the imaging likelihood for different imposed source scales is needed to explore the impact of plausible internal MST on the cosmological inference.

5.4 Adding lens kinematics

Additional constraints on the lens model can come from kinematic data at a different scale than the Einstein ring. This becomes of particular importance when weakening the constraining power of the lens model, as described in Section 5.3. Lens models with different source scales predict different lens kinematics. The prediction depends on the stellar velocity anisotropy which can not be known from the existing data and the external convergence which has to be inferred separately.

As long as the relative likelihood of additional kinematic data (Equation 4.18) can not compete with the relative likelihood of the different shapelet scales (on the 10- level between the chosen source scales, see Section 5.2), the combined likelihood will be dominated by the lens model assumption. Only when re-normalizing the likelihood of the imaging data for different scales , the kinematic data can have a significant impact in the determination of the lens profile and in particular the lens potential for time-delay cosmography.

6 Combined likelihood analysis

In this section, we discuss how we combine the different data sets and their likelihoods. We showed in the previous section that biases can emerge from choices in the lens and source modeling. These aspects have to be taken into account when the data sets are combined.

6.1 Combining imaging and time delay data

In a first step, we do a joint analysis of the independent measurements of the time delay and imaging data. The combined likelihood is

(6.1)

with the independent likelihoods of Equation (4.16) and (4.17). We do not yet combine the kinematic data at the likelihood level. We sample all the lens model parameters and the time delay distance . We keep the lens light parameters fixed at the final position of the particle swarm process in the MCMC process to achieve a more efficient sampling of the relevant parameters. We included the full flexibility of the lens light parameters on a subset of the MCMC chains and come to the conclusion that the additional covariance of the lens light model on the cosmographic analysis is very minor, i.e. the impact on the uncertainty on is below 0.1%.

From Bayes theorem, the likelihood of the parameters given the data is (modulo a normalization):

(6.2)

We apply flat priors on the parameters , , , , and Mpc.

At this stage, we want to emphasize that there are 3 data points in the time delay measurement compared to several thousands of high signal-to-noise pixels in the imaging comparison. In principle, the provided time delay measurement can not only determine , which is independent of the imaging data but also can partially constrain the lens model. In practice, any even minor bias introduced in the image modeling can out-weigh the constraining power of the two additional time delay measurements.

In the following, we present the results of the analysis of filter F814W. The results of the equivalent analysis of filter F555W can be found in Appendix C. To sample the posterior distribution of the parameter space we use CosmoHammer [43]. We fix the shapelet scale at [0.14”, 0.15”, 0.16”, 0.17”, 0.18”, 0.19”] and do a separate inference of the parameters for each choice of . Figure 5 shows the posterior distribution of some of the parameters for the different choices of . The inferred parameter constraints for different values do not overlap. We see that is very narrowly determined for a given shapelet scale but varies from up to depending on the position in the degeneracy plane. We want to stress that the external convergence estimated by [22] is based on an external shear prior of .

Figure 5: Posterior distribution (1-2-3 sigma contours) of lens model parameters and time delay distance of the combined analysis of imaging data of F814W and time delay measurements. Different colors correspond to different choices of the shapelet scale . The posterior samples for different values mutually disagree in almost all parameters presented.

6.2 Constraints from kinematic data

To investigate the potential constraining power of the velocity dispersion data, we are interested in how distinguishable different positions within the MST are in terms of their predicted central velocity dispersions. To do so, we fix the cosmology and the external convergence to fiducial values. This allows us to evaluate the predicted LOS central velocity dispersion (Equation 4.11) for all the posterior samples of Figure 5. We assume a random realization of with a flat prior in the range [0.5,5] for all the posterior positions.

In Figure 6 we illustrate the predicted samples vs the predicted time delay distance . We see that the samples can not be fully distinguished with the current velocity dispersion measurement and the assumed anisotropy prior. The relative distance in the predicted velocity dispersion between the different samples are all within 4 (model given data).

There are three factors which affects the distinction of the source scales by kinematic data. (1) The uncertainty in the spectroscopic measurement, analysis and modeling of km s which is about 6%. This is visually the most obvious contribution in Figure 6, marked by the gray band. The mean values of the predicted samples of the different source scales differ by about one sigma of this estimated uncertainty. (2) The anisotropic uncertainty in the lens galaxy kinematics. This is the main driver of the spread in the predictions of the velocity dispersion within each source scale sample. This scatter has a relative spread of 10% given . (3) The predicted velocity dispersion depend highly on the observational conditions and configuration. The PSF and the slit size of the spectrograph results in a convolution and averaging over a wide range of radial scales. The predicted velocity dispersion for different concentrations of the mass in the lens galaxy (i.e. power-law slope ) differ the most in the very center of the lens. At the Einstein radius itself, the different lens models predict basically the same kinematics. With the PSF of 0.7” and a slit size of 0.81” 0.7”, power-law mass profiles with slopes in the range differ by about 100 km s in their predicted velocity dispersion . A smaller slit and seeing conditions of FWHM 0.1” can double this relative difference and therefore could improve the constraining power of the kinematic data significantly.

The combined effect of non-perfect data and non-perfect modeling of the kinematic data with prior can be translated in a relative error in the time delay distance of about 7.5% from Figure 6. Only kinematic data of the lens galaxy and its analysis can reduce this error budget.

In Section 5.2 we showed that the individual image likelihoods of the different samples differ by more than 10. Before including the velocity dispersion measurement in our cosmographic analysis, we re-normalize the image likelihood such that it is independent of the source scale (see Section 5.3). This re-normalization is done by taking the same number of MCMC posterior samples from the different source scales when doing further inferences with the lens model parameters.

Figure 6: Estimated LOS central velocity dispersions vs. time delay distances of the sample of lens models from Figure 5 (in the same colors) for a kinematic anisotropy prior of . The 1-2-3 sigma contours are shown. The external convergence was explicitly set to zero and the cosmology has been fixed to the Planck mean values in this particular plot. The gray band reflects the 1- uncertainty range of the LOS velocity dispersion estimates from the data. This shows that velocity dispersion estimates add important information on the lens model constraints.

6.3 Source scale and kinematic anisotropy priors

The combination and inference coming from the different data sets relies on priors on the source scale of the background galaxy and on the anisotropic behaviour of the stellar kinematics in the lens galaxy. In particular, the inference of the Hubble constant is related to the inference of the angular diameter distance as

(6.3)

In Figure 6, we see a significant dependence between the size of the source galaxy () and . Furthermore the interpretation of the kinematic data is also dependent on the anisotropic behaviour of the lens galaxy.

Choices of the priors on the source size and aniosotropic kinematic must be chosen with care based on information gained from other work as these priors potentially have a significant impact on the infered parameter posterior (i.e. ). In the following, we discuss two different priors in the kinematic anisotropy and the source scale. 333Comments from the authors about confirmation bias: The analysis of the mentioned priors on the cosmological inference has been made after posting a first version of this paper on arXiv.

6.3.1 Source size prior

A simple form of the source size prior which does not impose any specific form of knowledge about is a uniform prior in the range . We refer to this prior as . This prior ignores any knowledge about the population of galaxies. The model parameter is directly related to the brightness of the source as

(6.4)

The number density of galaxies as a function of luminosity is a well measured quantity (luminosity function, LF) and its faint end slope for the blue galaxy population can be well described with a single power-law slope as

(6.5)

with [44]. In this form, the expected source size can be stated as

(6.6)

This prior is weakly dependent on such that smaller source sizes are prefered. We chose as our default prior and explore the impact with in section 7.4.

6.3.2 Anisotropic kinematic prior

Studies of early type (lens) galaxies have been made by e.g. [45, 46] which reveal similar properties compared to local early type galaxies. We consider two priors which cover the same range in the mean anisotropic behaviour and their predicted velocity dispersion . (1) The prior used in Figure 6 is flat in (equation 4.7) in the range [0.5,5]. This prior should cover the expected scale where the transition between isotropic and radial velocity dispersion should occur in an uniform way and is exactly the same prior used in [22]. We refer to this prior as .

(2) We model a global contribution of the anisotropic behaviour in the form

(6.7)

in the range . This reflects the same range in allowed values for a given mass model. We refer to this prior as . indicates a isotropic velocity dispersion and , for which the velocity dispersion ellipsoid is very elongated along the radial direction with , corresponds to with the same mean anisotropy within the aperture. This is the same functional form of the prior as used in [2] to analyze a spiral lens galaxy althought with less range into a pure radial dispersion.

7 Cosmological inference

In this section, we study the cosmological constraints from strong lensing using data from images, time delays, central velocity dispersion of the lensing galaxy and independent external convergence estimates. We first show that the data can be used to constrain the angular diameter relation. Based on the constraints on the angular diameter distances, we then introduce the likelihood that allows us to infer the parameters within the flat CDM cosmological model.

7.1 Angular diameter distance posteriors

We can combine the posterior samples of Figure 5 with the independent velocity dispersion measurement to calculate the angular diameter distance relations and (Equation 4.11 and 2.8) as

(7.1)

and

(7.2)

To take into account the errors in , and , we importance sample the posteriors from the independent measurements ( and ) and for we uniformly sample in the range [0.5,5] times [see e.g. 47, 21, 22, for similar use].

The vs plane as shown in Figure 7 inherits the cosmological information of this analysis coming from the combined data and consistently translates the uniform prior in the source scale into the cosmological inference. This plane covers a wide range but the constrained region is more narrow. [48] did a very similar analysis in term of folding in the velocity dispersion measurement. In our case, we get a degeneracy in the two-dimensional plane coming from the MST whereas [48] and the forecasting of [49] assume independence in the two quantities. We over-plot the posterior samples of WMAP DR9 [50] and Planck15 [PlanckCollaboration:2015p9875] converted to the angular diameter distances of the lens system. We find that at least the posterior samples of one chosen source scale parameter is consistent within 2 with the CMB experiment posteriors in a flat CDM cosmology for the low redshift angular diameter distance relations. Without the renormalization of the imaging likelihood (see Section 5.3), this statement can not be made.

Figure 7: The constraints of the angular diameter distance relation for discrete positions in the MSD plane for filter F814W (same analysis for filter F555W is shown in Figure 11 in the appendix). The chosen priors in the source scale and the kinematic anisotropy of the lensing galaxy are and . Different colors indicate different imposed source scales. On the left panel: vs . Also over-plotted are the posteriors of the WMAP DR9 and Planck 2015 CDM posteriors mapped in the same angular diameter distance relation. On the right panel: Re-mapping of the angular diameter relations into a vs plane. The linear fit is indicated by the thick black line and the (1,2,3)- upper and lower limits of the projected distribution are plotted in different gray scale. The parameters of the fit are indicated in the figure.

7.2 An analytic likelihood for cosmology

So far, we have discretized the degeneracy plane by uniformly sample in steps of 0.01”. Effectively this means that while all the other parameters are sampled through standard MCMC methods, the direction is sampled on a grid. This separation is needed to allow us to do the re-normalization of the likelihood as described in Section 6.2. Sampling the -grid finely is computationally expensive. In the following, we show how we can analytically describe the posterior distribution and fill the gaps in without additional sampling.

To do so, we first map the vs plane of Figure 7 (left panel) into a vs plane (right panel). We see a linear relations between the posterior samples in a monotonic and equally spaced increasing fashion as a function of . We fit with linear regression the function

(7.3)

with being the slope and being the intercept. The legend of Figure 7 (right panel) shows the best fit values, which we discuss in more detail later. The linear fit is a good description of the combined samples of different source scalings. The same is shown for the filter F555W analysis in Appendix C. The spread of the distribution orthogonal to the linear relation is not well fit by a Gaussian distribution, but we find a skewed normal distribution provides a good description.

The one-dimensional likelihood of the strong lens system data given a cosmological model is given by the one-dimensional probability density of the samples relative to the fitted line:

(7.4)

where is the standard deviation, the skewness and is the re-parameterized skewed normal distribution function described in Appendix E. How the different source scale priors on fold in the likelihood is described in Appendix G and equation G.3. In this section, we apply a flat prior in , , and a flat prior in , , (see section 6.3). The inferences for the other combinations of the choices of priors are presented in Section 7.4.

For the analysis of the HST band F814W we fit the values , , and . For band F555W the fits result in , , and . Fitting the combined samples of the band F814W and F555W leads to , , and . The units of these parameters are given in respect with the angular diameter distances in Mpc.

The simple form of the likelihood enables a fast and consistent combination of different strong lensing systems also in combination with other cosmological probes.

7.3 Cosmological parameter constraints

The constraints on the angular diameter distance relations can be turned into constraints on the cosmological parameters of the background evolution. In the following we assume a flat CDM cosmology. The homogeneous expansion can be described in terms of the matter density and the Hubble constant . We use the likelihood of Equation (7.4) with the values of , , and from the analysis of F814W and F555W separately. First, we sample the parameters and simultaneously with uniform priors of and . Figure 8 shows the posterior distributions for the filter F814W (left panel) and F555W (middle panel) for the priors (, ) separately. The degeneracy in is strong but can be determined fairly well. A good approximation of the degeneracy shown in the --plane can be described by

(7.5)

where is the value for at fixed and is the marginalized error at fixed . This form allows us to more directly compare with other results from the literature.

Figure 8: Posterior sampling of the cosmological parameters for the filters F814W (left), F555W (middle) and combined with equal weight of the likelihoods of the two images (right). The posterior distribution of WMAP DR9 and Planck 2015 are over-plotted. The chosen priors in the source scale and the kinematic anisotropy of the lensing galaxy are and .

For a fixed value of , we infer a Hubble constant of km sMpcfor the F814W and km sMpcfor the F555W analysis.

From the analysis of each filter separately, we get an uncertainty coming from the imaging data only to be below 1% in the resulting inference of . Given the fact that our estimates for the two filters F814W and F555W is about 4.0% different while using exactly the same analysis and the same time-delay and kinematic data for all other parameters involved, we conclude that the imaging data inference is partially driven by unknown systematics in the modeling and the data. To marginalize out potential systematics in the analysis, we combine the two analyses on the angular diameter posterior level. The two-dimensional posteriors are shown in the right panel of Figure 8. In this way, we get a Hubble constant of km sMpc. The full posteriors for both samples are shown in Figure 9.

Figure 9: Posterior distribution for the value of for a fixed for filter F814W (green), F555W (blue) and the combined samples (red). The chosen priors in the source scale and the kinematic anisotropy of the lensing galaxy are and .

7.4 Prior dependence

In this section we investigate the dependence of the cosmological inference from the choice of priors of the source scale and the anisotropic kinematics of the lensing galaxy . In Section 6.3 we stated for each parameter two different priors, each of them being quoted to be uninformative and probing the same range in the physics. In table 1 the likelihood parameters and the resulting inference for fixed in flat CDM are stated. We see a strong prior dependence on the posterior distribution which can result in a mean shift in of more than 10 km sMpc. The source scale prior can result in a weak mean shift of about 1-2 km sMpcwithout a change in the uncertainty. This means that the information content in the imprinted priors are roughly the same and the systematic uncertainty is subdominant to the quoted total uncertainty. The situation changes for the kinematic prior . The flat prior approach for the two different parameterizations shifts the mean infered value of by more than . The precision is also affected: The prior results in a significantly higher precision inference than . This implies that inherits more information for the specific task of measuring than . If this prior is not representative of the distribution of early type galaxies, the inference with can be significantly biased compared with .

444For fixed in flat CDM.
- -
-0.0012 263.8
- -
-0.0014 264.2
Table 1: Likelihood and posteriors for different choices of priors. The inference is for fixed . indicates a flat prior in in the range in the parameterization of equation 4.7 and indicates a flat prior in of equation 6.7 of the anisotropic behavior of the lens galaxy. reflects a flat prior in the source scale and reflects a prior of the galaxy luminosity function (see section 6.3.1). The parameters describe the likelihood function stated in equation 7.4 and G.3.

8 Joint uncertainties and comparison with other work

In this Section we analyze the impact of the different data sets on the cosmological inference and we compare our method and results with the literature.

8.1 Uncertainties from the different data sets

We assign uncertainty estimates on the inference of coming from the independent data sets, namely the time delays, the HST ACS images, the line-of-sight analysis of wide field data and the spectra of the lens galaxy for the kinematic estimate 555In this analysis we ignore the dependence of the line-of-sight analysis on the shear term from the ACS image reconstruction.. We do so by forecasting a perfect modeling result for all data sets except the one in question. We then proceed in exactly the same way as presented in Section 7. This leads to an inference of the cosmological parameters only affected by the uncertainties coming from one single data set. We perform this analysis with the default priors and .

In Table 2 the estimated uncertainties from the different data sets are summarized and the 1- uncertainties on for fixed is stated. The Gaussian approximation of all these errors leads to a total uncertainty of 9.4% on . The estimate of the uncertainty coming from the full sampling results in 7.9%. This analysis does not include further potential systematics and does not question the priors chosen.

Our approach on the error analysis is different than the one chosen by [22]. We do not quote an error on the lens model itself, as this inference is dependent on different data sets. We quote an error on the lens model modulo a MST for the image reconstruction and separately an error on the kinematic estimate, which potentially can fully break the degeneracy.

We clearly see that the dominant contribution in the final uncertainty can be related to the kinematic data and its modeling. As discussed in Section 6.2, high resolution spectroscopy can provide data which can better constrain different positions in the MST and therefore significantly reduce the uncertainty on the angular diameter distance relation. The second most dominant uncertainty come from the line-of-sight contribution.

8.2 Comparison with other work

Cosmographic inference has been published by [22] with the same lens model parameterization and by [23] in combination with a composite (dark matter and baryonic matter separated) lens model, in response to the work of [1]. The values and uncertainties on the Hubble constant are km sMpcfor a value of in [22], a 5.5% error, and km sMpc, a 5.75% error, with for a flat CDM universe.

One difference between the work of [22, 23] and the one presented in this work arise from the explicit treatment of the MSD and related degeneracies in our work and its link to the source surface brightness reconstruction method. This allows us to overcome (at least partially) systematics from the source reconstruction method and the mass profile assumption. On the other hand, this weakens the constraining power of the image reconstruction. This explains our larger uncertainties compared to [22, 23]. Furthermore, their stated values on are -independent in the flat scenario while our values do depend on (see our Figure 8 vs. Figure 8 in [22]). This comes from the different description of the cosmological likelihood. The likelihood in [22] is described fully in terms of the time-delay distance where else our likelihood has an additional dependence on . In that sense, their stated value is independent of but ours requires a prior on .

A second difference is that we work in a 2D-plane of angular diameter distance relations (Figure 7) without the need of cosmological priors to define our angular diameter distance likelihood. This results in a different shape of the posterior distribution in the - plane (Figure 8) and the inferred projected posteriors have a strong dependence.

The best comparison with the work of [22, 23] should be done when comparing the inference with the same kinematic prior (first or second row in Table 1). We want to stress that we use explicit priors on the source scale. The cosmological inference is dependent on this prior as the constraining power of the kinematic data is weak. Therefore a shift of about 1 in our stated uncertainty on the inference of is not surprising.

Comparing our results with the CMB experiments, we get a 2.5 shift for and a 1 shift for in the CDM parameter inference. We conclude that the angular diameter distance at last scattering and the inferred angular diameter distance relation at lower redshift from this analysis are consistent with a flat CDM cosmology. Our analysis depends on uninformative priors on the kinematics of the lens galaxy and the source reconstruction scale . Further systematics can potentially also occur and are not included in this analysis.

Description Uncertainty
Time delays 1.6%
HST ACS image reconstruction 2.8%
Line-of-sight contribution 4.7%
Lens kinematics 666The quoted uncertainty includes the uncertainty in the unisotropy radius with a prior of . 7.5%
Total (Gaussian) 9.4%
Total (full sampling 777The uncertainty in the full sampling is given as half of the 68% confidence interval divided by the mean posterior value.) 7.9%
Table 2: Error budget on for a fixed .

9 Conclusions

In this work we applied the newly developed source reconstruction technique of [32] to the strong lens system RXJ1131-1231 to extract cosmographic information. We showed how different source reconstruction scales probe different regimes in the MST even when the lens model is not fully transformable through the MST.

This work is built on the modeling and the data of [22] and the systematics analysis of [1]. We incorporate a re-normalization of the imaging likelihood such that we have explicit priors on the source scale before combining with the kinematic data.

We introduced a cosmographic inference analysis which enables us to combine imaging, time-delay and kinematic data without relying on any cosmological priors. We came up with a likelihood function only based on the angular diameter distance relations, which can be described in analytic terms.

We find that the choice of priors on lens model parameters and source size are subdominant for the statistical errors for measurements of this systems. The choice of prior for the source is sub-dominant at present (2% uncertainty on ) but may be relevant for future studies. More importantly, we find that the priors on the kinematic anisotropy of the lens galaxy have a significant impact on our cosmological inference. When incorporating all the above modeling uncertainties, we find km sMpc(for ), when using kinematic priors similar to other studies. When we use a different kinematic prior motivated by Barnabè et al. (2012) [2] but covering the same anisotropic range, we find km sMpc. This means that the choice of kinematic modeling and priors have a significant impact on cosmographic inferences. Further systematics in the data and modeling can also occur. The way forward is either to get better velocity dispersion measures which would down weight the impact of the priors or to construct physically motivated priors for the velocity dispersion model.

This inference analysis was achieved with a single strong lens system in two imaging bands. Combining the information of multiple systems with comparable data can add vital constraints about the late time expansion history of the universe, also in terms of extensions of the standard cosmological model.

Acknowledgments

We thank Sherry Suyu and the co-authors of [22] and [23] for useful comments and discussions. We thank the Referee for useful comments on the manuscript that helped us improving the text. We acknowledge the import, partial use or inspiration of the following python packages: CosmoHammer [51], FASTELL [52], numpy 888www.numpy.org, scipy 999www.scipy.org, astropy 101010www.astropy.org, triangle 111111https://github.com/dfm/triangle.py. This work has been supported by the Swiss National Science Foundation (grant 200021_149442/1 and 200021_143906/1).

References

Appendix A Numerical computation of the luminosity-weighted LOS velocity dispersion

The computation of the luminosity-weighted LOS velocity dispersion within an aperture under certain seeing conditions (Equation 4.10) involves numerically challenging projection integrals and convolutions. In this section, we describe our approach to achieve a numerically stable and fast computation with a Monte-Carlo ray-tracing approach, similarly used by e.g. [53] to render convolved Galaxy light profiles. This method is based on drawing positions representing the total light distribution of the galaxy.

For the light in the galaxy, we take a Hernquist profile [41]

(A.1)

where is the total flux and relaxed to the effective radius of the galaxy by . The radial distribution function of flux is then

(A.2)

The cumulative distribution function is

(A.3)

A sample of can then be drawn from the distribution

(A.4)

where is the uniform distribution in .

In the following, we describe the steps starting from a representative sample of the flux in the galaxy to get to the estimate of the aperture averaged velocity dispersion:

  1. Draw a representative sample of radii drawn from the three-dimensional light distribution of the Hernquist profile (Equation A.4).

  2. Project the radius on a random two-dimensional plane and compute its projected radius and the projected coordinates . This sample represents the projected light profile of the galaxy.

  3. Displace the two-dimensional coordinates with a random realization according to the seeing distribution to . We assume the PSF is a two-dimensional Gaussian distribution. This sample represents the convolved, projected two-dimensional light distribution of the galaxy.

  4. Select samples, whose displaced position is on the aperture . This selects a sample representative for the luminosity and radial weighting within the aperture.

  5. Evaluate , the projected (but unweighted) velocity dispersion for the remaining samples.

  6. Take the sample average of the velocity dispersion . This average (once converged) corresponds to with the assumption of a Gaussian velocity dispersion.

About 100 samples evaluated in the aperture gives already an accuracy in of about 1%. For this paper, the computation is done with 1000 samples.

Appendix B Residual maps

In Figure 10 the normalized residuals corresponding to the source models with different source scales in Section 5.2 are shown. The residual maps differ significantly between the best fit values of the different shapelet scales . This reflects the fact that extended structure in the Einstein ring can give constraints on the local slope of the mass profile and the given mass model can not adopt equally well to different source scales as it is can not be rescaled according to the mass-sheet transform. The inferred lens models can be understood as the best fit power-law profiles at different positions within the MST.

Figure 10: The normalized residual maps for the best fit reconstruction for the different choices of the shapelet scale for the F814W image. The residuals differ significantly for the different choices of . From the imaging data only, a scale is favored over a scale by more than 30 . This statement is entirely lens model dependent.

Appendix C Analysis on WFC1 F555W

In the paper, we did focus on the analysis of the WFC1 F814W filter band. Here we present the same analysis for filter F555W. Figure 12 shows the posterior distribution of the lens model parameters and time delay distance for F555W. Figure 11 shows the constraints on the angular diameter distance relation. The values describing the distribution can be found in the main text.

Figure 11: The constraints of the angular diameter distance relation for discrete positions in the MSD plane for filter F555W (same as Figure 7 for filter F814W). Different colors indicate different imposed source scales. On the left panel: vs . Also over-plotted are the posteriors of the WMAP DR9 and Planck 2015 CDM posteriors mapped in the same angular diameter distance relation. On the right panel: Re-mapping of the angular diameter relations into a vs plane. The linear fit is indicated by the thick black line and the (1,2,3)- upper and lower limits of the projected distribution are plotted in gray scale. The parameters of the fit are indicated in the figure.
Figure 12: Posterior distribution (1-2-3 sigma contours) of lens model parameters and time delay distance of the combined analysis of imaging data of F555W and time delay measurements. Different colors correspond to different choices of the shapelet scale . (same as Figure 5 for filter F814W).

Appendix D Bayesian description and renormalization of the imaging likelihood

One of the steps presented in this paper is the renormalization of the imaging likelihood for different source scales . In Section 5.3 we provided heuristic arguments for this approach in the case of time delay cosmography. In the following Section, we provide a Bayesian interpretation and justification of our choice in performing this calculation.

Let us assume that there is a complete model that is able to fully describe the lens, with parameters . However, when we fit the data, in our modeling process, we use a restricted subset of the model containing only the parameters and that the missing degrees of freedom are captured by the parameters . To complete our notations, the source scale is given as , the cosmological parameters as . We also denote the image data as , the kinematic data as and any other independent data of the time delays and the lens environment as .

Our goal is to estimate the cosmological parameters given the data, which is . We can state, using Bayes rule

(D.1)

Independence of , and results in

(D.2)

The internal part of the MST is encapsulated in the term . One way to think about MST is that the source scale cannot be measured from imaging data alone. In other words, given image data and marginalizing over all possible lens models, one should recover the source size prior. The Bayesian expression for the MST is then