A Mosaic of quasar catalog

New constraints on Lyman- opacity with a sample of 62 quasars at


We present measurements of the mean and scatter of the IGM Lyman- opacity at along the lines of sight of 62 quasars at , the largest sample assembled at these redshifts to date by a factor of two. The sample size enables us to sample cosmic variance at these redshifts more robustly than ever before. The spectra used here were obtained by the SDSS, DES-VHS and SHELLQs collaborations, drawn from the ESI and X-Shooter archives, reused from previous studies or observed specifically for this work. We measure the effective optical depth of Lyman- in bins of 10, 30, 50 and 70 cMpc , construct cumulative distribution functions under two treatments of upper limits on flux and explore an empirical analytic fit to residual Lyman- transmission. We verify the consistency of our results with those of previous studies via bootstrap re-sampling and confirm the existence of tails towards high values in the opacity distributions, which may persist down to . Comparing our results with predictions from cosmological simulations, we find further strong evidence against models that include a spatially uniform ionizing background and temperature-density relation. We also compare to IGM models that include either a fluctuating UVB dominated by rare quasars or temperature fluctuations due to patchy reionization. Although both models produce better agreement with the observations, neither fully captures the observed scatter in IGM opacity. Our sample of 62 quasar spectra opens many avenues for future study of the reionisation epoch.

dark ages, reionization, first stars  quasars: absorption lines  intergalactic medium

1 Introduction

The first billion years of the Universe are currently a frontier of late-time cosmology, both observationally and theoretically. During this stretch of time the first stars and galaxies assembled from the primordial gas left behind by reheating, and the atomic hydrogen permeating the early Universe became ionised. This “reionization” transition is believed to be largely completed by redshift six. The precise timing and topology of reionisation are strongly influenced by the processes at work in the first galaxies and active galactic nuclei (AGN), as well as the large–scale structure of the early intergalactic medium (IGM).

Quasars located at have proven to be useful tools for obtaining information about reionisation due to their high intrinsic luminosities and prominent Lyman- emission lines. These properties have yielded results on multiple fronts, from measuring the sizes of quasar proximity zones across time, which are expected to diminish with increasing IGM neutrality and decreasing quasar lifetime (e.g. Fan et al., 2006; Carilli et al., 2010; Keating et al., 2015) to constraining enrichment processes by probing the cosmic abundances of intervening metals (e.g. Ryan-Weber et al., 2009; D’Odorico et al., 2013; Becker et al., 2015a; Chen et al., 2016; Bosman et al., 2017). The Lyman- forest extending bluewards of Å in the quasar rest frame is of particular interest to reionisation as it traces the diffuse intergalactic gas whose ionization is sensitive to the metagalactic ultraviolet background (UVB). The Lyman- opacity in the forest increases with redshift, and eventually complete absorption is reached once the IGM reaches average hydrogen neutral fractions of per cent (Gunn & Peterson, 1965). The characterisation of Lyman- opacity across redshift is a powerful constraint on models of reionisation, as the amount of residual transmission is sensitive to the nature of the UV sources, the thermal state of the IGM, and the large-scale clustering of sources among other factors (e.g. Wyithe & Bolton 2011; McQuinn et al. 2011; Davies et al. 2017).

Lyman- transmission along QSO lines of sight is often quantified by an “effective optical depth”, , where is the observed (residual) flux in the Lyman- forest, and is the unabsorbed continuum (e.g. Fan et al. 2006). The first studies of the optical depth distribution pointed to a large scatter in transmission along lines of sight (Songaila, 2004; Fan et al., 2006). Although this scatter was potentially incompatible with the predicted scatter due to large-scale fluctuations in the density field alone (Lidz et al., 2006), firm conclusions were limited by the relatively modest sample sizes (e.g. Mesinger 2010). Becker et al. (2015b) discovered a cMpc Lyman- trough extending down to and demonstrated that its existence, as well as the general distribution of Lyman- opacity measurements at , is incompatible with a spatially uniform UVB. The discovery prompted a flurry of new reionisation models (see e.g. Chardin et al. 2015; D’Aloisio et al. 2015; Davies & Furlanetto 2016). The combined samples of Fan et al. (2006) and Becker et al. (2015b) included 26 quasars at , which is only a fraction of the more than 200 quasars now known at these redshifts. In this paper we gather 62 spectra of quasars, more than doubling previous samples.

We present updated measurements of the Lyman- opacity distribution function (PDF) for the redshift range . The number of known quasars at is increasing rapidly due to searches by the Dark Energy Survey (DES; Reed et al., 2015), the Subaru High- Exploration of Low-Luminosity Quasars (SHELLQs; Matsuoka et al., 2016), Pan-STARRS (Kaiser et al., 2010), the VISTA Kilo-Degree Infrared Galaxy (VIKING; Venemans et al. 2013; Carnall et al. 2015) survey, the Canada-France High-redshift Quasar Survey (CFHQS; Willott et al., 2007) and UKIDSS (Venemans et al. 2007; Mortlock et al. 2009, 2011) as well as the completion of the search for high redshift quasars in the Sloan Digital Sky Survey (SDSS, York et al., 2000, Jiang et al., 2016). Here we take advantage of this increase to provide smoother constraints on the Lyman- PDF with a better handle on cosmic variance. In addition, we are able to robustly sample the Lyman-alpha opacity distribution up to for the first time.

The paper is structured as follows. In Section 2 we describe our sample of 62 quasars and present four previously unpublished spectra, briefly discussing the properties of our sample. Our methods for measuring Lyman- opacity distributions are presented, and compared to previous studies in Section 3. Challenges in dealing with the wide range of spectral resolutions and signal to noise ratios (SNRs) across our sample are discussed. Section 4 gives our results spanning the redshift range using multiple ways of accounting for the inhomogeneous quality of the data and non-detections of transmitted flux. These results are confronted with predictions from numerical models and discussed in Section 5. Section 6 introduces our empirical functional form to residual Lyman- transmission and outlines our maximum likelihood fitting method. We discuss implications for the process of reionisation and caveats of the work in Section 7. The results are summarized in Section 8 and extra figures, including a mosaic of the entire sample, are shown in the Appendix. Throughout the paper we use (Planck Collaboration et al., 2016) and quote comoving distances in units of .

All measurements of obtained and used in this paper are made available online1.

2 Data

Our sample consists spectra of 62 quasars at observed over the last 11 years. Out of these objects, 4 are discovery spectra from the SHELLQs survey, 10 were discovered by DES-VHS (out of which 4 are currently unpublished), 13 are SDSS discovery spectra, 13 are new reductions of archival data, 19 are adopted from previous studies on Lyman- transmission, and 3 are new to this work. Ten different optical spectrographs were used to obtain the data: ESI, X-Shooter, GMOS, MagE, EFOSC, FOCAS, MMT RCS, HIRES, MIKE and LBT-MODS. The following sections describe the make-up of the sample in more detail. Table 1 details the provenance of each spectrum. A mosaic of the entire sample is plotted in Appendix A.

QSO Instrument SNR Survey Notes Discovery ref. Spectrum ref.
J1120+0641 7.0842 X-Shooter 35.0 UKIDSS (1) (26)
J1205-0000 6.8 FOCAS 3.5 SHELLQs (20)
J0224–4711 6.50 GMOS 6.5 DES–VHS (2)
J0210–0456 6.44 ESI 5.3 CFHQS new reduction (3) (32)
J2329–0301 6.43 ESI 6.5 CFHQS new reduction (11) (25)
J1148+5251 6.419 HIRES 29.7 SDSS (4) (22)
J1152+0055 6.37 FOCAS 3.1 SHELLQs (20)
J1148+0702 6.339 HIRES 3.4 SDSS (5)
J0100+2802 6.30 X-Shooter 85.2 SDSS new spectrum (6) this paper
J1030+0524 6.28 X-Shooter 28.0 SDSS new reduction (7) (22)
J0050+3445 6.25 ESI 24.4 CFHQS (3) (23)
J0323–4701 6.25 EFOSC 12.5 DES–VHS (2)
J0330–4025 6.25 GMOS 12.1 DES–VHS (2)
J1623+3112 6.247 ESI 16.1 SDSS (8) (22)
J2325-???? 6.23 EFOSC 1.8 DES–VHS (24)
J0410–4414 6.21 GMOS 12.7 DES–VHS (2)
J0227–0605 6.20 ESI 7.5 CHFQS new reduction (27) (25)
J1048+4637 6.198 HIRES 29.2 SDSS (4) (28)
J1609+3041 6.16 MMT 6.1 SDSS (5)
J2229+1457 6.15 ESI 6.0 CHFQS new reduction (3) (25)
J1250+3130 6.13 ESI 26.2 SDSS new reduction (9)
J0033–0125 6.13 ESI 6.1 CHFQS new reduction (11) (25)
J1319+0950 6.132 X-Shooter 96.8 UKDISS/SDSS (10) (23)
J1509–1749 6.12 X-Shooter 88.9 CFHQS (11) (22)
J2315–0023 6.117 ESI 29.8 SDSS (12) (23)
J0454–4448 6.10 MagE 5.8 DES (21)
J0109–???? 6.10 EFOSC 4.2 DES–VHS (24)
J2216-0016 6.10 FOCAS 2.4 SHELLQs (20)
J1602+4228 6.09 MMT 33.3 SDSS new reduction (8) (25)
J0303–0019 6.078 ESI 8.0 SDSS new reduction (12) (32)
J0353+0104 6.072 ESI 80.7 SDSS (12) (23)
J2054–0005 6.062 ESI 39.5 SDSS (12) (23)
J1630+4012 6.058 MMT 17.0 SDSS (4) (22)
J1641+3755 6.04 ESI 9.0 CHFQS new reduction (11) (25)
J0408–5632 6.03 EFOSC 4.3 DES–VHS (2)
J1257+6349 6.02 MMT 6.1 SDSS (13)
J1306+0356 6.016 X-Shooter 55.8 SDSS new reduction (7) (22)
J1137+3549 6.01 ESI 31.7 SDSS new spectrum (9) this paper
J2310+1855 6.003 LBT-MODS 17.9 SDSS (5)
J0818+1722 6.0 HIRES 39.8 SDSS (9) (28)
J0131–???? 6.00 EFOSC 3.9 DES–VHS (24)
Table 1: Data used in this work. References are given in the caption of Fig 2. A dash ‘–’ indicates the discovery spectrum is used. Question marks in quasar names indicate a quasar yet unpublished by the discovering authors.
QSO Instrument SNR Survey Notes Discovery ref. Spectrum ref.
J0841+2905 5.96 ESI 11.2 SDSS (15) (22)
J0122-???? 5.96 EFOSC 3.0 DES–VHS (24)
J1202–0057 5.93 FOCAS 2.2 SHELLQs (20)
J0008–0626 5.929 MMT 4.4 SDSS (13)
J1411+1217 5.927 ESI 15.9 SDSS (8) (22)
J0148+0600 5.923 X-Shooter 128.0 SDSS (13) (23)
J1335+3533 5.901 ESI 16.3 SDSS (9) (22)
J2119–0040 5.87 MMT 4.0 SDSS (5)
J2307+0031 5.87 MMT 3.6 SDSS (5)
J0850+3246 5.867 MMT 3.7 SDSS (13)
J0203+0012 5.86 ESI 17.4 UKIDSS/SDSS (12) (23)
J0005–0006 5.850 ESI 28.8 SDSS new reduction (8) (32)
J1243+2529 5.85 MMT 4.1 SDSS (5)
J0840+5624 5.844 ESI 17.6 SDSS (9) (22)
J1436+5007 5.83 MMT 3.2 SDSS (9)
J0239–0045 5.82 MMT 4.9 SDSS (16)
J0836+0054 5.810 X-Shooter 93.4 SDSS new reduction (7) (22)
J0002+2550 5.8 HIRES 71.7 SDSS (8) (28)
J0810+5105 5.80 MMT 10.0 SDSS (5)
J1044–0125 5.782 ESI 49.2 SDSS new reduction (17) (25)
J0927+2001 5.772 X-Shooter 73.7 SDSS new spectrum (9) this paper
J1621+5155 5.71 MMT 10.3 SDSS (5)
Table 2: Current list of quasars, continued. Quasar names including question marks are not public yet. References: (1) Mortlock et al., 2011; (2) Reed et al., 2017; (3) Willott et al., 2010; (4) Fan et al., 2003; (5) Jiang et al., 2016; (6) Wu et al., 2015; (7) Fan et al., 2001; (8) Fan et al., 2004; (9) Fan et al., 2006; (10) Mortlock et al., 2009; (11) Willott et al., 2007; (12) Jiang et al., 2008; (13) Jiang et al., 2015; (14) Carnall et al., 2015; (15) Goto, 2006; (16) Jiang et al., 2009; (17) Fan et al., 2000; (18) Wang et al., 2016; (19) Morganson et al., 2012; (20) Matsuoka et al., 2016; (21) Reed et al., 2015; (22) McGreer et al., 2015; (23) Becker et al., 2015b; (24) Reed in prep.; (25) KOA 2017; (26) Bosman et al. 2017; (27) Willott et al. 2009; (28) Becker et al. 2006; (29) Venemans et al. 2013; (30) X-Shooter archives; (31) Venemans et al. 2015; (32) Eilers et al. 2017

2.1 SDSS quasars

The SDSS is a sky survey over 14,555 which provides imagining in the ugriz photometric bands as well as spectroscopic follow-up using a 2.5m dedicated telescope located at Apache Point Observatory (Fukugita et al., 1996; Hogg et al., 2001). Here we briefly outline the detection procedure of quasars in the SDSS (see Jiang et al. 2016 for a more in-depth summary). Candidates are selected in the first step as drop-outs with no detections in the ugr photometric bands and with colours in excess of . After quality cuts, follow-up photometry is obtained in the near infra-red (IR) and a second cut is imposed (e.g. Fan et al. 1999). Alternative colour cuts are used in deeper areas of the survey near the Galactic cap (Jiang et al., 2008, 2009) and in regions scanned two or more times (Jiang et al., 2015).

Confirmation spectra of the quasar candidates are typically obtained obtained with the Red Channel Spectrograph (RCS) on the 6.5m Multiple Mirror Telescope (MMT) or Double Spectrograph on the Hale 5.1m telescope (DBSP) (e.g. Jiang et al., 2016), and in one occasion with the Multi-Object Double Spectrographs for the Large Binocular Telescope on Mt. Graham in south-eastern Arizona (LBT-MODS, Pogge et al. 2012). Additional near-IR spectra taken for some objects do not cover the range 7500Å – 10,000Å required for coverage of Lyman- at and are not used in this work (e.g. Jiang et al., 2007, Simcoe et al., 2011).

Jiang et al. (2016) presented the 52 final quasars discovered by the SDSS, most of which are included in this work. Out of those, 29 have been re-observed since their discovery to obtain higher quality data, while 23 have not. The discovery spectra for those 13 of these objects are included in our sample. Eight of those objects were reported in Jiang et al. (2016), three objects in Jiang et al. (2015), one objects in Jiang et al. (2009) and one object in Fan et al. (2006).

2.2 DES and DES–VHS quasars

The Dark Energy Survey (DES) covers an area of 5000 in the southern hemisphere in visible imaging. It employs the dedicated Dark Energy Camera (DECam) on the Blanco 4m telescope, Cerro Tololo (The Dark Energy Survey Collaboration, 2005). The first high- quasar discovered in DES was presented in Reed et al. (2015). Quasar candidates are selected using a drop-out technique similar to the SDSS procedure described above, this time with the condition . In addition, the DES survey includes the Y band, allowing a more efficient removal of red dwarves from the sample via a constraint on the quasar continuum of . In Reed et al. (2017), eight additional quasars were detected by combining DES data with infrared observations in overlapping footprint of the VISTA Hemisphere Survey (VHS; McMahon et al. 2013). Nine additional objects have been discovered in the same way since (Reed et al, in prep), of which four included here.

Spectroscopic confirmation of the candidates was conducted either with the ESO Faint Object Spectrograph and Camera (EFOSC, Buzzoni et al. 1984) or the Gemini Multi-Object Spectrographs (GMOS, Hook et al. 2004), with some objects subsequently observed in higher quality with the Magellan Echellette (MagE, Marshall et al. 2008). The best quality spectrum for each of the 10 DES–VHS quasars was chosen as shown in Table 1.

2.3 SHELLQs quasars

The Subaru High-z Exploration of Low-luminosity Quasars (SHELLQs, Matsuoka et al. 2016) is a new imaging survey utilising the Hyper Suprime-Cam on the Subaru 8.2m telescope (Miyazaki et al., 2012). A search for quasars has currently been conducted over an area of 430 . The SHELLQs project aims to obtain deeper exposures in the grizy bands compared to SDSS and DES, leading to the discovery of 33 faint quasars so far (Matsuoka et al., 2016, 2017). In this work, we include four out of the first nine SHELLQs quasar spectra presented in Matsuoka et al. (2016). The confirmation spectra for these objects were taken with the Faint Object Camera and Spectrograph on the Subaru telescope (FOCAS, Kashikawa et al. 2002) as described in the discovery paper.

2.4 Other quasar spectra

In this work, we re-use 20 quasar spectra presented in previous investigations of Lyman- opacity. McGreer et al. (2011) and McGreer et al. (2015) conducted observations of 22 previously known quasars with the MagE, MMT and the X-Shooter instrument on Cassegrain UT2 (Vernet et al., 2011). We are making use of 9 of those observations as indicated in Table 1. Similarly, Becker et al. (2015b) published spectra of seven quasars, at , not included in a previous work by Fan et al. (2006), obtained on the Echellette Spectrograph and Imager (ESI) on the Keck II telescope (Sheinis et al., 2002), and X-Shooter on the VLT. Three additional spectra were first presented in Becker et al. (2006). The quasars followed up in the above papers were initially discovered by various surveys including the UKIRT Infrared Deep Sky Survey (UKIDSS, Lawrence et al. 2007), the Canada-France High-z Quasar Survey (CFHQS, Willott et al. 2007), SDSS, and the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS, Morganson et al. 2012). In addition, we also include a 30h X-Shooter spectrum of ULAS J1120+0641 at , first presented in Bosman et al. (2017).

2.5 New reductions

After a proprietary period of 18 months, raw data obtained with ESI is made publicly available through the Keck Observatory Archive (KOA2). In this work, we use 7 ESI spectra of quasars at re-reduced from raw data obtained from the KOA. Our ESI reduction pipeline is the same as described in Subsection 2.6.

Finally, we re-reduced the spectra of five quasars which have been previously published. X-Shooter spectra of the quasars J0836+0054 and J1030+0524 were introduced in McGreer et al. (2015), but here we use our own reduction of the raw X-Shooter files instead, in an attempt to improve data quality. ESI spectra of quasars J0210-0459, J0303-0019, and J0005-0006 were part of an observing run (PID: C197E; PI: Sargent) in October 2010 and have been previously used in e.g. Eilers et al. (2017). Here we use our own reductions of the raw ESI data.

Object Instrument Date Exposure Slit Width Seeing
Time (s) (arcsec) (arcsec)
SDSS J0100+2302 6.3 X-Shooter 23rd Oct 2015 1800 1.00 0.80
SDSS J1137+3549 6.01 ESI 18th Mar 2016 3000 1.00 0.80
SDSS J0927+2001 5.772 X-Shooter 13th Jan 2010 1800 1.00 0.77
Table 3: New observations of four quasars which are were not presented in previous work.

2.6 New spectra

We present 3 new observations of quasars which were carried out on the ESI, X-Shooter and EFOSC instruments as detailed in Table 3. The spectra are plotted in Figure 1.

An X-Shooter spectrum was obtained in January 2010 of quasar SDSS J0927+2001 at and had not previously been published. The reduction procedure is the same as the one presented in e.g. Becker et al. (2015b). The spectrum was extracted optimally (Horne, 1986) using 10 km bins after being flat-fielded and sky-subtracted following Kelson (2003). Custom telluric absorption routines were used as presented in Becker et al. (2012).

We obtained a 3000s ESI spectrum of the quasar SDSS J1137+3549 on the 18th of March, 2016.

A deep X-Shooter spectrum of J0100+2302 was obtained in collaboration with Max Pettini over 2015 and 2016 as part of the 13 hours program 096.A-0095(A). Here we make use of only one exposure of 1800s of the object, which is nevertheless a great improvement upon the previous LBT-MODS spectrum of the quasar.

Figure 1: New quasar spectra used in this work. Details of the observations can be found in Table 3.

2.7 Sample properties and notes on individual objects

Our 62 quasars have source redshifts in the range with a peak at and a distribution as shown in Fig 2. We investigate the Lyman- forest over Å, resulting in a redshift coverage shown in Fig 3 with up to 59 lines of sight covering the interval . These distributions vary slightly depending on the choice of proximity-zone cut-off.

Figure 2: Redshift distribution of the quasars included in our sample. Red indicates the whole sample while grey corresponds to the ‘SILVER’ sub-sample of objects with SNR.
Figure 3: Cumulative number of lines of sight covering a particular redshift. Red indicates the whole sample while grey corresponds to the ‘SILVER’ sub-sample of objects with SNR.

Redshifts for the objects in our sample are based the on best available estimates using nebular lines (such as [CII]) whenever possible. See Jiang et al. (2016) for the origin of the measured redshifts of SDSS quasars. Since we are primarily concerned with measuring Lyman-  opacity in the IGM, accurate values for the quasar redshifts are only relevant to the exclusion of quasar proximity zones from the analysis. In this work we are not attempting to measure the evolution of quasar proximity zone length across redshift. We use a fix cut–off for the end of the proximity zone of Å after checking that this is a reasonable choice and that more stringent criteria do not affect the results. This analysis is presented in Section 3.1.

We list the signal-to-noise ratio (SNR) for our spectra in the fourth column of Table 1. The SNR measurement is complicated by the disparate resolutions and redshifts of the quasar spectra, and because common sky lines fall at different rest wavelengths across the sample. To measure SNR, we first normalise the spectrum by a power law as described below, then keep the pixels located at Å which are not affected by sky lines. This wavelength range covers a portion of quasar spectra minimally affected by broad emission lines. We present here the mean SNR per 60 km s. The SNR is then computed as,


where is the error and the number of pixels per 60 km s interval. This is computed using , where is the pixel size in km s. For spectra where bins of fixed wavelength interval, , are used, rather than a fixed velocity interval, is measured at Å. While this is not the only way of homogeneously measuring spectral SNR, it is sufficient for our purposes to discriminate between data quality regardless of resolution, and has the advantage of being invariant under re-binning of the spectra. The values obtained range from SNR for J2325-5229, a DES–VHS quasar with continuum emission barely above the detection threshold, to SNR in a deep X-Shooter exposure of J1319+0959 first used in Becker et al. (2015b).

Out of the quasars in our sample, none show the characteristic features of broad absorption line spectra (BAL) or contamination by a DLA. Such objects were explicitely excluded during the sample assembly, with BAL and DLA features accounting for 5 out of 30 object rejections. We note that the fraction of quasar spectra displaying these features (5 out of 92 or about 5%) is lower than measured at later times. This could be due to the selection techniques employed to discover high redshift quasars, as the presence of a BAL feature diminishes the photometric colours most commonly used to select quasar candidates as drop-outs. It is also likely that some contamination by DLAs has gone undetected in our sample. Due to the saturation of the Lyman- forest, the most reliable way to detect and remove DLAs from the sample would be by detecting associated metal absorption at the DLA redshift. Mg II is unfortunately not visible for the redshifts of interest here, and the quality of the spectra is insufficient to detect typical C IV absorption systems in the majority of cases. Only deep X-Shooter spectra would provide sufficient coverage and sufficient SNR to completely remove DLA contamination at .

3 Methods

The spectra are first normalised by fitting a power law to the unabsorbed continuum. The portion being fitted extends from 1270 Å–1450 Å in the rest frame of the quasar; the range 1270 Å–1350 Å is used instead when the spectral coverage stops short of 1500 Å to avoid portions of the spectrum affected by the falling response of the instrument. This is the case for instance for spectra of quasars taken with the MMT or MagE, whose coverage extends to Å but the response of which decays significantly from Å. Pixels affected by sky lines are excluded and a first power law (PL) fit is made, from which we then exclude any pixels for which . The remaining flux is then fit with a power law again, and the process repeated a second time with a deviation coefficient of 1.5 to ensure convergence. Finally, the full flux array is divided pixel-by-pixel by the best fit power law function thus obtained.

Figure 4: Stack of flux over 1100 Å–1215 Å of 62 quasars. Spectra were normalised and sky lines masked before stacking. While the proximity zone transmission has fallen considerably by 1180Å, the exact end of the host quasar’s influence on the flux is unclear. Five spectra are excluded as described in Section V.2.7. Panels show the evolution of the stacked spectrum over , and . The thin blue lines show a total flux of zero.

Three objects which displayed too little continuum (due to a combination of high redshift and spectrograph wavelength coverage) are excluded from the analysis since no satisfactory estimate of the continuum could be obtained. The continua for all other objects were checked visually. We checked that the best fit power law parameters were robust to small changes in the fitting window bounds. The effects of window choices were run all the way through the analysis; we find an end effect on values of of magnitude . This effect can be seen in Fig.6, where the only differences between our measurements and those of Becker et al. (2015b) are due to small differences in the choices of continuum fitting. These errors are in all cases much smaller than the effect of cosmic variance.

We measure the average transmitted flux in windows of 50 comoving cMpc extending from the end of the quasar’s proximity zone at Å down to the onset of Lyman-  absorption (1041 Å in the rest frame). The average continuum-normalised flux is transformed into effective opacity following and associated to the redshift corresponding to the middle of that 50 cMpc region. The analysis is repeated for window sizes of 10, 30 and 70 cMpc .

We treat non-detections of transmitted flux in two different ways. First, following previous work, we take the upper limit on the flux to correspond to twice the error in the flux over the measurement window. If individual peaks of transmission are detected at more than significance over that range, then we take the lower limit on the flux to be equal to twice the flux in those peaks alone following where is the error over the wavelength covered by peaks (Becker et al., 2015b). This allows us to compare our results to the previous samples of Fan et al. (2006) and Becker et al. (2015b) which are a sub-set of our catalogue. The Lyman- opacities in these two samples were not measured in identical ways, as the lengths of the excluded proximity zones and the details of the continuum fitting were subtly different. This might have resulted in a mild tension between the two samples, which we can now harmonize by

  1. doing a bootstrap re-sampling of our catalog to mimic previously used sample sizes, and

  2. treating both samples identically through the same continuum fitting routine and same proximity zone exclusion technique.

Secondly, we treat upper limits by plotting the most optimistic and pessimistic bounds on the cumulative distribution function (CDF). The optimistic bound is given, as above, by taking the intrinsic average flux to be equal to two times the average error over the measurement window, i.e. just below detection sensitivity. However, different spectrographs with different exposure times will be sensitive to different thresholds: for instance, none of the data in our sample could measure an effective optical depth . To reflect this ambiguity, we attribute maximal opaqueness () to all non-detections in order to obtain a ‘maximally pessimistic bound’ (so called due to the increased difficulty of reconciling this outcome with current reionisation models). The ‘true’ CDF will necessarily lie in-between these two extremes so long as the bounds are resolved.

Figure 5: Effect of incrementally increasing the excluded proximity zone size on the Lyman- CDF at various redshifts (see also Fig 4). By ending the proximity zone at Å, the average opacity is affected at and . However, no statistically significant difference is seen between a cut-off at Å and the very conservative case Å, at any redshift. Throughout this paper we adopt the traditional cut-off Å.

3.1 Proximity zone exclusion

We aim to measure the Lyman- opacity of the IGM. In order to achieve this, we need to avoid any bias introduced by the quasar proximity zones: the regions immediately surrounding the quasars where the ionisation of the gas is enhanced due to ionising radiation from the object themselves. In the past, cut-offs for quasar proximity zones were chosen either on a case-by-case basis as the point where the quasar’s Lyman- flux had fallen below 10% of its peak value (Fan et al., 2006; Eilers et al., 2017) or chosen as a fixed value of 1176 Å determined based on a small sample of objects (Becker et al., 2015b). The former definition is not useful in the context of a sample containing spectroscopic data of varied resolution since clumps of neutral gas within the proximity zone might be resolved by some instruments but not others. A fixed-value cut-off is therefore more suited to a large dataset and facilitates future refinements of the measurements.

To ensure that the traditional value of 1176Å is sufficiently stringent as to remove all proximity zone influence, we first plot a stacked spectrum of our entire catalog in the range Å (Figure 4). The stacking was carried out by interpolating the spectra onto a common wavelength array after normalising them by a fitted power law and removing bad pixels. For the purposes of the stacking, objects with different measurements errors are not weighed differently. Some interesting features are visible, such as the slight increase in average opacity along lines of sight with redshift and the average power-law shape of the proximity zone. Based on the stack, we see that the proximity zone influences average transmitted flux at Å, but effects further away are unclear. To confirm and refine this, we incrementally increase the amount of excluded flux blueward of the Lyman- line and compare the resulting CDFs. This is equivalent to restricting the measurement to spectrum segments which are increasingly distant from the redshift of interest. The results are shown in Fig 5 for several redshift bins. The effect of the proximity zone flux is visible as a modest decrease in opacity when a cut-off of Å is used. No statistically significant difference is seen between the Å and the Å cases, indicating that the shape of the CDF has converged. We therefore adopt a value of the proximity zone end at Å in the rest of the paper.

Individual quasars showing anomalously long proximity zones (whether due to extreme bolometric luminosities, location in an under–dense IGM region, or chance) could be present in the sample and would not appear in the stacks in Fig 4. The resulting contamination could potentially bias the average opacities to be too low. However, we do not find any such objects in the sample by visual inspection. Uncovering trends in quasar proximity zones is beyond the scope of this work, and it is enough for our purposes to confirm that no boost to average flux is seen at Å.

Figure 6: Reproduction of the Becker et al. (2015b) opacity PDFs at (left) and (right). The same lines of sight are used, except for 4 quasars at that are excluded from our work for the lower redshift case. Differences are accounted for by slight differences in the continuum fitting between the authors and previous measurements taken from Fan et al. (2006).

4 Results

Figure 7: New results, obtained using the method described in Becker et al. (2015b). Only spectra of moderate or good quality (SNR, ‘SILVER’ sample) are used, and non-detections of transmitted flux are treated as data points with values of twice the average error (see text). No significant discrepancy with previous results is found.
Figure 8: New results, plotting the most optimistic and most pessimistic contours based on the intrinsic values values of non-detections. The leftmost contour corresponds to non-detections have intrinsic values of twice the average error (as in Fig 7) while the rightmost contour assumes non-detections are maximally opaque (see text). The thin dashed line displays the most likely lognormal distribution computed in a maximum likelihood scheme (see Section 6).

4.1 Effect of data quality

We select sub-samples from the quasar catalog based on the signal to noise ratios. A ‘GOLD’ sample is chosen with SNR to match the SNR of the worst spectrum used in McGreer et al. (2015). This yields a sample of 33 high-quality spectra, nearly halving the sample. Similarly, we construct a ‘SILVER’ sample by applying a less stringent cut of SNR, matching the data quality from Eilers et al. (2017). This yields 45 lines of sight. For consistency, we will refer to the full sample as the ‘BRONZE’ sample. We note that the sample of 26 quasars used by Becker et al. (2015b), the largest one so far, includes measurements from Fan et al. (2006) made from spectra with SNR much lower than these thresholds – down to SNR. The results are shown in Figure 11. There is no significant trend with data quality at , where the distribution is well-sampled even by the small GOLD sample. A disparity appears at .

The differences between the SILVER and GOLD samples at are due to the small sample size of the GOLD sample. The difference in number of lines of sight in this range is a factor of 2.5, which reflects the fact that very high quality spectra are harder to obtain for quasars at . The SILVER and GOLD sample are consistent according to a KS-test ().

On the other hand, the sizes of the SILVER and GOLD samples are similar at , and agree with each other well. However, the additional lines of sight included in the BRONZE sample are of insufficient quality to distinguish opacity beyond . Since the mean opacity measured in the SILVER sample is roughly , these SNR spectra yield upper limits (equal to twice the error) which are very poor in comparison. Two of BRONZE lines of sight do show signs of residual transmission (see below).

In light of these results, we decide to adopt the SILVER distribution in every instance in this paper where the data is presented or analysed in the form of a cumulative distribution function. The maximum likelihood method described above has been explicitly designed to account for observational errors and sensitivity, and we accordingly use the BRONZE sample only in Section 6.

4.2 New Lyman- distributions

Figure 7 presents our results compared to the previous CDFs of Becker et al. (2015b). As done in Becker et al. (2015b), the CDFs include lower limits. Figure 8 presents the same results, plotted to show the ‘pessimistic’ and ‘optimistic’ bounds described earlier. The results over are completely consistent with previous studies. We find a clear, well-defined tail of high-opacity () lines of sight at redshift . This trend was already visible in the CDF reported in Becker et al. (2015b). Roughly of lines of sight at have opacities , which might pose problems for IGM models that assume a spatially uniform UV background and temperature-density relation.

At , we find a small but significant tail of transparent lines of sight, with roughly of measurements showing . This tail was not visible in the Becker et al. (2015b) sample as most of the relevant objects were not included. The two samples are consistent according to a KS test (, Kolmogorov 1933). At , we find that opacities are slightly smaller than the ones previously reported. Again, the two samples are consistent according to a KS test (). Our sample for is of comparable size to Becker et al. (2015b)’s samples at and , so that small differences are expected between our measurement and a ‘true’ representation of cosmic variance in the same way as seen at lower redshifts.

At , our sample is smaller than all the ones used in Becker et al. (2015b) at lower redshift and the results should be interpreted with caution. Seven out of eleven 50 cMpc regions included in our SILVER sample display residual peaks of transmission, while the remaining four pose tight constraint on transmission. We find a very high average opacity of . Such opacities are only accessible to current spectrographs with large time investments. This is readily visible in the large panel of Figure 11, where the ‘transparent’ lines of sight in the BRONZE sample are upper limits originating from spectrographs which struggle to distinguish opacities beyond . It is worthy to note that two lines of sight with SNR do display residual transmission at the level of . The quasar J1148+0702 displays a transmission peak within Å of the formal end of the proximity zone, while the quasar J2325 displays such a peak outside of its proximity zone but has a SNR of 1.8, making it impossible to definitely rule of reduction issues. Further scrutiny of these objects is required in order to determine whether these peaks may be related to particularly long and sporadic proximity zones.

In Figure 9 we plot the distribution function of Lyman- opacity across redshift. We distinguish between detections and lower limits using separate histogram colours. The distributions are clearly non-gaussian, with peak values increasing linearly with redshift. The tail of opaque lines of sight at is clearly visible and appears smooth and well sampled.

The effect of varying the size of the integration window is shown in Figure 10. Although the effect is subtle, decreasing the window size tends to broaden the distribution, as expected. This is a natural consequence of cosmic variance. The broadening is particularly pronounced when the window size is decreased below 30 cMpc . The redshift range is more clearly affected, possibly because these redshifts are better sampled. We find no statistically significant difference between bins of 50 and 70 cMpc at any redshift.

4.3 Comparison to previous studies

As a first test of our procedure, we reproduce the CDF presented in Becker et al. (2015b) using the spectra of 24 out of 27 quasars used in that work, which are a subset of our catalog. Figure 6 presents the results at and . The latter measurement makes exclusive use of those 26 quasars, and therefore any deviations are due entirely to subtle differences in the continuum fitting – such as the precise wavelength ranges used and the number of sigma-clipping iterations. In addition, the opacities of the spectra used in Fan et al. (2006) were not recomputed in Becker et al. (2015b), giving rise to a separate set of slight discrepancies. All of our measurements of in 50 cMpc windows agree within error with those quoted in the original papers. The Becker et al. (2015b) results at make use of 4 quasars at which are not used in our sample. Nevertheless the shapes of the cumulative PDFs are in excellent agreement, and the measurements of in individual windows all agree within error.

Next, we compare our results with those Becker et al. (2015b) by computing the CDF obtained from a random sub-sample of the size used by those authors from our larger sample. The contours of one hundred such realisations are plotted in Figure 7. At all redshifts we find the results are in statistical agreement, with a mild tension at where our work finds a slightly lower average opacity.

Figure 9: Differential distributions of opacity in 50 cMpc bins, showing detections in black and lower limits in gray.
Figure 10: The effect of varying the size of the window over which Lyman- transmission is measured. No significant effect is seen between windows of 30, 50 and 70 cMpc at any redshift, suggesting that fluctuations occur on even larger scales. Binning the data in 10 cMpc windows strongly affects the distribution, in particular at lower redshifts where it results a broader distribution.
Figure 11: Cumulative distribution functions of Lyman- opacity across redshift, computed using the full sample (black), the SILVER sub-sample of 51 objects matching the quality of data used in Eilers et al. (2017) (SNR), and the GOLD sample of 35 objects which match the quality from McGreer et al. (2015) (SNR). Shaded blue areas show the ‘optimistic’ and ‘pessimistic’ bounds presented in Fig. 8. At , the distributions are well resolved by all samples and the distributions therefore agree. At , the GOLD distribution lies between the bounds shown in Figure 8. At , the difference is attributable to the small sample size of the sub-samples (see text).
Chardin et al. (2015) Keating et al. (2017) Sherwood
5.0 N/A 1.703 0.662
5.2 N/A 1.522 0.590
5.4 0.878 1.509 0.565
5.6 0.848 1.135 0.590
5.8 0.841 1.055 0.666
6.0 N/A 0.944 0.796
Table 4: Emissivity rescaling factors () used to tune the simulations discussed here. The factors are chosen to match the observed flux following (see text). The first column gives the measured values of . To obtain the most accurate measurement possible, only the GOLD sample was used. The errors are estimated using bootstrap re-sampling.

5 Comparison with models

In this section we compare the updated CDFs of Lyman- transmission in 50 cMpc windows with three simulations. Here, we briefly describe these simulations and outline their most relevant features.

When comparing predictions from these numerical simulations to observational results, it is important to keep in mind a few caveats. Firstly, all of the following numerical models explicitly re-scale the ionising background intensities to match the observed mean effective optical depth of the observations presented in this paper, , by freely choosing a parameter such that:


At low redshift (), this rescaling is small and is somewhat justified by the difficulty of self-consistently generating the ionising UVB. In the following analysis, all simulation snapshots have been rescaled in this way to observed for each redshift bin. The values of are computed using the SILVER sample of objects with SNR, and are shown in Table 4.

These simulations all use the UVB prescription of Haardt & Madau (2012) (HM12) as a starting point, which incorporates the best available estimates of the nature of ionising sources, the ratio of galaxy to quasar contributions, the escape fraction, the spectral shape of sources, and other factors. The precise values of these parameters are not known, and vary across redshift. The HM12 emissivity is therefore used as a best guess, and the final re-scaling of the emissivity reflects the known inaccuracy in the model. The rescaling is therefore simulation– and redshift–dependent, but also resolution–dependent since the small-scale recombination and self-shielding effects necessary to calculate the UVB self-consistently are currently beyond the reach of numerical simulations. In Table 4 we list the values used to rescale in optical depths in each case (), chosen to match the observed flux following . At high redshift () this rescaling procedure becomes less valid because the correction is not small. For instance, the mean flux values of the Sherwood simulation used here had to be re-scaled by a factor at . This reflects the fact that the HM12 background is a bad representation of the ionising emissivity at high redshift, and ideally should not be used.

Secondly, the timing of reionisation is a free parameter in most of the following models. The Sherwood simulation, and the HM12 ionising background, were designed to fit Lyman- transmission at lower redshifts than explored here. Because the time evolution of Lyman- transmission is slower at , the successful predictions of these cosmological simulations are usually robust to shifts up to over the redshift range considered here (see e.g. Chardin et al. 2018).

5.1 Full forward modelling

To meaningfully confront simulated Lyman- lines of sight with observations, it is important to post-process them in a way which mimics data. We take a full forward modelling approach when comparing the data to simulations, transforming simulated lines of sight into realistic observations before treating them in the same way as the data. This simplifies the comparison, and enables us to estimate the errors in an empirical way.

We implement this by selecting the same number of simulated lines of sight as present in the data at each redshift, and post-processing them with observational profiles. An observational profile consists of an instrumental resolution and an error array. Each selected simulated line of sight is randomly matched to such a profile drawn randomly from the observations relating to the redshift under study. The simulated spectrum is first mapped onto a corresponding wavelength array, convolved with a gaussian with width of the instrumental resolution and re-binned onto the same wavelength array as the observation. Finally noise is added randomly at each pixel following a gaussian distribution with width corresponding to the observed error in that pixel.

The resulting post-processed lines of sight are equal in number to the observations for the corresponding redshift range, and the CDF extracted from both sets of lines of sight should now in principle be completely comparable. To get an estimate of the variance between random realisations, we run the entire process 100 times until the envelope of the CDF bounds converges. CDFs outside of those bounds – even by a single bin – therefore empirically have much less than one chance in one hundred of being observed if the underlying transmission is given by the simulated model. These bounds are shown overlayed with the observational bounds in Figures 12 and 13 corresponding to the rare sources model, radiative transfer model, and uniform UVB cases respectively. The raw predictions from the simulations, without post-processing, are all shown in Figure 15.

Figure 12: Comparison of the measured Lyman- PDFs at with fully post-processed outputs of numerical simulations from Chardin et al. (2015) (top) and Keating et al. (2017) (bottom). The coloured contours show the envelope of the pessimistic and optimistic bounds for 100 realisations. Post-processing consists of randomly drawing a number of lines of sight from the simulations equal to the number of observations in the corresponding redshift range. The simulated lines of sight are then forward-modelled to mimic the observed spectra (see text).
Figure 13: Same as Figure 12 for lines of sight drawn from the Sherwood simulation (Bolton et al., 2017).

5.2 The Sherwood Simulation

Figure 14: Comparison of the measured Lyman- PDFs at with outputs from a range of numerical simulations. This plots shows only the solid lines from Figures 12 and Fig 13, and the errors have been omitted for the sake of comparison.

The Sherwood simulation suite (Bolton et al., 2017) was designed to reproduce the Lyman- forest over and test its sensitivity to a range of model parameters such as galactic and AGN-driven outflows, thermal histories, and cold/warm dark matter. With gas particle masses of and particles for a box size of 40 cMpc , the Sherwood simulation used here possesses higher mass resolution than other larger-scale cosmological simulations such as Illustris and EAGLE. The simulation is run using the hydrodynamics code P-GADGET 3 (Springel, 2005) and used core-hours of computing time. Bolton et al. (2017) compare simulated Lyman- lines of sight with data from Fan et al. (2006), finding remarkable agreement over with a slight lack of strongly opaque regions at . Because it was designed to match observations at redshifts less than , the Sherwood simulation uses a uniform UVB with a shape given by the HM12 model, in a scenario where reionisation is driven mostly by galaxies and with no radiative transfer. The interrelations of hydrogen neutral fraction, temperature, photon mean free path, and density are therefore not taken into account, and the simulations are expected to fail in the hydrogen reionisation regime where the UVB is known to be inhomogeneous.

Here we compare the 40 cMpc Sherwood simulation box with the Lyman- CDF over . Each measurement of requires stitching together two simulated lines of sight and truncating to 50 cMpc ; 2500 total values of are obtained. The model predictably falls short of reproducing the variety of line-of-sight opacities at . It is perhaps more surprising that the uniform UVB model also fails dramatically at , where line-of-sight variation is slightly more pronounced than previously reported.

5.3 Radiative transfer simulation

In addition to the homogeneous-UVB Sherwood simulations, we compare our results to the full cosmological radiative transfer simulations of Keating et al. (2017). Following D’Aloisio et al. (2015), these simulations test the effect of regions ionising at different redshifts on the spatial variations in the temperature-density relation. The temperature dependence of recombinations rates could then lead to increased fluctuations in Lyman- opacity. Unlike previous models, Keating et al.’s simulations use an extended and self–consistent reionisation history to boost the IGM temperature. The injected energy and history are chosen to match the temperature and photo-ionisation rates of the IGM at measured using the Lyman- forest.

This difference to previous models turns out to be important, as Keating et al. (2017) find their more realistic choices of reionisation heating do not reproduce the large Lyman- opacity fluctuations previously reported. Choosing a higher ionisation energy to match the IGM heating in D’Aloisio et al. (2015) they find the fluctuations are still not large enough, and moreover the produced lines of sight are in tension with low redshift Lyman- forest data as well as transmission peak statistics at high redshift.

The simulations are also run in P-GADGET 3 but snapshots of the resulting density field are then post-processed to include the effects of temperature and ionisation using mono-frequency radiative transfer. Here we use a 40 cMpc box with mass particles, which should better capture large-scale variations of opacity than the higher resolution 20 cMpc box. Keating et al. (2017) compare their Lyman- PDFs to data over and find no trend with box size. Here we stitch together two simulated lines of sight for each of the 2500 total measurements of . We find that this model, while performing better than the Sherwood simulation, does not provide a satisfactory fit to the Lyman- CDFs. Although including radiative transfer does increase line-of-sight variance, the effect is too small by at least factor of two. This is in agreement with the results of Keating et al. (2017).

5.4 Rare sources simulation

Finally, we compare the results to predictions of a quasar-driven reionisation toy model from Chardin et al. (2015). These models include a density field produced by the hydrodynamics code RAMSES onto which radiative transfer is added in post-processing with the ATON code. They found the fluctuations in the UV produced by galaxies in the redshift range to be on scales too small ( cMpc) to account for the spread in Lyman- opacity. A much better fit was found when rare, bright ionising sources were added to the simulation with a carefully chosen spatial density. Such sources could be (faint) quasar , or alternatively extremely bright star-forming galaxies.

The simulation boxes of Chardin et al. (2015) are only 20 cMcp in size, but this is not a problem as the line of sight variance is mostly driven by the presence or absence of a strong ionising source nearby. The simulations use mass particles. Because of the smaller simulation volume, we have to stitch three simulated lines of sight together for each measurement of . This results in 1020 measurements, a smaller number than in the previous cosmological simulations.

We find that the rare sources simulations aptly reproduce the CDF at , and are the only model to do so out of the ones we tested. We note however that the line of sight variance seems to disappear very quickly at lower redshifts, resulting in the model under-estimating the opacity variance at . Unfortunately no snapshots were available at , but it is unlikely that those would show a sufficient amount of opaque lines of sight since those are already gone from the simulations at . However, we note that there might be room for reionisation to occur later within this model, which could potentially ease the conflict. Chardin et al. (2015) also note that their simulations are in better agreement with the Lyman- CDFs calculated with smaller window sizes of cMpc h. This is still the case with the updated measurements, as can readily be seen by comparing Figures 10 and 15.

4.9 – 5.1 N/A N/A 0.138 0.84 0.141 0.67
5.1 – 5.3 N/A N/A 0.110 1.07 0.105 0.87
5.3 – 5.5 0.089 1.82 0.087 0.80 0.086 0.67
5.5 – 5.7 0.062 3.02 0.057 1.50 0.057 1.00
5.7 – 5.9 0.038 5.70 0.035 1.82 0.030 1.44
5.9 – 6.1 N/A N/A 0.007 2.80 0.005 1.78
Table 5: Most likely intrinsic parameter values with 68% credible intervals, following Equations 3 and 4. Best fit values are marginalised over the other parameter. The errors on simulations are approximately for and for , corresponding to the resolution down to which the bayesian likelihood analysis was run. Although the emissivity in the simulations is tuned to match , small differences emerge due to the random nature of the forward modelling. The values listed for here are the values derived from the maximum likelihood fit to the log-normal distribution, not the ones extracted directly from the measurements, which are more robust and given in Table 4.

There is also tension between this model and observations for low-opacity lines of sight at : in a QSO-driven scenario, low opacity regions arise purely due to proximity to a source quasar and are therefore not expected to disappear completely at high redshift, as long as the QSOs are active. This appears to be in contrast with observations which do report that transparent lines of sight go missing at . However, this problem is mitigated by

  1. the smaller observed sample size at these redshifts, which means the discovery of even 1 transparent line of sight could ease the conflict, and

  2. while the ionising emissivity is already tuned to observations, it might be sensible to rescale the predicted opacity by a larger factor to improve the agreement with data (see later).

We note that the volume density of rare bright sources in this model was explicitly chosen to reproduce the Lyman- CDFs of Becker et al. (2015b) which use ‘optimistic’ measurements i.e. following our leftmost CDF contour. It is conceivable that the model could be modified to reproduce a CDF closer to the mid-point of our contours. This will be discussed in more detail in the next section.

Figure 15: Evolution with redshift of the mean flux (right) and the skewness parameter (right) of Lyman- transmission. A rapid decrease in mean flux with increasing redshift is accompanied by an increase in skewness, as the distribution of fluxes is increasingly non-gaussian. Simulations are post-processed to match the mean flux measured from observations; the offset between observations and simulations at the highest redshift bins reflects the fact that the most likely intrinsic mean flux is lower than the measured mean flux.

6 Empirical Analytic Fit

The treatment of non-detections implemented above is conservative, but does not return a best fit or most likely distribution. Such a fit should ideally incorporate weighting of the observational constraints based on corresponding observational errors in a way where extra data, no matter how large the errors, should not make the fit worse. In order to achieve this, we perform a fully bayesian maximum likelihood analysis. This requires a parametrisation of the Lyman- flux distribution; we empirically find that the data is well fit by a scale-dependant lognormal distribution on the form,


The lognormal parameters can be recast into the more physically meaningful variables of mean flux and skewness parameter following


This parametrization has the advantage of making the tuning used in simulations explicit: in the simulations is chosen, at each redshift, to match the observations. The difference between data and models is then limited to the shape or skewness parameter , on which meaningful constraints can be obtained. A small value of indicates a roughly normal distribution, while tends to a log distribution.

For each set of parameters , the likelihood of the observations given those parameters is computed following


where is the normalised probability distribution of a measurement. This is taken to be a gaussian centered on the observed and with variance equal to the error on . represents the probability distribution of the intrinsic flux within a spectrum region and naturally weights the observations according to their observational errors. We implement the prior that the intrinsic flux is necessarily positive by defining only for . In this way, observations in which the mean flux is formally below the detection threshold, and even observations where the mean flux is formally negative, can be used as constraints on the underlying distribution of fluxes.

Following the likelihood ratio confidence bounds methods the best-fit parameters are found where the value of is highest, with 68%, 90% and 95% credibility regions found where , 0.2585 and 0.1465 respectively (see review in e.g. Andrae 2010). This has the useful feature of being insensitive to multiplicative rescaling of . The post-processed predictions from simulations are obtained by forward modelling 100 simulated lines of sight which then go through the same pipeline as the observations.

The posterior distribution on and , and predictions from post-processed simulations, are shown in Appendix B. The parameters are non-degenerate and the credible regions of parameter space are therefore well-defined. We marginalise over each parameter in turn by collapsing the likelihood matrix and computing new credible interval bounds. Both parameters increase smoothly with redshift, as shown in Figure 15. The average flux decreases steadily with increasing redshift over . While the UVB in simulations is explicitly tuned to match the mean flux, the mean flux recovered after post-processing the lines of sight is slightly lower than measured from observations. This most likely reflects a difference in the clustering of transmission in the simulations: if the transmission occurs in isolated spikes, the flux is more likely to go undetected after observational errors are taken into account (see eg. ?).

This decrease in average flux with redshift is accompanied by a smooth increase in the skewness of the distribution until , followed by a tentative decrease at . This correspond to an increasingly non-normal distribution of transmission at higher redshift. The decrease in skewness in the highest redshift bin reflects the fact that transparent lines of sight are not found at . However, no current spectrograph is capable of measuring opacities larger than , and using a log-normal form for residual flux distribution becomes increasingly less appropriate.

The Sherwood simulation and the model of Keating et al. (2017) shows a very modest increase with redshift, while only the model from Chardin et al. (2015) is displaying sufficient non-gaussianity. Table 5 gives the most likely values for and at all redshifts and for all post-processed models. The most likely intrinsic distribution of opacities is plotted in Fig 8 as a dashed line.

7 Discussion

Our results confirm the long-lasting inhomogeneity of the IGM opacity to Lyman- after the end of reionisation. This inhomogeneity is seen to persist as late at , which is perplexing since photon percolation is predicted to have happened by that redshift by all theoretical models. We confirm the increasing scatter of Lyman- opacity with higher redshift first reported by Becker et al. (2015b). In fact, previous studies may have been too optimistic in their treatments of non-detections of Lyman- flux. With a large enough sample of lines of sight, we found that the distribution of residual fluxes was aptly described by a lognormal distribution and were able to incorporate constraints from non-detections in a fully consistent way. This choice of distribution is purely empirical, as the lognormal is the most common distributing function for variables which are constrained to be positive, such as flux. Using this parametrisation, we extract a linear increase in mean opacity from at to at . It is remarkable, but perhaps not surprising, that all the numerical simulations we confronted with the data required a strongly redshift-dependent rescaling of the source emissivity to match this smooth increase in opacity.

The main caveat in this study, which is also related to a weak point in current numerical simulations, lies in the difficulty of ensuring we are measuring the opacity of the IGM itself as opposed to intervening DLA systems and other absorbers that are not part of the optically thin IGM. Finding these systems via the accompanying decrement to Lyman- flux is very difficult at the redshifts studied here given how strongly the Lyman- forest is already absorbed. We have explicitly removed all systems displaying C IV absorption in the quasar continuum redward of Lyman-. However, weakly ionised transitions such as Mg II and O I require far red and infrared spectra of reasonably high resolution and SNR, which we lack in many cases. Systems containing these ions often do not show C IV in absorption, and their numbers could potentially be increasing at high redshift (Becker et al., 2006; Bosman et al., 2017; Codoreanu et al., 2017). Nevertheless, one would need to discard the measurements obtained along the most opaque of our sample in order for the remaining lines of sight to match the Sherwood smooth-UVB prediction at , and this fraction rises to of the sample at . A smooth UVB at those redshifts is therefore confidently ruled out even in the presence of this caveat. At the same time, recent research has highlighted the crucial importance of including self-shielding effects for numerical simulations of reionisation (e.g. Madau 2017). Bridging this gap can therefore be done from different angles, as improved surveys will constrain the occurrence rates of low-mass systems in the epoch of reionisation to enable better removal, and numerical simulations become more refined.

8 Summary

We have assembled a sample of 62 optical spectra of quasars with in order to measure the distribution of IGM Lyman- transmission over . These objects consist of 13 SDSS-discovered quasars, 10 quasars from DES-VHS, 4 from the SHELLQs survey, 13 from online telescope archives, 19 from previous similar studies and 3 new spectra. The data originates from a total of 10 different optical spectrographs and have been collected over the course of the last 11 years. We use this unprecedented sample of high- quasars to improve the measurements of residual Lyman- transmission of Fan et al. (2006) and Becker et al. (2015b). The large variance in opacity among lines of sight has previously been shown to be incompatible with a uniform UVB. At the same time, unexpectedly large longitudinal correlations in opacity of up to 110 cMpc at mean that only a dramatic increase in number of background sources – and not a finer sampling – can aptly quantify cosmic variance (Becker et al., 2015b).

We confirm the existence of a long-lasting inhomogeneity in the Lyman- opacity of the IGM at , but also detect a significant departure from the opacity distribution expected for a spatially uniform UVB and temperature-density relation down to . If the data genuinely reflect large fluctuations in intergalactic opacity at such low redshifts it may present a significant further challenge to models of reionisation the post-reionisation IGM. We also extend our study to , finding increased opacity.

In order to deal with the disparate data quality in our sample and present limits in a transparent way, we then introduce a second bound on the CDF which is “maximally pessimistic”, i.e. non-detections are taken to mean that . This allows us to incorporate moderately deep data while being confident that the ‘true’ CDF lies in-between these two bounds; in other words our results present the region permitted by current data.

We further explore an empirical log-normal fit to the CDF, which is characterized by a mean opacity and a skewness. Both detections and non-detections are used to constrain the underlying shape of the opacity distribution. We find a linear increase in mean Lyman- opacity with redshift from at to at , accompanied by a smooth increase in skewness.

Altering the comoving size of the binning window produces only subtle effects on the distribution between the and cMpc cases at any redshift. A binning with cMpc significantly broadens the distribution of effective optical depths at . We also vary the length of the excluded “proximity region” which is affected by the source quasar itself, finding no effect at any redshift on the statistical distribution of transmitted flux as long as Å is adopted. The traditionally used value of Å is thus valid and we don’t expect significant contamination from the quasar proximity zones.

We compare our final results with outputs from three different published numerical models: the Sherwood simulation, which uses a spatially uniform UVB (Bolton et al., 2017); the radiative transfer post-processed simulation of Keating et al. (2017), which models temperature fluctuations arising from differences in the timing of reionisation; and a model including rare, bright sources from Chardin et al. (2015). Echoing previous works, we find that the data strongly disfavour the uniform UVB model and the radiation transfer model in their current forms. The rare sources model is marginally consistent with data at but not at . More work may be required to determine whether variations of these models may be more consistent with the present data.

In light of these results, the extreme scatter of Lyman- opacity at the tail end of reionisation remains a perplexing puzzle.


SEIB thanks Alberto Rorai for many productive discussions. The authors are grateful to Laura Keating, Jonathan Chardin, and Girish Kulkarni for kindly sharing the outputs of the radiative transfer, rare sources, and Sherwood simulations, respectively. The authors also thank Ian McGreer for agreeing to share reduced spectra from his 2015 paper.

This research has made use of the Keck Observatory Archive (KOA), which is operated by the W. M. Keck Observatory and the NASA Exoplanet Science Institute (NExScI), under contract with the National Aeronautics and Space Administration. Based on data obtained from the ESO Science Archive Facility. This work is based in part on observations made with ESO Telescopes at the La Silla Paranal Observatory under program IDs 084.A-0390, 096.A-0095 and 096.A-0411.

This paper used data obtained with the MODS spectrographs built with funding from NSF grant AST-9987045 and the NSF Telescope System Instrumentation Program (TSIP), with additional funds from the Ohio Board of Regents and the Ohio State University Office of Research.

Some of the data presented herein were obtained at the W. M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation. The authors wish to recognize and acknowledge the very significant cultural role and reverence that the summit of Maunakea has always had within the indigenous Hawaiian community. We are most fortunate to have the opportunity to conduct observations from this mountain.

SEIB acknowledges support from the European Research Council Advanced Grant FP7/669253 as well as a Graduate Studentship from the Science and Technology Funding Coucil (STFC). Support by the ERC Advanced Grant Emergence – 32056 is gratefully acknowledged. XF acknowledges support from the U.S. NSF grant AST 15-15115. LJ acknowledges support from the National Key R&D Program of China (2016YFA0400703) and from the National Science Foundation of China (grant 11533001). GB acknowledges support from the NSF under grant AST-1615814.

Appendix A Mosaic of quasar catalog

In Figure A1 and A2 we plot all the quasar spectra used in this work in a common format. The spectra are normalised by dividing the flux by a best fit power-law. Wavelengths are divided by to bring the spectra into the rest frame. The y axis is calibrated so that it spans the range continuum for each quasar. We do not bin the spectra in order to reflect the diversity of data qualities present in the sample. Error arrays are shown in red.

Figure 16: First half of the quasar catalog. The origin of each spectrum and the instruments used are listed in Table 1. Wavelength runs from 1000 to 1550 Å and the fluxes have been normalised by dividing by the best-fit power law to the continuum.
Figure 17: Second half of the quasar catalog. Data is as in Fig 18.

Appendix B Posterior distribution of and

In Figure B1 we show the posterior distribution in parameter space for both the data and best fit values for the post-processed simulations. The results given throughout the paper are marginalised over either one of the two parameters. We plot the skewness on a logarithmic scale to emphasize the fact that the distribution is consistent with being ‘maximally skewed’ at , which following Equation 3 corresponds to a exponential distribution tending to infinity at . The 68% and 90% credible intervals are shown as concentric contours. The colored thick lines correspond to simulations from Chardin et al. (2015) (orange), Keating et al. (2017) (red) and Bolton et al. (2017) (blue), post-processed as described in Section 5.1.

Figure 18: Posterior distributions on the skewness and mean opacity of Lyman- transmission. Different contours correspond to redshift ranges of beginning at , following the direction indicated by the arrow. The colored thick lines correspond to simulations from Chardin et al. (2015) (orange), Keating et al. (2017) (red) and Bolton et al. (2017) (blue), post-processed to mimic observational data as described in Section 5.1.


  1. web address
  2. https://koa.ipac.caltech.edu/cgi-bin/KOA/nph-KOAlogin


  1. Andrae, R. 2010, ArXiv e-prints, arXiv:1009.2755
  2. Becker, G. D., Bolton, J. S., & Lidz, A. 2015a, PASA, 32, e045
  3. Becker, G. D., Bolton, J. S., Madau, P., et al. 2015b, MNRAS, 447, 3402
  4. Becker, G. D., Sargent, W. L. W., Rauch, M., & Carswell, R. F. 2012, ApJ, 744, 91
  5. Becker, G. D., Sargent, W. L. W., Rauch, M., & Simcoe, R. A. 2006, ApJ, 640, 69
  6. Bolton, J. S., Puchwein, E., Sijacki, D., et al. 2017, MNRAS, 464, 897
  7. Bosman, S. E. I., Becker, G. D., Haehnelt, M. G., et al. 2017, MNRAS, 470, 1919
  8. Buzzoni, B., Delabre, B., Dekker, H., et al. 1984, The Messenger, 38, 9
  9. Carilli, C. L., Wang, R., Fan, X., et al. 2010, ApJ, 714, 834
  10. Carnall, A. C., Shanks, T., Chehade, B., et al. 2015, MNRAS, 451, L16
  11. Chardin, J., Haehnelt, M. G., Aubert, D., & Puchwein, E. 2015, MNRAS, 453, 2943
  12. Chardin, J., Haehnelt, M. G., Bosman, S. E. I., & Puchwein, E. 2018, MNRAS, 473, 765
  13. Chen, S.-F. S., Simcoe, R. A., Torrey, P., et al. 2016, ArXiv e-prints, arXiv:1612.02829
  14. Codoreanu, A., Ryan-Weber, E. V., Crighton, N. H. M., et al. 2017, ArXiv e-prints, arXiv:1708.00304
  15. D’Aloisio, A., McQuinn, M., & Trac, H. 2015, ApJ, 813, L38
  16. Davies, F. B., & Furlanetto, S. R. 2016, MNRAS, 460, 1328
  17. Davies, F. B., Hennawi, J. F., Eilers, A.-C., & Lukić, Z. 2017, ArXiv e-prints, arXiv:1703.10174
  18. D’Odorico, V., Cupani, G., Cristiani, S., et al. 2013, MNRAS, 435, 1198
  19. Eilers, A.-C., Davies, F. B., Hennawi, J. F., et al. 2017, ApJ, 840, 24
  20. Fan, X., Strauss, M. A., Schneider, D. P., et al. 1999, AJ, 118, 1
  21. Fan, X., White, R. L., Davis, M., et al. 2000, AJ, 120, 1167
  22. Fan, X., Narayanan, V. K., Lupton, R. H., et al. 2001, AJ, 122, 2833
  23. Fan, X., Strauss, M. A., Schneider, D. P., et al. 2003, AJ, 125, 1649
  24. Fan, X., Hennawi, J. F., Richards, G. T., et al. 2004, AJ, 128, 515
  25. Fan, X., Strauss, M. A., Becker, R. H., et al. 2006, AJ, 132, 117
  26. Fukugita, M., Ichikawa, T., Gunn, J. E., et al. 1996, AJ, 111, 1748
  27. Goto, T. 2006, MNRAS, 371, 769
  28. Gunn, J. E., & Peterson, B. A. 1965, ApJ, 142, 1633
  29. Haardt, F., & Madau, P. 2012, ApJ, 746, 125
  30. Hogg, D. W., Finkbeiner, D. P., Schlegel, D. J., & Gunn, J. E. 2001, AJ, 122, 2129
  31. Hook, I. M., Jørgensen, I., Allington-Smith, J. R., et al. 2004, PASP, 116, 425
  32. Horne, K. 1986, PASP, 98, 609
  33. Jiang, L., Fan, X., Vestergaard, M., et al. 2007, AJ, 134, 1150
  34. Jiang, L., McGreer, I. D., Fan, X., et al. 2015, AJ, 149, 188
  35. Jiang, L., Fan, X., Annis, J., et al. 2008, AJ, 135, 1057
  36. Jiang, L., Fan, X., Bian, F., et al. 2009, AJ, 138, 305
  37. Jiang, L., McGreer, I. D., Fan, X., et al. 2016, ApJ, 833, 222
  38. Kaiser, N., Burgett, W., Chambers, K., et al. 2010, in Proc. SPIE, Vol. 7733, Ground-based and Airborne Telescopes III, 77330E
  39. Kashikawa, N., Aoki, K., Asai, R., et al. 2002, PASJ, 54, 819
  40. Keating, L. C., Haehnelt, M. G., Cantalupo, S., & Puchwein, E. 2015, MNRAS, 454, 681
  41. Keating, L. C., Puchwein, E., & Haehnelt, M. G. 2017, ArXiv e-prints, arXiv:1709.05351
  42. Kelson, D. D. 2003, PASP, 115, 688
  43. KOA. 2017, W. M. Keck Observatory Archive
  44. Kolmogorov, A. 1933, Inst. Ital. Attuari, Giorn., 4, 83
  45. Lawrence, A., Warren, S. J., Almaini, O., et al. 2007, MNRAS, 379, 1599
  46. Lidz, A., Oh, S. P., & Furlanetto, S. R. 2006, ApJ, 639, L47
  47. Madau, P. 2017, ArXiv e-prints, arXiv:1710.07636
  48. Marshall, J. L., Burles, S., Thompson, I. B., et al. 2008, in Proc. SPIE, Vol. 7014, Ground-based and Airborne Instrumentation for Astronomy II, 701454
  49. Matsuoka, Y., Onoue, M., Kashikawa, N., et al. 2016, ApJ, 828, 26
  50. —. 2017, ArXiv e-prints, arXiv:1704.05854
  51. McGreer, I. D., Mesinger, A., & D’Odorico, V. 2015, MNRAS, 447, 499
  52. McGreer, I. D., Mesinger, A., & Fan, X. 2011, MNRAS, 415, 3237
  53. McMahon, R. G., Banerji, M., Gonzalez, E., et al. 2013, The Messenger, 154, 35
  54. McQuinn, M., Hernquist, L., Lidz, A., & Zaldarriaga, M. 2011, MNRAS, 415, 977
  55. Mesinger, A. 2010, MNRAS, 407, 1328
  56. Miyazaki, S., Komiyama, Y., Nakaya, H., et al. 2012, in Proc. SPIE, Vol. 8446, Ground-based and Airborne Instrumentation for Astronomy IV, 84460Z
  57. Morganson, E., De Rosa, G., Decarli, R., et al. 2012, AJ, 143, 142
  58. Mortlock, D. J., Patel, M., Warren, S. J., et al. 2009, A&A, 505, 97
  59. Mortlock, D. J., Warren, S. J., Venemans, B. P., et al. 2011, Nature, 474, 616
  60. Planck Collaboration, Ade, P. A. R., Aghanim, N., et al. 2016, A&A, 594, A13
  61. Pogge, R. W., Atwood, B., O’Brien, T. P., et al. 2012, in Proc. SPIE, Vol. 8446, Ground-based and Airborne Instrumentation for Astronomy IV, 84460G
  62. Reed, S. L., McMahon, R. G., Banerji, M., et al. 2015, MNRAS, 454, 3952
  63. Reed, S. L., McMahon, R. G., Martini, P., et al. 2017, MNRAS, 468, 4702
  64. Ryan-Weber, E. V., Pettini, M., Madau, P., & Zych, B. J. 2009, MNRAS, 395, 1476
  65. Sheinis, A. I., Bolte, M., Epps, H. W., et al. 2002, PASP, 114, 851
  66. Simcoe, R. A., Cooksey, K. L., Matejek, M., et al. 2011, ApJ, 743, 21
  67. Songaila, A. 2004, AJ, 127, 2598
  68. Springel, V. 2005, MNRAS, 364, 1105
  69. The Dark Energy Survey Collaboration. 2005, ArXiv Astrophysics e-prints, astro-ph/0510346
  70. Venemans, B. P., McMahon, R. G., Warren, S. J., et al. 2007, MNRAS, 376, L76
  71. Venemans, B. P., Findlay, J. R., Sutherland, W. J., et al. 2013, ApJ, 779, 24
  72. Venemans, B. P., Verdoes Kleijn, G. A., Mwebaze, J., et al. 2015, MNRAS, 453, 2259
  73. Vernet, J., Dekker, H., D’Odorico, S., et al. 2011, A&A, 536, A105
  74. Wang, F., Wu, X.-B., Fan, X., et al. 2016, ApJ, 819, 24
  75. Willott, C. J., Delorme, P., Omont, A., et al. 2007, AJ, 134, 2435
  76. Willott, C. J., Delorme, P., Reylé, C., et al. 2009, AJ, 137, 3541
  77. —. 2010, AJ, 139, 906
  78. Wu, X.-B., Wang, F., Fan, X., et al. 2015, Nature, 518, 512
  79. Wyithe, J. S. B., & Bolton, J. S. 2011, MNRAS, 412, 1926
  80. York, D. G., Adelman, J., Anderson, Jr., J. E., et al. 2000, AJ, 120, 1579
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description