Simulating and interpreting deep Jwst/NIRSpec observations in the Hubble Ultra Deep Field
The James Webb Space Telescope (JWST) will enable the detection of optical emission lines in galaxies spanning a broad range of luminosities out to redshifts . Measurements of key galaxy properties, such as star formation rate and metallicity, through these observations will provide unique insight into, e.g., the role of feedback from stars and active galactic nuclei (AGNs) in regulating galaxy evolution, the co-evolution of AGNs and host galaxies, the physical origin of the ‘main sequence’ of star-forming galaxies and the contribution by star-forming galaxies to cosmic reionization. We present an original framework to simulate and analyse observations performed with the Near Infrared Spectrograph (NIRSpec) on board JWST. We use the \oldtextscbeagle tool (BayEsian Analysis of GaLaxy sEds) to build a semi-empirical catalogue of galaxy spectra based on photometric spectral energy distributions (SEDs) of dropout galaxies in the Hubble Ultra Deep Field (HUDF). We demonstrate that the resulting catalogue of galaxy spectra satisfies different types of observational constraints on high redshift galaxies, and use it as input to simulate NIRSpec/prism () observations. We show that a single ‘deep’ ( ks) NIRSpec/prism pointing in the HUDF will enable detections of multiple optical emission lines in () galaxies at () down to AB mag. Such observations will allow measurements of galaxy star formation rates, ionization parameters and gas-phase metallicities within factors of 1.5, mass-to-light ratios within a factor of 2, galaxy ages within a factor of 3 and -band attenuation optical depths with a precision of 0.3.
keywords:galaxies: evolution – galaxies: formation – galaxies: ISM – HII regions – dark ages, reionization, first stars – telescopes
#1\NAT@spacechar\@citeb\@extra@b@citeb\NAT@date \@citea\NAT@nmfmt\NAT@nm\NAT@spacechar\NAT@@open#1\NAT@spacechar\NAT@hyper@\NAT@date \LetLtxMacro\oldtextsc
Our picture of the formation and evolution of galaxies across cosmic time has improved significantly in the past 15 years. Galaxy surveys have provided researchers with a wealth of spectro-photometric data, at both low (e.g. SDSS, York et al. 2000) and high (e.g., GOODS, Stanway et al. 2003; GLARE, Stanway et al. 2004, 2007; COSMOS, Scoville et al. 2007; HUDF, Bunker et al. 2004; Beckwith et al. 2006; Bouwens et al. 2010; Bunker et al. 2010; McLure et al. 2010; Ellis et al. 2013; Illingworth et al. 2013; CANDELS, Grogin et al. 2011; Koekemoer et al. 2011; VUDS, Le Fèvre et al. 2015; VANDELS, McLure et al. 2017) redshifts. On the theoretical front, computer simulations of galaxy formation (e.g. EAGLE, Schaye et al. 2015; IllustrisTNG, Pillepich et al. 2017) can now accurately predict several key galaxy properties, such as the redshift evolution of the galaxy stellar mass function and star formation rate density (e.g. Genel et al. 2014; Furlong et al. 2015). Yet, many details of the physical processes driving the evolution of galaxies remain unknown. AGN feedback is a key ingredient of galaxy formation models (see Somerville & Davé 2015 and references therein), but observational evidence for AGN-driven ‘quenching’ of star formation is ambiguous (e.g. Carniani et al. 2016; Suh et al. 2017). Similarly, although several correlations exist between the physical properties of AGNs and their host galaxies (see Kormendy & Ho 2013 and references therein), a causal connection among these properties indicating a ‘co-evolution’ of galaxies and AGNs has not been clearly demonstrated. The apparent existence of a tight relation between galaxy masses and star formation rates (i.e., ‘main sequence’, Brinchmann et al. 2004; Noeske et al. 2007), and perhaps of a more ‘fundamental’ relation involving gas-phase metallicity as well (Mannucci et al. 2010), can be an indication of self-regulation of star formation within galaxies (e.g. Lilly et al. 2013), but the significance and redshift evolution of these relations remains unclear (e.g. Yabe et al. 2015; Telford et al. 2016). The increasing relative abundance of UV-faint galaxies at redshift (e.g. Bouwens et al. 2015a; Finkelstein et al. 2015) would support a picture in which low-mass star-forming galaxies provide the bulk of H-ionizing (LyC) photons required for cosmic reionization (e.g. Wilkins et al. 2011; Finkelstein et al. 2012; Robertson et al. 2013; Bouwens et al. 2015b), but this conclusion relies on several assumptions about the ionizing emissivity of galaxies and escape fraction of LyC photons from galaxies.
Advancing our understanding of the above processes, and of many others, that drive the evolution of galaxies from the reionization epoch to the present day, requires measuring physical properties – such as stellar masses, star formation rates, stellar and gas metallicities, stellar ages, gas ionization state, the dynamics of gas and stars – for large samples of galaxies, across a broad range of redshifts and galaxy stellar masses. While multi-band photometric campaigns have collected high quality SEDs for thousands of galaxies spanning a wide range of masses out to (e.g. HUDF, CANDELS), the availability of spectroscopic observations at is much more limited. At those redshifts, the strongest optical emission lines (e.g. [O ii], H, [O iii], H) are shifted to near infrared wavelengths, where observations with ground-based telescopes are challenging because of regions of low atmospheric transmittance, bright sky background and contamination from bright, variable sky lines. Optical emission lines have been measured out to for large samples of galaxies using low-resolution slitless spectroscopy with HST (e.g. WISP, Atek
et al. 2010; 3D-HST, Brammer
et al. 2012; GLASS, Treu
et al. 2015; FIGS, Pirzkal
et al. 2017). Higher-resolution spectroscopic surveys from the ground (e.g. VUDS, VANDELS) have mainly relied on multi-object spectrographs operating at optical wavelengths (e.g. VIMOS at VLT, Le Fèvre
et al. 2003), hence are limited, at high redshift, to the measurement of rest-frame UV emission lines. These lines are intrinsically weaker than the optical ones (e.g., see table 5 of Steidel et al. 2016), and their interpretation is complicated by radiative transfer effects, dust attenuation and potential contamination from outflows.
In the near future, the limitations of existing observatories to study the high redshift Universe will largely be overcome by the launch of the James Webb Space Telescope (JWST, Gardner et al. 2006). JWST is expected to revolutionise our view of galaxies at , and of low-mass galaxies at , by providing a wide range of imaging and spectroscopic capabilities in the wavelength range µm. In particular, the Near Infrared Spectrograph (NIRSpec, Bagnasco et al. 2007; Birkmann et al. 2016) onboard JWST will increase by more than an order of magnitude the emission line sensitivity in the wavelength region µm covered by existing ground-based instruments, while reaching an even greater sensitivity at wavelengths µm inaccessible with existing observatories. The multi-object spectroscopic capabilities of NIRSpec will enable the simultaneous measurement of up to galaxy spectra, allowing the observation of standard optical emission lines for large samples of galaxies out to (see Section 2.4). Combined with stellar mass measurements based on JWST/NIRCam imaging (Near Infrared Camera, Horner & Rieke 2004), this will provide researchers with unique data to constrain the physical processes driving the evolution of galaxies at , the formation of the first galaxies at , the way cosmic reionization proceeded in space and time and the contribution of different sources to the cosmic reionization budget.
The uniqueness of JWST/NIRSpec, however, also poses new challenges for the planning and interpretation of observations of high redshift galaxies. Unlike ground-based multi-object spectrographs, which typically require the user to define a ‘mask’ of apertures on the sky, NIRSpec is equipped with a micro-shutter array (MSA, Kutyrev et al. 2008) providing a fixed grid of apertures on the sky. Optimising the NIRSpec MSA usage for a given scientific goal therefore requires a careful prioritization of the targets, as well as calculations to study spectral overlaps and truncations. The different filters and dispersers available on NIRSpec, covering the wavelength range µm and spectral resolutions , and , provide complementary information, but the optimal choice of filters, dispersers and exposure times depends on the scientific case, target properties and redshift range. Given the paucity of high quality galaxy spectra at , NIRSpec data will open a largely unexplored space of observables, for both emission lines and stellar continuum studies. Maximising the information extracted from such data therefore requires models and approaches adapted to describing galaxies with a wide range of stellar populations and interstellar medium properties, likely extending well beyond those measured with existing observatories. Having adequate models and simulations is therefore critical, especially for the planning of Cycle 1 and Cycle 2 observations, as these will be largely based on spectroscopic follow-up of existing, pre-JWST imaging data.
In this paper, we build a physically-motivated, semi-empirical framework to simulate integrated galaxy spectra as could be observed at low spectral resolution with JWST/NIRSpec in the Hubble Ultra Deep Field (HUDF). We then use these simulated spectra to study our ability to constrain different galaxy physical parameters, such as star formation rates, ages, mass-to-light ratios, metallicities and properties of dust attenuation and ionised gas, for objects at redshift . We adopt a self-consistent physical model which accounts for stellar and ionised gas emission, integrated in the state-of-the-art \oldtextscbeagle tool (Chevallard & Charlot 2016), to generate the input mock catalog of galaxy spectra and to fit the simulated observed spectra to retrieve the galaxy properties.
In Section 2, we describe the catalogue of dropouts of Bouwens et al. (2015a) and our approach to fit the UV-to-near infrared photometry of these sources with the \oldtextscbeagle tool. We also present our method for associating model spectra with each dropout galaxy, and demonstrate how such a method produces spectra consistent with several, independent observables. In Section 3 we present our simulations of JWST/NIRSpec observations and the full-spectrum fitting of these simulations with \oldtextscbeagle. In Section 4 we examine the constraints on different galaxy physical parameters obtained by fitting the simulated NIRSpec data. In Section 5 we discuss our results in the light of NIRSpec observational programs, in particular of the NIRSpec Guaranteed Time Observations (GTO) program, and future extensions of this work to cover a broader range of NIRSpec observing modes. Finally, we summarize our conclusions in Section 6.
Throughout the paper, we express magnitudes in the AB system, adopt a zero-age solar metallicity (corresponding to a present-day metallicity of 0.01524, see Table 3 of Bressan et al. 2012) and the latest constraints on cosmological parameters from Planck, i.e. , and (see last column ‘TT,TE,EE+lowP+lensing+ext’ of Table 4 of Planck Collaboration et al. 2015). All the emission line equivalent widths in the text refer to rest-frame values.
2 A semi-empirical catalogue of galaxy spectra in the HUDF
In this section we detail our approach to create a semi-empirical catalogue of high-redshift galaxy spectra. This catalogue is constructed by matching the predictions of a spectral evolution model to a large sample of galaxies with multi-band HST photometry, and represents the first of several steps involved in the process of creating and interpreting simulated NIRSpec observations. The adoption of a semi-empirical approach is motivated by the need to minimize the model-dependence of our analysis, decoupling it from a particular theoretical approach (e.g. hydro-dynamic simulation vs semi-analytic model) and its specific implementation. Also, anchoring our simulations to existing HST photometry allows us to create and study NIRSpec observations which more closely resemble those that will be obtained early on in the JWST mission, i.e. based on HST-selected targets. Adopting a purely empirical approach based on existing spectroscopic observations, on the other hand, is not viable because of the widely different wavelength coverage and sensitivity of current observatories compared to JWST/NIRSpec. Existing spectroscopic surveys targeting high redshift galaxies are often limited to observe wavelengths µm (e.g. VVDS, Le Fèvre et al. 2013, VUDS, VANDELS), while surveys extending to µm (e.g. MOSDEF) target relatively bright () galaxies.
2.1 Multi-band Hst photometry of dropout galaxies in the HUDF
In this work, we focus on galaxies at redshift , for which JWST/NIRSpec will enable measurements of standard optical emission lines (e.g. H, [O iii], H, [N ii], [S ii]), which are largely inaccessible with existing ground-based telescopes. As shown by the initial works of Steidel et al. (1996) and Madau et al. (1996), high redshift star-forming galaxies can be effectively selected from broad-band photometric data by using the Lyman break ‘dropout’ technique. This technique exploits the (nearly) complete absorption by neutral hydrogen of any light emitted by a galaxy blue-ward the Lyman limit (912 Å). As the Lyman limit is redshifted to redder wavelengths with increasing redshift, this makes an object become undetectable (‘drop-out’) from a given optical/near-infrared band. Several groups over the years have used this technique to identify high- star-forming galaxies, initially using optical data from the HST Advanced Camera for Surveys (ACS) to select objects out to (e.g. Steidel et al. 1999; Dickinson et al. 2004; Bunker et al. 2004), and later exploiting near-infrared observations with the Wide Field Camera 3 (WFC3) to obtain large samples of galaxies (e.g. Wilkins et al. 2010; Bouwens et al. 2011; McLure et al. 2013; Schenker et al. 2013).
Here we use the high- galaxy candidates selected with the dropout technique by Bouwens et al. (2015a) in the Hubble Ultra Deep Field (HUDF). Since we are interested in galaxies out to the highest redshifts, we only consider sources selected from the 4.7 region in the HUDF with deep HST/WFC3 near-infrared observations. We adopt the multi-band HST catalogue of Illingworth et al. (2013), built from a combination of all available HST/ACS and WFC3 observations in the HUDF (for a complete list see table 2 of Illingworth et al. 2013), and refer to this as the ‘eXtreme Deep Field’ (XDF) catalogue. The XDF catalogue includes observations in 9 bands, 5 in the optical, based on the ACS/WFC filters F435W, F606W, F775W, F814W and F850LP, and 4 in the near-infrared, taken with the WFC3 filters F105W, F125W, F140W and F160W. The depth of a point source is , computed within a circular aperture of 0.35 arcsec diameter, while the combined image used for the source extraction reaches a depth of 31.2 within the same circular aperture (Illingworth et al. 2013). We do not use Spitzer/IRAC data, since the vast majority of our galaxies are too faint to be detected in existing Spitzer images.
We consider dropout galaxies in the filters (), (), (), () and (), which we will indicate as , , , and dropouts in the remainder of this paper. Details on the Lyman Break selection in each band can be found in sec 3.2 of Bouwens et al. (2015a) (see also their table 2). Fig. 1 shows the F160W magnitude distribution of galaxies selected in the different dropout bands. The vast majority ( per cent) of the galaxies at ( dropouts and above) have , while the 50 per cent completeness magnitude varies between a minimum of (F105W filter, dropouts) to a maximum of (F160W filter, dropouts). The adopted Lyman Break selection produces, for each dropout band, the redshift distributions shown in Fig. 2. Galaxy redshifts are centred around the average values reported above, with Full Width Half Maximum (FWHM) of the redshift distribution set by the filter widths, i.e. for the , , and dropouts, increasing to for the dropouts. The absolute UV magnitude of the sources, computed from the median magnitude in each dropout class, assuming a flat spectrum, and using the central redshift of each dropout band, varies from ( dropouts), to (), (), (), ().
2.2 Broad-band SED fitting of high-redshift galaxies in the HUDF
|\arraybackslash Parameter||Prior||Description||XDF photometry||NIRSpec spectra|
|\arraybackslash||\arraybackslashUniform||\arraybackslashStellar mass (does not account for fraction of mass returned to the ISM by stellar mass loss)||\arraybackslash✵||\arraybackslash✵|
|\arraybackslash||\arraybackslashUniform||\arraybackslashTime scale of star formation in a SFH described by a delayed exponential function.||\arraybackslash✵||\arraybackslash✵|
|\arraybackslash||\arraybackslashGaussian truncated||\arraybackslashAge of onset of star formation in the galaxy||\arraybackslash✵||\arraybackslash✵|
|\arraybackslash||\arraybackslashGaussian truncated||\arraybackslashStar formation rate averaged over the last yr||\arraybackslash✵||\arraybackslash✵|
|\arraybackslash||\arraybackslashUniform||\arraybackslashEffective gas ionization parameter||\arraybackslash✵||\arraybackslash✵|
|\arraybackslash||\arraybackslashUniform||\arraybackslashDust-to-metal mass ratio||\arraybackslash✵||\arraybackslash✵|
|\arraybackslash||\arraybackslashExponential truncated||\arraybackslash-band attenuation optical depth||\arraybackslash✵||\arraybackslash✵|
|\arraybackslash||\arraybackslashUniform||\arraybackslashFraction of -band attenuation arising in the diffuse ISM||\arraybackslash||\arraybackslash✵|
Similarly to Chevallard &
Charlot (2016, see their sec 4), we use the \oldtextscbeagle tool to fit the XDF photometry of 715 dropout galaxies with a self-consistent physical model that includes stellar emission, continuum+line emission from H ii regions and diffuse ionized gas and dust attenuation. We do not model the emission from an AGN potentially contaminating the HST photometry, as the expected number of type-1 AGNs in the 4.7 field here considered is consistent with zero (e.g. see section 4.2.5 of Grazian
et al. 2015). We let the redshift free to vary, and adopt a two-component star formation history constituted by a ‘smooth’ function and a burst. The smooth component is described by a delayed exponential function , where is the star formation timescale and the lookback time, which varies from yr to , the age of onset of star formation in the galaxy.
The burst (with constant star formation rate) covers the last of star formation, the timescale over which per cent of the H-ionizing photons are emitted (e.g. Charlot &
Fall 1993; Binette et al. 1994), and it is parametrised in terms of the ‘current star formation rate’ , i.e. the SFR averaged over the past 10 Myr. Decoupling the ‘past’ star formation history and the ‘current’ star formation rate allows us to obtain galaxy spectra with any contribution of emission lines relative to stellar continuum. We fix the maximum redshift for the formation of the first stars in a galaxy at , so that at any redshift the maximum allowed age for the onset of star formation is , where and refer to the age of the Universe at redshift and , respectively. We approximate the distribution of stellar metallicities in a galaxy, including the metallicity of young stars (with ages yr, ) and the interstellar metallicity (), with a single metallicity , i.e. .
Following the Bayesian approach adopted in \oldtextscbeagle, we define the posterior probability distribution of the model free parameters as
where indicates the data, the adopted model, the prior distribution and the likelihood function. We adopt independent priors for all parameters, uniform for the parameters , , , , and , Gaussian for and and exponential for (see Table 1).
where the summation index runs over all observed bands (i.e., bands with positive errors, and negative or positive fluxes), is the observed flux, , where is the observational error and is an additional error term that we add to avoid obtaining results dominated by systematic uncertainties (see sec. 4.2 of Chevallard & Charlot 2016) and indicates the fluxes predicted by our model for a set of parameters .
We adopt the Nested Sampling algorithm (Skilling et al. 2006) as implemented in \oldtextscmultinest (Feroz et al. 2009) to sample the posterior probability distribution of the 9 free parameters . It is worth briefly pausing to discuss the potential risk of over-fitting our data by using such a flexible physical model. The 9 photometric bands used in the fitting cover the wavelength range , hence they mainly probe the rest-frame UV emission of galaxies, and only bands redward the Lyman break (4 bands for the dropouts) provide constraints on the galaxy physical properties. In star-forming galaxies, the UV continuum emission is mostly sensitive to the recent ( yr) star formation history and to dust attenuation, while other parameters, such as the mass of older stars and the physical conditions of gas, have little influence on the emission at these wavelengths. Since in this work we aim at simulating NIRSpec observations covering a large variety of intrinsic galaxy spectra, extending beyond those observed in relatively bright galaxies at , we must also account for the variation of model parameters largely unconstrained by existing photometric observations. For this reason we adopt the flexible, 9-parameters model described above, and combine the weak constraints provided by HST photometry on some model parameters with well established relations among galaxy physical quantities to obtain physically-motivated combinations of parameters (see Section 2.3). We then validate this approach by comparing our model predictions with external (photometric and spectroscopic) data-sets at redshift (Section 2.4).
The results of the \oldtextscbeagle fitting of the XDF data are summarised in Fig. 2 in terms of the stacked posterior probability distribution of redshift. This is computed by combining the redshift probability distribution of each galaxy using a kernel-density estimator. Similarly to an histogram, Fig. 2 represents a density plot, where the solid curves depend on both the density of galaxies at each redshift and on the redshift probability distribution of each individual source. Fig. 2 provides an interesting visual comparison among the redshift distributions obtained by Bouwens et al. (2015a) from the analysis of Monte Carlo simulations of artificial sources (see their section 4.1) and those computed in this work. Given the very different approaches adopted to compute the two sets of lines of Fig. 2, the figure highlights a good agreement between the two methods. The differences among the two approaches are larger for the dropouts, for which the peaks of the two distributions are separated by (), possibly because of the low constraining power of data for the dropouts (low , few bands red-ward the continuum break) combined with a low number of sources ().
2.3 Linking model spectra and observed galaxies in the HUDF
The \oldtextscbeagle tool provides us with the posterior probability distribution of the 9 free parameters of the model (reported in column 5 of Table 1), along with sets of selected observables (e.g., full spectrum, absolute and apparent magnitudes) and derived quantities (e.g. rates of H- and He-ionizing photons). These are obtained by sampling the posterior probability distribution of model parameters using the \oldtextscmultinest algorithm (see section 3.3 of Chevallard & Charlot 2016 for more information on the \oldtextscbeagle output). From a purely statistical perspective, we could associate with a given XDF source any spectra from the set drawn using \oldtextscmultinest in this way. However, since the data are consistent with a broad range of model spectra, this approach would not ensure that these match other, independent observables, such as galaxy colours at longer wavelengths, or emission line measurements. The latter is particularly relevant for XDF galaxies, which exhibit faint continuum emission and will likely be only detected via emission lines by JWST/NIRSpec. By obtaining a physically-motivated distribution of emission line strengths, we can produce simulated NIRSpec observations with realistic ratios. This, in turn, should guarantee that the constraints on the galaxy physical parameters derived in Section 3 can be applied to future ‘deep’ NIRSpec observations of HST-selected high-redshift galaxies.
After some experimentation, we find that imposing model parameters to follow a redshift-dependent relation between mass and star formation rate (the so-called ‘main sequence’ of star-forming galaxies) produces observables in agreement with different types of observations. In practice, we enforce this relation by multiplying the posterior probability distribution of equation 2.1 by a ‘weight’ function which depends on the stellar mass and star formation rate. In order to more densely populate the tails of mass–star formation relation, we adopt a Student’s-t weight function with 3 degrees of freedom, which is significantly broader than a Gaussian function. The adopted functional form is
where is the standardised variable
and, as before, indicates the star formation rate. We fix the scatter (see, e.g., Speagle et al. 2014; Shivaei et al. 2015), and adopt the (redshift-dependent) mass-star formation rate relation of Speagle et al. (2014) (their equation 28) to compute (below expressed as a function of time instead of redshift )
where indicates the galaxy stellar mass. Equations (2.3)–(2.5) provide the conditional distribution of star formation rate given the galaxy stellar mass and redshift, . This conditional distribution does not depend on the data , hence it can be readily interpreted as a conditional prior distribution linking the three parameters star formation rate, stellar mass and redshift.
It is well known that the photometric degeneracy between the Lyman and Balmer breaks can make a galaxy photometric SED be equally well described by two different models with widely different redshifts (e.g. Ilbert et al. 2006). As the \oldtextscbeagle tool can identify these multiple redshift solutions, in this work we restrict the analysis to redshifts consistent with the expected dropout redshift of Bouwens et al. (2015a). To achieve this, among the combinations of model parameters obtained with \oldtextscbeagle, we only consider those satisfying the condition , where for the dropouts, for the , for the and for the and ones (Fig. 2). We then randomly draw, for each galaxy, a set of parameters among those sampled by \oldtextscmultinest, where the probability of drawing any set is proportional to the re-weighted posterior probability distribution. We can repeat the random draw (with replacement) times to obtain model spectra consistent with the observed HST photometry, and corresponding to combinations of satisfying the adopted mass-star formation rate relation. This is particularly relevant to evaluate selection effects and to study the consequences of deriving population-wide relations among physical parameters (e.g. mass-metallicity, mass-star formation rate) from a finite number of observations. Both applications will be part of successive works, while in the remainder of this paper we consider a single random draw to associate a model spectra to each XDF source, and refer to the ensemble of 715 model spectra as a ‘mock catalogue’.
We show in Fig. 3 by means of (grey) density contours on a logarithmic scale the two-dimensional stacked posterior probability distribution of stellar mass and star formation rate , along with the pairs drawn from the re-weighted posterior probability distribution (circles of different colours). By analogy with the one-dimensional stacked posterior probability distribution of redshift (see Section 2.2), we compute the two-dimensional stacked posterior probability distribution of and summing the individual probability distributions computed with a kernel-density estimator. Fig. 3 shows the effect of varying in our model both the current star formation rate and total stellar mass when only rest-frame UV observations are available. The observed HST photometry of a large fraction of XDF galaxies can be reproduced equally well by our model for widely different combinations of and , which makes the stacked distribution occupy a broad region of Fig. 3. We note that the sharp upper envelope created by the coloured circles in Fig. 3 is caused by the fixed duration (10 Myr) of the current burst of constant SF, which limits the maximum attainable at fixed .
Fig. 4 shows the distribution of the main model free parameters of the mock catalogue. The low constraining power of HST photometry with respect to most model parameters makes these distributions reflect the adopted priors, i.e. Gaussian for , exponential for and uniform for the other parameters (see Table 1). Also, no strong correlations are visible except for the relation between and that we enforced. It is worth briefly discussing the absence in Fig. 4 of a correlation between mass and metallicity, and metallicity and ionization parameter. The mass–metallicity relation is well-established at low redshift (e.g. from SDSS data, Tremonti et al. 2004), while observations targeting galaxy populations similar to those used in our analysis, i.e. faint high-redshift galaxies, depict a less clear picture. Recent results from the VUDS survey suggest that highly star-forming galaxies with stellar masses comparable to those of our mock galaxies () span a broad range of gas-phase metallicities at fixed stellar mass (Calabrò et al. 2017). Similar results have been obtained by Izotov et al. (2015) and Jimmy et al. (2015) from the analysis of low-mass local ‘analogues’ of high redshift galaxies. These findings can have a physical origin, since the metal content of star-forming gas results from the complex interaction between enrichment from previous stellar generations, AGN and stellar feedback and gas accretion, or can be driven by systematic uncertainties in the metallicity estimators (e.g. Kewley & Ellison 2008).
Similarly, an anti-correlation between gas-phase metallicity and ionization parameter has been observed at low redshift (Carton et al. 2017), and can be expected on theoretical grounds (e.g. Dopita et al. 2006), but the redshift evolution and dispersion of the relation for galaxy populations as those considered in this work are largely unknown. While future observations may show that certain combinations of model parameters (e.g. high metallicity and large ionization parameter) are not well suited to describe the conditions of photoionized gas in high-redshift star-forming galaxies, we have decided not to impose any relation between these parameters in our mock galaxy catalogue. This will enable users to adopt any relation between mass–metallicity and metallicity–ionization parameter to select sub-samples of sources from the different realisations of the mock catalogue.
2.4 Comparison of model spectra with independent data
Our approach to build a catalogue of mock galaxy spectra guarantees, by construction, that these match the HST photometry of XDF dropouts. In this section, we compare the mock spectra with other observables, namely near infrared photometry from the Infrared Array Camera (IRAC) onboard the Spitzer Space Telescope, and measurements of the [C iii]907+C iii]909 emission line performed with different instruments. The lower sensitivity of Spitzer with respect to HST only enables the detection in the IRAC band 1 (3.6 µm filter, ) and band 2 (4.5 µm filter, ) of a minority of XDF dropouts with (see Fig. 1). For this reason, we have to rely on Spitzer observations of stacked galaxies and of individual (brighter) galaxies over a much wider area than the XDF region.
We compare in Fig. 5 the predicted optical-to-near infrared SEDs ( bands) of our mock catalogue with the stacked SEDs computed by González et al. (2012) at and galaxies. González et al. (2012) consider HST/ACS, HST/WFC3 and Spitzer/IRAC observations in the GOODS-South region (Giavalisco et al. 2004; Bouwens et al. 2012), and compute the median SEDs of , , and dropouts in bins of observed UV luminosity. In practice, they compute the median HST fluxes of the sources in each UV luminosity bin, and perform aperture photometry on the median-combined image in each IRAC band. They evaluate the uncertainties on the median fluxes through bootstrap resampling. We note that when comparing model predictions with Spitzer/IRAC fluxes of galaxies, one must consider the effect of optical emission lines contaminating the and bands. At ( dropouts), contaminates the band, while at both IRAC bands can be contaminated by either or . This makes the IRAC colours computed from the stacked SEDs of galaxies highly dependent on the emission line properties and redshift distribution of the sources entering the stacks, and this dependence is exacerbated in stacks computed from a low number of sources (e.g. see fig. 7 of González et al. 2012). For this reason, we only consider the and stacked SEDs of González et al. (2012) computed from a minimum of 15 individual sources. Fig. 5(a) and (b) show a good agreement among the SEDs of our mock catalogue and the median SEDs González et al. (2012) in all the UV luminosity bins considered. This test is particularly important since the IRAC fluxes, unlike the HST ones, are not matched to any observation during the mock catalogue creation. Fig. 5 hence demonstrates that, on average, the mass and ages of evolved stars in our mock galaxies, traced by the IRAC fluxes probing the rest-frame optical-to-near infrared emission of galaxies, are consistent with the observed values.
The Spitzer/IRAC colour can be used to further test the agreement between observations and mock spectra. As we noted above, the redshift evolution of this colour is sensitive to the presence of optical emission lines, especially the two groups and , contaminating the and bands (e.g. observationally, Stark et al. 2013; de Barros et al. 2014; theoretically, Wilkins et al. 2013). We therefore compare in Fig. 6 the evolution of the colour in the redshift range predicted by our mock catalogue with data from different sources. These include, from low to high redshift, observations from 3D-HST (Skelton et al. 2014), from the spectroscopic sample of Smit et al. (2016) and from the photometric samples of Rasappu et al. (2016), Smit et al. (2015) and Smit et al. (2014). Since our mock catalogue is based on observations from the small, 4.7 XDF area, while data are drawn from much larger areas ( ), we encode in the circle sizes the rest-frame UV luminosity of each object: the larger the circle, the more luminous the galaxy. Fig. 6 highlights the ability of the mock spectra to reproduce the sharp IRAC colour change at () caused by the entry of H in the () band, including the most extreme blue and red colours, which are caused by . We note that the different extremes reached by the IRAC colours at () and at () are likely caused by the different widths of the IRAC filters, µm for filter and µm for one. This difference translates into a different contamination of the integrated broad-band flux of each band, at fixed emission line equivalent width (EW). At redshifts the band is contaminated by the group of lines and the band by , therefore the colour depends on the relative intensities of H-Balmer lines vs [O iii], i.e. on the physical conditions of ionized gas, especially metallicity and ionization parameter. In the narrow redshift window , the band is free of strong emission lines, while is contaminated by . Smit et al. (2015) exploit this property to select extreme emission lines galaxies with accurate photometric redshifts. They search for such extreme objects in the 5 CANDELS fields (), and find sources with . These are relatively bright galaxies falling in a narrow redshift range, hence, not surprisingly, they do not appear in our mock catalogue, which reaches at most at . Overall, Fig. 6 indicates that the strengths of the and lines in our mock catalogue are consistent with those inferred from Spitzer/IRAC colours of galaxies. This validation is particularly important since the strength of the optical emission lines in our mock catalogue will translate into a distribution of in the NIRSpec simulations (see Section 3.2), therefore directly affecting the constraints on galaxy physical parameters form NIRSpec spectra discussed in Section 4.
The above tests only provide indirect constraints on spectroscopic features through the contamination of broad band filters by emission lines. As we have already noted, existing observatories do not allow the detection of optical emission lines at , hence to study the evolution of galaxy spectral features across the widest redshift range one must rely on UV emission lines. The Ly line is the most luminous emission line at UV wavelengths, but radiative transfer effects caused by its resonant nature make the comparison of model predictions and observations non trivial (e.g., Verhamme et al. 2006). We therefore consider another bright UV line, the doubly-ionized carbon line [C iii]907+C iii]909, which has been observed both at low (e.g. Berg et al. 2016; Senchyna et al. 2017) and high redshift (e.g. Erb et al. 2010; Stark et al. 2015). We compare in Fig. 7 the redshift evolution of the [C iii]907+C iii]909 equivalent width from literature with that predicted by our mock catalogue. In the same figure, we also show with a histogram the distribution of [C iii]907+C iii]909 equivalent widths in our mock catalogue. Fig. 7 indicates that while most galaxies in our catalogue exhibit Å, a significant number of mock spectra attain substantially larger values, reaching the most extreme Å found by Stark et al. (2015, 2017) and Amorín et al. (2017).
The comparisons of our mock spectra with external observables presented in this section validate our semi-empirical approach to build a mock galaxy catalogue based on HST photometry in the HUDF. In particular, the above tests demonstrate the ability of our catalogue to match the rest-frame optical continuum emission of and galaxies (Fig. 5), the contamination of the strongest optical emission lines to Spitzer/IRAC bands (Fig. 6), and the observed range of [C iii]907+C iii]909 equivalent widths at (Fig. 7).
3 Simulating and fitting Jwst/NIRSpec observations
In this study, we focus on simulations of deep (100 ks) observations performed with JWST/NIRSpec. We consider the multi-object spectroscopy (MOS) mode and adopt the low spectral resolution configuration NIRSpec/prism (PRISM/CLEAR spectral configuration), which enables the simultaneous observation of up to objects over the full spectral range µm.
3.1 Simulating Jwst/NIRSpec spectra
A detailed description of the approach adopted to simulate NIRSpec spectra can be found in Appendix A. Here, we only provide an overview of our approach and highlight a few key features of the simulations. For each galaxy, we start from the (noiseless) mock spectrum obtained using the procedure described in Section 2. This high-resolution spectrum is redshifted to the photometric redshift obtained with the \oldtextscbeagle analysis and then rebinned to the spectral pixel size of the NIRSpec/prism configuration. We then use an idealized model accounting for the response of the telescope and of the instrument to derive the number of electrons per-second generated at detector level in each pixel. Combined with a noise model, this allows us to generate a simulated (noisy) spectrum. This approach is very similar to the one used by many exposure-time calculators (ETCs), but simpler than the elaborate two-dimensional scheme used by JWST’s official ETC ‘Pandeia’ (Pontoppidan et al. 2016).
In our simulations, we also account for the fact that a significant fraction of the object light may fall outside of the standard 3-shutter slitlet, especially for extended objects (the projected size of the aperture of an individual micro-shutter is arcsec). In practice, we compute the (wavelength-dependent) throughput of an extended source with effective radius described by an exponential profile (Sersic index equal to 1) and average over all source positions within the aperture of the central shutter of the slitlet. Averaging over random ellipticities and position angles produces small corrections ( few per cent), which we therefore ignore. We fix the effective radius to kpc at redshift (e.g. Shibuya et al. 2015, corresponding to arcsec), and adopt a conservative approach keeping the physical size fixed and accounting for the cosmological increase of the (apparent) sizes with redshift to arcsec at , at , at and at . We do not consider a redshift-dependent evolution of the effective radius because of the large uncertainties in the measurement of the sizes of faint galaxies at (e.g. see fig. 12 of Kawamata et al. 2017), but we note that assuming a decreasing with increasing redshift would lead to a larger throughput. The total throughput at is between 39 per cent (at ) and 33 per cent (at ) for the aperture, light profile and size adopted. Assuming a smaller effective radius kpc (0.06 arcsec) at (Shibuya et al. 2015) would increase the throughput at from 33 to 44 per cent.
We note that the resolution of the NIRSpec prism has a strong wavelength dependence, from a minimum of at µm to a maximum of at µm. This makes groups of neighbouring emission lines appear blended or unblended depending on a galaxy redshift. Since UV and optical emission lines will provide the firmest constraints on the physical parameters of faint galaxies at high redshift, we have summarised in Fig. 8 the visibility and blending/unblending of different lines as a function of redshift. Emission lines in Fig. 8 are considered unblended when their redshifted wavelengths are separated by a corresponding to 2.2 pixels, i.e. the typical instrumental spectral resolution element size. This does not account for the effect of an extended source on the emission line profiles. Our simulations cover the redshift range , with most objects in the range 4–8. While groups of UV lines such as and are blended at all redshifts, at H is unblended with respect to [O iii] and [O iii], while these [O iii] lines are unblended at . The H line is blended with [N ii] at , while H is blended with [O iii]636 up to . We note that all the lines in Fig. 8 but and are unblended when using the disperser, while the disperser allows one to resolve also those doublets at most redshifts. Fig. 8 also highlights the ability of NIRSpec/prism observations to provide the simultaneous measurement of the main UV and optical emission lines for galaxies at : H is accessible up to , H to and [O iii] to .
We show in Fig. 9 and 10 a few examples of simulated NIRSpec/prism spectra. Fig. 9(a) shows a galaxy with , and Fig. 9(b) one at , with . Fig. 9(a) shows a high signal-to-noise detection () of the Balmer lines H and H, and of the Oxygen lines [O iii] and [O iii]. Fig. 9(b) illustrates that the sensitivity of JWST/NIRSpec will allow us to obtain highly significant detections () of emission lines (H, H, [O iii] and [O iii]) even for a fainter galaxy at redshift . We note that the flux excess in the F140W band visible in the inset of Fig. 9(b) likely indicates a problem in the HST data, since at there are no strong emission lines falling in that band that can cause such an excess. Fig. 10 provides an overview of NIRSpec/prism observations of 10 galaxies at covering a broad range of continuum luminosities . The figure highlights how the large star formation rates per unit stellar mass and young ages of high redshift galaxies should enable statistically significant detections of emission lines even in galaxies out to .
3.2 Characteristics of the simulated spectra
We show in Fig. 11 the distribution of equivalent widths of H and of the two groups of lines and computed from the mock spectra. The mean (median) redshift of the galaxies in the mock catalogue is (), and the mean (median) equivalent width of (), which compares favourably with the value found by Smit et al. (2016) at redshift , and of and found by Rasappu et al. (2016) at redshift in their photometric and spectroscopic samples, respectively.
Fig. 12 shows the relation between the signal-to-noise ratio of the H line and the observed F160W magnitude for all galaxies in the mock catalogue, while Table 2 indicates the fraction of galaxies in bins of F160W magnitude with , 5 and 10. The reported in Fig. 12 and Table 2 accounts for the effect of the instrumental spectral response, while our simulations do not account for this effect. The consequence is, on average, a per cent larger emission line in our simulated spectra with respect to a computation accounting for the spectral response (see Appendix A).
Fig. 12 and Table 2 indicate that NIRSpec/prism ‘deep’ observations will detect most galaxies in the HUDF through their emission lines, with typical . More than 70 per cent of the XDF dropouts with will have detections of emission lines, and we expect to detect with about one third of the faintest () galaxies. The reported values imply an even larger of H (visible out to ), and likely of [O iii] (out to ), given the large ratios observed at high redshift (e.g. Strom et al. 2017).
3.3 Fitting simulated Jwst/NIRSpec spectra
We use the \oldtextscbeagle tool to perform a full, pixel-by-pixel fitting of each simulated NIRSpec spectrum. We adopt a model similar to the one used to fit the HST/XDF photometry (see Section 2.2), but fixing the redshift to the estimated value to mimic the way data will be analysed, i.e. with redshift being determined directly from the position of the spectral lines. Unlike in the HST broad-band fitting, we let the interstellar metallicity and fraction of -band attenuation arising in the diffuse ISM, , free to vary. We adopt the same independent priors as in Section 2.2, i.e. uniform for , , , , , and , exponential for and Gaussian for and (see Table 1). As in Section 2.2, we consider a multi-variate Gaussian likelihood function, assuming independent errors in each pixel (see however Appendix A.6). The likelihood function can therefore be described by equation (2.2), considering the summation over all spectral pixels, and assuming , i.e. removing any additional error term since, by construction, we have a perfect knowledge of the noise properties of the simulated spectra. We defer to a future work, based on the NIRSpec Instrument Performance Simulator, a more detailed analysis of other sources of random and systematic uncertainties in JWST/NIRSpec data, related, for instance, to detector defects, flat-fielding and spectro-photometric calibration (see Appendix A.6).
We show in Fig. 9 two examples of the \oldtextscbeagle fitting of the simulated spectra. The red lines indicate the spectra obtained by computing, in each pixel, the median of the posterior distribution of predicted fluxes, while the red shaded regions show the 95 per cent central credible interval computed from the same distribution of fluxes. Fig. 9(a) and (b) show how the high- detections of several emission lines constrain the model predictions to high precision. As we will discuss in the next section, this in turn induces tight constrains on several physical quantities describing the conditions of photoionized gas in these galaxies.
4 Jwst/NIRSpec constraints on galaxy properties
We summarise in Fig. 13 and Table 3 the main results of our analysis of simulated ‘deep‘ NIRSpec observations of high redshift galaxies. We report the constraints on the 6 quantities stellar mass , galaxy age , current star formation rate , ionization parameter , gas-phase metallicity and -band attenuation optical depth . Fig. 13 shows in different panels the relation between retrieved and input value for the 6 quantities above. The large errorbars visible in Fig. 13(a) in galaxies with , along with the ‘flattening’ of the relation between input and retrieved value, highlight the large uncertainty affecting the determination of the stellar masses, for which the posterior distribution is dominated by the prior. The stellar continuum emission in these galaxies, related to the stellar mass, is in fact too faint to be detected in our simulated spectra. Similarly, the large errorbars and spread of the points in Fig. 13(b) show the difficulty of determining the age of the oldest stars from our simulated NIRSpec spectra. The reason is that the UV-to-optical spectra of galaxies with large specific star formation rates are dominated by the emission from young stars, which outshine the older ones. Fig. 13(c), (d), (e) and (f) show that the star formation rate, ionization parameter, gas-phase metallicity and attenuation optical depth can be constrained across the entire range spanned by the input parameters. The constraints on these quantities come from emission lines ratios, and are therefore largely unaffected by our ability to detect the stellar continuum emission in faint, high-redshift galaxies.
We quantify in Table 3 the precision of the constraints on the 6 physical parameters shown in Fig. 13. For this, we compute for each parameter the cumulative distribution of the errors , and report in Table 3 the errors corresponding to 0.2, 0.5 and 0.9 of the inverse cumulative distribution , i.e. , and . The values therefore indicate the error on the parameter such that 20, 50, or 90 per cent of the objects exhibit errors . The central column of Table 3 indicates that for per cent of the simulated galaxies the stellar masses (or mass-to-light ratios) can be constrained within a factor of ( dex), the age of onset of star formation within a factor of ( dex), the star formation rate, ionization parameter and gas-phase metallicity within a factor of albeit the precision worsens for galaxies at ( and dropouts). The -band dust attenuation optical depth can be constrained with a precision .
The model parameters not reported in Fig. 13 and Table 3, namely the timescale of star formation , dust-to-metal mass ratio and fraction of dust attenuation arising in the diffuse ISM , are largely unconstrained by this analysis. As for the age of the oldest stars, constraining and would require detecting at high the rest-frame optical-to-near infrared (stellar) continuum emission of the galaxies, which traces older stellar populations. Since the NIRSpec observations here considered mainly provide high- detections of optical emission lines, a way to better constrain these parameters would be to combine NIRSpec spectra with JWST/NIRCam and MIRI photometry. This type of analysis will be included in a future study. Constraining , on the other hand, requires measuring, with high , emission lines of refractory and non-refractory elements, such as oxygen ([O ii], [O iii]), and nitrogen and/or sulphur ([N ii], [S ii], e.g. see figure 3 of Gutkin et al. 2016). The emission lines of non-refractory elements, however, are too weak to be detected in the galaxies studied in this work. We note that although , and are unconstrained by the simulated data, their inclusion in the fitting allows us to appropriately propagate into the constraints on the other physical parameters the uncertainty deriving from their poor knowledge, i.e. , and in this analysis play the role of ‘nuisance’ parameters.
We now discuss the effect on the precision of the physical parameters retrieval of the H line moving beyond the spectral coverage of NIRSpec at (see Fig. 8). Table 3 shows that the error threshold for the parameters , and slowly increases from the , to , to dropouts, reflecting the presence of fainter objects, on average, when moving from lower to higher redshift sources (see Fig. 1). The and dropouts, on the other hand, exhibit significantly larger values, likely because of the disappearance of the H line from the NIRSpec data at those redshifts. These parameters are in fact mainly constrained by Balmer lines () or line ratios involving such lines [, ], and given that H is intrinsically more luminous than H, and that it suffers less attenuation, the constraining power of NIRSpec data suddenly drops as H is shifted outside the NIRSpec range. The NIRSpec constraints on the stellar mass (or mass-to-light ratio) worsen with increasing redshift, following the average dimming of the sources, while those on do not significantly evolve with redshift, likely because of the small values in our mock catalogue, and hence relatively large error thresholds on here considered. The age of onset of star formation is the parameter with the poorest constraints: as for , constraining the galaxy age requires detections of the stellar continuum of evolved stellar populations, which is only possible for the brightest galaxies in the sample.
The results presented in the previous section demonstrate how the unique combination of sensitivity and wavelength coverage of the NIRSpec/prism configuration will allow the precise characterisation of the physical properties of star-forming galaxies across a wide range of redshifts and luminosities. Specifically, the ability to observe multiple optical emission lines at high should enable precise measurements of the gas-phase metallicity and star-formation rate, which, combined with stellar mass constraints from NIRCam, should significantly boost our ability to understand the mass and metal assembly in galaxies at redshift . We note that a way to improve the precision of the physical parameter estimates of objects with low emission line detections would be to combine them with galaxies with higher measurements into a Bayesian hierarchical framework. This would enable population-wide constraints on the parameters and to increase the precision of individual constraints. An ongoing effort to study such an approach is currently underway (Curtis-Lake et al., in prep).
The analysis presented so far illustrates the statistical constraints on several galaxy properties that can be obtained from 100 ks NIRSpec/prism observations of high-redshift sources, but it does not provide any insight on the actual number of sources that will be observed with a NIRSpec pointing. Here, we therefore discuss the number of HST-selected galaxies expected to be observed in a single NIRSpec/prism pointing centred on the HUDF. Fig. 14 shows the HUDF/GOODS-South area along with the four NIRSpec MSA quadrants, where we centred one quadrant on the 4.6 area of the HUDF with deep WFC3 observations. Fig. 14 shows how HUDF sources will only be observable from one MSA quadrant, while for the remaining three quadrants galaxies will have to be selected from shallower data in GOODS-South.
For XDF, we include the and dropouts. For CANDELS, we consider galaxies at and from table 4 of Bouwens et al. (2015a).
For XDF, we include the , and dropouts. For CANDELS, we consider galaxies at from the same table as in a.
On-sky target density of the objects in units of .
Average number of non-overlapping spectra observed with the NIRSpec/prism in a single pointing.
The number of objects