Simulating deep NIRSpec observations in the HUDF

Simulating and interpreting deep Jwst/NIRSpec observations in the Hubble Ultra Deep Field

Jacopo Chevallard, Emma Curtis-Lake, Stéphane Charlot, Pierre Ferruit, Giovanna Giardino, Marijn Franx, Michael V. Maseda, Ricardo Amorin, Santiago Arribas, Andy Bunker, Stefano Carniani, Bernd Husemann, Peter Jakobsen, Roberto Maiolino, Janine Pforr, Timothy D. Rawle, Hans-Walter Rix, Renske Smit, Chris J. Willott

Scientific Support Office, Directorate of Science and Robotic Exploration, ESA/ESTEC, Keplerlaan 1, 2201 AZ Noordwijk, The Netherlands
Sorbonne Universités, UPMC-CNRS, UMR7095, Institut d’Astrophysique de Paris, F-75014, Paris, France
Leiden Observatory, Leiden University, NL-2300 RA Leiden, Netherlands
Cavendish Laboratory, University of Cambridge, 19 J. J. Thomson Ave., Cambridge CB3 0HE, UK
Kavli Institute for Cosmology, University of Cambridge, Madingley Road, Cambridge CB3 0HA, UK
Departamento de Astrofísica, Centro de Astrobiología, CSIC-INTA, Cra. de Ajalvir, 28850-Madrid, Spain
Department of Physics, University of Oxford, Oxford, UK
Max Planck Institute for Astronomy, Königstuhl 17, D-69117 Heidelberg, Germany
Dark Cosmology Centre, Niels Bohr Institute, University of Copenhagen, Juliane Maries vej 30, DK-2100 Copenhagen, Denmark
European Space Agency, c/o STScI, 3700 San Martin Drive, Baltimore, MD 21218, USA
NRC Herzberg, 5071 West Saanich Rd, Victoria, BC V9E 2E7, Canada
E-mail: jchevall@cosmos.esa.intESA Research Fellow
Submitted to MNRAS on

The James Webb Space Telescope (JWST) will enable the detection of optical emission lines in galaxies spanning a broad range of luminosities out to redshifts . Measurements of key galaxy properties, such as star formation rate and metallicity, through these observations will provide unique insight into, e.g., the role of feedback from stars and active galactic nuclei (AGNs) in regulating galaxy evolution, the co-evolution of AGNs and host galaxies, the physical origin of the ‘main sequence’ of star-forming galaxies and the contribution by star-forming galaxies to cosmic reionization. We present an original framework to simulate and analyse observations performed with the Near Infrared Spectrograph (NIRSpec) on board JWST. We use the \oldtextscbeagle tool (BayEsian Analysis of GaLaxy sEds) to build a semi-empirical catalogue of galaxy spectra based on photometric spectral energy distributions (SEDs) of dropout galaxies in the Hubble Ultra Deep Field (HUDF). We demonstrate that the resulting catalogue of galaxy spectra satisfies different types of observational constraints on high redshift galaxies, and use it as input to simulate NIRSpec/prism () observations. We show that a single ‘deep’ ( ks) NIRSpec/prism pointing in the HUDF will enable detections of multiple optical emission lines in () galaxies at () down to AB mag. Such observations will allow measurements of galaxy star formation rates, ionization parameters and gas-phase metallicities within factors of 1.5, mass-to-light ratios within a factor of 2, galaxy ages within a factor of 3 and -band attenuation optical depths with a precision of 0.3.

galaxies: evolution – galaxies: formation – galaxies: ISM – HII regions – dark ages, reionization, first stars – telescopes

1 Introduction

Our picture of the formation and evolution of galaxies across cosmic time has improved significantly in the past 15 years. Galaxy surveys have provided researchers with a wealth of spectro-photometric data, at both low (e.g. SDSS, York et al. 2000) and high (e.g., GOODS, Stanway et al. 2003; GLARE, Stanway et al. 2004, 2007; COSMOS, Scoville et al. 2007; HUDF, Bunker et al. 2004; Beckwith et al. 2006; Bouwens et al. 2010; Bunker et al. 2010; McLure et al. 2010; Ellis et al. 2013; Illingworth et al. 2013; CANDELS, Grogin et al. 2011; Koekemoer et al. 2011; VUDS, Le Fèvre et al. 2015; VANDELS, McLure et al. 2017) redshifts. On the theoretical front, computer simulations of galaxy formation (e.g. EAGLE, Schaye et al. 2015; IllustrisTNG, Pillepich et al. 2017) can now accurately predict several key galaxy properties, such as the redshift evolution of the galaxy stellar mass function and star formation rate density (e.g. Genel et al. 2014; Furlong et al. 2015). Yet, many details of the physical processes driving the evolution of galaxies remain unknown. AGN feedback is a key ingredient of galaxy formation models (see Somerville & Davé 2015 and references therein), but observational evidence for AGN-driven ‘quenching’ of star formation is ambiguous (e.g. Carniani et al. 2016; Suh et al. 2017). Similarly, although several correlations exist between the physical properties of AGNs and their host galaxies (see Kormendy & Ho 2013 and references therein), a causal connection among these properties indicating a ‘co-evolution’ of galaxies and AGNs has not been clearly demonstrated. The apparent existence of a tight relation between galaxy masses and star formation rates (i.e., ‘main sequence’, Brinchmann et al. 2004; Noeske et al. 2007), and perhaps of a more ‘fundamental’ relation involving gas-phase metallicity as well (Mannucci et al. 2010), can be an indication of self-regulation of star formation within galaxies (e.g. Lilly et al. 2013), but the significance and redshift evolution of these relations remains unclear (e.g. Yabe et al. 2015; Telford et al. 2016). The increasing relative abundance of UV-faint galaxies at redshift (e.g. Bouwens et al. 2015a; Finkelstein et al. 2015) would support a picture in which low-mass star-forming galaxies provide the bulk of H-ionizing (LyC) photons required for cosmic reionization (e.g. Wilkins et al. 2011; Finkelstein et al. 2012; Robertson et al. 2013; Bouwens et al. 2015b), but this conclusion relies on several assumptions about the ionizing emissivity of galaxies and escape fraction of LyC photons from galaxies.

Advancing our understanding of the above processes, and of many others, that drive the evolution of galaxies from the reionization epoch to the present day, requires measuring physical properties – such as stellar masses, star formation rates, stellar and gas metallicities, stellar ages, gas ionization state, the dynamics of gas and stars – for large samples of galaxies, across a broad range of redshifts and galaxy stellar masses. While multi-band photometric campaigns have collected high quality SEDs for thousands of galaxies spanning a wide range of masses out to (e.g. HUDF, CANDELS), the availability of spectroscopic observations at is much more limited. At those redshifts, the strongest optical emission lines (e.g. [O ii], H, [O iii], H) are shifted to near infrared wavelengths, where observations with ground-based telescopes are challenging because of regions of low atmospheric transmittance, bright sky background and contamination from bright, variable sky lines. Optical emission lines have been measured out to for large samples of galaxies using low-resolution slitless spectroscopy with HST (e.g. WISP, Atek et al. 2010; 3D-HST, Brammer et al. 2012; GLASS, Treu et al. 2015; FIGS, Pirzkal et al. 2017). Higher-resolution spectroscopic surveys from the ground (e.g. VUDS, VANDELS) have mainly relied on multi-object spectrographs operating at optical wavelengths (e.g. VIMOS at VLT, Le Fèvre et al. 2003), hence are limited, at high redshift, to the measurement of rest-frame UV emission lines. These lines are intrinsically weaker than the optical ones (e.g., see table 5 of Steidel et al. 2016), and their interpretation is complicated by radiative transfer effects, dust attenuation and potential contamination from outflows.111The Ly line is theoretically the most luminous emission line at UV and optical wavelengths, but being a resonant line it suffers from radiative transfer effects which affect its visibility and make its physical interpretation extremely challenging. Multi-object, ground-based spectrographs operating at near infrared wavelengths (e.g. MOSFIRE at Keck, McLean et al. 2008; KMOS at VLT, Sharples et al. 2013) permit the extension of rest-frame optical-emission-line measurements out to , albeit only for galaxies with relatively bright emission lines (e.g. MOSDEF, Kriek et al. 2015; KBSS, Steidel et al. 2014). These measurements enabled, for example, new constraints on the dust attenuation properties of galaxies at (e.g. Reddy et al. 2015) and on the conditions of ionized gas (metal abundances, ionization state) at (e.g. Strom et al. 2017; Shapley et al. 2017).

In the near future, the limitations of existing observatories to study the high redshift Universe will largely be overcome by the launch of the James Webb Space Telescope (JWST, Gardner et al. 2006). JWST is expected to revolutionise our view of galaxies at , and of low-mass galaxies at , by providing a wide range of imaging and spectroscopic capabilities in the wavelength range µm. In particular, the Near Infrared Spectrograph (NIRSpec, Bagnasco et al. 2007; Birkmann et al. 2016) onboard JWST will increase by more than an order of magnitude the emission line sensitivity in the wavelength region µm covered by existing ground-based instruments, while reaching an even greater sensitivity at wavelengths µm inaccessible with existing observatories. The multi-object spectroscopic capabilities of NIRSpec will enable the simultaneous measurement of up to galaxy spectra, allowing the observation of standard optical emission lines for large samples of galaxies out to (see Section 2.4). Combined with stellar mass measurements based on JWST/NIRCam imaging (Near Infrared Camera, Horner & Rieke 2004), this will provide researchers with unique data to constrain the physical processes driving the evolution of galaxies at , the formation of the first galaxies at , the way cosmic reionization proceeded in space and time and the contribution of different sources to the cosmic reionization budget.

The uniqueness of JWST/NIRSpec, however, also poses new challenges for the planning and interpretation of observations of high redshift galaxies. Unlike ground-based multi-object spectrographs, which typically require the user to define a ‘mask’ of apertures on the sky, NIRSpec is equipped with a micro-shutter array (MSA, Kutyrev et al. 2008) providing a fixed grid of apertures on the sky. Optimising the NIRSpec MSA usage for a given scientific goal therefore requires a careful prioritization of the targets, as well as calculations to study spectral overlaps and truncations. The different filters and dispersers available on NIRSpec, covering the wavelength range µm and spectral resolutions , and , provide complementary information, but the optimal choice of filters, dispersers and exposure times depends on the scientific case, target properties and redshift range. Given the paucity of high quality galaxy spectra at , NIRSpec data will open a largely unexplored space of observables, for both emission lines and stellar continuum studies. Maximising the information extracted from such data therefore requires models and approaches adapted to describing galaxies with a wide range of stellar populations and interstellar medium properties, likely extending well beyond those measured with existing observatories. Having adequate models and simulations is therefore critical, especially for the planning of Cycle 1 and Cycle 2 observations, as these will be largely based on spectroscopic follow-up of existing, pre-JWST imaging data.

In this paper, we build a physically-motivated, semi-empirical framework to simulate integrated galaxy spectra as could be observed at low spectral resolution with JWST/NIRSpec in the Hubble Ultra Deep Field (HUDF). We then use these simulated spectra to study our ability to constrain different galaxy physical parameters, such as star formation rates, ages, mass-to-light ratios, metallicities and properties of dust attenuation and ionised gas, for objects at redshift . We adopt a self-consistent physical model which accounts for stellar and ionised gas emission, integrated in the state-of-the-art \oldtextscbeagle tool (Chevallard & Charlot 2016), to generate the input mock catalog of galaxy spectra and to fit the simulated observed spectra to retrieve the galaxy properties.

In Section 2, we describe the catalogue of dropouts of Bouwens et al. (2015a) and our approach to fit the UV-to-near infrared photometry of these sources with the \oldtextscbeagle tool. We also present our method for associating model spectra with each dropout galaxy, and demonstrate how such a method produces spectra consistent with several, independent observables. In Section 3 we present our simulations of JWST/NIRSpec observations and the full-spectrum fitting of these simulations with \oldtextscbeagle. In Section 4 we examine the constraints on different galaxy physical parameters obtained by fitting the simulated NIRSpec data. In Section 5 we discuss our results in the light of NIRSpec observational programs, in particular of the NIRSpec Guaranteed Time Observations (GTO) program, and future extensions of this work to cover a broader range of NIRSpec observing modes. Finally, we summarize our conclusions in Section 6.

Throughout the paper, we express magnitudes in the AB system, adopt a zero-age solar metallicity (corresponding to a present-day metallicity of 0.01524, see Table 3 of Bressan et al. 2012) and the latest constraints on cosmological parameters from Planck, i.e. , and (see last column ‘TT,TE,EE+lowP+lensing+ext’ of Table 4 of Planck Collaboration et al. 2015). All the emission line equivalent widths in the text refer to rest-frame values.

2 A semi-empirical catalogue of galaxy spectra in the HUDF

In this section we detail our approach to create a semi-empirical catalogue of high-redshift galaxy spectra. This catalogue is constructed by matching the predictions of a spectral evolution model to a large sample of galaxies with multi-band HST photometry, and represents the first of several steps involved in the process of creating and interpreting simulated NIRSpec observations. The adoption of a semi-empirical approach is motivated by the need to minimize the model-dependence of our analysis, decoupling it from a particular theoretical approach (e.g. hydro-dynamic simulation vs semi-analytic model) and its specific implementation. Also, anchoring our simulations to existing HST photometry allows us to create and study NIRSpec observations which more closely resemble those that will be obtained early on in the JWST mission, i.e. based on HST-selected targets. Adopting a purely empirical approach based on existing spectroscopic observations, on the other hand, is not viable because of the widely different wavelength coverage and sensitivity of current observatories compared to JWST/NIRSpec. Existing spectroscopic surveys targeting high redshift galaxies are often limited to observe wavelengths µm (e.g. VVDS, Le Fèvre et al. 2013, VUDS, VANDELS), while surveys extending to µm (e.g. MOSDEF) target relatively bright () galaxies.

2.1 Multi-band Hst photometry of dropout galaxies in the HUDF

Figure 1: Distributions of the WFC3/F160W magnitude for the (, purple line), (, cyan), (, green), ( objects, orange) and ( objects, red) dropout galaxies selected by Bouwens et al. (2015a) in the HUDF.

In this work, we focus on galaxies at redshift , for which JWST/NIRSpec will enable measurements of standard optical emission lines (e.g. H, [O iii], H, [N ii], [S ii]), which are largely inaccessible with existing ground-based telescopes. As shown by the initial works of Steidel et al. (1996) and Madau et al. (1996), high redshift star-forming galaxies can be effectively selected from broad-band photometric data by using the Lyman break ‘dropout’ technique. This technique exploits the (nearly) complete absorption by neutral hydrogen of any light emitted by a galaxy blue-ward the Lyman limit (912 Å). As the Lyman limit is redshifted to redder wavelengths with increasing redshift, this makes an object become undetectable (‘drop-out’) from a given optical/near-infrared band. Several groups over the years have used this technique to identify high- star-forming galaxies, initially using optical data from the HST Advanced Camera for Surveys (ACS) to select objects out to (e.g. Steidel et al. 1999; Dickinson et al. 2004; Bunker et al. 2004), and later exploiting near-infrared observations with the Wide Field Camera 3 (WFC3) to obtain large samples of galaxies (e.g. Wilkins et al. 2010; Bouwens et al. 2011; McLure et al. 2013; Schenker et al. 2013).

Here we use the high- galaxy candidates selected with the dropout technique by Bouwens et al. (2015a) in the Hubble Ultra Deep Field (HUDF). Since we are interested in galaxies out to the highest redshifts, we only consider sources selected from the 4.7  region in the HUDF with deep HST/WFC3 near-infrared observations. We adopt the multi-band HST catalogue of Illingworth et al. (2013), built from a combination of all available HST/ACS and WFC3 observations in the HUDF (for a complete list see table 2 of Illingworth et al. 2013), and refer to this as the ‘eXtreme Deep Field’ (XDF) catalogue. The XDF catalogue includes observations in 9 bands, 5 in the optical, based on the ACS/WFC filters F435W, F606W, F775W, F814W and F850LP, and 4 in the near-infrared, taken with the WFC3 filters F105W, F125W, F140W and F160W. The depth of a point source is , computed within a circular aperture of 0.35 arcsec diameter, while the combined image used for the source extraction reaches a depth of 31.2 within the same circular aperture (Illingworth et al. 2013). We do not use Spitzer/IRAC data, since the vast majority of our galaxies are too faint to be detected in existing Spitzer images.

We consider dropout galaxies in the filters (), (), (), () and (), which we will indicate as , , , and dropouts in the remainder of this paper. Details on the Lyman Break selection in each band can be found in sec 3.2 of Bouwens et al. (2015a) (see also their table 2). Fig. 1 shows the F160W magnitude distribution of galaxies selected in the different dropout bands. The vast majority ( per cent) of the galaxies at ( dropouts and above) have , while the 50 per cent completeness magnitude varies between a minimum of (F105W filter, dropouts) to a maximum of (F160W filter, dropouts). The adopted Lyman Break selection produces, for each dropout band, the redshift distributions shown in Fig. 2. Galaxy redshifts are centred around the average values reported above, with Full Width Half Maximum (FWHM) of the redshift distribution set by the filter widths, i.e. for the , , and dropouts, increasing to for the dropouts. The absolute UV magnitude of the sources, computed from the median  magnitude in each dropout class, assuming a flat spectrum, and using the central redshift of each dropout band, varies from ( dropouts), to (), (), (), ().

Figure 2: Redshift distribution of the , , , and dropouts, colour coded as in Fig. 1. Dashed lines indicate the expected redshift distribution computed by Bouwens et al. (2015a) by means of Monte Carlo simulations of artificial sources. Solid lines are computed by ‘stacking’ (i.e., summing) the posterior probability distribution of redshift computed with the \oldtextscbeagle tool for each object, normalising the resulting distribution to a maximum value of 1, and then, following Bouwens et al. (2015a), convolving it with a Normal distribution with zero mean and standard deviation of 0.2.

2.2 Broad-band SED fitting of high-redshift galaxies in the HUDF

Parameter Prior Description XDF photometry NIRSpec spectra
Uniform Redshift
Uniform Stellar mass (does not account for fraction of mass returned to the ISM by stellar mass loss)
Uniform Time scale of star formation in a SFH described by a delayed exponential function.
Gaussian truncated Age of onset of star formation in the galaxy
Uniform Stellar metallicity
Uniform Interstellar metallicity
Gaussian   truncated Star formation rate averaged over the last yr
Uniform Effective gas ionization parameter
Uniform Dust-to-metal mass ratio
Exponential truncated -band attenuation optical depth
Uniform Fraction of -band attenuation arising in the diffuse ISM
Table 1: Priors relative to the different free parameters used in the \oldtextscbeagle tool for the broad-band SED fitting of HST/XDF photometry and for the full-spectrum fitting of JWST/NIRSpec simulated spectra. The symbol indicates a Gaussian (Normal) distribution.

Similarly to Chevallard & Charlot (2016, see their sec 4), we use the \oldtextscbeagle tool to fit the XDF photometry of 715 dropout galaxies with a self-consistent physical model that includes stellar emission, continuum+line emission from H ii regions and diffuse ionized gas and dust attenuation. We do not model the emission from an AGN potentially contaminating the HST photometry, as the expected number of type-1 AGNs in the 4.7  field here considered is consistent with zero (e.g. see section 4.2.5 of Grazian et al. 2015). We let the redshift free to vary, and adopt a two-component star formation history constituted by a ‘smooth’ function and a burst. The smooth component is described by a delayed exponential function , where  is the star formation timescale and  the lookback time, which varies from yr to , the age of onset of star formation in the galaxy. The burst (with constant star formation rate) covers the last of star formation, the timescale over which per cent of the H-ionizing photons are emitted (e.g. Charlot & Fall 1993; Binette et al. 1994), and it is parametrised in terms of the ‘current star formation rate’ , i.e. the SFR averaged over the past 10 Myr. Decoupling the ‘past’ star formation history and the ‘current’ star formation rate allows us to obtain galaxy spectra with any contribution of emission lines relative to stellar continuum. We fix the maximum redshift for the formation of the first stars in a galaxy at , so that at any redshift the maximum allowed age for the onset of star formation is , where  and refer to the age of the Universe at redshift and , respectively. We approximate the distribution of stellar metallicities in a galaxy, including the metallicity of young stars (with ages yr, ) and the interstellar metallicity (), with a single metallicity , i.e. .222The interstellar metallicity was indicated as  in the original paper describing the \oldtextscbeagle tool (Chevallard & Charlot 2016), while here we follow the nomenclature adopted in Gutkin et al. (2016). Following Gutkin et al. (2016, see also ) we describe the properties of gas ionized by young stars by means of galaxy-wide (‘effective’) parameters. The ionization parameter  determines the ratio of H-ionizing photons to gas density at the Strömgren radius of an effective star cluster, while the dust-to-metal mass ratio  (‘depletion factor’) sets the amount of depletion of heavy elements onto dust grains (see sec 2.3 of Gutkin et al. 2016 for a discussion of depletion factors).333Our definition of  implies a volume-averaged ionization parameter (see equation 1 of Hirschmann et al. 2017). We model the effect of dust attenuation on stellar and gas emission by appealing to the two-component model (diffuse ISM + birth clouds) of Charlot & Fall (2000), parametrized in terms of the total attenuation optical depth  and the fraction of attenuation arising in the diffuse ISM . We account for the effect of absorption from the intergalactic medium (IGM) by means of the average prescription of Inoue et al. (2014).

Following the Bayesian approach adopted in \oldtextscbeagle, we define the posterior probability distribution of the model free parameters  as


where  indicates the data, the adopted model, the prior distribution and the likelihood function. We adopt independent priors for all parameters, uniform for the parameters , , , ,  and , Gaussian for and and exponential for  (see Table 1).444We note that the adopted prior for  would correspond to the non-informative, scale-invariant Jeffreys prior for a one-dimensional Gaussian likelihood function depending only on . We consider a multi-variate Gaussian likelihood function with independent errors on each measurement


where the summation index runs over all observed bands (i.e., bands with positive errors, and negative or positive fluxes), is the observed flux, , where  is the observational error and is an additional error term that we add to avoid obtaining results dominated by systematic uncertainties (see sec. 4.2 of Chevallard & Charlot 2016) and indicates the fluxes predicted by our model for a set of parameters .

We adopt the Nested Sampling algorithm (Skilling et al. 2006) as implemented in \oldtextscmultinest (Feroz et al. 2009) to sample the posterior probability distribution of the 9 free parameters . It is worth briefly pausing to discuss the potential risk of over-fitting our data by using such a flexible physical model. The 9 photometric bands used in the fitting cover the wavelength range , hence they mainly probe the rest-frame UV emission of galaxies, and only bands redward the Lyman break (4 bands for the dropouts) provide constraints on the galaxy physical properties. In star-forming galaxies, the UV continuum emission is mostly sensitive to the recent ( yr) star formation history and to dust attenuation, while other parameters, such as the mass of older stars and the physical conditions of gas, have little influence on the emission at these wavelengths. Since in this work we aim at simulating NIRSpec observations covering a large variety of intrinsic galaxy spectra, extending beyond those observed in relatively bright galaxies at , we must also account for the variation of model parameters largely unconstrained by existing photometric observations. For this reason we adopt the flexible, 9-parameters model described above, and combine the weak constraints provided by HST photometry on some model parameters with well established relations among galaxy physical quantities to obtain physically-motivated combinations of parameters (see Section 2.3). We then validate this approach by comparing our model predictions with external (photometric and spectroscopic) data-sets at redshift (Section 2.4).

The results of the \oldtextscbeagle fitting of the XDF data are summarised in Fig. 2 in terms of the stacked posterior probability distribution of redshift. This is computed by combining the redshift probability distribution of each galaxy using a kernel-density estimator. Similarly to an histogram, Fig. 2 represents a density plot, where the solid curves depend on both the density of galaxies at each redshift and on the redshift probability distribution of each individual source. Fig. 2 provides an interesting visual comparison among the redshift distributions obtained by Bouwens et al. (2015a) from the analysis of Monte Carlo simulations of artificial sources (see their section 4.1) and those computed in this work. Given the very different approaches adopted to compute the two sets of lines of Fig. 2, the figure highlights a good agreement between the two methods. The differences among the two approaches are larger for the dropouts, for which the peaks of the two distributions are separated by (), possibly because of the low constraining power of data for the dropouts (low , few bands red-ward the continuum break) combined with a low number of sources ().

2.3 Linking model spectra and observed galaxies in the HUDF

The \oldtextscbeagle tool provides us with the posterior probability distribution of the 9 free parameters of the model (reported in column 5 of Table 1), along with sets of selected observables (e.g., full spectrum, absolute and apparent magnitudes) and derived quantities (e.g. rates of H- and He-ionizing photons). These are obtained by sampling the posterior probability distribution of model parameters using the \oldtextscmultinest algorithm (see section 3.3 of Chevallard & Charlot 2016 for more information on the \oldtextscbeagle output). From a purely statistical perspective, we could associate with a given XDF source any spectra from the set drawn using \oldtextscmultinest in this way. However, since the data are consistent with a broad range of model spectra, this approach would not ensure that these match other, independent observables, such as galaxy colours at longer wavelengths, or emission line measurements. The latter is particularly relevant for XDF galaxies, which exhibit faint continuum emission and will likely be only detected via emission lines by JWST/NIRSpec. By obtaining a physically-motivated distribution of emission line strengths, we can produce simulated NIRSpec observations with realistic  ratios. This, in turn, should guarantee that the constraints on the galaxy physical parameters derived in Section 3 can be applied to future ‘deep’ NIRSpec observations of HST-selected high-redshift galaxies.

After some experimentation, we find that imposing model parameters to follow a redshift-dependent relation between mass and star formation rate (the so-called ‘main sequence’ of star-forming galaxies) produces observables in agreement with different types of observations. In practice, we enforce this relation by multiplying the posterior probability distribution of equation 2.1 by a ‘weight’ function which depends on the stellar mass and star formation rate. In order to more densely populate the tails of mass–star formation relation, we adopt a Student’s-t weight function with 3 degrees of freedom, which is significantly broader than a Gaussian function. The adopted functional form is


where is the standardised variable


and, as before,  indicates the star formation rate. We fix the scatter (see, e.g., Speagle et al. 2014; Shivaei et al. 2015), and adopt the (redshift-dependent) mass-star formation rate relation of Speagle et al. (2014) (their equation 28) to compute (below expressed as a function of time  instead of redshift )


where  indicates the galaxy stellar mass. Equations (2.3)–(2.5) provide the conditional distribution of star formation rate given the galaxy stellar mass and redshift, . This conditional distribution does not depend on the data , hence it can be readily interpreted as a conditional prior distribution linking the three parameters star formation rate, stellar mass and redshift.

Figure 3: Stacked two-dimensional posterior probability distribution of mass and star formation rate for the , , , and dropouts (grey density contours, plotted on a logarithmic scale). Circles, colour coded as in Fig. 1, indicate the pairs [, ] corresponding to a random realisation of model parameters drawn from the re-weighted posterior probability distribution. The solid yellow line indicates the mass–star formation rate relation measured by Santini et al. (2017) at from galaxies selected from 4 gravitationally-lensed HST Frontier Fields, while the dashed line is an extrapolation of this relation at lower stellar masses.

It is well known that the photometric degeneracy between the Lyman and Balmer breaks can make a galaxy photometric SED be equally well described by two different models with widely different redshifts (e.g. Ilbert et al. 2006). As the \oldtextscbeagle tool can identify these multiple redshift solutions, in this work we restrict the analysis to redshifts consistent with the expected dropout redshift of Bouwens et al. (2015a). To achieve this, among the combinations of model parameters  obtained with \oldtextscbeagle, we only consider those satisfying the condition , where for the dropouts, for the , for the and for the and ones (Fig. 2). We then randomly draw, for each galaxy, a set of parameters among those sampled by \oldtextscmultinest, where the probability of drawing any set is proportional to the re-weighted posterior probability distribution. We can repeat the random draw (with replacement) times to obtain model spectra consistent with the observed HST photometry, and corresponding to combinations of satisfying the adopted mass-star formation rate relation. This is particularly relevant to evaluate selection effects and to study the consequences of deriving population-wide relations among physical parameters (e.g. mass-metallicity, mass-star formation rate) from a finite number of observations. Both applications will be part of successive works, while in the remainder of this paper we consider a single random draw to associate a model spectra to each XDF source, and refer to the ensemble of 715 model spectra as a ‘mock catalogue’.

Figure 4: Distribution of model free parameters of the mock catalogue of galaxy spectra computed as detailed in Sections 2.22.3. The colour coding indicates mock galaxies at different redshifts, and it is the same as in Fig. 1. The off-diagonal panels show the relation between each pair of parameter , while the diagonal panels display, by means of histograms, the distribution of each parameter.

We show in Fig. 3 by means of (grey) density contours on a logarithmic scale the two-dimensional stacked posterior probability distribution of stellar mass  and star formation rate , along with the pairs drawn from the re-weighted posterior probability distribution (circles of different colours). By analogy with the one-dimensional stacked posterior probability distribution of redshift (see Section 2.2), we compute the two-dimensional stacked posterior probability distribution of  and  summing the individual probability distributions computed with a kernel-density estimator. Fig. 3 shows the effect of varying in our model both the current star formation rate and total stellar mass when only rest-frame UV observations are available. The observed HST photometry of a large fraction of XDF galaxies can be reproduced equally well by our model for widely different combinations of  and , which makes the stacked distribution occupy a broad region of Fig. 3. We note that the sharp upper envelope created by the coloured circles in Fig. 3 is caused by the fixed duration (10 Myr) of the current burst of constant SF, which limits the maximum  attainable at fixed .

Fig. 4 shows the distribution of the main model free parameters of the mock catalogue. The low constraining power of HST photometry with respect to most model parameters makes these distributions reflect the adopted priors, i.e. Gaussian for , exponential for  and uniform for the other parameters (see Table 1). Also, no strong correlations are visible except for the relation between and that we enforced. It is worth briefly discussing the absence in Fig. 4 of a correlation between mass and metallicity, and metallicity and ionization parameter. The mass–metallicity relation is well-established at low redshift (e.g. from SDSS data, Tremonti et al. 2004), while observations targeting galaxy populations similar to those used in our analysis, i.e. faint high-redshift galaxies, depict a less clear picture. Recent results from the VUDS survey suggest that highly star-forming galaxies with stellar masses comparable to those of our mock galaxies () span a broad range of gas-phase metallicities at fixed stellar mass (Calabrò et al. 2017). Similar results have been obtained by Izotov et al. (2015) and Jimmy et al. (2015) from the analysis of low-mass local ‘analogues’ of high redshift galaxies. These findings can have a physical origin, since the metal content of star-forming gas results from the complex interaction between enrichment from previous stellar generations, AGN and stellar feedback and gas accretion, or can be driven by systematic uncertainties in the metallicity estimators (e.g. Kewley & Ellison 2008).

Similarly, an anti-correlation between gas-phase metallicity and ionization parameter has been observed at low redshift (Carton et al. 2017), and can be expected on theoretical grounds (e.g. Dopita et al. 2006), but the redshift evolution and dispersion of the relation for galaxy populations as those considered in this work are largely unknown. While future observations may show that certain combinations of model parameters (e.g. high metallicity and large ionization parameter) are not well suited to describe the conditions of photoionized gas in high-redshift star-forming galaxies, we have decided not to impose any relation between these parameters in our mock galaxy catalogue. This will enable users to adopt any relation between mass–metallicity and metallicity–ionization parameter to select sub-samples of sources from the different realisations of the mock catalogue.

2.4 Comparison of model spectra with independent data

Figure 5: (a) Diamonds with () error bars indicate the stacked photometric SEDs of redshift galaxies ( dropout) of González et al. (2012) in three bins of observed F775W magnitudes, used as a proxy for the rest-frame UV luminosity. Shaded ‘violins’ indicate the magnitude distribution (95 per cent interval) covered by our mock spectra based on dropouts in the XDF, split in the same UV luminosity bins as in González et al. (2012). We indicate in the inset legend the number of objects in each González et al. (2012) stack, and in parenthesis the number of objects in our mock catalogue entering each bin. (b) Same as (a), but for dropout galaxies (redshift ). We did not plot our median SED for the bin because of the low (4) number of galaxies falling in the bin.

Our approach to build a catalogue of mock galaxy spectra guarantees, by construction, that these match the HST photometry of XDF dropouts. In this section, we compare the mock spectra with other observables, namely near infrared photometry from the Infrared Array Camera (IRAC) onboard the Spitzer Space Telescope, and measurements of the [C iii]907+C iii]909 emission line performed with different instruments. The lower sensitivity of Spitzer with respect to HST only enables the detection in the IRAC band 1 (3.6 µm filter, ) and band 2 (4.5 µm filter, ) of a minority of XDF dropouts with (see Fig. 1). For this reason, we have to rely on Spitzer observations of stacked galaxies and of individual (brighter) galaxies over a much wider area than the XDF region.

Figure 6: Comparison of the Spitzer/IRAC colour at from the literature (coloured circles) with that of the galaxies in our mock catalogue (grey circles). In order of increasing redshift, we show galaxies with spectroscopic redshifts at from 3D-HST/GOODS-North and GOODS-South (cyan circles, Skelton et al. 2014), objects from the spectroscopic sample of Smit et al. (2016) (light green) and from the photometric samples of Rasappu et al. (2016) (dark green), Smit et al. (2015) (orange) and Smit et al. (2014) (red). We only plot objects for which the quoted uncertainty on the IRAC colour is . For clarity, we only show error bars when these are . As indicated in the inset legend, the area of the circles is proportional to the absolute UV magnitude of each galaxy .
Figure 7: Comparison of the [C iii]907+C iii]909 (rest-frame) equivalent widths  from literature (coloured circles) with those measured from the spectra of our mock catalogue (grey circles). Literature values are taken, in order of increasing redshift, from Stark et al. (2014, cyan circles), Maseda et al. (2017, light green), Amorín et al. (2017, blue), Ding et al. (2017, dark green), Stark et al. (2015, orange) and Stark et al. (2017, red). The upper limits indicate 3 limits. As in Fig. 6, the area of the circles is proportional to the absolute UV magnitude  of each galaxy.

We compare in Fig. 5 the predicted optical-to-near infrared SEDs ( bands) of our mock catalogue with the stacked SEDs computed by González et al. (2012) at and galaxies. González et al. (2012) consider HST/ACS, HST/WFC3 and Spitzer/IRAC observations in the GOODS-South region (Giavalisco et al. 2004; Bouwens et al. 2012), and compute the median SEDs of , , and dropouts in bins of observed UV luminosity. In practice, they compute the median HST fluxes of the sources in each UV luminosity bin, and perform aperture photometry on the median-combined image in each IRAC band. They evaluate the uncertainties on the median fluxes through bootstrap resampling. We note that when comparing model predictions with Spitzer/IRAC fluxes of galaxies, one must consider the effect of optical emission lines contaminating the  and  bands. At ( dropouts),  contaminates the  band, while at both IRAC bands can be contaminated by either  or . This makes the IRAC colours computed from the stacked SEDs of galaxies highly dependent on the emission line properties and redshift distribution of the sources entering the stacks, and this dependence is exacerbated in stacks computed from a low number of sources (e.g. see fig. 7 of González et al. 2012). For this reason, we only consider the and stacked SEDs of González et al. (2012) computed from a minimum of 15 individual sources. Fig. 5(a) and (b) show a good agreement among the SEDs of our mock catalogue and the median SEDs González et al. (2012) in all the UV luminosity bins considered. This test is particularly important since the IRAC fluxes, unlike the HST ones, are not matched to any observation during the mock catalogue creation. Fig. 5 hence demonstrates that, on average, the mass and ages of evolved stars in our mock galaxies, traced by the IRAC fluxes probing the rest-frame optical-to-near infrared emission of galaxies, are consistent with the observed values.

The Spitzer/IRAC colour  can be used to further test the agreement between observations and mock spectra. As we noted above, the redshift evolution of this colour is sensitive to the presence of optical emission lines, especially the two groups  and , contaminating the  and  bands (e.g. observationally, Stark et al. 2013; de Barros et al. 2014; theoretically, Wilkins et al. 2013). We therefore compare in Fig. 6 the evolution of the  colour in the redshift range predicted by our mock catalogue with data from different sources. These include, from low to high redshift, observations from 3D-HST (Skelton et al. 2014), from the spectroscopic sample of Smit et al. (2016) and from the photometric samples of Rasappu et al. (2016), Smit et al. (2015) and Smit et al. (2014). Since our mock catalogue is based on observations from the small, 4.7  XDF area, while data are drawn from much larger areas ( ), we encode in the circle sizes the rest-frame UV luminosity  of each object: the larger the circle, the more luminous the galaxy. Fig. 6 highlights the ability of the mock spectra to reproduce the sharp IRAC colour change at () caused by the entry of H in the  () band, including the most extreme blue and red colours, which are caused by . We note that the different extremes reached by the IRAC colours at () and at () are likely caused by the different widths of the IRAC filters, µm for  filter and µm for  one. This difference translates into a different contamination of the integrated broad-band flux of each band, at fixed emission line equivalent width (EW). At redshifts the  band is contaminated by the group of lines  and the  band by , therefore the  colour depends on the relative intensities of H-Balmer lines vs [O iii], i.e. on the physical conditions of ionized gas, especially metallicity and ionization parameter. In the narrow redshift window , the  band is free of strong emission lines, while  is contaminated by . Smit et al. (2015) exploit this property to select extreme emission lines galaxies with accurate photometric redshifts. They search for such extreme objects in the 5 CANDELS fields (), and find sources with . These are relatively bright galaxies falling in a narrow redshift range, hence, not surprisingly, they do not appear in our mock catalogue, which reaches at most at . Overall, Fig. 6 indicates that the strengths of the  and  lines in our mock catalogue are consistent with those inferred from Spitzer/IRAC colours of galaxies. This validation is particularly important since the strength of the optical emission lines in our mock catalogue will translate into a distribution of  in the NIRSpec simulations (see Section 3.2), therefore directly affecting the constraints on galaxy physical parameters form NIRSpec spectra discussed in Section 4.

The above tests only provide indirect constraints on spectroscopic features through the contamination of broad band filters by emission lines. As we have already noted, existing observatories do not allow the detection of optical emission lines at , hence to study the evolution of galaxy spectral features across the widest redshift range one must rely on UV emission lines. The Ly line is the most luminous emission line at UV wavelengths, but radiative transfer effects caused by its resonant nature make the comparison of model predictions and observations non trivial (e.g., Verhamme et al. 2006). We therefore consider another bright UV line, the doubly-ionized carbon line [C iii]907+C iii]909, which has been observed both at low (e.g. Berg et al. 2016; Senchyna et al. 2017) and high redshift (e.g. Erb et al. 2010; Stark et al. 2015). We compare in Fig. 7 the redshift evolution of the [C iii]907+C iii]909 equivalent width from literature with that predicted by our mock catalogue. In the same figure, we also show with a histogram the distribution of [C iii]907+C iii]909 equivalent widths in our mock catalogue. Fig. 7 indicates that while most galaxies in our catalogue exhibit Å, a significant number of mock spectra attain substantially larger  values, reaching the most extreme Å found by Stark et al. (2015, 2017) and Amorín et al. (2017).

The comparisons of our mock spectra with external observables presented in this section validate our semi-empirical approach to build a mock galaxy catalogue based on HST photometry in the HUDF. In particular, the above tests demonstrate the ability of our catalogue to match the rest-frame optical continuum emission of and galaxies (Fig. 5), the contamination of the strongest optical emission lines to Spitzer/IRAC bands (Fig. 6), and the observed range of [C iii]907+C iii]909 equivalent widths at (Fig. 7).

3 Simulating and fitting Jwst/NIRSpec observations

In this study, we focus on simulations of deep (100 ks) observations performed with JWST/NIRSpec. We consider the multi-object spectroscopy (MOS) mode and adopt the low spectral resolution configuration NIRSpec/prism (PRISM/CLEAR spectral configuration), which enables the simultaneous observation of up to objects over the full spectral range µm.

3.1 Simulating Jwst/NIRSpec spectra

Figure 8: Visibility and blending/unblending of the strongest UV and optical emission lines as a function of redshift, for NIRSpec/prism observations. The thin grey lines indicate the lines visibility as a function of redshift, computed assuming the nominal NIRSpec wavelength coverage . The thick black lines show the unblended single lines or blended group of lines which will be observable at different redshifts.

A detailed description of the approach adopted to simulate NIRSpec spectra can be found in Appendix A. Here, we only provide an overview of our approach and highlight a few key features of the simulations. For each galaxy, we start from the (noiseless) mock spectrum obtained using the procedure described in Section 2. This high-resolution spectrum is redshifted to the photometric redshift obtained with the \oldtextscbeagle analysis and then rebinned to the spectral pixel size of the NIRSpec/prism configuration. We then use an idealized model accounting for the response of the telescope and of the instrument to derive the number of electrons per-second generated at detector level in each pixel. Combined with a noise model, this allows us to generate a simulated (noisy) spectrum. This approach is very similar to the one used by many exposure-time calculators (ETCs), but simpler than the elaborate two-dimensional scheme used by JWST’s official ETC ‘Pandeia’ (Pontoppidan et al. 2016).

In our simulations, we also account for the fact that a significant fraction of the object light may fall outside of the standard 3-shutter slitlet, especially for extended objects (the projected size of the aperture of an individual micro-shutter is arcsec). In practice, we compute the (wavelength-dependent) throughput of an extended source with effective radius  described by an exponential profile (Sersic index equal to 1) and average over all source positions within the aperture of the central shutter of the slitlet. Averaging over random ellipticities and position angles produces small corrections ( few per cent), which we therefore ignore. We fix the effective radius to kpc at redshift (e.g. Shibuya et al. 2015, corresponding to arcsec), and adopt a conservative approach keeping the physical size fixed and accounting for the cosmological increase of the (apparent) sizes with redshift to arcsec at , at , at and at . We do not consider a redshift-dependent evolution of the effective radius because of the large uncertainties in the measurement of the sizes of faint galaxies at (e.g. see fig. 12 of Kawamata et al. 2017), but we note that assuming a decreasing  with increasing redshift would lead to a larger throughput. The total throughput at is between 39 per cent (at ) and 33 per cent (at ) for the aperture, light profile and size adopted. Assuming a smaller effective radius kpc (0.06 arcsec) at (Shibuya et al. 2015) would increase the throughput at from 33 to 44 per cent.

We note that the resolution of the NIRSpec prism has a strong wavelength dependence, from a minimum of at µm to a maximum of at µm. This makes groups of neighbouring emission lines appear blended or unblended depending on a galaxy redshift. Since UV and optical emission lines will provide the firmest constraints on the physical parameters of faint galaxies at high redshift, we have summarised in Fig. 8 the visibility and blending/unblending of different lines as a function of redshift. Emission lines in Fig. 8 are considered unblended when their redshifted wavelengths are separated by a corresponding to 2.2 pixels, i.e. the typical instrumental spectral resolution element size. This does not account for the effect of an extended source on the emission line profiles. Our simulations cover the redshift range , with most objects in the range 4–8. While groups of UV lines such as and are blended at all redshifts, at H is unblended with respect to [O iii] and [O iii], while these [O iii] lines are unblended at . The H line is blended with [N ii] at , while H is blended with [O iii]636 up to . We note that all the lines in Fig. 8 but and are unblended when using the disperser, while the disperser allows one to resolve also those doublets at most redshifts. Fig. 8 also highlights the ability of NIRSpec/prism observations to provide the simultaneous measurement of the main UV and optical emission lines for galaxies at : H is accessible up to , H to and [O iii] to .

Figure 9: (a) Simulated NIRSpec/prism spectrum (thick grey line) of a dropout (XDFV-3948262597, ) placed at redshift , with stellar mass , star formation rate and specific star formation rate . The red line and red shaded region indicate the posterior median and 95 per cent credible interval, respectively, from the \oldtextscbeagle fit of the simulated spectrum. (b) same as (a), but for a dropout (XDFZ-3312565447, ), with , , and . The simulated NIRSpec/prism spectrum correspond to an exposure time of ks. The small inset in each panel shows the observed XDF photometry (blue diamonds, from Bouwens et al. 2015a) along with the model photometry predicted by \oldtextscbeagle for the set of physical parameters associated to these sources with the procedure outlined in Sec. 2.3 (orange stars).
Figure 10: Illustration of simulated NIRSpec/prism spectra of 10 galaxies at . The grey line indicates the noiseless input spectrum, while the blue line shows the simulated one. Each spectrum is normalized to its maximum value, then shifted vertically for clarity. The magnitudes reported in the figure are observed ones and refer to the HST/WFC3 F160W filter, while the redshift is the \oldtextscbeagle-based photo- of the mock galaxy. Shaded regions indicate the location of the main optical emission lines [O ii] (blue),  (green) and  (red).

We show in Fig. 9 and 10 a few examples of simulated NIRSpec/prism spectra. Fig. 9(a) shows a galaxy with , and Fig. 9(b) one at , with . Fig. 9(a) shows a high signal-to-noise detection () of the Balmer lines H and H, and of the Oxygen lines [O iii] and [O iii]. Fig. 9(b) illustrates that the sensitivity of JWST/NIRSpec will allow us to obtain highly significant detections () of emission lines (H, H, [O iii] and [O iii]) even for a fainter galaxy at redshift . We note that the flux excess in the F140W band visible in the inset of Fig. 9(b) likely indicates a problem in the HST data, since at there are no strong emission lines falling in that band that can cause such an excess. Fig. 10 provides an overview of NIRSpec/prism observations of 10 galaxies at covering a broad range of continuum luminosities . The figure highlights how the large star formation rates per unit stellar mass and young ages of high redshift galaxies should enable statistically significant detections of emission lines even in galaxies out to .

3.2 Characteristics of the simulated spectra

Figure 11: Distribution of equivalent widths of H (short-dashed line), (solid line) and (long-dashed line), in our simulated galaxies.

We show in Fig. 11 the distribution of equivalent widths of H and of the two groups of lines  and  computed from the mock spectra. The mean (median) redshift of the galaxies in the mock catalogue is (), and the mean (median) equivalent width of (), which compares favourably with the value found by Smit et al. (2016) at redshift , and of and found by Rasappu et al. (2016) at redshift in their photometric and spectroscopic samples, respectively.

Fig. 12 shows the relation between the signal-to-noise ratio of the H line  and the observed F160W magnitude for all galaxies in the mock catalogue, while Table 2 indicates the fraction of galaxies in bins of F160W magnitude with , 5 and 10. The  reported in Fig. 12 and Table 2 accounts for the effect of the instrumental spectral response, while our simulations do not account for this effect. The consequence is, on average, a per cent larger emission line  in our simulated spectra with respect to a computation accounting for the spectral response (see Appendix A).

Figure 12: Relation between S/N of the H emission line and observed F160W magnitude. The colour coding is the same as in Fig 1 and reflects galaxies in different dropout classes.
(199) 1.00 0.97 0.83
(78) 0.95 0.81 0.51
(104) 0.88 0.64 0.19
(160) 0.71 0.42 0.05
(124) 0.65 0.19 0.00
(44) 0.34 0.05 0.00
Table 2: Fraction of objects in different bins of observed F160W magnitude with , 5 and 10.

Fig. 12 and Table 2 indicate that NIRSpec/prism ‘deep’ observations will detect most galaxies in the HUDF through their emission lines, with typical . More than 70 per cent of the XDF dropouts with will have detections of emission lines, and we expect to detect with about one third of the faintest () galaxies. The reported  values imply an even larger  of H (visible out to ), and likely of [O iii] (out to ), given the large ratios observed at high redshift (e.g. Strom et al. 2017).555Detector noise dominates the noise budget of NIRSpec observations of faint targets at the wavelengths relevant for this analysis. This causes a nearly linear increase of the line  with increasing received flux, at fixed observed wavelength (see also Appendix A). Assuming a constant photon conversion efficiency of the instrument, standard emission line intensities , and (Steidel et al. 2016), and ignoring the effect of dust attenuation, the  distribution in Fig. 12 would increase by a factor for H and for [O iii]. The factors and account for the different number of photons, and hence of photo-electrons, generated at different wavelengths at fixed emitted energy. As we will discuss in Section 4, the simultaneous detection at high  of several hydrogen and metal emission lines enables tight constraints on several key galaxy properties, such as star formation rates and gas-phase metallicities.

3.3 Fitting simulated Jwst/NIRSpec spectra

We use the \oldtextscbeagle tool to perform a full, pixel-by-pixel fitting of each simulated NIRSpec spectrum. We adopt a model similar to the one used to fit the HST/XDF photometry (see Section 2.2), but fixing the redshift to the estimated value to mimic the way data will be analysed, i.e. with redshift being determined directly from the position of the spectral lines. Unlike in the HST broad-band fitting, we let the interstellar metallicity  and fraction of -band attenuation arising in the diffuse ISM, , free to vary. We adopt the same independent priors as in Section 2.2, i.e. uniform for , , , , , and , exponential for  and Gaussian for and (see Table 1). As in Section 2.2, we consider a multi-variate Gaussian likelihood function, assuming independent errors in each pixel (see however Appendix A.6). The likelihood function can therefore be described by equation (2.2), considering the summation over all spectral pixels, and assuming , i.e. removing any additional error term since, by construction, we have a perfect knowledge of the noise properties of the simulated spectra. We defer to a future work, based on the NIRSpec Instrument Performance Simulator, a more detailed analysis of other sources of random and systematic uncertainties in JWST/NIRSpec data, related, for instance, to detector defects, flat-fielding and spectro-photometric calibration (see Appendix A.6).

We show in Fig. 9 two examples of the \oldtextscbeagle fitting of the simulated spectra. The red lines indicate the spectra obtained by computing, in each pixel, the median of the posterior distribution of predicted fluxes, while the red shaded regions show the 95 per cent central credible interval computed from the same distribution of fluxes. Fig. 9(a) and (b) show how the high- detections of several emission lines constrain the model predictions to high precision. As we will discuss in the next section, this in turn induces tight constrains on several physical quantities describing the conditions of photoionized gas in these galaxies.

4 Jwst/NIRSpec constraints on galaxy properties

Figure 13: Comparison between the input physical parameter (on the x-axis) and the value retrieved by fitting the full NIRSpec/prism simulated spectra with \oldtextscbeagle (on the y-axis) for the 6 model parameters stellar mass , , current star formation rate , ionization parameter , gas-phase metallicity  and attenuation optical depth . The different colours correspond to the different dropout classes, and follow the colour coding of Fig. 1. The retrieved parameter value corresponds to the posterior median of the marginal posterior distribution, while the error bars to the 68 per cent central credible region. To avoid overcrowding the plot, we only show errorbars for 20 randomly selected objects.
0.13 0.16 0.22 0.27 0.22 0.29 0.31 0.37 0.37 0.37 0.61 0.59 0.50 0.53 0.52
0.25 0.36 0.33 0.35 0.30 0.55 0.55 0.53 0.52 0.46 1.03 0.90 0.76 0.85 0.67
0.05 0.06 0.09 0.18 0.19 0.10 0.14 0.17 0.31 0.41 0.36 0.50 0.73 1.54 1.47
0.05 0.06 0.09 0.10 0.16 0.15 0.19 0.20 0.30 0.33 0.53 0.58 0.54 0.79 0.80
0.04 0.05 0.06 0.11 0.06 0.08 0.11 0.14 0.20 0.17 0.24 0.30 0.34 0.57 0.78
0.16 0.18 0.21 0.27 0.22 0.34 0.32 0.34 0.40 0.34 0.94 0.80 0.73 0.89 0.67
Table 3: Error thresholds on the retrieved model parameters corresponding to, 0.2, 0.5 and 0.9 of the inverse cumulative distribution function (quantile function) of the errors , for the different dropout classes. The error is defined as half of the 68 per cent central credible interval ( for a Gaussian distribution). Each cell indicate the error threshold on the parameter such that the cumulative distribution , 0.5 and 0.9.

We summarise in Fig. 13 and Table 3 the main results of our analysis of simulated ‘deep‘ NIRSpec observations of high redshift galaxies. We report the constraints on the 6 quantities stellar mass , galaxy age , current star formation rate , ionization parameter , gas-phase metallicity  and -band attenuation optical depth . Fig. 13 shows in different panels the relation between retrieved and input value for the 6 quantities above. The large errorbars visible in Fig. 13(a) in galaxies with , along with the ‘flattening’ of the relation between input and retrieved value, highlight the large uncertainty affecting the determination of the stellar masses, for which the posterior distribution is dominated by the prior. The stellar continuum emission in these galaxies, related to the stellar mass, is in fact too faint to be detected in our simulated spectra. Similarly, the large errorbars and spread of the points in Fig. 13(b) show the difficulty of determining the age of the oldest stars from our simulated NIRSpec spectra. The reason is that the UV-to-optical spectra of galaxies with large specific star formation rates are dominated by the emission from young stars, which outshine the older ones. Fig. 13(c), (d), (e) and (f) show that the star formation rate, ionization parameter, gas-phase metallicity and attenuation optical depth can be constrained across the entire range spanned by the input parameters. The constraints on these quantities come from emission lines ratios, and are therefore largely unaffected by our ability to detect the stellar continuum emission in faint, high-redshift galaxies.

We quantify in Table 3 the precision of the constraints on the 6 physical parameters shown in Fig. 13. For this, we compute for each parameter the cumulative distribution of the errors , and report in Table 3 the errors corresponding to 0.2, 0.5 and 0.9 of the inverse cumulative distribution , i.e. , and . The values therefore indicate the error on the parameter such that 20, 50, or 90 per cent of the objects exhibit errors . The central column of Table 3 indicates that for per cent of the simulated galaxies the stellar masses (or mass-to-light ratios) can be constrained within a factor of ( dex), the age of onset of star formation within a factor of ( dex), the star formation rate, ionization parameter and gas-phase metallicity within a factor of albeit the precision worsens for galaxies at ( and dropouts). The -band dust attenuation optical depth can be constrained with a precision .

The model parameters not reported in Fig. 13 and Table 3, namely the timescale of star formation , dust-to-metal mass ratio and fraction of dust attenuation arising in the diffuse ISM , are largely unconstrained by this analysis. As for the age of the oldest stars, constraining  and  would require detecting at high  the rest-frame optical-to-near infrared (stellar) continuum emission of the galaxies, which traces older stellar populations. Since the NIRSpec observations here considered mainly provide high- detections of optical emission lines, a way to better constrain these parameters would be to combine NIRSpec spectra with JWST/NIRCam and MIRI photometry. This type of analysis will be included in a future study. Constraining , on the other hand, requires measuring, with high , emission lines of refractory and non-refractory elements, such as oxygen ([O ii], [O iii]), and nitrogen and/or sulphur ([N ii], [S ii], e.g. see figure 3 of Gutkin et al. 2016). The emission lines of non-refractory elements, however, are too weak to be detected in the galaxies studied in this work. We note that although ,  and  are unconstrained by the simulated data, their inclusion in the fitting allows us to appropriately propagate into the constraints on the other physical parameters the uncertainty deriving from their poor knowledge, i.e. ,  and  in this analysis play the role of ‘nuisance’ parameters.

We now discuss the effect on the precision of the physical parameters retrieval of the H line moving beyond the spectral coverage of NIRSpec at (see Fig. 8). Table 3 shows that the error threshold for the parameters ,  and  slowly increases from the , to , to dropouts, reflecting the presence of fainter objects, on average, when moving from lower to higher redshift sources (see Fig. 1). The and dropouts, on the other hand, exhibit significantly larger values, likely because of the disappearance of the H line from the NIRSpec data at those redshifts. These parameters are in fact mainly constrained by Balmer lines () or line ratios involving such lines [, ], and given that H is intrinsically more luminous than H, and that it suffers less attenuation, the constraining power of NIRSpec data suddenly drops as H is shifted outside the NIRSpec range. The NIRSpec constraints on the stellar mass (or mass-to-light ratio) worsen with increasing redshift, following the average dimming of the sources, while those on  do not significantly evolve with redshift, likely because of the small  values in our mock catalogue, and hence relatively large error thresholds on  here considered. The age of onset of star formation is the parameter with the poorest constraints: as for , constraining the galaxy age requires detections of the stellar continuum of evolved stellar populations, which is only possible for the brightest galaxies in the sample.

5 Discussion

The results presented in the previous section demonstrate how the unique combination of sensitivity and wavelength coverage of the NIRSpec/prism configuration will allow the precise characterisation of the physical properties of star-forming galaxies across a wide range of redshifts and luminosities. Specifically, the ability to observe multiple optical emission lines at high  should enable precise measurements of the gas-phase metallicity and star-formation rate, which, combined with stellar mass constraints from NIRCam, should significantly boost our ability to understand the mass and metal assembly in galaxies at redshift . We note that a way to improve the precision of the physical parameter estimates of objects with low  emission line detections would be to combine them with galaxies with higher  measurements into a Bayesian hierarchical framework. This would enable population-wide constraints on the parameters and to increase the precision of individual constraints. An ongoing effort to study such an approach is currently underway (Curtis-Lake et al., in prep).

The analysis presented so far illustrates the statistical constraints on several galaxy properties that can be obtained from 100 ks NIRSpec/prism observations of high-redshift sources, but it does not provide any insight on the actual number of sources that will be observed with a NIRSpec pointing. Here, we therefore discuss the number of HST-selected galaxies expected to be observed in a single NIRSpec/prism pointing centred on the HUDF. Fig. 14 shows the HUDF/GOODS-South area along with the four NIRSpec MSA quadrants, where we centred one quadrant on the 4.6  area of the HUDF with deep WFC3 observations. Fig. 14 shows how HUDF sources will only be observable from one MSA quadrant, while for the remaining three quadrants galaxies will have to be selected from shallower data in GOODS-South.666The HUDF/XDF area is larger than the on-sky area of a single MSA quadrant ( arcmin, excluding vignetted shutters), but the separation between quadrants ( arcmin along one axis, arcmin along the other) and unknown telescope roll-angle will in practice limit the possibility of selecting HUDF/XDF targets in more than one quadrant. We report in Table 4 the expected number  of galaxies at and that can be observed with a single NIRSpec pointing in the HUDF/GOODS-South area, along with the on-sky density of galaxies in the HUDF/XDF and GOODS-South regions. Because the faintest targets in the HUDF/XDF may not be detected at a significant level in our deep NIRSpec/prism observations, we also report the expected number of targets observed in the HUDF/XDF with and . From the on-sky density of the targets, and assuming Poisson-distributed sources, a 3 shutters slitlet configuration, and allowing targets to occupy any position within the open area of a shutter (Ferruit at al., in prep), we can compute the number of expected sources observed in a single NIRSpec pointing. At , we consider potential targets in the HUDF/XDF and in CANDELS/GOODS-South, which implies non-overlapping galaxy spectra observed, on average, in a single pointing. At , the larger number of targets, in the XDF and in CANDELS/GOODS-South, translates into spectra observed in a single pointing. This will leave unused slitlets which will be occupied by galaxies at lower redshift. Table 4 also indicates that restricting the targets in the HUDF/XDF to galaxies with (which should exhibit , see Table 2), reduces by (from to ) the number of observed galaxies in a pointing. Given the paucity of targets, it is therefore advisable to prioritise the brighter targets without excluding the fainter ones, as we have previously shown that a significant fraction of galaxies should show detections of H (see Table 2). We note that the main factor limiting the NIRSpec multiplexing of galaxies is their on-sky density outside the HUDF/XDF region, as this density is only in the CANDELS/GOODS-South ‘deep’ area, where data reach a depth of .

Catalog a b
c d c d
XDFe 304 80
413 132
all 513 202
GOODS-Sf all 2061 303
  • For XDF, we include the and dropouts. For CANDELS, we consider galaxies at and from table 4 of Bouwens et al. (2015a).

  • For XDF, we include the , and dropouts. For CANDELS, we consider galaxies at from the same table as in a.

  • On-sky target density of the objects in units of .

  • Average number of non-overlapping spectra observed with the NIRSpec/prism in a single pointing.

  • The number of objects refers to the entire XDF area of , while to compute we consider 1 MSA quadrant.

  • We consider the total number of objects in CANDELS/GOODS-South ‘deep’, corresponding to an area of

Table 4: Number of galaxies at and expected to be observed with a single NIRSpec/prism pointing centred on the HUDF.
Figure 14: The four quadrants of the NIRSpec micro-shutter array (MSA) overplotted on galaxies at redshift (blue symbols) and