# Parametrizations of the 21-cm global signal and parameter estimation from single-dipole experiments

###### Abstract

One approach to extracting the global 21-cm signal from total-power measurements at low radio frequencies is to parametrize the different contributions to the data and then fit for these parameters. We examine parametrizations of the 21-cm signal itself, and propose one based on modelling the Ly background, intergalactic medium temperature and hydrogen ionized fraction using tanh functions. This captures the shape of the signal from a physical modelling code better than an earlier parametrization based on interpolating between maxima and minima of the signal, and imposes a greater level of physical plausibility. This allows less biased constraints on the turning points of the signal, even though these are not explicitly fit for. Biases can also be alleviated by discarding information which is less robustly described by the parametrization, for example by ignoring detailed shape information coming from the covariances between turning points or from the high-frequency parts of the signal, or by marginalizing over the high-frequency parts of the signal by fitting a more complex foreground model. The fits are sufficiently accurate to be usable for experiments gathering of data, though in this case it may be important to choose observing windows which do not include the brightest areas of the foregrounds. Our assumption of pointed, single-antenna observations and very broad-band fitting makes these results particularly applicable to experiments such as the Dark Ages Radio Explorer, which would study the global 21-cm signal from the clean environment of a low lunar orbit, taking data from the far side.

###### keywords:

methods: statistical – cosmology: theory – dark ages, reionization, first stars – diffuse radiation – radio lines: general.^{†}

^{†}pubyear: 2015

^{†}

^{†}pagerange: Parametrizations of the 21-cm global signal and parameter estimation from single-dipole experiments–Parametrizations of the 21-cm global signal and parameter estimation from single-dipole experiments

## 1 Introduction

The sky-averaged or ‘global’ 21-cm signal, , is the mean differential brightness temperature of the 21-cm line of hydrogen, relative to the cosmic microwave background (CMB), as a function of redshift or observed frequency. The amplitude of the signal is determined by the amount of neutral hydrogen present and by the relative number of electrons in the ground and excited states of the 21-cm transition, which is often defined in terms of the spin temperature, , of the transition. depends in turn on the kinetic temperature, , of the gas, and on the efficiency with which can be driven towards , and away from the CMB temperature, by collisions and by Ly coupling (the Wouthuysen-Field effect; Wouthuysen, 1952; Field, 1958). is, therefore, sensitive to the evolution of a variety of different radiation fields: ionizing radiation, which destroys neutral hydrogen; X-rays, which can heat the gas and raise ; and Ly , which causes Wouthuysen-Field coupling. This makes it a valuable probe of sources of radiation and heating up to the end of the epoch of reionization at (see e.g., Pritchard & Loeb, 2012, and references therein).

To make inferences about the properties of the high-redshift Universe from radio observations at the relevant frequencies (), it is useful to have models for that can be fit to the data and compared with each other. We note that some of the information contained in the signal, especially about the radiation fields which affect it most directly, may be extracted in a relatively model-independent way (Mirocha et al., 2013), and that it may be possible to extract the signal as a residual without explicitly fitting for it during foreground removal (Liu et al., 2013; Switzer & Liu, 2014). Nonetheless, we generally deal with parametrized models. These parameters may be physical — describing assumptions about, for example, the spectra of early sources, the efficiency of different types of star formation and the mass of star-forming haloes — and we then require a code which can compute the radiation backgrounds from the evolving population of sources, and their effect on the 21-cm signal. At the other end of the scale, we can choose a flexible parametric form which we hope can describe the shape of without having to specify the details of the physics in advance, such as a cubic spline, as in Pritchard & Loeb (2010).

The fitting is complicated by the fact that the redshifted 21-cm spectrum is superimposed on that of bright astrophysical foregrounds, and is observed by an instrument with a frequency response which may be complicated and/or known only through careful calibration. One approach to this problem is to find parametrized models for the foregrounds, instrument, and any other components of the observed spectrum (or spectra, if multiple pointings are used in order to gather more independent information on the foregrounds and the instrument), and fit them simultaneously with the parameters of the 21-cm signal. This entails searching what may be a high-dimensional parameter space, which one may do with a Markov Chain Monte Carlo sampler (Harker et al., 2012), nested sampling (Harker, 2015) or similar methods. This allows us to rigorously characterize the errors, study the degeneracies between foreground and signal parameters, and so on, but may require the likelihood (and therefore a realization of the signal) to be computed many times. This places requirements on the computational cost of our signal model, as well as on its flexibility and accuracy.

The aim of this paper is to study different parametrizations of , comparing their ability to represent the signal faithfully, to retain the astrophysical information in the measured spectrum, and to be used in a signal extraction pipeline where the cost of computing the model signal may be important. This is timely because a number of projects (current or proposed) aim to detect the 21-cm global signal, e.g. EDGES (Rogers et al., 2015), LEDA (Greenhill & Bernardi, 2012), BIGHORNS (Sokolowski et al., 2015) and SCI-HI (Voytek et al., 2014). EDGES, especially, has already produced tentative constraints on the rapidity of reionization () from observations between 100 and 200 MHz (Bowman & Rogers, 2010). We describe the setup of these experiments in general terms in Section 2, then, for concreteness, focus more specifically on the Dark Ages Radio Explorer (DARE; Burns et al. 2012; Burns et al. in preparation). Our parametrizations of the signal are introduced in Section 3, and we then describe the process of extracting the signal from the data in Section 4. The quality of the recovery is compared for different parametrizations and experimental setups in Section 5, which we discuss further in Section 6 before offering some conclusions in Section 7.

## 2 Experimental setup

We consider a pointed experiment, that is it takes integrated low-frequency radio spectra in a number of discrete directions rather than, say, adopting some sort of scanning approach. We therefore concentrate on the case where we have some small number of independent spectra from which we wish to extract the global signal. In the case of DARE, the maximum number of independent pointings is , since the antenna power pattern is broad (the beam has a full width at half-maximum of tens of degrees, depending on frequency). All else being equal, a larger number of pointings should improve our constraints, since they provide more independent information on the foregrounds and the instrument. If a very large fraction of the sky is covered, however, this implies that some spectra will include brighter regions near the Galactic Centre, reducing our sensitivity. This implies a tradeoff which we examine later.

In designing a global signal experiment, including designing a suitable antenna, the optimal frequency range must also be considered. A very low frequency experiment to study the dark ages would encounter significant problems if conducted from the ground, because of the ionosphere (Datta et al., 2014; Vedantham et al., 2014), but might also require a large antenna which could cause problems for a space mission, so we do not consider it here. We wish to include the start of the cosmic dawn in our analysis, which suggests starting at or lower, while a practical antenna can offer sensitivity and a smooth beam over perhaps a factor of 3 in frequency, suggesting a maximum of around . This will probably allow an experiment to capture the start of reionization, but not the end. We will generally consider a frequency window of –, though we will also look at the effects of narrower ranges.

At each frequency, the sky temperature seen by the antenna is an integral of the true sky over the antenna power pattern. The true sky includes both the 21-cm signal and the foreground signal. The parametrization of the signal is our main concern here, and so we treat the foregrounds very simply, assuming they can be described by a polynomial in – for each pointing, where is frequency. The coefficients of this polynomial, which constitute the parameters of our foreground model, are computed from the global sky model of de Oliveira-Costa et al. (2008). This is done by first integrating the sky model over the power pattern of the antenna at a large number of frequencies for each pointing we wish to consider, and then computing a polynomial fit in –. Because of this subsequent fitting stage, the power pattern we assume for the antenna is not important; we take it to be Gaussian, so that the computation can be done efficiently using the routines in healpy^{1}^{1}1https://github.com/healpy/healpy, which is based on the healpix (Górski
et al., 2005) package^{2}^{2}2http://healpix.sourceforge.net/.

We assume a simple model for the instrument response: the calibrated noise-free (modelled) spectrum is given by

(1) |

where is the receiver temperature, which we assume to be a constant. The noise on a frequency channel of width after an integration time is then given by the radiometer equation, , assuming two polarizations are averaged together. For the high sky temperatures at these frequencies, therefore makes a relatively minor contribution to the noise temperature unless is very small, so although we assume it to be for concreteness, this has little influence on our results.

## 3 The 21-cm global signal and parametrizations

Our model for the 21-cm signal is computed with the Accelerated Reionization Era Simulations (ares) code^{3}^{3}3https://bitbucket.org/mirochaj/ares, first designed to investigate the signatures of X-ray heating in the global 21-cm signal (Mirocha, 2014). As in previous works (e.g., Furlanetto, 2006; Pritchard &
Loeb, 2010), ares divides the intergalactic medium (IGM) into two phases: (i) a fully ionized phase representing \ionHii bubbles around galaxies, whose volume filling factor affects the overall normalization of the global 21-cm signal, and (ii) a mostly neutral ‘bulk IGM’ phase beyond bubbles, whose spin temperature determines the strength and sign of the global 21-cm signal.

There are many parameters in ares that can be varied to generate different realizations of the 21-cm signal. In this work, we consider the cosmological parameters and the primordial power spectrum to be fixed, which in principle specifies the amount of matter which has been confined in haloes (and the mass function of haloes) as a function of redshift. The remaining parameters of the model govern how efficiently this collapsed mass is converted into radiation of different wavelengths, including the Ly which causes Wouthuysen-Field coupling, ionizing radiation (which affects ), and X-rays which heat the IGM efficiently. For example, we may specify a minimum halo virial temperature below which a halo cannot form stars, the formation efficiency and spectral energy distribution of different source populations, the fraction of the radiation which escapes into the IGM, and so on. Our reference model assumes values for these parameters which are consistent with low-redshift values, and results in a history for which the Thomson scattering optical depth to the CMB is consistent with constraints from Planck (Planck Collaboration XIII, 2015).

Part of the aim of this series of papers is to determine how well the values of these parameters really can be expected to represent the physics generating the signal. That is, if we generate a signal with ares, and use it as part of a synthetic data set, to what extent can we recover what we put in? This includes recovery of (i) the signal itself, (ii) the properties of the IGM consistent with that recovered signal, and ultimately, (iii) the properties of galaxies required to explain the IGM properties.

In its simplest mode of operation – in which the cosmological radiative transfer is treated approximately – ares is in principle fast enough to be used to model the signal when fitting a synthetic data set, which simultaneously yields a recovered signal, the entire history of Ly emission, ionization and heating in the IGM, and constraints on the values for physical parameters in the code. To fit more computationally expensive physical models, which have more free parameters, more advanced physics, or both, may require a “two-stage” approach, in which a preliminary fit is performed using a computationally efficient – but perhaps phenomenological – parametrization of the global 21-cm signal. This first stage yields a set of measurements to be fit subsequently by a more complex model. We will now consider two parametrizations of the signal that are inexpensive alternatives to the full ares model, and revisit the two-stage approach in §5.

The first parametrization that we will consider is that put forward by Pritchard & Loeb (2010), which we will call the ‘turning points’ parametrization. This has also been used in subsequent studies (Harker et al., 2012; Harker, 2015) as a fast and convenient method to describe the major features of a generic global signal. In this picture, the 21-cm spectrum has a number of maxima and minima, caused as different effects become dominant in determining . The positions (in redshift or frequency, and in brightness temperature) of these turning points are the parameters of the model. The signal between these turning points is modelled as a cubic spline. This parametrization is very flexible and can describe a wide range of plausible 21-cm signals, but the turning point positions require further interpretation in order to relate them to the physics of the first sources, via their constraints on the intensity of various radiation fields and the properties of the IGM at the redshifts of the turning points (Mirocha et al., 2013).

Label | Type | Description | |||
---|---|---|---|---|---|

A | 16.1 | 87.2 | Minimum | ‘Dark ages’; collisional coupling becomes ineffective. | |

B | 47.4 | 29.0 | Maximum | ‘Cosmic dawn’; Ly coupling becomes effective. | |

C | 71.0 | 19.0 | Minimum | Heating becomes important. | |

D | 111.4 | 11.7 | 19.2 | Maximum | Heating saturated; reionization begins. |

E | 180 | 6.9 | 0 | Minimum | End of reionization; null signal after this point. |

We will also consider a parametrization which is in some sense intermediate between the ‘turning points’ model and physical models like ares, which we will call the ‘tanh’ parametrization. In this model, the Ly background (which determines the strength of Wouthuysen-Field coupling), the temperature of the IGM, and the ionized fraction of hydrogen, all evolve according to a tanh model, i.e. for each quantity we have

(2) |

so that they are zero at high redshift^{4}^{4}4With the exception of the IGM temperature, which is a sum of a ‘tanh’ term and an adiabatic cooling term, , in order to reproduce the ‘dark ages’ signal prior to first light., switch on over an interval of width around a redshift of , and become saturated at a value of at low redshift. Therefore each of these three quantities has three parameters describing its evolution, though e.g. the parameter for ionization has a natural value of unity and is not free to vary if the model is to represent a physical history. Because the tanh model specifies values for IGM quantities, unlike the ‘turning points’ model which is purely phenomenological, it can also yield e.g. the Thomson optical depth. The parameters are not linked directly to source properties, however, and so it is in this sense that we describe it as being intermediate between the ares-based parametrization and the ‘turning points’ parametrization.

Although we will introduce our fitting procedure fully in the next section, we show the result of attempting to fit the signal generated by ares with the turning points and tanh parametrizations now, in order to point out some features of the parametrizations. This fitting, shown in Fig. 1, assumes an idealized instrument model, simple foregrounds, and 1000 h of observation in four sky regions, which should be a relatively simple case for the extraction to deal with, in order to focus on the differences between parametrizations.

We first note that the ares model plotted in Fig. 1 has the typical features we expect: a maximum at corresponding to the onset of Ly coupling, a minimum at where the IGM starts heating, and a broad maximum at where ionization starts to become important. We will refer to these features as turning points B, C, and D, respectively (see Table 1). If we interpolate between these maxima and minima using a cubic spline (the turning points parametrization), we produce the magenta dot–dashed line. Despite enforcing the correct parameter values, the shape of the curve under this popular parametrization scheme does not closely match the curve produced by the physical model. This might raise concerns that a turning points model will not perform well in the extraction of the signal from synthetic data, and these concerns are not dispelled by the solid blue line, which is the result of attempting this fitting.

The overall difference in normalization between the black and blue lines comes about because a constant offset in the signal can be almost perfectly absorbed by the foregrounds, and so is very difficult to determine from the data. This can lead to an unphysical signal, which we can attempt to solve by imposing stricter priors, but which is not prevented by the parametrization itself. Even if we correct this offset in normalization by hand, however, we see that the shape of the recovered signal does not closely follow the shape of the input signal. This is shown by the dashed red line, for which we artificially add an offset to the recovered signal to ensure that the temperature of the minimum of the absorption trough matches the input. We can see that even the frequencies of turning points B and D are poorly recovered: B is at too high a frequency, and D is too low. A possible reason for this can be seen by comparison with the magenta dot–dashed line, which goes through the correct turning points. We see that near the minimum of the signal, the offset recovered curve captures the shape of the input signal much better. It seems, then, that by matching the shape of the signal between the turning points, we recover worse values for the positions of the turning points themselves.

Finally, we examine a fit using the tanh model, shown with the dotted cyan line. This captures the overall shape of the signal much better than the turning points model, though there seems to be an offset at high frequency which we examine later. In fact, if we look at the position of the turning points recovered from the tanh model, they match those of the input signal better than the turning points model even though they are not explicit parameters of the tanh model. Moreover, since the tanh functions in this parametrization describe properties of the IGM, which are then translated into a , we can guarantee that the signal is physically plausible much more readily than for the turning points parametrization. The tanh model is, however, much faster to evaluate than a full ares run, and so it is a promising parametrization for which to test our fitting in the rest of this paper.

## 4 Description of the recovery process

Given a synthetic data set, i.e. a small number of independent, noisy spectra (from pointing towards different areas of the sky) including a 21-cm signal, foregrounds, instrumental response and noise, generated according to the procedure outlined in Section 2, we wish to fit some parameters describing the different contributions to the data. In this paper we wish to concentrate our attention on the difference between parametrizations of the 21-cm signal, and so we fix the instrument model to be the same as the one used to generate the data set, and fit the parameters of a signal model (not necessarily the same one used to generate the data) and a straightforward foreground model. The global 21-cm signal is uniform over the sky at the angular resolutions and noise levels we consider here (Bittner & Loeb, 2011), so the parameters of the signal are the same in each sky area. The foreground parameters are allowed to differ, however.

We have updated our fitting routines from the ones used by Harker et al. (2012) for an earlier version of the DARE pipeline. Fitting moderately large numbers of sky regions, more complex foregrounds, more computationally expensive signal models, and potentially parameters of an instrument model, requires us to be able to explore a complicated parameter space with perhaps a few dozen dimensions. For this reason, we now use the emcee package (Foreman-Mackey et al., 2013), which implements the affine-invariant Markov Chain Monte Carlo sampler of Goodman & Weare (2010). This yields samples from the posterior probability distribution of the parameters of interest, given the data. In computing the likelihood, we assume all the frequency channels in all sky regions are independent, i.e. the probability density for obtaining the value , where indexes the sky region, for a vector of parameters , is

(3) |

where is the rms noise in the channel, computed from , the bandwidth and the integration time using the radiometer equation, and the likelihood is just the product over all the channels,

(4) |

More generally, we could concatenate the individual spectra into a single ‘data vector’, in which case our independence assumption implies a diagonal data covariance matrix, but this machinery is not necessary for the current work. In practice, we work with the log-likelihood, so the product in equation (4) is computed as a sum. We adopt broad, Gaussian priors for the foreground parameters, which have little impact since the data generally constrain them quite well. For the signal parameters we generally adopt uniform priors; occasionally, these do come into play, for example in preventing the 21-cm signal parametrized by its turning points from becoming unphysical, but it is generally clear when this is the case, and we comment on it when relevant.

### 4.1 Two-stage fitting

As alluded to in the previous section, we will also consider a scenario in which foreground removal has yielded the parameters of some 21-cm signal model, with errors, and we want to perform a second stage of inference about a different model. This situation might arise when the full parameter space has high dimension (for example, we wish to fit foregrounds in a large number of sky regions, perhaps including contributions from the Sun, Moon, etc., along with instrumental parameters) meaning that we can only perform the signal extraction using a signal model which can be generated rapidly. For example, we might infer the positions of the turning points, under that parametrization.

We might then wish to take those turning point constraints, and use them to infer something about the parameters of a model which is slower to evaluate, but has more physically meaningful parameters, for example the full ares signal model. We wish to test how much of the physical information in the data is retained in this two-step approach, since it may determine the requirements we place on our signal extraction pipeline. We also perform this second-stage fitting using emcee, but rather than the likelihood being computed as a sum over all the frequency channels in all the sky areas, it is computed assuming Gaussian errors on the parameters of the intermediate parametrization (either independent errors, or using the covariance matrix coming from the first-stage fitting).

## 5 Recovering parameters from observations

We start by comparing the turning points recovered by a direct fit of the ‘turning points’ parametrization to an ares model with the turning points recovered by fitting the tanh parametrization (where in the latter case, the turning points are the extrema of the reconstructed signal).

The direct fit of the turning points model, for the same case as for Fig. 1, is shown in Fig. 2, which is a ‘corner plot’ made using Foreman-Mackey’s triangle_plot package^{5}^{5}5https://github.com/dfm/triangle.py.

As noted in the discussion of Fig. 1, the turning point constraints are biased, and all the ‘true’ values of the turning point parameters lie outside the axis ranges of the panels in Fig. 2. The statistical errors are therefore misleading, even though the instrument model is perfect and fixed, and the foreground model is able (by construction) to perfectly model the input foregrounds. None the less, some of the correlations we can see in the 2D posteriors, such as the positive correlation between the frequency and differential brightness temperature of turning point C, persist in other cases. For example, we see this correlation when the input model (as well as the fit mode) uses the ‘turning points’ parametrization, or when the turning points are inferred from a tanh fit to the data.

Similarly, Fig. 3 shows the parameters of the tanh fit corresponding to the cyan dotted line in Fig. 1.

Although there are some strong degeneracies, for example between the normalization and central redshift of the Ly history, and between the central redshift and width of the temperature step, which may suggest that a lower-dimensional parametrization could work, the constraints seem plausible and physical. Constraints on the reionization midpoint and duration are comparable to numbers quoted in the literature, but should be interpreted with caution as the bandpass used in the fit is truncated at 120 MHz, i.e., . As a result, these constraints would likely change if one imposed prior information from the late stages of reionization. However, such constraints are still meaningful: we will focus on how these parameters are related to the volume filling factor of \ionHii regions, , and the volume-averaged ionization rate, , momentarily.

We see how these translate into constraints on the turning points in Fig. 4.

The statistical errors are similar to those for the direct spline fit through the turning points (Fig. 2), but the best-fitting values are clearly much closer to the true values, i.e. the bias is substantially reduced. Some biases remain, however, and are a consistent feature in all of our calculations, regardless of the integration time or number of sky regions. We have checked that despite the small residual bias, the fitting correctly captures the variation in turning point position with changes in ares parameters. For example, as the efficiency of X-ray production changes, the position of turning point C (in frequency and temperature) also changes, and the fitted values accurately reflect this.

We should expect biases in the turning point constraints to propagate to constraints on the IGM properties, so we will next investigate how to mitigate these biases. First, we test whether the constraints might be affected by the frequency range used for the fitting and the complexity of the foreground model. We can see from Fig. 1 that there is a mismatch between the input model and the recovered tanh-fit at the highest frequencies, where it seems to lie below the input signal. This is not unique to the specific realization fitted in Fig. 1, and may simply be due to a degeneracy with the foreground, which is more difficult to overcome at high frequencies where the signal is smoothest. Biases at high frequencies, depending on the parametrization, could propagate to lower frequencies. A more complex foreground model may help by absorbing the smooth divergence between the tanh model and the input curve at high frequencies. Usually, this sort of degeneracy is a disadvantage since it weakens the constraints on the signal, but we aim to test whether, by allowing a wider range of histories at low redshift, a more complex model might avoid biases at higher redshift, perhaps at the expense of increased errors.

The turning point constraints for the same model as in the earlier figures, for a data set truncated at , and for a data set using the full frequency range but a fourth-order rather than a third-order polynomial model for the foregrounds, are shown in Fig. 5.

As one would expect, truncating the frequency range significantly weakens the constraints on turning point D. It does, however, reduce the bias on the frequency of turning point C, for which the constraints were only marginally consistent with the true value in the standard case. Therefore it does seem that discarding frequencies where the parametrization is unable to capture the shape of the signal helps with recovery. Perhaps surprisingly, increasing the complexity of the foreground model has a similar effect, increasing the errors on the high-frequency turning point but reducing the bias on turning point C. The extra foreground parameters act as extra nuisance parameters which absorb the difference between the tanh model and the data, and which are marginalized over to produce an unbiased constraint.

A similar effect can be seen at work if we examine one- and two-stage fits for the IGM parameters. By a one-stage fit, we mean that the properties of the IGM are taken directly from the tanh parametrization. By a two-stage fit, as discussed in §4.1, we mean that, first, a tanh fit is used to infer the positions of the turning points and then, secondly, the constraints on the turning points are used to infer IGM parameters. We look at two flavours of the second-stage fit in Figs 6 and 7.

In the first, we simply take the errors on the frequency and temperature of each turning point to be independent and Gaussian. In the second, we still assume the errors are Gaussian, but use the covariance matrix for the turning point parameters obtained from the posterior samples of the tanh fit.

If we only use the turning points to constrain the IGM parameters, and do not make use of any other shape information in the signal, this naturally increases the errors, as can be seen in the larger contours and broader 1D distributions for the two-stage fits. This is especially so if we disregard the correlation structure and treat each turning point parameter independently (red lines). The two-stage fits do, however, reduce the bias on the inference of the IGM properties. The black contours are inconsistent with the true values, while the red and green contours from the two-stage fits enclose the true value. Our interpretation of this is that although we have discarded shape information, retaining only the turning point positions, this shape information was unreliable, and biased our constraints. The turning points encode robust information about the signal, even when they are inferred from a parametrization (the tanh model) which does not explicitly include their positions as parameters. This highlights the importance either of finding a parametrization which is flexible enough to be able to capture the true shape of the signal, or finding robust quantities which yield reliable information even if the parametrization is imperfect. Of course, for a real experiment we do not know the shape of the signal in advance, though we may be able to choose between different parametrizations using e.g. the Bayesian evidence.

In Fig. 8, we distil the results of a calculation suite in which the total integration time, , and the number of independent sky regions, , is varied, showing constraints on the turning points as a function of and . Increasing the integration time by a factor of 10 has a much more substantial effect than increasing the number of sky regions, though subtle -level biases, particularly in the frequency and brightness temperature of turning point D, are persistent even with 1000 h integration. Increasing from 1 to 2 has a greater impact than any further increases, though this conclusion may change if we must also fit parameters of the instrument, since we will then have constraints from a greater variety of signals being passed through the instrumental response. The results for and are perhaps surprisingly poor, but this reflects the fact that by using the entire sky, we end up including the parts with the most intense foregrounds, in the direction of the Galactic Centre. Since this paper’s focus is on signal parametrizations, an investigation of the tradeoff between the number of independent sky regions and the intensity of the foregrounds (and how this affects constraints on the instrumental response) is beyond its scope.

## 6 Discussion

While these calculations can provide some indication of the most useful or flexible parametrizations to use for signal extraction for observational data, it would be more useful to have an indication of which is the best model to use from the data themselves. This may be supplied by a computation of the Bayesian evidence, but although emcee provides in principle for the computation of the appropriate marginal likelihood using its parallel tempering mode, we have found that to be impractical. We have used nested sampling (Skilling, 2004) to perform model selection for an idealized and simplified version of the problem we study here (Harker, 2015), but we found that multinest (Feroz et al., 2009) did not scale well enough to apply to the problems in the current paper, and an alternative such as polychord (Handley et al., 2015) may be required.

In principle, the best parametrization might also depend on the properties of the instrument, since instrumental parameters might be more degenerate with some signal parametrizations than with others. If the instrumental response is known, but is different from the response we have assumed here (a constant 85 per cent efficiency across the band), our conclusions are unaffected, though the errors on the parameters can change owing to the change in sensitivity. We have checked this using a modelled instrumental response for DARE. If we must fit instrumental parameters, then the conclusions may depend in detail on the properties of the instrument, and so are beyond the scope of this paper, which attempts a more general discussion.

## 7 Conclusions

We have described a ‘tanh’ model for the global 21-cm signal between the end of the dark ages and the start of the epoch of reionization, which employs simple parametric forms for the Ly background, IGM temperature and reionization histories, and which matches the shape of physical models much better than the ‘turning points’ model used in previous work. The tanh model also helps pin down the overall normalization of the signal, and thus the position of its turning points, despite the fact that the turning points are not explicit parameters of the model. This is largely because the tanh model has stronger theoretical priors, e.g., the ‘dark ages’ feature is confined to a narrow ‘track’ at MHz, and the signal cannot become ‘super-saturated’ at late times (low redshifts). Moreover, by describing IGM properties explicitly, it opens the door to including other constraints on the reionization history (e.g. from the CMB) in our likelihood function. It does all this while being several orders of magnitude faster to compute than a full physical model of the 21-cm signal, allowing us to explore the large parameter spaces which are required if we are to simultaneously fit parameters of the signal, foregrounds and instrument. We can none the less take the parametric fits and use them to constrain simple galaxy formation models, as shown by Mirocha et al. (2015).

We have found that integration time plays a larger role than the number of independent sky areas in the quality of signal recovery, though subtle biases persist in turning points constraints, particularly at the highest frequencies ( MHz). This can be remediated by a more complex foreground model or by truncating the band at MHz, though the latter renders all constraints on the IGM at the lowest redshifts meaningless.

Even when the turning point constraints are unbiased relative to the input values, inferences of the properties of the IGM at the turning points can be biased. This is due to a subtle mismatch in shape between the tanh model and the ares physical model, which can be seen upon evaluation of the curvature at the turning points. In ‘two-stage fits’, one can mitigate such effects by treating the errors on the turning points as independent Gaussians: while this is admittedly a more conservative estimate of the errors, it is a treatment which removes most knowledge of the detailed shape of the signal, keeping only the information which is more robust.

Finally, a direct ‘single-stage fit’ using the parameters of the ares model might well be ideal, and would provide a useful point of comparison to our two-stage fits. For example, it would be interesting to see if biases in IGM properties would persist even if the signal were fit with the exact model used to generate it. We did not find this to be computationally feasible for this work, which highlights the need for future work to consider other samplers to explore our parameter space, and to tackle model selection as well as parameter estimation.

## Acknowledgements

GH acknowledges funding from the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme (FP7/2007–2013) under REA grant agreement no. 327999. JM acknowledges support through the NASA Earth and Space Science Fellowship programme, grant number NNX14AN79H. JRP acknowledges support under ERC-2014-STG grant #638743-FIRSTDAWN, FP7-PEOPLE-2012-CIG grant #321933-21ALPHA, and STFC consolidated grant ST/K001051/1. This research has been supported by the NASA Ames Research Center via Cooperative Agreements NNA09DB30A and NNX15AD20A, and by NASA ATP grant NNX15AK80G. This work utilized the Janus supercomputer, which is supported by the National Science Foundation (award number CNS-0821794) and the University of Colorado Boulder. The Janus supercomputer is a joint effort of the University of Colorado Boulder, the University of Colorado Denver and the National Center for Atmospheric Research.

## References

- Bittner & Loeb (2011) Bittner J. M., Loeb A., 2011, J. Cosmology Astropart. Phys., 4, 38
- Bowman & Rogers (2010) Bowman J. D., Rogers A. E. E., 2010, Nature, 468, 796
- Burns et al. (2012) Burns J. O., et al., 2012, Adv. Space Res., 49, 433
- Datta et al. (2014) Datta A., Bradley R., Burns J. O., Harker G., Komjathy A., Lazio T. J. W., 2014, Effects Of The Ionosphere On Ground-Based Detection Of The Global 21 CM Signal From The Cosmic Dawn And The Dark Ages (arXiv:1409.0513)
- Feroz et al. (2009) Feroz F., Hobson M. P., Bridges M., 2009, MNRAS, 398, 1601
- Field (1958) Field G. B., 1958, Proc. IRE, 46, 240
- Foreman-Mackey et al. (2013) Foreman-Mackey D., Hogg D. W., Lang D., Goodman J., 2013, PASP, 125, 306
- Furlanetto (2006) Furlanetto S. R., 2006, MNRAS, 371, 867
- Goodman & Weare (2010) Goodman J., Weare J., 2010, Comm. Appl. Math. Comput. Sci., 5, 65
- Górski et al. (2005) Górski K. M., Hivon E., Banday A. J., Wandelt B. D., Hansen F. K., Reinecke M., Bartelmann M., 2005, ApJ, 622, 759
- Greenhill & Bernardi (2012) Greenhill L. J., Bernardi G., 2012, preprint, (arXiv:1201.1700)
- Handley et al. (2015) Handley W. J., Hobson M. P., Lasenby A. N., 2015, MNRAS, 450, L61
- Harker (2015) Harker G. J. A., 2015, MNRAS, 449, L21
- Harker et al. (2012) Harker G. J. A., Pritchard J. R., Burns J. O., Bowman J. D., 2012, MNRAS, 419, 1070
- Liu et al. (2013) Liu A., Pritchard J. R., Tegmark M., Loeb A., 2013, Phys. Rev. D, 87, 043002
- Mirocha (2014) Mirocha J., 2014, MNRAS, 443, 1211
- Mirocha et al. (2013) Mirocha J., Harker G. J. A., Burns J. O., 2013, ApJ, 777, 118
- Mirocha et al. (2015) Mirocha J., Harker G. J. A., Burns J. O., 2015, ApJ, 813, 11
- Planck Collaboration XIII (2015) Planck Collaboration XIII 2015 (arXiv:1502.01589)
- Pritchard & Loeb (2010) Pritchard J. R., Loeb A., 2010, Phys. Rev. D, 82, 023006
- Pritchard & Loeb (2012) Pritchard J. R., Loeb A., 2012, Rep. Prog. Phys., 75, 086901
- Rogers et al. (2015) Rogers A. E. E., Bowman J. D., Vierinen J., Monsalve R., Mozdzen T., 2015, Radio Science, 50, 130
- Skilling (2004) Skilling J., 2004, in Fischer R., Preuss R., Toussaint U. V., eds, AIP Conf. Proc. Vol. 735, Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Am. Inst. Phys., New York. p. 395, doi:10.1063/1.1835238
- Sokolowski et al. (2015) Sokolowski M., et al., 2015, Publ. Astron. Soc. Australia, 32, 4
- Switzer & Liu (2014) Switzer E. R., Liu A., 2014, ApJ, 793, 102
- Vedantham et al. (2014) Vedantham H. K., Koopmans L. V. E., de Bruyn A. G., Wijnholds S. J., Ciardi B., Brentjens M. A., 2014, MNRAS, 437, 1056
- Voytek et al. (2014) Voytek T. C., Natarajan A., Jáuregui García J. M., Peterson J. B., López-Cruz O., 2014, ApJ, 782, L9
- Wouthuysen (1952) Wouthuysen S. A., 1952, AJ, 57, 31
- de Oliveira-Costa et al. (2008) de Oliveira-Costa A., Tegmark M., Gaensler B. M., Jonas J., Landecker T. L., Reich P., 2008, MNRAS, 388, 247