Reconstruction of Galaxy Star Formation Histories through SED Fitting:
The Dense Basis Approach
Abstract
We introduce the Dense Basis method for Spectral Energy Distribution (SED) fitting. It accurately recovers traditional SED parameters, including M, SFR and dust attenuation, and reveals previously inaccessible information about the number and duration of star formation episodes and the timing of stellar mass assembly, as well as uncertainties in these quantities. This is done using basis Star Formation Histories (SFHs) chosen by comparing the goodnessoffit of mock galaxy SEDs to the goodnessofreconstruction of their SFHs. We train and validate the method using a sample of realistic SFHs at drawn from stochastic realisations, semianalytic models, and a cosmological hydrodynamical galaxy formation simulation. The method is then applied to a sample of 1100 CANDELS GOODSS galaxies at to illustrate its capabilities at moderate S/N with 15 photometric bands. Of the six parametrizations of SFHs considered, we adopt linearexponential, besselexponential, lognormal and gaussian SFHs and reject the traditional parametrizations of constant (TopHat) and exponential SFHs. We quantify the bias and scatter of each parametrization. of galaxies in our CANDELS sample exhibit multiple episodes of star formation, with this fraction decreasing above . About of the CANDELS galaxies have SFHs whose maximum occurs at or near the epoch of observation. The Dense Basis method is scalable and offers a general approach to a broad class of datascience problems.
Subject headings:
galaxies: star formation — galaxies: evolution — techniques: photometric1. Introduction
The integrated light of a galaxy offers a vast amount of information. When measured with sufficient precision and suitably analysed, the Spectral Energy Distribution (SED) offers insights about a galaxy’s composition from its birth to its time of observation (Acquaviva et al., 2011; Conroy & Gunn, 2010). This can be used to estimate the galaxy’s star formation rate as a function of time, which traces its evolution and merger history (Heavens et al., 2000; Tojeiro et al., 2007; Chevallard & Charlot, 2016; Leja et al., 2016). Combined with other observations, this provides valuable knowledge of cosmic structure formation.
Existing methods of SED fitting use a variety of sophisticated techniques. These include inversion methods (Heavens et al., 2000), bayesian codes for estimating uncertainties and covariances (Acquaviva et al., 2015; Chevallard & Charlot, 2016), machine learning methods with training sets (Leistedt & Hogg, 2016), and templatebased models (Bolzonella et al., 2000). To search the large parameter spaces of the variables in consideration, Markov Chain Monte Carlo (MCMC) methods have become increasingly popular.
These advances have been necessitated by the increasing detail provided by theory, and the expanding size of galaxy catalogues available through surveys. A large amount of (spectro)photometric data of unprecedented quality will be generated in upcoming surveys, like LSST (Ivezic et al., 2008), HETDEX (Papovich et al., 2016) and JPAS (Benitez et al., 2014). SDSS (Eisenstein et al., 2011) has already measured spectrophotometry for objects. The HETDEX/SHELA field will cover roughly 600,000 objects with multiband photometry and fiber spectroscopy. JPAS will cover 9,000 square degrees with 59 filters (+54 narrowband filters across optical) for galaxies. Large regions covered in the NIR with Euclid (Laureijs et al., 2010) and WFIRST (Green et al., 2012) will overlap with LSST, which leads to SEDs for objects by 2022, many of which will have panchromatic photometry. In keeping with the large amounts of reduced data generated by these collaborations, it is imperative that advanced methods of analysis are developed in order to gain useful information from the integrated light of the galaxies under consideration.
The star formation history (SFH) of a galaxy can sometimes be poorly constrained through different approaches to SED fitting. Typical methods assume a predetermined parametrization like constant star formation or exponentially declining star formation to estimate physical quantities of interest like the stellar mass, star formation rate (SFR), or the time at which the galaxy started forming stars. A few approaches instead seek to reconstruct the SFH from the data, using methods that include reducing the dimensionality of the parameter space using data compression methods (Heavens et al., 2000), finebinning the interval that makes the maximum contribution to flux (Tojeiro et al., 2007), mapping the discretizedtime photometric fitting to a linear inversion problem (Dye, 2008), or comparing against a large basis of realistic model SEDs using a Bayesian method (Pacifici et al., 2012). In the current work, we aim to show that using a wellmotivated basis allows us to reconstruct robust star formation histories from galaxy SEDs.
The paper is organized as follows: in §2, we introduce the Dense Basis formalism of SED Fitting, and how it can be applied to the specific problem of reconstructing SFHs, including the motivation for a particular choice of basis and the fitting procedure with a particular basis set. We describe the training of the atlas using different sources of realistic SFHs: SAMs, Hydrodynamic simulations, and stochastic SFHs in §3. In §4 we validate the method on both synthetic SEDs from the SAMs as well as real SEDs from the CANDELS GOODSS field. We then present results in §5 including the number of episodes of star formation in the galaxy’s past and constraints on the timing and duration of star formation activity, quantities that were previously inaccessible through SED fitting. In §6, we discuss biases introduced by adopting single parametrizations of SFHs, compare with other SFH reconstruction methods, and mention the application of the Dense Basis method to larger datasets.
2. The Dense Basis Formalism
The Dense Basis SED fitting method reconstructs Star Formation Histories (SFHs) of individual galaxies using an atlas comprised of SEDs corresponding to well motivated families of SFHs that effectively cover the space of all physical SFHs^{1}^{1}1While an expansion using an infinite number of polynomials or a fourier decomposition would provide a true basis in the sense of spanning the space of all possible curves, our basis functions only do so approximately; however, since they can reconstruct any star formation history to the level of precision attainable with spectrophotometric data, they provide an effective basis.. It does so by training the atlas on mock catalogs prior to fitting the full dataset. This allows us to use the reconstructed SFHs to perform novel analyses and to tackle problems that were previously intractable with SED fitting, such as estimating the number and duration of star formation episodes in a galaxy’s past. To avoid any bias due to choice of prior, the method is currently implemented in a frequentist manner. In this section, we briefly describe the Dense Basis methodology and training of the basis set. An overview of the process is described in Figure1.
2.1. A well motivated basis of SFHs
The collection of multiple families of well motivated SFHs and their corresponding SEDs with which we fit galaxies; henceforth atlas of SEDs and SFHs, should be designed to utilise the Dense Basis method to its full potential. The choice of appropriate families of functions to best describe the formation of stars in the galaxy in SFH space () determines how the SEDfitting procedure encodes realistic star formation. We employ seven major considerations in the choice of basis that should be satisfied for every functional family under consideration:

Physically or empirically motivated: The functional form of the SFH needs to be realistic, arising either from statistical analysis of star formation in model galaxies, or deduced from observed galaxies. For the latter, as in Gladders et al. (2013), skewed distributions such as linear rise followed by exponential decline and lognormal arise in physical processes restricted to nonnegative domains. In this case, SFHs should also satisfy at the Big Bang.

Robustness of reconstruction: The family of basis SFHs should be chosen such that a good fit in an SED space () should correspond to a good reconstruction in SFH space (). This correspondence can be tested in various ways and could potentially be different for different datasets since the representative form of the SFH could differ across epochs. It is a useful metric for eliminating SFH families that fit SEDs well but yield biased SFH results, such as exponentially declining SFH parametrizations, which describe star formation reasonably well at recent times, but bias quantities such as Age and , the lookback time at which the galaxy accumulates 50% of its observed mass, (Pacifici et al., 2012). Analogous to isochrone synthesis and matrix inversion methods, this is possible since the SEDs are piecewise linear in their dependence on the SFH and can be decomposed into multiple representations using different functional families.

Dense in SFH space: To avoid degeneracies and biases, (i.e., to better reveal the local minima of the likelihood surface in parameter space), we need to ensure the basis is sufficiently dense in the space of nparameter curves spanning in the interval .

Minimal number of parameters: The number of parameters used to describe the functional form of the SFH basis functions will determine the amount of data compression possible in reconstructing the spectrum of the galaxy from its bestfit coordinates in parameter space: . For the present application, we model the star formation history as a sum of star formation basis functions, each needing three parameters to describe each reconstructed episode of star formation, the timing of the peak, the timescale, and the stellar mass formed..

Temporally consistent: The families should be chosen such that they produce consistent results for an SFH, independent of when the galaxy is observed, within uncertainties.

Positive definite: Any functional used to describe the SFH should be positive definite, since , which allows us to extract physical information from multicomponent solutions to the reconstructed SFH, as opposed to methods like PCA (Ferreras et al., 2006) or piecewiselinear matrix inversion (Dye, 2008), which need regularization to yield physical solutions.

Robust to noise: The atlas spans the space of physically motivated SFHs, but not the space of all possible SEDs. This makes it robust to noise in the sense that distortions due to noise that are not accessible through the physically motivated families of SFHs under consideration do not bias the fits, as described in Appendix.D.
We describe a few of the 2parameter families of curves for the current analysis. An overall normalization corresponding to the stellar mass acts as a third parameter. A visual representation of these families is shown in Figure 2.

TopHat: Historically, simple stellar populations (SSPs) assumed that a galaxy’s stellar population formed in a single instantaneous burst (Tinsley, 1980). An improvement over that was the extension to constant star formation (CSF) from a start time through the time of observation at a fixed rate. Here we use a twoparameter version of this parametrization, with a start time and a width^{2}^{2}2Since this is a positive definite version of the TopHat wavelet, this illustrates the possibility of extending our method to a wavelet basis.. This is also useful for comparison with quantities in the literature computed using CSF histories, which correspond to setting .
(1) where denotes the Heaviside function with for and for , is the time at which star formation starts, and is the width of the TopHat.

ESF: Exponentially declining star formation rates, a parametrization that performs well for local ellipiticals and for comparison with older literature with the time at which star formation starts and the rate constant of the exponential decline.
(2) 
Gaussian: A parametrization that is useful for describing symmetric episodes of star formation, where is the time at which star formation peaks and is the standard deviation, which sets the width of the episode of star formation.
(4) 
Besselexp: Besselfunction rise, followed by exponential decline (henceforth ). The order of the Bessel function of the first kind, determines when the SFR peaks^{3}^{3}3Although there is no closed form expression for this, it can be easily determined from a lookup table for the zeros of and to linear approximation is Yr, and sets the width of the episode of star formation.
(6) We add a linear piece such that ), to ensure that the set of functions described by this family remains positive definite, while also satisfying at the big bang.
These functions offer the advantages of being able to model short episodes of star formation at specific times (small t) or long periods of star formation where the rate rises and then falls (e.g., Pacifici et al. (2012); Tomczak et al. (2016)). Figure 3 shows a typical star formation history drawn from simulations and fits using the six families of SFHs described above. It can be seen that the standard parametrisations of constant star formation and exponentially declining star formation under and overestimate the stellar mass of the galaxy, while the other families show an improved estimation of the general trend of star formation. Additionally, the expansion of the basis to include all physically motivated combinations of singlecomponent SFHs will allow us to describe SFHs with multiple episodes of star formation separated by periods of relative quiescence in a galaxy’s SFH.
2.2. The SED fitting problem: reconstruction of SFHs
For a Simple Stellar Population, which assumes that all of its stars form at a single lookback time (T) and with the same metallicity (Z), the luminosity at a given wavelength () is simply
(7) 
where is the time since the big bang, is the age of the galaxy at , and is the spectrum giving the luminosity of an SSP of metallicity Z at age t since formation. The SSP spectrum contains assumptions for the IMF, stellar tracks, and metallicity, which we hold constant in the current study. Some of the effects of relaxing this assumption are noted in §4.3, and are discussed further in §6.2.
Generalising from Simple Stellar Populations (SSPs) to Composite Stellar Populations (CSPs), we can then represent the SED for a galaxy with a given star formation history (SFH ) as an integral over all of the star formation events that occurred at different times from the birth of the universe to the time of observation. Composite stellar populations are written as a sum over a nonorthogonal set of star formation histories that satisfy the constraints outlined in §2.1, such that
(8) 
with denoting an overall normalization corresponding to the stellar mass formed by the SFH . Given a basis of SFHs that spans this space, we can expand this instead as a sum over the parameter space, akin to a Fourier expansion, as,
(9) 
where the contribution to the luminosity from an episode of star formation described by a family of curves from eq.(16) with the parameters is given by,
(10) 
Dust reddening and nebular emission lines are then applied to the spectrum as described in §2.3, denoted by the notation . The photometry in passband from the basis SFH parametrised by , is then given by,
(11) 
Using this as a mapping from the basis of SFHs to the space of all physically motivated SEDs, we can then define a surface, which denotes the metric distance in the vector space of photometry between the observed SED and its closest match in the atlas. Finding the reconstructed SFH in the basis is then reduced to an optimization problem on the likelihood surface. For example, with a surface defined using a metric, we get
(12) 
In the following sections, we train the basis set using different mock datasets for which we can quantify both the goodnessoffit in SED space, given by as well as the goodnessofreconstruction in SFH space, given by , defined in §3. We choose basis functions that show sufficient correspondence between the optima of these two quantities, which lets us reconstruct SFHs in the presence of model degeneracies, systematics and instrumental noise.
2.3. Generating the Atlas
In order to implement the densebasis algorithm, it is necessary to first generate an atlas of template spectral energy distributions (SEDs) and then to use it to fit the observed SEDs. This is done as follows:
1. Basis SFHs belonging to the functional families described in §2.1 are generated on a grid of wellchosen discrete parameter values.
2. SEDs corresponding to these star formation histories are then generated using the isochrone synthesis code BC03. (Bruzual & Charlot, 2003), using input parameter ranges as described in Table 1.
3. Nebular emission is added according to the prescription in Orsi et al. (2014) using MAPPINGS III, a one dimensional shock and photoionization code for modelling nebular line and continuum emission. (Allen et al., 2008). We use in this work the precomputed HII region model grid described in (Kewley et al., 2001), with the incident ionization spectra computed using Staburst99 (Leitherer et al., 1999), at , from which we compute the ionization parameter using,
(13) 
This prescription does not add effective degrees of freedom to the atlas and could be expanded to accommodate more realistic emission in future work with higher S/N SEDs.
4. Calzetti Dust Attenuation (Calzetti, 2001) is applied to atlas SED spectra with discrete values of to extend parameter space in dust for procedures where dusty SEDs are fit, using
(14) 
where
with and the coefficients adjusted for in microns. Since attenuation inferred from nebular emission lines differs from that inferred from the continuum (UV spectral slope), we use (Calzetti, 2001), where is applied to both UV nebular continuum and nebular emission lines.
5. After nebular emission lines are added to the spectrum, and dust attenuation is applied, the photometry for the basis SEDs , where denotes the photometric bands, or spectroscopic bins, at a redshift z is given by,
(15) 
where is the transmission curve of passband (the spectroscopic equivalent would be the resolution element and throughput at that ), and is the luminosity distance (a of 10 parsecs is assumed when , as in BC03). For convenience, the flux densities are obtained as the ratio of the number of photons corresponding to the fluxes () to the number of photons produced by a flat spectrum in passband . This yields the observations, predictions and uncertainties in identical units. The notation indicates that nebular emission and dust reddening have been applied to the spectrum.
2.4. Choosing the number of basis functions
In practice, galaxies rarely have sufficiently smooth starformation histories to be perfectly fit by a functional form, as inferred from our mock datasets as well as Hammer et al. (2005); Kelson (2014); Weisz et al. (2011); Sparre et al. (2015); Diemer et al. (2017). In addition, considering the errors in the photometry, incomplete empirical knowledge of the mapping from SFH to SED spaces, and degeneracies between the SFH and other factors like dust and metallicity, we need to assess methods of reconstruction using multiple basis SFHs to reconstruct as close to the true SFH as possible given the quality of available data. Considering a solution to the minimization problem in Eq.12, we can express the BestFit SED as
(16) 
where is the number of components determined using the Ftest, given by,
(17) 
This is used to determine the number of components in the SFH space that the SED should be fit with. The Ftest assesses the null hypothesis that the fit with a larger number of parameters is not a statistical improvement over a fit with a smaller number of parameters^{4}^{4}4To motivate the choice of as our bounding value, it helps to think of the case with an equal number of degrees of freedom (), where a better statistical model has , which corresponds to . For the general case of , the cutoff provides a metric where statistical improvement is sufficient to justify the extra degrees of freedom., where are degrees of freedom corresponding to the number of components () being fit with, with denoting the number of photometric bands.
We then reconstruct the SFH according to the optimal components of the likelihood surface for the chosen .
2.5. Estimating Uncertainties
We estimate uncertainties for the reconstructed SFHs via a fully forward modeled frequentist approach using the likelihood surface of the fit, after rescaling the bestfit to correct for artificially low obtained for very noisy galaxies and artificially high values for the brightest galaxies.
A subsurface of the complete likelihood surface is then obtained by imposing a cutoff using a procedure similar to (Avni, 1976). We compare the SFH corresponding to each point in the subsurface to the median SFH and exclude outlier SFHs that have an excursion greater than 1.5 times the maximum value, yielding robust confidence intervals as in (Xie & Singh, 2013; Zhao & Ma, 2016). The uncertainties in SFR at each point in time are then found using a distribution of the remaining acceptable SFHs. Our tests using the sample of 1200 mock SFHs show that this method robustly estimates the confidence bounds, such that for a formal confidence interval, the true SFH lies within the confidence interval of the time. We show the surface computed using this procedure for a single family in Figure 4 showing the bestfit SFH and threshold for uncertainties. In Figure 5 we show some representative examples of the uncertainties with the top panel demonstrating the method’s ability to constrain an older episode of star formation and the bottom panel showing the case of uncertainties with multiple episodes of star formation.
2.6. Choice of parameter space and photometric bands
For the initial implementation of the method in this work, we have used the Bruzual and Charlot (2003) library of stellar tracks, with the parameter space as described by Table 1. The Dense Basis formalism can be applied equivalently with any set of free parameters, including the set of SSP models, to use the training and validation steps to help constrain the variable parameters, as discussed in §6.2. All three datasets are standardized to contain a sample of 400 galaxies with the same realistic distribution of stellar mass.
Parameter choice  range  
IMF  Chabrier/Salpeter   
SED generation:  BC03   
Bands fit  11   
Tracks  Padova’94   
Metallicity    
Dust law  Calzetti  
SFH form  Linexp  Gyr, Gyr 
”  Gaussian  Gyr, Gyr 
”  Besselexp  Gyr, Gyr 
”  lognormal  Gyr, Gyr 
”  exponential  Gyr, Gyr 
”  tophat  Gyr, Gyr 
For training and validation, we consider fitting 11 of the 17 CANDELS GOODSS (Guo et al., 2013) bands: [u_ctio, HST/ACS F435w, F606w, F775w, F850lp, HST/WFC3 F105w, F125w, F160w, VLT/HAWKI Ks, and Spitzer/IRAC 3.6,4.5m], excluding , F814w, F098w, and Isaac Ks for the maximum photometric orthogonality, excluding IRAC 5.8 and 8.0m since the BC03 tracks do not account for the PAH emission that appear in those bands at z=1. Once the method was tested, we expanded to include the , F814w, F098w, and Isaac Ks bands as well, leading to fits using 15band photometry in §4.2. The training is performed on the mock datasets described in §.3, the validation is performed using both the mocks datasets as well as the CANDELS sample for which SpeedyMC results are available. Finally, the results are compiled using the full CANDELS sample at .
We present results at z=1 in the current work since it allows us to analyse restUV information that comes into the UBV bands as well as the Balmer 4000A break, while avoiding dust reemission in the midIR. This choice of redshift and filter set is compatible with the BC03 SPS models while providing a moderate S/N regime in which to test the reconstruction of SFHs. The procedure can be generalized to all redshifts and is discussed in §6.3.
3. Training the SFH families
To inform the choice of a functional form for the SFH basis, we train and validate the method with three mock datasets
To inform the choice of a functional form for the SFH basis, we train and validate the method with three mock datasets of 400 galaxies each, drawn from SemiAnalytic models, Hydrodynamical simulations, and stochastic realisations of star formation histories. We work with multiple datasets to minimise the effect of any single training set on our choice of SFH families. Using these three mock catalogs, we look at various families of 2parameter curves, and their combinations, to find the families that perform best at reconstructing SFHs. The atlas generated using that basis is then used to fit the real catalog. Before we go into the details of the training procedure, we first briefly describe the three datasets being used.
3.1. Training with SAMs
The first dataset is drawn from mock catalogs with known realistic star formation histories from stateoftheart SemiAnalytic Models (Somerville et al., 2015). These simulations use dark matter halo ‘merger trees’ extracted from dissipationless Nbody simulations in a CDM universe (Klypin et al., 2011) to determine the masses of dark matter halos collapsing at a given epoch, following which halos merge to form larger structures. In this framework, SAMs use analytic recipes to model the radiative cooling of gas, suppression of gas infall, and cooling due to the presence of a photoionizing background, collapse of cold gas to form a rotationallysupported disk, conversion of cold gas into stars, and feedback and chemical enrichment from massive stars and supernovae. A more recent generation of SAMs also includes prescriptions for the growth of supermassive black holes and the impact of the energy they release on galaxies and their surroundings (Croton et al., 2006; Bower et al., 2006; Somerville et al., 2008). Recent comparisons have shown that SAMs produce similar predictions for fundamental galaxy properties to those of numerical hydrodynamic simulations, perhaps because of the common framework of CDM, which dictates gravitationally driven gas accretion rates (Somerville et al., 2015). However, SAMs require orders of magnitude less computing time for a given volume than hydrodynamical cosmological simulations. The resulting galaxy star formation and enrichment histories are outputs. We then use these SFHs to produce SEDs to train and validate against, using a realistic mass distribution of galaxies with .
3.2. Training with hydrodynamic simuations
We also train the method against a set of SFHs obtained from the MUFASA meshless hydrodynamic simulations (Davé et al., 2016), which satisfies multiple observational constraints like the stellar mass  halo mass relation (Behroozi et al., 2013; Munshi et al., 2013), the mass metallicity relation (Steidel et al., 2014; Sanders et al., 2015), and the SFRM relation (Speagle et al., 2014; Kurczynski et al., 2016). The star formation histories arereported as instantaneous star formation events that take place, ranging from to events for different galaxies. We restrict the fits to galaxies with , which have well defined SFHs in the simulation. To ensure the SFHs are not artificially stochastic, we generate the SFHs by convolving the instantaneous star formation events using an Epanechnikov kernel with a width of 100Myr (R.Dave, private comm.) The galaxies in MUFASA follow a realistic mass distribution that we use to sample all three mock datasets. We restrict the fits to galaxies with , which have well defined SFHs in the simulation. .
3.3. Training with stochastic SFHs
Following the prescription of Kelson (2014), we generate stochastic SFHs with different values for the Hurst parameter H, which quantified the autocorrelation of the SFH. A value of corresponds to a random walk in , is correlated and is anticorrelated (Mandelbrot & Van Ness, 1968). This provides a sample for testing the other families against to determine possible biases. We generate different SFHs by varying the Hurst exponent, which encodes the longtime correlation of the stochastic SFHs in , sampling galaxies with the same mass distribution as the SAM and hydrodynamic SFH samples. We exclude from the sample SFHs with a Hurst parameter since these do not correspond to realistic looking SFHs.
3.4. Training procedure and results:
We quantify the correspondence between the goodnessoffit in SED space and the goodnessofreconstruction in SFH space as a metric to judge the success of each family of basis functions.
For the SED goodnessoffit, we use , given by,
(18) 
where the index k sums over the entire basis of SFHs, with a number of components determined using the Ftest as described in §2.4 and the index j sums over the photometric bands. The is optimized for each basis function, effectively making stellar mass the normalization. The global minima of the surface corresponds to the maximum likelihood, given by .
To quantitatively compare how well the families perform at reconstructing the SFHs of the galaxies in the three mock catalogs, we quantify the accuracy of reconstruction of the SFH by computing the statistic, given by,
(19) 
where (AndersonSprecher, 1994) quantifies the amount of variance explained by the fit. We set an ambitious goal for the reconstruction by asking if it does as well as direct fits in SFH space to the true SFHs using polynomials of the same order. Since the true SFHs exhibit a large amount of stochasticity, the question of good due to overfitting does not usually occur and is handled by the Ftest in §4.1. We define the statistic in logarithmic time; since the SED is sensitive to changes in the SFR over roughly equal logarithmic intervals of time, this provides a more sensitive estimator. To handle all three datasets on the same footing, since they contain SFHs with differing amounts of structure and stochasticity, we apply a small nonparametric smoothing (Cleveland, 1979) to the SFHs. This statistic has proved to be the most robust for the current application, matching the qualitative results with other statistics, as detailed in Appendix C. ranges from , with the most successful reconstruction given by .
In the noiseless regime, most galaxies show the expected correspondence between the goodnessoffit and goodnessofreconstruction, especially in the regime of high likelihood (). Since we can access only observationally, this correspondence is important since it allows us to obtain a good reconstruction for a galaxy whose SED is well fit. In order for an SFH family to be robust, we require that the SFHs for the ensemble of galaxies should be reconstructed as well as possible, comparing with direct fits to the true SFH using a polynomial with the same number of degrees of freedom.
In Figure 6, we show the computed for each mock dataset using all six SFH families, showing that the Linexp, Besselexp, Gaussian and Lognormal families perform better overall at SFH reconstruction in comparison to the traditional parametrisations of constant and exponentially declining star formation histories. On the basis of this, we we prune our basis SFH set to retain only the Linexp, Besselexp, Gaussian and lognormal families, hereafter denoted ‘Best4 basis’, to be used in further work.
As an additional step of validating our training statistic we examine the correspondence between and a related statistic, , computed using the uncertainties obtained through the method outlined in §.2.5 in Appendix E. We also study the possible biases that could arise in the reconstruction with a particular family, and how our choice of basis mitigates them.
4. Validation
4.1. Validation using three datasets: Hydrodynamic Simulations, SemiAnalytic Models and Stochastic SFHs
Having trained our method to arrive at an optimal basis for the dataset in consideration, we now apply the SED Fitting method to the full sample of 1200 SFHs drawn from the hydrodynamic simulation, SemiAnalytic Model and the stochastic realizations.
Realistic simulated photometric noise has been applied to the mock SEDs to simulate observing conditions. The noise consists of a multiplicative factor corresponding to the zeropoint uncertainty in each band: 3% for the spacebased bands, (HST/WFC3 and HST/ACS) and 10% for the groundbased bands (U ctio, U vimos, Isaac Ks, HawkI Ks) and IRAC Ch.14, as well as a photometric additive factor corresponding to the median errors in each band computed from the CANDELS dataset. With these added in quadrature to yield the for each band , simulated fluxes were drawn from a gaussian distribution .
We show fits using the Best4 basis that is a combination of Linexp, Besselexp, Gaussian and lognormal families, as was determined through the training step in §3.1. The galaxy SEDs have been fit with an atlas consisting of two component basis SFHs. This basis is constructed using all physical combinations of elements from the single episode basis and is seen to have a smaller scatter around the true values, as described in §.6.1. In Figure 7, we show the reconstructed SFHs for two randomly selected galaxies from each mock dataset, illustrating the recovery of both recent episodes of star formation, as well as the overall trend of star formation including periods of relative quiescence.
In Figure 8, we illustrate the recovery of four physical quantities: the stellar mass (), the star formation rate averaged over the last 100Myr (), the lookback time over which the galaxy accreted of its observed stellar mass(), and dust extinction (). The top row depicts the results for the stochastic realizations, the middle row for the hydrodynamic simulations, and the bottom row for the SemiAnalytic Models. All the fits show the bias to now be significantly smaller than the scatter that may occur from using a single family of SFHs, as discussed in §6.1. The scatter in stellar mass increases at lower mass, corresponding to more noisy SEDs, while the increased scatter in SFR as compared to the values in §.6.1 is more due to the presence of dust than noise since fits without dust show a much smaller scatter of dex. The 0.3 dex scatter in the reconstruction of is reasonable. However, the distributions for the SAMs and the hydrodynamic simulations look poor due to the narrow range of true values for these models with the top row being more representative of the method’s performance with a broader distribution of . Reconstruction of simulated dust drawn from an exponential distribution is done using an atlas containing 25 values of dust ranging from to using the Calzetti dust law, with reasonable scatter of and negligible bias of .
We find that our choice of basis yields comparable good results to all three mock datasets, with the bias in the estimation of these physical quantities derived from the SFH not exceeding 0.05dex, as seen in Figure 8. This is an important criterion to be met before the method is applied to observational data, since it lets us relax the assumption that the SFHs corresponding to the training SEDs match the actual star formation histories of galaxies at a given epoch, in favor of the slightly weaker assumption that the SFHs are drawn from a similar distribution. The ensemble results show that the reconstruction is nearly unbiased for the physical parameters of interest and can be used to extract a variety of derived quantities from the SED of distant galaxies in a robust manner.
We also present results in Figure 9 for the fraction of a sample of galaxies that are reconstructed with a second episode of star formation. For our three datasets, we perform the Ftest using Eq.18, with , and obtain the fraction of galaxies that are significantly better fit with a second component of star formation. In some cases, the second component has similar peak time and serves only to modify the SFH shape, eg. Gaussian + lognormal, with a single peak; we term these single episode SFHs. We then find the fraction of the mock galaxies that have a distinct second peak to their reconstructed SFH, which can only happen when the reconstruction prefers a second component. We compare this number to the number of galaxies in the sample whose true SFH has two episodes of star formation, computed using a peak finding routine. Since the true SFHs show a large amount of stochasticity, only the most prominent peaks with a separation greater than 100Myr are selected by smoothing over the local variations as in §3.1 and finding the lookback times at which the SFH peaks, using and . The results are summarized as boxplots in Figure 9, for two mass bins chosen such that roughly half the sample lies in each mass bin. For the high mass bin, the higher S/N leads to more accurate predictions of the fraction of SEDs with more than one episode of star formation. Since our atlas is restricted to SEDs corresponding to physically motivated SFHs, we generally do not overfit the noise, as is seen by the small number of false positive results^{5}^{5}5excluding the SAM (realZ) results, which we discuss further in §4.3. However, a large amount of noise makes it more difficult for the Ftest to detect a statistically significant improvement to the fit. This leads to a systematic underestimation of the fraction of galaxies with more than a single episode of star formation, as shown in our results for the lowermass galaxies. The decreased S/N also results in the fraction of fits with false negatives^{6}^{6}6which can be expressed as , where are the fractions of false negatives and positives, and are the true and reconstructed fractions of galaxies with multiple episodes of star formation. being higher in this mass bin.
4.2. Validation against SpeedyMC results for CANDELS SEDs
We now apply the method to a sample of 1100 CANDELS galaxies in the GOODSS field at from Kurczynski et al. (2016), for which we have physical quantities derived using SpeedyMC^{7}^{7}7a much faster version of the GalMC Markov Chain Monte Carlo algorithm (Acquaviva et al., 2011):
http://www.ctp.citytech.cuny.edu/vacquaviva/web/GalMC.html (Acquaviva et al., 2011, 2015) for 742 galaxies. This sample provides a good representative redshift to test the method for the recovery of SFHs at moderate S/N, as discussed towards the end of §2.6. We perform the fitting with discrete values for and , which adds some scatter to the results. For redshift, we choose bin edges at , and we let vary from 0 to 2.5 in increments of 0.1. The purpose of this comparison is to ensure that our SED fitting code developed to implement the dense basis method, which was also used to generate mock SEDs, does not contain circular errors. Additionally, it is a useful test to match the physical quantities that can be recovered through traditional SED fitting before presenting previously inaccessible quantities.
In order to make the comparison as consistent as possible, we match the initial conditions of the fitting procedure to the SpeedyMC parameter space, as summarized in Table 2, and limit our SFH basis to singlecomponent Linexp curves. In Figure 10, we show the results comparing our fits to the SpeedyMC results for the stellar mass (), SFR and . The slight bias in could be due to the difference in the way the two codes implement nebular emission. The colorbars denote the spectroscopic redshifts corresponding to the observed galaxies.
Dense basis  SpeedyMC  
IMF  Salpeter  Salpeter 
GoF:  
SPS:  BC03  BC03 
Bands fit  
Metallicity  
Dust law  Calzetti  Calzetti 
Nebular emission  MAPPINGS III  custom 
SFH form  Linexp  Linexp 
4.3. Validation against SAM SEDs with multiple metallicities:
We address a final possible source of systematic bias in the fits: the assumption of a single metallicity () in building the atlas and performing the fits at . To take into account the distribution of metallicities found in real galaxies, we go back to the SAM SFHs and consider the individual metallicity components of the overall SFHs. We generate spectra corresponding to each of these metallicities using six values of metallicity available for the Padova’94 tracks in BC03, given by . Using this procedure, we obtain spectra corresponding to the SFH in each metallicity bin and use a weighted sum to obtain SEDs corresponding to galaxies with realistic metallicity histories, which we denote as SAM (realZ). We then fit these SEDs with our singlemetallicity basis to test how robust our fits are at .
In the left panel of Figure 11, we show the distributions of observed metallicities at in the SAMs, weighted using the dominant contribution to the total mass of the galaxy for realistic sampling in Stellar Mass () used in §4.1. Upon examining the SEDs corresponding to a sample of galaxies of different ages generated by combining the spectra corresponding to the star formation histories in each of BC03’s six metallicity bins, we find that older, more massive galaxies are more metalrich at the observed epoch and thus show a greater deviation from the template SEDs, which currently assume for the entire SFH.
In the four panels to the right of Figure 11 we show the reconstruction of physical quantities (, , and ) for the cumulative SEDs with realistic metallicities. The increased bias in the appears to be the result of poorly fitting older galaxies, which have much higher metallicities than those in the atlas. For Gyr and older, galaxies tend to have at the time of observation. This effect in addition to the narrow distribution of true causes the scatter in to appear poor even though it is comparable to the fits in §4.1. The recovered SFHs themselves are still representative of the true SFH of the galaxy up to a lookback time of , after which the degeneracies in the surface due to the contributions from older stars, dust and differing metallicities impose larger uncertainties on the reconstruction by a factor of .
In Figure 9, we now focus on the results in the last columns in each mass bin. The results agree well in the high mass bin, due to roughly equal numbers of false negatives and positives. The net results in both mass bins are still acceptable, as a result of which the method is still valid even in its current simple realisation with a singlemetallicity.
5. Results
The Dense basis method of SEDfitting allows us to reconstruct the star formation histories of galaxies in a nonparametric fashion, not being restricted to the choice of a particular number or family of basis SFHs, while being able to compress the reconstructed SFHs using a small number of parameters to describe a best fit. We show the results of applying this method to our sample of CANDELS galaxies at and mock SAM galaxies at .
5.1. Going beyond ’age’ and instantaneous SFR
The ‘Age’ of a galaxy, defined as the lookback time at which the galaxy first started forming stars , is not as meaningful with realistic SFHs as it used to be with simple stellar populations, which formed all their stars at a single lookback time, given by the Age (Tinsley, 1980; Bruzual & Charlot, 2003). Realistic SFHs as seen in the SAM and the hydrodynamic simulations may maintain a small amount of star formation before ramping up to a major episode of star formation, which results in the true Age for most galaxies approaching the age of the universe. Since the SED of a galaxy is most sensitive to its largest episodes of star formation, with its sensitivity decreasing as we go back in lookback time, the ’Age’ recovered through SED fitting methods is not a robust physical quantity. However, if we were to estimate the lookback time at which the galaxy accumulated the first of its observed stellar mass, we estimate the lookback time at which any major star formation activity in the galaxy started. While the distributions of the Age and are similar for a given sample of galaxies, the latter is a more meaningful quantity in terms of studying galaxy growth and evolution and is more robustly estimated through SED fitting. (Pacifici et al., 2015, 2016) This can be seen from the top panels of Figure 12, with the right panel showing noiseless reconstructions of the Age, and the left panel showing noiseless reconstructions of for all three samples of galaxies used in §4.1 using the same basis set and format for the plots. The latter quantity is more robust, as can be seen from the reduced bias and scatter in the estimation of .
In a similar manner, due to the large amount of stochasticity that realistic SFHs show, it is more robust to estimate the Star Formation Rate (SFR) averaged over the last 100Myr in lookback time, rather than the instantaneous SFR, as shown in the bottom panels of Figure 12. The panel on the left denotes , which has less scatter than , shown on the right. It is widely appreciated that broadband SED fitting is primarily sensitive to SFR averaged over the past 100Myr^{8}^{8}8However, when nebular emission lines are strong enough to contribute significantly to the broadband photometry, SED fitting can probe Myr timescales. (Conroy, 2013; Johnson et al., 2013), but SED fitting traditionally reports SFR in its chosen parametrization nonetheless. With rapid rises and exponential declines possible, these quantities can differ significantly, leading to the extra scatter in the bottom right panel of Figure 12.
5.2. The number of episodes of star formation experienced by CANDELS galaxies
It is an important feature of the densebasis method to be able to recover the number of strong episodes of star formation in a galaxy. Doing so allows us to detect recent bursts of star formation, or a period of relative quiescence between episodes of continuous star formation, with the amount of data that can be extracted depending upon the S/N. This can then be used to infer valuable information about the galaxy’s evolution and merger history.
In this paper, we have demonstrated the use of an Ftest to detect if the addition of a second component of star formation is a statistically significant improvement to the fit. This is then used to infer the fraction of galaxies whose SFHs contain a second major episode of star formation, and was validated for the mock galaxies in the high S/N regime in §4.1. For our current sample of 1100 CANDELS GOODSS galaxies, we can reliably fit 790 galaxies, with the remaining galaxies either having poor or with missing fluxes in multiple filters, preventing robust estimation of the SFH and its uncertainties. The Ftest then determines that 134 galaxies out of the sample of 790 galaxies show a statistical improvement upon being fit with a second component, of which 117 galaxies contain a second episode of star formation. This corresponds to roughly of the sample, similar to the results for the mocks. Figure 13 shows six examples of the procedure, showing three galaxies that were fit by a single basis SFH and three with two components.
Additionally, we provide a breakdown of the fraction of galaxies in each mass bin from shown in Figure 14. This figure reveals a decrease a significant decrease in the fraction of galaxies that are fit with two major episodes of star formation as the stellar mass increases above . As seen in §.4.1,we expect to underestimate the fraction of 2episode galaxies at lower masses in the CANDELS sample. Hence the increased number of 2episode galaxies at is a robust indication that 2episode galaxies are more common at lower mass. This discrepancy between the data and simulations is intriguing.
5.3. Constraints on timing and duration of episodes
Using the reconstructed SFHs for our CANDELS sample, we can obtain constraints on the timing and duration of episodes of star formation. This is possible since the reconstructed SFH using our well motivated basis SFH set captures the general trend of star formation, even if the finer stochastic details are lost. For each fit, we obtain the number of episodes of star formation, the lookback time of peak star formation, and the FWHM of that episode, thus obtaining the timescale of starformation episodes both on a galaxybygalaxy as well as an ensemble basis, as shown in Figure 15. For of the galaxies, we find that the SFH is still rising at the time of observation, comparable to for galaxies from the SAM and Hydrodynamic Simulation. In estimating the width of an episode of star formation, we estimate the width of an episode up to the time of observation, leading to truncated widths for the subsample of galaxies whose SFHs are still rising, shown in red in the histogram. We find that the widths for our sample are smaller by a factor of than those for the mock galaxies. This discrepancy bears further investigation, with a similar difference seen in Diemer et al. (2017). Additionally, we can also find the interval between episodes of star formation for the galaxies that are reconstructed with two episodes of star formation.
5.4. Statistics of and uncertainties
The reconstructed SFHs for the CANDELS galaxies computed using the Dense Basis method are used to compute the lookback times at which the galaxy accumulates a certain fraction of its observed mass. These quantities, defined , satisfy the equality,
(20) 
Generalising from §5.1, this lets us follow the mass assembly using the lookback times at which the ensemble of galaxies accumulated a certain fraction of its observed mass. We do this for the CANDELS sample at in Figure 16, providing histograms showing the overall lookback times at which the individual galaxies accumulated 10%, 50% and 90% of their observed stellar mass. This allows us to infer the overall growth and evolution of galaxies at that epoch.
6. Discussion
6.1. Biases from using single SFH parametrizations
The flexibility in the choice of SFH family used for SED fitting makes it possible to quantify the bias introduced in the estimation of physical quantities due to the choice of SFH parametrization used. We briefly list these biases at for the six families of SFHs presented in this work, highlighting the particular families that perform best at the estimation of a particular quantity. For the seven physical quantities listed in Tables 3 and 4 below, we formulate the bias and scatter as the median and standard deviation of the histogram , which gives the scatter after taking the bias into account. This is done using physical quantities computed using the reconstructions of the SFHs of the 1200 mock galaxies from §4.1 with a realistic mass distribution, using fits without dust or noise to highlight the bias due to the SFH parametrization. We have included the CSF, TopHat and Exponential biases in Table 3 below in an effort to standardize quantities in comparison to older literature, while also listing the reduced bias and scatter with the full Dense Basis method with up to two components of basis SFHs from the Linexp, Besselexp, Gaussian and lognormal families. In order to ensure a fair comparison, all families contain the same number of basis SFHs and are dense enough to converge, i.e., a denser grid on the parameter space does not change the results significantly.
SFR  SFR  Age  

CSF  14%  5%  4%  24%  13%  19%  43% 
tophat  17%  2%  11%  27%  24%  41%  59% 
exponential  20%  2%  7%  21%  34%  55%  70% 
Linexp  18%  1%  7%  18%  28%  39%  50% 
Gaussian  16%  1%  7%  20%  31%  27%  16% 
lognormal  14%  2%  3%  16%  25%  33%  26% 
Besselexp  19%  1%  7%  21%  34%  42%  43% 
Dense Basis  6%  4%  1%  4%  4%  22%  29% 
SFR  SFR  Age  

CSF  28%  43%  40%  36%  48%  51%  37% 
tophat  28%  50%  49%  35%  44%  46%  32% 
exponential  27%  16%  20%  34%  36%  36%  26% 
Linexp  29%  23%  25%  34%  38%  37%  29% 
Gaussian  24%  26%  24%  29%  31%  28%  28% 
lognormal  26%  16%  20%  35%  34%  31%  26% 
Besselexp  23%  16%  18%  29%  30%  25%  27% 
Dense Basis  13%  7%  10%  18%  43%  34%  20% 
Almost all the families tend to underestimate the stellar mass. However, the scatter in is generally larger than this bias except for the TopHat family. The scatter is even larger for SFR, and thus the bias doesn’t significantly affect the results except at the low SFR () end, as seen from Figure 8. Age has the greatest bias of all the estimated quantities, and it can be seen that it decreases when we estimate , which is a more robust quantity, as we proposed in §5.1. About of the mock galaxies form of their mass at . The small contribution to the observed flux from these older stars is dominated by more recent contributions, as a result of which the method does detect these older stars and underestimates the age. Since most of the mock galaxies start forming stars at , the distribution of true ages is extremely narrow and can only be underestimated, since the method does not allow . An artifact of this bias is also seen in , although it is smaller. However, since the Dense Basis method recovers the major episodes of star formation and the bias is largely due to the distribution of the true ages, the reconstructed SFHs are robust. In a similar vein, the bias decreases in considering and even further with . Age has a lower scatter than , since most galaxies start forming stars at , and this creates a narrow distribution for the true Ages. For SED fitting methods that use Age, setting would lead to a bias of and a scatter of , fully competitive with any of the single families. The scatter in for the Dense Basis method is also in part due to expanding to a larger parameter space, which yields a smaller bias at the expense of increased scatter regulated by the Ftest. The TopHat, exponential and Linexp parametrizations have a large bias in age, and should be accounted for in comparisons of ages in the literature. is the most robust of the massassembly times, with the Linexp and lognormal families performing best in its estimation. The Dense Basis method offers the least scatter in M, SFR, and is nearly unbiased in these quantities, as well as .
6.2. Comparison with other methods
The Dense Basis method offers an SED fitting approach that minimizes the bias and scatter introduced due to traditional SFH parametrizations. In this section, we consider comparisons with existing methods of SFH reconstruction. MOPED (Heavens et al., 2000) fixes logarithmic time bins and finds the SFR in each bin with a piecewise constant SFH using fitting with data compression, giving more weight to those pixels in the spectrum that carry most information about a given parameter. VESPA (Tojeiro et al., 2007) adaptively bins the lookback time, i.e., the in Eq.11, provided there are enough free parameters to avoid overfitting. Dye’s (2008) method adopts a similar approach with photometry, but uses regularization in order to the make the SFR in each bin positive, which might bias the likelihood surface and is computationally more expensive. None of these methods reconstructs smooth SFHs; the fits do not provide us with SFHs that allow us to analyse the number of episodes of star formation or to analyse the peak times and widths of star formation episodes. The method introduced here uses a physically motivated functional form of SFHs that requires a smaller number of free parameters to fit the SFH, thus obtaining smooth SFHs with multiple components through photometric SED fitting, comparable to what was previously accessible with spectroscopy or CMD reconstruction (Weisz et al., 2011). Another advantage is the ability to use real SEDs to test functional forms for a best match against starformation mechanisms at a given redshift. The usage of wellmotivated parametrised functional forms instead of time bins with variable heights allows us to obtain a smooth reconstruction of the SFH with a smaller number of parameters without the need for regularization, since the basis SFHs are smooth and positive definite.
6.3. Possible extensions of the Dense Basis method
In addition to the two parameter families described in §2.2, it is possible to extend the approach to a larger parameter space by using families of curves including the 4parameter families described in Simha et al. (2014) and the Exponential+ power law (Behroozi et al., 2013),
(21) 
where , and indexes the time at which star formation begins.
Currently, however, we restrict our attention to the two parameter families since we also consider combinations of curves from these families, which let us model a much more versatile set of trajectories in SFH space.
An advantage of our method is that it will recover only as many SFH basis components as are needed to produce a good fit to the SED, thus enabling us to extend the procedure to reconstructing metallicity histories, and to use multiple dust extinction models. It is also possible to extend the code to additional SPS models, which is naturally incorporated with the Conroy FSPS models (Conroy & Gunn, 2010) that contain the BaSTI and Padova isochrone sets. Model dependency due to the choice of tracks and IMF is also an issue that could be incorporated into future versions, which will have more data available that can be used to address degeneracies between different sets of isochrone synthesis models, stellar evolution tracks, and IMF choices.
Additionally, the superposition of ‘stochastic’ bursts on top of these smooth functional forms has been better shown to reproduce the observed spectroscopic properties of individual galaxies (Kauffmann et al., 2003; Brinchmann et al., 2004). This can be explored in future applications of the dense basis method to spectroscopic data, using realistic stochastic SFHs as in the approach of (Pacifici et al., 2015), or the theoretical stochastic SFHs from (Kelson, 2014).
The current formulation is frequentist, and the training and validation produce parameter uncertainty estimates consistent with this approach. A Bayesian formulation of the method is certainly possible, but since the priors on SFHs are poorly known, significant care would be required.
6.4. Handling Big Data
A large amount of data will be generated from the upcoming generation of surveys including LSST (Ivezic et al., 2008), HETDEX/SHELA (Papovich et al., 2016) and JPAS (Benitez et al., 2014), which will yield a mixture of broadband photometry and spectroscopy for galaxies.
Methods for analysing these galaxies using SED Fitting techniques need to be both computationally efficient as well as capable of handling and storing large volumes of data in a memoryefficient manner.
The dense basis method was designed with these two requirements in mind. It takes for a single run on a 2.9 GHz laptop, albeit with large memory requirements for storing the 2component basis, which runs to with 18 values of and values of . After the initial generation of the atlas, the fits themselves can be stored simply by saving the index of the bestfit SED and the normalization for each component, leading to efficient storage of the fits as coefficients for each reconstructed SFH.
6.5. Broader Data science applications
This method can be used to solve problems of the general type
(22) 
where d represents a vector of observables, i.e., galaxy SEDs in the current work, and the functionals represent possible SFHs. The index sums over basis functions and refers to multiple photometric bands. We adopt functionals that can be shifted by varying and scaled by varying because this is reasonable for the underlying physics of star formation. This is not a requirement for solving Eq (22) and additional constraints upon the functionals will depend upon the problem being considered. Upon generalization, this formulation is particularly useful for the class of problems where constrained observed data is used to recover quantities in an otherwise inaccessible parameter space, such as singleepoch observations of historical processes. In the absence of a known analytic mapping from the parameter space to the space of observables and the lack of a definite correlation between the goodness of estimation in these two spaces, traditional methods like MonteCarlo estimation through the parameter space need not lead to accurate estimation of the , since a good fit need not correspond to an accurate reconstruction of the functional. Methods like Principal Component analysis may be used in the parameter space, but the principal components do not always correspond to physical representations of the observables. Such situations can frequently arise due to the presence of noise and degeneracies between different parameters that affect the observables.
In such cases, it is possible to apply the training method described in the current work, based on pruning a training atlas from a large space of informed estimates from empirical observations and statistical motivations, leading to an oversampled nonorthogonal ‘Dense Basis’. This lets us perform any subsequent fitting to the data in a subset of parameter space where the correspondence between the goodnessoffit and goodnessofreconstruction exists and is well defined. Since the functionals in the parameter space are well motivated, they do not span the space of all observables and are robust to noise that would correspond to ‘unphysical’ results. In the current framing, the method is readily applicable to timeseries problems, where the observables are integrated quantities depending on the overall nature of the timeseries.
In an expanding arsenal of datascience tools, the Dense Basis method provides a convenient formalism to solve the above class of problems in a tractable manner, and to train and implement a solution finding method. The advantages of using this method include not having the constraints of regularization imposed by matrix inversion methods or suffering from the lack of correlation between observables and principal vectors in solution space that techniques like PCA exhibit, while also being robust to noise.
7. Conclusions
The standard assumption of a simple parametric form for galaxy Star Formation Histories (SFHs) during Spectral Energy Distribution (SED) fitting biases estimations of physical quantities and underestimates their true uncertainties. In this paper, we introduce the Dense Basis method, which offers a general approach when a vector of observed data points d can be modelled as a sum of positivedefinite, continuous functionals m obeying d m. Here we apply it to the case where d represents a galaxy SED and the functionals are possible SFHs.
We train the method using SFHs from mock catalogs at from three different sources: a Semi Analytic Model (SAM), meshless hydrodynamic simulations, and stochastic realizations. We do this to ensure that the method can successfully reconstruct a wide variety of SFHs allowing us to relax the assumption that our training SFHs are prefectly representative of the true SFHs of galaxies at that epoch. The training step allows us to compare the goodnessoffit in SED space to the goodnessofreconstruction in SFH space. We use this comparison to eliminate SFH families that provide poor or biased reconstructions, leading us to drop the TopHat and Exponential families from our basis, while keeping the Linexp, Besselexp, Gaussian and Lognormal families.
A basis consisting of these four families and their combinations is then used to apply the Dense Basis method to the broadband CANDELS photometry of a sample of galaxies at . The method allows us to accurately estimate physical quantities of interest that explicitly depend on the SFH, notably Stellar Mass (M), and SFR, which we note is more robust than SFR and dust attenuation. The method also allows us to estimate previously inaccessible quantities, including the number and duration of star formation episodes in a galaxy’s past, and the lookback times at which the galaxy accumulates of its observed mass, which are more robust quantities than the Age of a galaxy, and allow us to track the galaxy’s growth and evolution as a function of lookback time. The current frequentist implementation of the method allows us to estimate confidence intervals for these quantities. We quantify the bias and scatter in these quantities due to various SFH parametrizations including the traditional parametrizations of constant and exponentially declining SFHs.
The method can be expected to have broad data science applications, and can be scaled and applied to high S/N spectrophotometry from upcoming surveys across all redshift ranges to reconstruct the SFHs of individual galaxies, as well as to infer the growth and evolution of the ensemble of galaxies at various epochs.
8. Acknowledgements
The authors would like to thank the anonymous referee, as well as Viviana Acquaviva, Robin Ciardullo, Caryl Gronwall, Alex Hagen, Boris Leistedt, Regina Liu, Camilla Pacifici, David Spergel, MinGe Xie and Greg Ziemann for their insightful comments and suggestions, Rachel Somerville for providing the SAM SEDs and SFHs, Peter Kurczynski for providing the SpeedyMC results from Kurczynski et al. (2016), and Romeel Dave for providing the MUFASA SFHs. The authors acknowledge support from Rutgers University. This material is based upon work supported by the National Science Foundation under Grant No. AST1055919. Support for Program number HSTAR14564.001A was provided by NASA through a grant from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Incorporated, under NASA contract NAS526555.
Appendix A Consistency across filter curves
It is possible to perform fits to the mock galaxies observed at different redshifts and ensure that the reconstructed SEDs yield physical quantities that are robust independent of the redshift. This analysis can be extended to determine the redshift range across which a given atlas is robust, since the amount of information contained within the filters changes with redshift.
Since we restrict our mock dataset to and the observed dataset to in current work, we perform this consistency check fitting the same mock galaxies whose rest frame spectra are computed considering them to be at and . We perform Dense Basis fitting on the galaxies, and compare the derived quantities and find that the estimation of these quantities remains robust within uncertainties.
Appendix B Dotproduct SED fitting as a computational speedup
We present an additional approach to finding the optimal reconstruction given an atlas of SEDs using a nonorthogonal dotproduct, i.e., a projection product, that might prove to be a useful computational speedup for dealing with large datasets.
Since the projection product is done in a nonorthogonal basis, reconstruction of the original vector using the dotproduct coefficients is more involved than the procedure in the case of the inner product in an orthogonal space. Various methods have been tested for this reconstruction, including iteratively refitting the residuals as long as they remain above the noise level, constructing a reduced orthogonal space by projecting out components of vectors along a principal component, and constructing an expanded basis of linear combinations of the basis functionals. This method is expected to operate on the timescale of operations, where N is the size of the basis and M is the number of bands of the photometry in consideration.
The bestfit is estimated through a nonorthogonal equivalent of a dot product in the photometric vector space, through a mapping given by,
(B1) 
where is a mapping such that .
For an equivalent orthogonal basis, the dot product coefficient is given by the same mapping, with an additional constraint imposed due to orthogonality, which is,
(B2) 
which allows us to reconstruct the original vector simply using
(B3) 
However, in the absence of orthogonality, we turn to more involved methods of reconstruction, bounded by both computational costs and error margins on the photometry, which could lead to overfitting if not accounted for.
Other factors held constant, the coefficient of the photometric dotproduct indicates the projection of the true SFH of the galaxy on to the basis SFH. Therefore, without any degeneracies in the basis SEDs, a higher coefficient would mean that the SFH is closer to the true SFH of the galaxy, with denoting a perfect match with basis vector . The procedure returns similar results to the fitting procedure described in §2.4, with a slight computational speedup requiring of the time for fitting an SED, which might be helpful in fitting large datasets of SEDs from upcoming surveys.
Appendix C Alternative methods of defining goodnessofreconstruction:
Given that the SFHs are not a directly measurable quantity, care must be taken in comparing the reconstructed SFHs to the true ones, accounting for the unequal sensitivity of the SEDs to the same interval of time at different epochs, as well as the large amount of stochasticity present in the simulated SFHs (SAM, Hydro., Stochastic). We outline some of the methods viable for this as alternatives to be considered in other applications of the Dense Basis method. These statistics, while useful for comparing how well a given reconstruction approximates the true SFH, are significantly affected by stochasticity. Since we are only interested in the relative performance of the families of curves in current work, we choose to compare the goodnessofreconstruction to that of a polynomial fit with the same number of degrees of freedom as the parametrizations under consideration.

and adjusted:
The coefficient of determination is among the simplest ways to compare two sets of points, comparing how well the reconstructed SFH approximates the true one. This gives the first indication of the fact that some families of SFHs may be more useful than others for a given dataset at SFH reconstruction.
The statistic is given by,
(C1) which quantifies the amount of variance explained by the fit. Since the stochasticity of the different mocks differs, the median for fits to the three datasets can vary widely, with the SAM galaxies doing the best and the MUFASA galaxies doing the worst. It is possible to adjust this by smoothing the true SFHs using a nonparametric method until they all exhibit an equal level of stochasticity, or simply by rebinning the SFH with a time interval of the order of the least stochastic sample. Another improvement, as implemented in the current work, is to compare the of the reconstruction with a reference with the same number of degrees of freedom, such as a fit using a polynomial.

The Pearson correlation coefficient ()
Since the standard implementation of as a goodnessofreconstruction metric fails to account for the different amounts of stochasticity present in the different mock datasets, we consider the Pearson correlation coefficient, which accounts for the inherent stochasticity of an SFH through a normalization. As an alternative to the previous method, we can present the results for the training step as likelihood vs the Pearson correlation coefficient, written as
(C2) since this could better provide an estimate of the goodnessofreconstruction for highly stochastic SFHs without the need for an additional adjustment step. However, the coefficient in this form assumes Gaussian statistics which are not always applicable for our datasets.

Spearman’s correlation coefficient ()
Pearson’s correlation coefficient assumes a Gaussian distribution of noise around a linear relation, and finds the degree of correlation around it. However, the relation we seek, to compare two timeordered sets of curves, needs to be more robust. Therefore we considered the Spearman coefficient, which compares two monotonic functions using ranks in order to find the degree of correlation between them. For distinct ranks, the coefficient is given by , where is the difference between the two ranks of each observation. This, however does not work very well at describing the fit for young galaxies, where a significant fraction of the two star formation histories is tied at the same rank due to long periods of vanishing SFR at early times.

MISE
The mean integrated square error given by also provides a method to quantify the goodnessofreconstruction. However, it does have a well defined range to compare different quantities, and provides no accounting for the varying amounts of stochasticity of the different datasets. The concept of minimum distance estimation that this method implements can also be generalized to the KolmogorovSmirnov statistic, which depends on the maximum absolute difference between the true and estimated cumulative SFHs, but does not provide sufficiently sensitive results to make a distinction between the different families using a correspondence between the statistic and the goodnessoffit.
Appendix D Robustness to noise
The Dense Basis method performs SED fitting using an atlas of SEDs corresponding to well motivated basis SFHs that satisfy the conditions in §2.2. Although the mapping from SFH space to the observed photometry is theoretically bijective, an SED at a given noise level for a given set of photometric bands is degenerate in SFH space to the extent that all the SFHs that produce the same SED within error limits are an acceptable fit. Our formulation then ensures that the basis is effectively dense in SFH space, allowing us to reconstruct the overall trend of star formation even if it doesn’t capture the finer stochastic details. However, even though the basis is effectively dense in SFH space, it is not dense in SED space, since a large region of the photometry space is not accessible through any physically motivated SFH. This allows our method to ignore all noise that is ‘unphysical’ while performing the SEDFitting step. Even though this yields worse and the noise biases the reconstruction to an extent documented in Figure 8, it does not overfit the SED by fitting for any noise that does not correspond to a physically motivated SFH. This allows our method to be robust to a large fraction of the noise, as is seen in Figure 17, where we show an example of 1000 noisy realisations to a SAM spectrum in red and the corresponding reconstruction in blue, which successfully ignores major outliers in fitting the SED. Extended to the entire ensemble of 1200 mock galaxies, we find that the ratio of the residuals to the noise is , with a standard deviation of , i.e.,
(D1) 
The decomposition of this quantity into the sensitivity to noise in individual bands in Figure 18 shows that the F160w is the most sensitive to noise, with the method being remarkably robust to the noise in the groundbased bands. The maximum deviation due to noise is computed and found to be in the bounding bands (u ctio and IRAC 4.5m). This is expected, since the endpoints are the most unconstrained in the fitting process.
Appendix E Examining the correspondence for individual galaxies
We examine the correspondence between and for individual galaxies in greater detail. We provide examples for three randomly chosen galaxies from each mock catalog as examples in Figure 19, and discuss possible biases and how they should be minimised. We compute as follows:
(E1) 
where the index k sums over the entire basis of SFHs as above, and denotes the stellar mass normalisation. The denote symmetric pointwise uncertainties computed through the procedure described in §.2.5 for each family being tested. Since the code is implemented over a grid, the integral over time is effectively a sum over discrete time intervals as described in §.2.2. The statistic computes the accuracy of reconstruction and is better for training the basis families, since it does not reward SFH families that yield larger uncertainties, as opposed to , which does so. However, we can use this statistic to observe the correspondence between fits in SED space and reconstructions in SFH space on a galaxy by galaxy basis, and study sources of biases in the reconstruction.
We encounter two types of biases in the plots, summarised as follows:

Degenerate : If, in addition to the correspondence, some good fits to the SED () correspond to bad reconstructions of the SFH (), the SFH reconstruction may be biased. However, these are often removed as outliers in the procedure used to compute uncertainties, as described in §.2.5.

Suboptimal : The best fit to the SED corresponds to a significantly worse reconstruction than the best possible reconstruction of the SFH with that basis. However, like the true SFH, the best possible reconstruction is generally within our reported uncertainties around the bestfit determined via .
For the first point, we quantify the two kinds of biases using the surface generated for each galaxy in the ensemble of 1200 galaxies using each SFH family. An example of the two kinds of bias is shown in Figure 20, showing the plot for a single galaxy with a single SFH family. Since there is a certain amount of degeneracy introduced in SED fitting due to noise, we consider the set of all good fits () instead of the best fit . As shown in Figure 19, we see that there is generally a good correspondence between and in the regime of good fits. We then find the families that minimise the two types of biases in SFH reconstruction.
We quantify the degenerate bias by examining the histogram of the set . Since this is the set of good fits, we then say that a galaxy has a type 1 bias if it has multiple peaks in this histogram, separated by a minimum distance of 1 dex in . This is the more common type of bias, and the probability that it will bias the fit towards a poorer reconstruction depends on the ratio of the areas under the two peaks. We show the number of occurrences of this type of bias for each family in Table 5, finding that the exponential and CSF families have the highest occurrence of this behaviour. While these biases are more common than suboptimal biases, they only indicate the possibility of a bias due to noise, and are usually much milder than the example shown.
Linexp  Besselexp  Gaussian  Lognormal  Exponential  TopHat  

Type 1 bias:  216  179  108  145  323  294 
Type 2 bias:  5  2  8  1  10  12 
For suboptimal bias, we find the distance for each galaxy with each SFH family. This distance denotes the difference between the best possible in the basis and the corresponding to the bestfit SED in the basis. If the latter quantity is much worse than the former, we say that a galaxy has a bias due to suboptimal . We find that this is best quantified by the condition . We show the number of these biases for each family in Table 5, finding much lower rates of occurrence and that the TopHat family shows the highest occurrence of this behaviour.
References
 Acquaviva et al. (2011) Acquaviva, V., Gawiser, E., & Guaita, L. 2011, Proceedings of the International Astronomical Union, 7, 42
 Acquaviva et al. (2015) Acquaviva, V., Raichoor, A., & Gawiser, E. 2015, The Astrophysical Journal, 804, 8
 Allen et al. (2008) Allen, M. G., Groves, B. A., Dopita, M. A., Sutherland, R. S., & Kewley, L. J. 2008, The Astrophysical Journal Supplement Series, 178, 20
 AndersonSprecher (1994) AndersonSprecher, R. 1994, The American Statistician, 48, 113
 Avni (1976) Avni, Y. 1976, The Astrophysical Journal, 210, 642
 Behroozi et al. (2010) Behroozi, P. S., Conroy, C., & Wechsler, R. H. 2010, The Astrophysical Journal, 717, 379
 Behroozi et al. (2013) Behroozi, P. S., Wechsler, R. H., & Conroy, C. 2013, The Astrophysical Journal, 770, 57
 Benitez et al. (2014) Benitez, N., Dupke, R., Moles, M., et al. 2014, arXiv preprint arXiv:1403.5237
 Bolzonella et al. (2000) Bolzonella, M., Miralles, J.M., & Pelló, R. 2000, Astron. Astrophys, 363, 476
 Bower et al. (2006) Bower, R., Benson, A., Malbon, R., et al. 2006, Monthly Notices of the Royal Astronomical Society, 370, 645
 Brinchmann et al. (2004) Brinchmann, J., Charlot, S., White, S., et al. 2004, Monthly Notices of the Royal Astronomical Society, 351, 1151
 Bruzual & Charlot (2003) Bruzual, G., & Charlot, S. 2003, Monthly Notices of the Royal Astronomical Society, 344, 1000
 Calzetti (2001) Calzetti, D. 2001, Publications of the Astronomical Society of the Pacific, 113, 1449
 Chevallard & Charlot (2016) Chevallard, J., & Charlot, S. 2016, arXiv preprint arXiv:1603.03037
 Cleveland (1979) Cleveland, W. S. 1979, Journal of the American statistical association, 74, 829
 Conroy (2013) Conroy, C. 2013, Annual Review of Astronomy and Astrophysics, 51, 393
 Conroy & Gunn (2010) Conroy, C., & Gunn, J. E. 2010, The Astrophysical Journal, 712, 833
 Croton et al. (2006) Croton, D. J., Springel, V., White, S. D., et al. 2006, Monthly Notices of the Royal Astronomical Society, 365, 11
 Davé et al. (2016) Davé, R., Rafieferantsoa, M. H., Thompson, R. J., & Hopkins, P. F. 2016, arXiv preprint arXiv:1610.01626
 Diemer et al. (2017) Diemer, B., Sparre, M., Abramson, L. E., & Torrey, P. 2017, arXiv preprint arXiv:1701.02308
 Dressler & Abramson (2014) Dressler, A., & Abramson, L. 2014, Proceedings of the International Astronomical Union, 10, 140
 Dye (2008) Dye, S. 2008, Monthly Notices of the Royal Astronomical Society, 389, 1293
 Eisenstein et al. (2011) Eisenstein, D. J., Weinberg, D. H., Agol, E., et al. 2011, The Astronomical Journal, 142, 72
 Ferreras et al. (2006) Ferreras, I., Rogers, B., & Lahav, O. 2006, arXiv preprint astroph/0611456
 Gavazzi et al. (2002) Gavazzi, G., Bonfanti, C., Sanvito, G., Boselli, A., & Scodeggio, M. 2002, The Astrophysical Journal, 576, 135
 Gladders et al. (2013) Gladders, M. D., Oemler, A., Dressler, A., et al. 2013, The Astrophysical Journal, 770, 64
 Green et al. (2012) Green, J., Schechter, P., Baltay, C., et al. 2012, arXiv preprint arXiv:1208.4012
 Guo et al. (2013) Guo, Y., Ferguson, H. C., Giavalisco, M., et al. 2013, The Astrophysical Journal Supplement Series, 207, 24
 Hammer et al. (2005) Hammer, F., Flores, H., Elbaz, D., et al. 2005, Astronomy & Astrophysics, 430, 115
 Heavens et al. (2000) Heavens, A. F., Jimenez, R., & Lahav, O. 2000, Monthly Notices of the Royal Astronomical Society, 317, 965
 Ivezic et al. (2008) Ivezic, Z., Tyson, J., Abel, B., et al. 2008, arXiv preprint arXiv:0805.2366
 Johnson et al. (2013) Johnson, B. D., Weisz, D. R., Dalcanton, J. J., et al. 2013, The Astrophysical Journal, 772, 8
 Kauffmann et al. (2003) Kauffmann, G., Heckman, T. M., White, S. D., et al. 2003, Monthly Notices of the Royal Astronomical Society, 341, 33
 Kelson (2014) Kelson, D. D. 2014, arXiv preprint arXiv:1406.5191
 Kewley et al. (2001) Kewley, L. J., Dopita, M., Sutherland, R., Heisler, C., & Trevena, J. 2001, The Astrophysical Journal, 556, 121
 Klypin et al. (2011) Klypin, A. A., TrujilloGomez, S., & Primack, J. 2011, The Astrophysical Journal, 740, 102
 Kurczynski et al. (2016) Kurczynski, P., Gawiser, E., Acquaviva, V., et al. 2016, The Astrophysical Journal Letters, 820, L1
 Laureijs et al. (2010) Laureijs, R. J., Duvet, L., Sanz, I. E., et al. 2010, in SPIE Astronomical Telescopes+ Instrumentation, International Society for Optics and Photonics, 77311H–77311H
 Lee et al. (2010) Lee, S.K., Ferguson, H. C., Somerville, R. S., Wiklind, T., & Giavalisco, M. 2010, The Astrophysical Journal, 725, 1644
 Leistedt & Hogg (2016) Leistedt, B., & Hogg, D. W. 2016, arXiv preprint arXiv:1612.00847
 Leitherer et al. (1999) Leitherer, C., Schaerer, D., Goldader, J. D., et al. 1999, The Astrophysical Journal Supplement Series, 123, 3
 Leja et al. (2016) Leja, J., Johnson, B. D., Conroy, C., van Dokkum, P. G., & Byler, N. 2016, arXiv preprint arXiv:1609.09073
 Mandelbrot & Van Ness (1968) Mandelbrot, B. B., & Van Ness, J. W. 1968, SIAM review, 10, 422
 Munshi et al. (2013) Munshi, F., Governato, F., Brooks, A., et al. 2013, The Astrophysical Journal, 766, 56
 Orsi et al. (2014) Orsi, Á., Padilla, N., Groves, B., et al. 2014, Monthly Notices of the Royal Astronomical Society, 443, 799
 Pacifici et al. (2012) Pacifici, C., Kassin, S. A., Weiner, B., Charlot, S., & Gardner, J. P. 2012, The Astrophysical Journal Letters, 762, L15
 Pacifici et al. (2016) Pacifici, C., Oh, S., Oh, K., Lee, J., & Yi, S. K. 2016, arXiv preprint arXiv:1604.02460
 Pacifici et al. (2015) Pacifici, C., da Cunha, E., Charlot, S., et al. 2015, Monthly Notices of the Royal Astronomical Society, 447, 786
 Papovich et al. (2016) Papovich, C., Shipley, H., Mehrtens, N., et al. 2016, arXiv preprint arXiv:1603.05660
 Sanders et al. (2015) Sanders, R. L., Shapley, A. E., Kriek, M., et al. 2015, The Astrophysical Journal, 799, 138
 Simha et al. (2014) Simha, V., Weinberg, D. H., Conroy, C., et al. 2014, arXiv preprint arXiv:1404.0402
 Somerville et al. (2008) Somerville, R. S., Hopkins, P. F., Cox, T. J., Robertson, B. E., & Hernquist, L. 2008, Monthly Notices of the Royal Astronomical Society, 391, 481
 Somerville et al. (2015) Somerville, R. S., Popping, G., & Trager, S. C. 2015, Monthly Notices of the Royal Astronomical Society, 453, 4337
 Sparre et al. (2015) Sparre, M., Hayward, C. C., Springel, V., et al. 2015, Monthly Notices of the Royal Astronomical Society, 447, 3548
 Speagle et al. (2014) Speagle, J. S., Steinhardt, C. L., Capak, P. L., & Silverman, J. D. 2014, The Astrophysical Journal Supplement Series, 214, 15
 Steidel et al. (2014) Steidel, C. C., Rudie, G. C., Strom, A. L., et al. 2014, The Astrophysical Journal, 795, 165
 Tinsley (1980) Tinsley, B. M. 1980, Fundamentals of cosmic physics, 5, 287
 Tojeiro et al. (2007) Tojeiro, R., Heavens, A. F., Jimenez, R., & Panter, B. 2007, Monthly Notices of the Royal Astronomical Society, 381, 1252
 Tomczak et al. (2016) Tomczak, A. R., Quadri, R. F., Tran, K.V. H., et al. 2016, The Astrophysical Journal, 817, 118
 Weisz et al. (2011) Weisz, D. R., Dalcanton, J. J., Williams, B. F., et al. 2011, The Astrophysical Journal, 739, 5
 Xie & Singh (2013) Xie, M.g., & Singh, K. 2013, International Statistical Review, 81, 3
 Zhao & Ma (2016) Zhao, G., & Ma, Y. 2016, Statistics & Probability Letters, 116, 72