Bayesian Estimation of Thermonuclear Reaction Rates

# Bayesian Estimation of Thermonuclear Reaction Rates

C. Iliadis,11affiliation: Department of Physics & Astronomy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3255, USA 22affiliation: Triangle Universities Nuclear Laboratory, Durham, NC 27708-0308, USA K.S. Anderson,11affiliation: Department of Physics & Astronomy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3255, USA A. Coc,33affiliation: Centre de Sciences Nucléaires et de Sciences de la Matière (CSNSM), CNRS/IN2P3, Univ. Paris-Sud, Université Paris–Saclay, Bâtiment 104, F-91405 Orsay Campus, France F.X. Timmes44affiliation: School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287-1504, USA 55affiliation: Joint Institute for Nuclear Astrophysics S. Starrfield44affiliation: School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287-1504, USA
###### Abstract

The problem of estimating non-resonant astrophysical S-factors and thermonuclear reaction rates, based on measured nuclear cross sections, is of major interest for nuclear energy generation, neutrino physics, and element synthesis. Many different methods have been applied in the past to this problem, almost all of them based on traditional statistics. Bayesian methods, on the other hand, are now in widespread use in the physical sciences. In astronomy, for example, Bayesian statistics is applied to the observation of extra-solar planets, gravitational waves, and type Ia supernovae. However, nuclear physics, in particular, has been slow to adopt Bayesian methods. We present astrophysical S-factors and reaction rates based on Bayesian statistics. We develop a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties in a consistent manner. The method is applied to the d(p,)He, He(He,2p)He, and He(,)Be reactions, important for deuterium burning, solar neutrinos, and big bang nucleosynthesis.

methods: numerical — nuclear reactions, nucleosynthesis, abundances — stars: interiors — primordial nucleosynthesis

## 1 Introduction

Thermonuclear reaction rates are at the heart of nuclear astrophysics. They are essential for understanding key phenomena in the universe, including main-sequence stars, red giants, AGB stars, white dwarfs, core-collapse and thermonuclear supernovae, classical novae, and type I X-ray bursts. The evaluations provided by Willy Fowler and collaborators were of outstanding importance in this regard (Fowler, Caughlan & Zimmerman, 1967, 1975; Caughlan & Fowler, 1988). Their recommended experimental reaction rates provided, for the first time, a solid nuclear physics foundation for models of stars and big bang nucleosynthesis. A further milestone was reached with the NACRE collaboration (Angulo et al., 1999). Their work provided not only updated rates, but also included approximate reaction rate error estimates. These ideas were subsequently extended to heavier target nuclei and to reactions involving short-lived targets (Iliadis et al., 2001).

In recent years, a growing volume of astronomical data motivated an increased number of nucleosynthesis sensitivity studies to better quantify the impact of given reactions on nuclear burning. The first such reaction network studies utilized published recommended thermonuclear reaction rates together with somewhat arbitrary methods of varying the rates (Iliadis et al., 2002; Stoesz & Herwig, 2003; Rapp et al., 2006; Parikh et al., 2008; Iliadis et al., 2011). It became apparent that improved experimental reaction rate estimates, based on sound statistical methods, would be very valuable. Such experimental rates were first published in 2010 (Longland et al., 2010; Iliadis et al., 2010a, b, c). They were obtained using Monte Carlo sampling of the many experimental nuclear physics input quantities (e.g., resonance energies and strengths, partial widths and reduced widths) entering in a reaction rate calculation.

The output of this procedure is a probability density for the reaction rate at each temperature of interest. The probability density is used to extract statistically meaningful rate estimates, such as a recommended rate (from the median) or rate uncertainties (from the 16th and 84th percentiles for a 68% coverage probability). Experimental Monte Carlo-based reaction rates are tabulated in the STARLIB reaction rate library (Sallaska et al., 2013) and are publicly available. The STARLIB library has already been used in Monte Carlo nucleosynthesis studies of classical novae (Kelly et al., 2013) and in studies of globular cluster polluters (Iliadis et al., 2016). Recently, STARLIB has been used with the MESA stellar evolution software instrument (Paxton et al., 2011, 2013, 2015) to study the impact of uncertainties in nuclear reaction rates on the properties of carbon-oxygen white dwarfs (Fields et al., 2016).

Experimental Monte Carlo-based thermonuclear reaction rates are so far available for (p,), (p,), and (,) reactions in the mass region, involving both stable and unstable target nuclei. The Monte Carlo-based method of estimating reaction rates is limited, in its present form, to nuclear reactions that are dominated by resonant contributions to the total rate. Non-resonant contributions are included in the method (Longland et al., 2010), but their random sampling is only performed in the simplest possible manner by providing an approximate uncertainty of the non-resonant astrophysical S-factor. While this treatment of the non-resonant component is not statistically rigorous, it has little practical effect on the total rates for the reactions referred to above, precisely because they are dominated by resonant contributions.

The calculation of non-resonant reaction rates333With the expression “non-resonant”, we refer to astrophysical S-factors that vary smoothly with energy. directly from experimental data has its own difficulties and pitfalls. Such rates have been estimated for light-particle reactions, in the mass region, using the R-matrix reaction model and fits to the data for the purpose of studying big bang nucleosynthesis (Descouvemont et al., 2004). Light-particle reaction rates, in the mass range, have also been computed for solar models (Adelberger et al., 2011). The experimental data were analyzed in the latter work by minimization, using either a polynomial S-factor expansion or the output of theoretical nuclear reaction models. Typical problems encountered in the analysis of non-resonant rates include the treatments of data normalization factor uncertainties (i.e., systematic errors) and discrepant data sets. In a recent study of the cosmic evolution of deuterium (Coc et al., 2015), a number of different methods, all based on minimization, have been employed to compute rates for the d(p,)He, d(d,n)He, and d(d,p)H reactions.

In this work we provide a fresh look by calculating the non-resonant reaction rates using Bayesian probability theory. The advantages of this approach are manifold. First, the Bayesian approach yields directly the quantity of interest in nucleosynthesis sensitivity studies, i.e., the reaction rate probability density function. These rates can be easily implemented, together with the Monte Carlo-based rates discussed above, into the STARLIB rate library. Second, the Bayesian model provides a more consistent method for extracting information from measured data, even in ill-conditioned situations, compared to traditional statistics.

In Section 2, we will discuss how to incorporate systematic uncertainties, robust regression, and non-Gaussian statistical uncertainties into a Bayesian analysis. Bayesian astrophysical S-factors and thermonuclear rates for the d(p,)He, He(He,2p)He, and He(,)Be reactions are presented in Sections 3 and 4, respectively. A summary and conclusions are provided in Section 5. Since Bayesian inference has rarely been applied before to S-factors and reaction rates444For an interesting application of Bayesian methods to estimate the parameters of effective field theories, and an application to the S-factor and reaction rates of Be(p,)B, see Zhang, Nollett & Phillips (2015); Wesolowski et al. (2016)., we will discuss in Appendix A how to use Bayes’ theorem to estimate model parameters. To clarify our discussion of Bayesian S-factors and reaction rates in the main text, these ideas are applied in Appendix B to the simple problem of linear regression.

## 2 Systematic uncertainties, robust regression, and non-Gaussian statistical uncertainties

For the analysis of Bayesian models, we will employ the program JAGS (“Just Another Gibbs Sampler”) using Markov chain Monte Carlo (MCMC) sampling. More information on JAGS, including a simple example, is provided in the Appendices. Before we can analyze astrophysical S-factor data using Bayesian inference, we have to consider how to include systematic uncertainties, outliers, and non-Gaussian statistical uncertainties in the likelihood function.

### 2.1 Systematic uncertainties

Experimental data are subject to statistical and systematic uncertainties. Statistical uncertainties are well understood and they usually follow a known probability distribution, e.g., a Gaussian or Poissonian. When a series of independent experiments is performed, statistical uncertainties will give rise to different results in each individual measurement. The magnitude of the statistical uncertainty can be estimated from the standard deviation of the data, if the experiments are uncorrelated. Statistical effects can be reduced by combining the results from several measurements.

Systematic effects, on the other hand, do not usually signal their existence by a larger fluctuation of the data. When the experiment is repeated, the presence of systematic effects may not produce different answers. Similarly, systematic uncertainties are frequently not reduced when combining the results from different measurements. Reported systematic uncertainties are at least partially based on assumptions made by the experimenter, are model dependent, and follow vaguely known probability distributions (Heinrich & Lyons, 2007).

Consider as an example the measurement of an astrophysical S-factor at a given bombarding energy. The experimental result is frequently reported as , where is the mean value, is the standard deviation representing the statistical uncertainty, and denotes the systematic uncertainty. The latter two quantities are either reported as absolute or relative (i.e., percent) uncertainties. If a single uncertainty is required, statistical and systematic uncertainties can be combined in quadrature on the grounds that they are uncorrelated. Since a systematic effect will shift all points of a given data set in the same direction, it can either be quantified as an (additive) offset or a (multiplicative) normalization. The true value of the offset or normalization is, of course, unknown, otherwise there would be no systematic uncertainty. However, we do have one piece of information: the expectation value of the systematic uncertainty is zero, if the systematic effect is quantified as an offset, or unity, if it is described as a normalization. If this would not be the case, we would have corrected the data for the systematic effect.

The S-factor data we will analyze in Section 3 have been reported with systematic uncertainties described by normalization factors. For example, suppose the systematic uncertainty for a given data set is reported as , implying that the normalization uncertainty is given by a factor of . A useful distribution for factor uncertainties is the lognormal probability density, given by

 f(x)=1σ√2πxe−(lnx−μ)2/(2σ2),x>0 (1)

It is characterized by two quantities, the location parameter, , and the spread parameter, . Notice that and are not the mean and standard deviation of the lognormal distribution, but of the Gaussian distribution for . The median value of the lognormal distribution is given by , while the factor uncertainty, for a coverage probability of 68%, is . Therefore, we include in our Bayesian model a systematic effect as a highly-informative, lognormal prior with a median of , i.e., , and a factor uncertainty given by the systematic uncertainty, i.e., in the above example, or . A specific example, including the syntax, for implementing systematic uncertainties into JAGS is given in Appendix B.

### 2.2 Robust regression

Outliers can bias the data analysis significantly if they are not properly taken into account. The frequently applied procedure of disregarding data points that are subjectively deemed to be outliers has no statistical justification. A number of different approaches have been applied in Bayesian inference to include outliers in the analysis. For example, the data could be described by applying a distribution that has taller tails, such as the distribution (Lange, Little, & Taylor, 1989), compared to the ubiquitous Gaussian distribution.

In the present work, we adopt a different approach that is based on Andreon & Weaver (2015). The method treats the complete data set as a mixture of two populations: one population of supposedly correctly measured uncertainties, and another one for which the reported uncertainty estimates are too optimistic. The goal is to design an algorithm that can automatically identify and reduce the weight of the data points with over-optimistic uncertainties (i.e., outliers). This is achieved by including a parameter describing the membership to the different populations into the random sampling of the posterior.

The procedure has a number of advantages. In the analysis, each datum contributes to the posterior with a larger weight the smaller the uncertainty and the higher the probability that the reported uncertainty is correct. All of the data points are taken into account in the analysis and none are discarded. The MCMC sampling also quantifies the outlier probability for a given datum. The JAGS implementation of robust regression for a simple example is described in Appendix B.

### 2.3 Non-Gaussian statistical uncertainties of data points

Nuclear reaction cross sections, or astrophysical S-factors, are experimentally determined by products and ratios of many nuclear physics input quantities: measured net intensities, incident beam charge, detection efficiencies, number of target nuclei, stopping powers, etc. According to the central limit theorem, the probability density of a derived quantity, such as the cross section or S-factor, will then be distributed according to a lognormal rather than a normal probability density (Equation 1). This situation was discussed in Longland et al. (2010). The lognormal parameters are given by

 μ=ln(E[x])−12ln(1+V[x]E[x]2) (2)
 σ= ⎷ln(1+V[x]E[x]2) (3)

where and denote the mean value and the variance (i.e., the square root of the standard deviation), respectively. For standard deviations 10% of the mean value, the lognormal probability density is very close in shape to a Gaussian. However, with increasing relative standard deviations, the differences between the lognormal density function and a Gaussian approximation increase. Notice also that, unlike the lognormal probability density, a Gaussian density function predicts a finite probability for negative values of the random variable, which is unphysical for manifestly positive quantities, such as nuclear reaction cross sections or astrophysical S-factors. The JAGS syntax for implementing lognormal likelihoods is given in Appendix B.

At low bombarding energies, where the experimental yields are very small, the reported uncertainties on data points are frequently large. For example, how should one interpret a reported S-factor of “ keVb”, which implies a finite chance of a zero S-factor? It is certainly inappropriate to assume a Gaussian likelihood in this case, because the S-factor cannot become negative. But it is equally inappropriate to assume a lognormal likelihood, which predicts zero probability for a zero S-factor. In such cases, the total statistical uncertainty is dominated by counting statistics and the appropriate likelihood function to use is a Poissonian (or a difference of Poissonians, if the net intensity is inferred from total and background counts).

Suppose we perform a simple counting measurement. We have measured the total and the background counts, and we are interested in estimating the signal (i.e., total minus background) counts. When we set up a Bayesian model for this situation, it is appropriate to assume Poissonian likelihoods for both the total and the background counts. Figure 1 shows numerical results obtained using JAGS. The panels display the posteriors for the signal counts. We assumed a uniform prior between and , i.e., the posterior will closely reflect the shape of the likelihood (see Equation A3). The different panels are obtained for a total number of counts of , , , and , while the background counts are kept fixed at . The predicted mean and standard deviation of the signal is indicated in each panel. When the total number of counts is relatively large (e.g., ; see first panel), the probability for predicting zero signal counts is negligible. In addition, the shape of the density function is well approximated by a lognormal distribution. On the other hand, when the total number of counts is similar to the background counts (e.g., ; see last panel), the posterior predicts a large probability at zero signal counts and certainly does not resemble a lognormal distribution. The sequence of panels shows that the probability density can be approximated by a lognormal distribution as long as the ratio of mean value and standard deviation is (see second panel). Similar results are obtained when different priors are used (e.g., gamma functions, exponentials, or hyperpriors). Consequently, we will exclude in our analysis of experimental S-factors the few data points that do not satisfy this criterion.

## 3 Bayesian astrophysical S-factors

We will now apply the Bayesian method to the estimation of astrophysical S-factors555Analyzing S-factor data rather than cross section data has a number of advantages, among them a dramatic reduction in the energy dependence at the low bombarding energies considered here, and a straightforward comparison to literature results., . This quantity is defined as (Iliadis, 2015)

 S(E)≡Eσ(E)e2πη (4)

where is the nuclear reaction cross section at the center-of-mass energy, . The quantity denotes the Gamow factor, given by

 2πη=0.98951013Z0Z1√M0M1M0+M11E (5)

with the charges of the projectile and target; in this expression, the relative atomic masses, , and the energy, , are in units of u and MeV, respectively.

The experimental S-factor can be extracted from data using fitting functions based either on a polynomial representation or on nuclear reaction models. The former provides a result that is independent of nuclear theory. Because this procedure has no theoretical justification beyond the known data points, it requires that the S-factor data cover the entire energy region of astrophysical interest. However, this is frequently not the case, especially at low bombarding energies, where the Coulomb barrier greatly inhibits direct measurements.

When data are missing in the region of interest, fitting functions motivated by nuclear theory (e.g., potential models, microscopic calculations, or R-matrix approaches) are usually preferred (Adelberger et al., 2011). With this method, it is assumed that the nuclear model reliably describes the energy dependence of the S-factor, but that the absolute scale is determined by a fit of the data using the nuclear model. The assumption of an additional normalization motivated by experimental data can be explained qualitatively. For example, many microscopic models compute the interior wave functions over truncated configuration spaces, with consequences for the normalization. Similarly, in ab initio models, small variations in the strength of the effective nucleon-nucleon interaction, which is adjusted to reproduce nucleon-nucleon scattering data, will result in changes of the S-factor normalization. Nevertheless, we emphasize that the need for an additional normalization has no rigorous theoretical justification. However, since we cannot easily compute microscopic models and vary the model parameters, the assumption of a normalization factor determined by experiment represents the most straightforward method. In any case, the extrapolation of the S-factor beyond the measured data will have some theoretical justification.

We will present in the following a Bayesian analysis for several light-particle nuclear reactions, assuming for the model S-factor either a polynomial representation or the results of nuclear models. Each of these reactions has it own intricacies.

### 3.1 S-factor for d(p,γ)3He

The d(p,)He reaction represents the second step in the pp chains of stellar hydrogen burning. Since its rate is much faster compared to the first step, p(p,e)d, uncertainties in the d(p,)He reaction are usually not important for stellar energy generation. In special situations, however, this reaction does play a crucial role. For example, during the earliest stages of stellar evolution, when a cloud of interstellar gas collapses to form a protostar, the central temperature reaches a few million kelvin. At this temperature, primordial deuterium fuses with hydrogen (deuterium burning), thereby generating nuclear energy that slows the contraction and the central heating of the gas until the deuterium is consumed. The d(p,)He reaction also plays a crucial role in big bang nucleosynthesis (Coc et al., 2015, and references therein), which begins when the temperature has declined to  GK, corresponding to relevant kinetic energies of  keV. The uncertainty in the d(p,)He reaction rate impacts the primordial abundances of d, He, and Li. For example, the reaction rate needs to be known to better than % below an energy of  keV to compare big bang nucleosynthesis predictions to the very precise value (uncertainty of %) of the deuterium-to-hydrogen (D/H) abundance ratio measured in very metal-poor, damped Lyman- systems (Cooke et al., 2014).

Most recently, S-factors and reaction rates for d(p,)He have been presented by Coc et al. (2015). A reliable estimation of S-factors requires simultaneous knowledge of statistical and systematic uncertainties, as discussed in Section 2.1. Among the many data sets published during 1962-2008, this information is only available for four studies (Ma et al., 1997; Schmid et al., 1997; Casella et al., 2002; Bystritsky et al., 2008). These were the only data sets used by Coc et al. (2015) for their S-factor estimation, and we will apply the same data selection666It appears that the energies in Bystritsky et al. (2008) have been misinterpreted in Coc et al. (2015). The correct center-of-mass energies, used in the present work, of the three data points are  keV,  keV, and  keV (instead of  keV,  keV, and  keV).. The data point at the lowest measured bombarding energy of Casella et al. (2002) has a mean-value-to-standard-deviation ratio in excess of a factor of and has been omitted in our analysis for the reasons given in Section 2.3. This data point was included in the analysis of Coc et al. (2015). The data adopted for the present analysis are displayed as black symbols in Figure 2, where the displayed error bars refer to () statistical uncertainties only.

\onecolumngrid \twocolumngrid

The S-factor fit to the data was performed in previous work by using polynomials (e.g., Adelberger et al., 2011) or results from nuclear theory (e.g., Descouvemont et al., 2004; Coc et al., 2015). Similar to Coc et al. (2015), we will adopt in the present work the theoretical S-factors from Marcucci et al. (2005). They were obtained using variational wave functions for the p-d continuum and He bound states, together with a Hamiltonian consisting of two-nucleon and three-nucleon potentials. We will assume that the theoretical model adequately describes the shape of the S-factor curve, but that the absolute scale of the model S-factor is determined by the fit to the data.

The JAGS model for each of the four data sets includes the effects of systematic uncertainties, robust regression, and lognormal likelihood functions, as discussed in Section 2. Our Bayesian model has five parameters. Four of these are the normalizations of the individual data sets. The highly informative priors for these parameters are computed using systematic uncertainty factors of (Ma et al., 1997), (Schmid et al., 1997), (Casella et al., 2002), and (Bystritsky et al., 2008), which are listed in Table I of Coc et al. (2015). The fifth parameter is the common scaling factor by which the nuclear model results have to be multiplied to fit the data. We assume a non-informative prior for this parameter, i.e., a normal probability density with a location of zero and a standard deviation of . The distribution was truncated at zero since the scaling factor must be a positive quantity. Other choices of priors (i.e., uniform and gamma functions) gave very similar results. The theoretical S-factor from Marcucci et al. (2005), available to us as a table of 100,000 S-factor values between center-of-mass energies of  MeV and  MeV, was directly implemented into JAGS.

With our Bayesian model, we generated random samples using three independent Markov chains, each of length 75,000 (without burn-in). This ensures that the Monte Carlo fluctuations are negligible compared to the statistical and systematic uncertainties. A first impression can be obtained from Figure 2. The grey shaded region consists of lines that correspond to the credible S-factor curves, where each line corresponds to one sampled set of model parameters. The blue line represents the median (50th percentile), and the red lines the 16th and 84th percentiles of all credible S-factors. More information on the meaning of these lines can be found in Appendix A.

Details of our analysis are given in Table 1 and are compared to recently published results obtained using traditional statistics ( minimization). The top part of the table displays the normalization factors (“norm”) of each data set, taking into account the reported systematic uncertainties (see above). The present and previous values overlap within uncertainties. However, the magnitude of the uncertainties differs significantly. For example, for the data of Casella et al. (2002) our uncertainties are a factor of 4 larger than those of Coc et al. (2015), while for the data of Bystritsky et al. (2008) our uncertainties are smaller by a factor of 2. The fourth column summarizes the outlier probability of the different data sets. The values are computed from the average of the outlier probabilities of all data points in a given set, as predicted by JAGS. The data of Schmid et al. (1997) have an average outlier probability of 72%, in agreement with the elevated reduced found by Coc et al. (2015) for this set.

The same data sets were analyzed in both Coc et al. (2015) and in the present work, and the same systematic uncertainties were adopted in both studies. Therefore, it is interesting to investigate the main reason for the significant differences, mentioned above, that are obtained in the analysis of the data of Casella et al. (2002) and Bystritsky et al. (2008). We performed a series of tests and found that neither inclusion or omission of the lowest lying data point in Casella et al. (2002), nor the use of the correct or incorrect center-of-mass energies in Bystritsky et al. (2008) (see Footnote 6) had an effect on our derived normalization factors listed in the top part of Table 1. These changes in the data sets are too small to affect the analysis. We also performed a test by using Gaussian instead of lognormal likelihoods for the data points and obtained again results in agreement with those listed in Table 1. This is not surprising because, with few exceptions, the data points have relatively small error bars, implying that a Gaussian closely approximates the lognormal likelihood (Section 2.3). We thus conclude that the significant differences obtained presently and previously regarding the data sets of Casella et al. (2002) and Bystritsky et al. (2008) are caused by the adoption of a Bayesian model in our work as opposed to using a traditional method employed by Coc et al. (2015).

The lower part of Table 1 compares the present and previous values for the scale factor of the theoretical model results, and the astrophysical S-factor at zero energy. Our Bayesian analysis verifies the results reported by Coc et al. (2015), both for the recommended values and the magnitude of the uncertainties. Our zero-energy S-factor also agrees with the value presented in Adelberger et al. (2011), although our uncertainty (%) is smaller by a factor of . The analysis by Adelberger et al. (2011) was performed using a minimization and assuming a quadratic parameterization of the S-factor. The zero-energy S-factor presented in Xu et al. (2013) has a much larger uncertainty (%) compared to all other recent values. It was obtained, using a potential model, from a standard fit in conjunction with a “fit-by-eye” technique.

In summary, completely independent methods of analysis provide comparable results. But unlike the minimization applied previously, the Bayesian technique provides consistent answers without the need to resort to Gaussian assumptions and other approximations. Thermonuclear reaction rates will be presented in Section 4.1.

### 3.2 3He(3He,2p)4He

The He(He,2p)He reaction represents the third and final step of the pp1 chain. The competition of this process with the He(,)Be reaction determines the relative neutrino fluxes that originate from the pp and pep reactions (pp1 chain) compared to the Be and B decays (pp2 and pp3 chains). The S-factor ratio for He(He,2p)He and He(,)Be enters directly in the calculation of the solar neutrino energy losses, and thus impacts the relationship between the photon luminosity and the total energy production of the Sun (Adelberger et al., 2011).

The astrophysical S-factor of He(He,2p)He was recently evaluated by Adelberger et al. (2011). As discussed above and in that work, a reliable estimation of the astrophysical S-factor requires separate knowledge of statistical and systematic uncertainties. This information is reported in four studies. The quoted systematic uncertainties are 4.5% (Krauss et al., 1987), 3.7% (Junker et al., 1998), 5.7% (Bonetti et al., 1999), and 3.8% (Kudomi et al., 2004). To these data we added one more study (Dwarakanath & Winkler, 1971) for which we could infer separate statistical (4%-7%) and systematic uncertainties (8.2%) based on the information provided. Our reasoning is discussed in more detail in Appendix C.2. The data adopted in the present work are displayed as black symbols in Figure 3, where the displayed error bars refer to () statistical uncertainties only.

We disregarded the two data points from Bonetti et al. (1999) at center-of-mass energies of  keV and  keV, with reported S-factor values of  MeVb and  MeVb, respectively, for the reasons given in Section 2.3. Only a single event was observed at each of these two energies, and the quoted errors refer to statistical uncertainties only (see their Table I). It is not difficult to include such data in a Bayesian model if their probability densities were known. Since this information is not provided by Bonetti et al. (1999), we cannot include these two data points with very large errors. They should also not be included in a traditional ( minimization) analysis. Bonetti et al. (1999) and Adelberger et al. (2011) do not mention if they included these two data points or not.

The S-factor fit to the data was performed in previous work (Bonetti et al., 1999; Adelberger et al., 2011) by using the expressions

 S(E)=Sbare(E) eπη(UeE) (6) Sbare(E)=S(0)+S′(0)E+12S′′(0)E2 (7)

where is the bare nucleus S-factor that is not influenced by electron screening, is the S-factor at zero center-of-mass energy, and are the first and second energy derivatives of the S-factor at zero energy, and is the electron-screening potential energy. This expression adequately describes the total measured S-factor, , at energies below  MeV.

Our Bayesian model has nine parameters. Five of these are the normalizations of the individual data sets. The highly informative priors for these parameters are computed using the systematic uncertainty factors quoted above. The other parameters are , , , and . We assume non-informative priors for these parameters, i.e., normal probability densities located at zero with large values for the standard deviations. The distributions for and were truncated at zero energy since both the S-factor and the electron-screening potential energy are positive quantities. Other choices of priors gave consistent results.

We generated random samples using three independent Markov chains, each of length 75,000 (without burn-in). Results are shown in Figure 3. The two sets of credible lines display the S-factors with (upper grey lines) and without (lower grey lines) electron screening corrections. All other lines have the same meaning as in Figure 2.

\onecolumngrid \twocolumngrid

Our results are listed in Table 2 and they are compared to recently published values obtained using traditional statistics ( minimization). The top part of the table displays the normalization factors (“norm”) of each data set, taking into account the reported systematic uncertainties (see above). The fourth column summarizes the outlier probability of the different data sets, which is computed from the average of the outlier probabilities of all data points in a given set, as predicted by JAGS. We find the largest outlier probabilities for the data of Krauss et al. (1987) (67%) and Bonetti et al. (1999) (58%).

The lower part of Table 2 compares the present and previous values for the S-factor expansion coefficients, , , , and the electron screening potential, . Notice that each of the listed present values is marginalized over all other parameters (see Appendix B). Our results agree with those of Bonetti et al. (1999) within uncertainties. However, our values for and disagree with those reported by Adelberger et al. (2011). Since their value of agrees with our result, we conclude that the disagreement for the other parameters is caused by the significantly smaller bombarding energy range analyzed by Adelberger et al. (2011), i.e.,  keV, compared to  MeV in the present work. Our uncertainty of % for the zero-energy S-factor is much smaller than the value of % reported by Xu et al. (2013), who obtained the S-factor using a phenomenological nuclear reaction model and a minimization.

Furthermore, Adelberger et al. (2011) report an S-factor of  MeVb at the Gamow peak (  keV) for the Sun’s central temperature (  MK), corresponding to an uncertainty of %. Our result is  MeVb, corresponding to a significantly smaller uncertainty of %. Thermonuclear reaction rates will be presented in Section 4.2.

### 3.3 3He(α,γ)7Be

The detection of solar neutrinos has entered a precision era, enabling the measurement of neutrino fluxes with a total uncertainty of about % % by various neutrino detectors (Aharmin et al., 2013; Smy, 2013; Bellini et al., 2014). The measured neutrino fluxes can be used to probe the solar core and test solar models, provided that the relevant thermonuclear reaction rates are accurately known. Since the He(,)Be reaction competes with He(He,2p)He, it determines the number of Be and B neutrinos originating from the pp2 and pp3 chains. The He(,)Be reaction also plays a prominent role in big bang nucleosynthesis. While the primordial abundances of d, He, and He predicted by standard big bang models are in reasonable agreement with those from observation, the models overproduce the primordial abundance of Li by a factor of 3. This “Li problem” is among the unsolved mysteries in astrophysics (Iocco et al., 2009). Most of the Li in the early universe is produced as Be, by the He(,)Be reaction, and decays subsequently via electron capture to Li. Although a new determination of the He(,)Be rate does not appear to solve the Li problem, it is nevertheless desirable to determine a reliable rate of this reaction.

Many groups have measured the He(,)Be reaction using various experimental strategies. For summary discussions, see Adelberger et al. (2011); Bordeanu et al. (2013); deBoer et al. (2014). Similar to the procedure in Adelberger et al. (2011), we adopt a subset of all published measurements for the present analysis. First, we only consider those studies that provide separate statistical and systematic uncertainties. This excludes all measurements performed before the year 2000. Second, we focus on the center-of-mass energy region below  MeV. Third, we only consider those experiments that directly measure the total cross section, i.e., via activation or recoil detection. We exclude prompt -ray data, since these studies rely so far on computed rather than measured corrections for -ray angular correlation effects. Based on these selection criteria, four data sets remain: Brown et al. (2007); Nara Singh et al. (2004); Di Leva et al. (2009); Costantini et al. (2008), which were labeled “Seattle”, “Weizmann”, “ERNA”, “LUNA”, respectively, in Adelberger et al. (2011). For systematic uncertainties, we adopt 3.0% (Brown et al., 2007), 5.1% (Nara Singh et al., 2004), 5.0% (Di Leva et al., 2009), and 3.1% (Costantini et al., 2008). Notice that Adelberger et al. (2011) adopt a systematic uncertainty of only 2.2% for the data of Nara Singh et al. (2004). However, this value applies to their highest energy data point only, while the other three data points have considerably higher systematic uncertainties (4.1% 7.1%). We adopt here the average value. The data adopted in the present work are displayed as black symbols in Figure 4, where the displayed error bars refer to () statistical uncertainties only.

Several different strategies have been employed in the past to fit the experimental S-factor data for the He(,)Be reaction, including potential models (Tombrello & Parker, 1963), parametrized analytical functions (Cyburt & Davids, 2008), resonating-group methods (Kajino, 1986), and R-matrix approaches (deBoer et al., 2014). In this work, we will focus on three microscopic models. The first is the resonating-group study of Kajino (1986), which has been used in several previous investigations. In this model, the nuclear system is characterized by antisymmetrized wave functions describing the relative motion of two clusters. The required phenomenological nucleon-nucleon interactions were tuned to reproduce the properties of bound and scattering states within the restricted cluster model space. The second is the calculation of Nollett (2001), which employed accurate nucleon-nucleon potentials. The bound states were computed using the variational Monte Carlo method, while the relative motion of the nuclei in the initial state was described by one-body wave functions generated from the intercluster potential A of Kim, Izumoto & Nagatani (1981). The third is the ab initio model of Neff (2011), which employed realistic interactions to solve the many-body problem using a large model space. The latter work found that the assumption of a predominant external capture, which was commonly adopted in most previous studies, is not that well satisfied. Similar to the discussion in Section 3.1, we assume that these models adequately describe the shape of the S-factor, but that the absolute scale of each model S-factor is determined by a fit to the data. In the following, we first discuss and quote results obtained using the model of Neff (2011). Subsequently, we use the models of Kajino (1986) and Nollett (2001) to estimate the model uncertainty for the extrapolation of the S-factor to low energies, where no data exist. We obtained the theoretical S-factors for all three models from the original authors as numerical tables, which were directly implemented into JAGS. Tests showed that this procedure caused negligible errors in the linear interpolation between grid points.

Our Bayesian model has five parameters. Four of these are the normalizations of the individual data sets. The highly informative priors for these parameters are computed using the systematic uncertainty factors quoted above. The fifth parameter is the common scaling factor by which the nuclear model results have to be multiplied to fit the data. We assume a non-informative prior for the latter parameter, i.e., a normal probability density with a location of zero and a standard deviation of . The distribution was truncated at zero since the scaling factor must be a positive quantity. Other choices of priors gave very similar results. We generated random samples using three independent Markov chains, each of length 75,000 (without burn-in). Results are shown in Figure 4, where the lines have the same meaning as in Figure 2.

Our numerical results are listed in Table 3 and are compared to recently published values obtained using different methods. The top part of the table displays the normalization factors (“norm”) of each data set, taking into account the reported systematic uncertainties (see above). The fourth column summarizes the outlier probability of the different data sets, which is computed from the average of the outlier probabilies of all data points in a given set, as predicted by JAGS. We obtain the largest average outlier probability for the data of Brown et al. (2007) (81%).

The lower part of Table 3 compares the present and previous values for the S-factor at zero energy, . From fitting the data using the model of Neff (2011), we find  MeVb, representing an uncertainty of %. Our result agrees within the quoted uncertainties with those of Adelberger et al. (2011), who used in their analysis the same data sets as we did. Notice, however, that Adelberger et al. (2011) employed an analytic function that approximated the S-factor of Nollett (2001) “to better than %, on average”. In contrast, we directly used the original numerical tables of Neff (2011) and there was no need for an approximation. Compared to Adelberger et al. (2011), our uncertainty in from fitting the data is smaller by a factor of . It is also interesting that our value for disagrees with the R-matrix result of deBoer et al. (2014),  MeVb, where their quoted error is based on the data fit only. Their quoted mean value of is lower by % compared to our result.

\onecolumngrid \twocolumngrid

For low energies, especially those pertaining to the solar Gamow peak, data do not exist and the S-factor must be extrapolated to compute the reaction rates. Therefore, past work has included a “theory error” that is based on the spread in values obtained when different theoretical models are used to fit the data. In our case, we repeated our analysis using the theoretical model S-factors of Kajino (1986) and Nollett (2001). With the theoretical model of Nollett (2001), we find almost identical S-factors (mean value and uncertainties) compared to the model of Neff (2011). The model of Kajino (1986) resulted in a similar uncertainty, but a mean value smaller by a factor of %. Using the full spread of % as an estimate for the “theory error”, our result is  MeVb. This can be compared to  MeVb and  MeVb. Concerning the result of deBoer et al. (2014), the first uncertainty was obtained from the data fit, the second from varying the background pole energies and the R-matrix channel radius, and the third from using different scattering data sets to define the phase shifts. Thermonuclear reaction rates will be presented in Section 4.2.

## 4 Bayesian reaction rates

The thermonuclear reaction rate per particle pair, , can be written as (Iliadis, 2015)

 NA⟨σv⟩=(8πm01)1/2NA(kT)3/2∫∞0e−2πηS(E)e−E/kTdE (8)

where is the reduced mass of projectile and target, and is Avogadro’s constant; the product of Boltzmann constant, , and plasma temperature, , is numerically given by

 kT=0.086173324T9(MeV) (9)

with the temperature, , given in units of GK.

For each set of parameters sampled by the MCMC algorithm, we calculate the reaction rates by numerical integration of Eq. (8) on a grid of temperatures between  MK and  GK. The resulting set of reaction rate values constitute the reaction rate probability density at a given temperature. Using this probability density, we follow the procedure recommended in Longland et al. (2010) to compute a recommended rate (50th percentile), a high and low rate (16th and 84th percentiles, respectively), and the lognormal parameters, and , of the lognormal approximation of the total reaction rate. The rate factor uncertainty, , corresponding to a coverage probability of 68%, is obtained from (see Equation 1). Notice we directly compute the lognormal parameters from the expectation value and variance of all rate samples, , at a given temperature. This ensures that the results can be directly incorporated into the STARLIB library (Sallaska et al., 2013).

### 4.1 Reaction rates for d(p,γ)3He

The present rates for the d(p,)He reaction, together with the corresponding factor uncertainties, are listed in columns 2 and 3 of Table 4. The rate factor uncertainty is constant, % (except at the highest temperatures; see below), since it is determined by a single parameter (i.e., the common scaling factor; Section 3.1). Reaction rate probability densities are shown in Figure 5 for two selected temperatures,  MK (top), near the range important for deuterium burning, and  GK (bottom), relevant for big bang nucleosynthesis. The reaction rate samples (red histograms) are computed using the S-factor samples obtained from the Bayesian model (Section 3.1). The sampled rates are well represented by lognormal probability densities, shown as blue curves.

For temperatures of  MK, our rates agree with the recently evaluated results of Coc et al. (2015) within %. However, at lower temperatures, important for deuterium burning, our rates deviate strongly from the previous results. For example, at the lowest temperature,  MK, our rates are larger by a factor of compared to those of Coc et al. (2015)777This large difference has no impact on big bang nucleosynthesis; see below.. The disagreement is explained by an erroneously assumed lower integration limit of  keV in the previous work, which is too high for computing the reaction rates at the lowest temperatures. Our estimated reaction rate factor uncertainties are close to the values given previously at all temperatures. We integrate numerically the reaction rates only up to  MeV, i.e., the highest center-of-mass energy for which we have theoretical S-factors from Marcucci et al. (2005). Since we may miss rate contributions at the highest temperatures,  GK, we adopt in this region the values from Coc et al. (2015), which are shown in italics in Table 4.

\onecolumngrid \twocolumngrid

It is straightforward to calculate the effect of the new reaction rate on the predicted primordial D/H ratio. Our mean value for the scale factor (Table 1) represents a % increase compared to Coc et al. (2015). This difference translates into a decrease of the central D/H value by only % (Iocco et al., 2009), while the total uncertainty remains unchanged at %.

### 4.2 Reaction rates for 3He(3He,2p)4He

The present rates for the He(He,2p)He reaction, together with the corresponding factor uncertainties, are listed in columns 4 and 5 of Table 4. The rate factor uncertainties amount to % % for temperatures of  GK. The reaction rate probability density is shown in Figure 6 (top) for the temperature at the Sun’s center ( MK). The reaction rate samples (red histograms) are computed using the S-factor samples obtained from the Bayesian model (Section 3.2). The sampled rates are well represented by a lognormal probability density, shown as blue curve.

Although for temperatures of  GK our rates agree with those of Angulo et al. (1999), our estimated reaction rate factor uncertainties are significantly smaller, by a factor of (i.e., % versus %). Since we numerically integrate the reaction rates only up to  MeV, we may miss rate contributions at the highest temperatures,  GK. Therefore, we adopt in this region the values from Angulo et al. (1999), which are shown in italics in Table 4.

At temperatures relevant for the center of the Sun (  MK), the Be and B solar neutrino fluxes approximately scale with the value according to the relations and (Table XV in Bahcall & Ulrich, 1988). Therefore, compared to the rate of Adelberger et al. (2011), our results translate into an increase in the Be and B solar neutrino fluxes by 1.5% and 1.4%, respectively. A more detailed analysis, incorporating our much reduced uncertainty in (by factor of ), is beyond the scope of the present work.

### 4.3 Reaction rates for 3He(α,γ)7Be

The present rates for the He(,)Be reaction, together with the corresponding factor uncertainties, are listed in columns 6 and 7 of Table 4. The rate factor uncertainty amounts to % for temperatures of  GK. The reaction rate probability density is shown in Figure 6 (bottom) for the temperature at the Sun’s center ( MK). The reaction rate samples (red histograms) are computed using the S-factor samples obtained from the Bayesian model (Section 3.3). The sampled rates are well represented by a lognormal probability density, shown as blue curve.

For the Sun’s central temperature, the present reaction rates barely agree with the R-matrix results of deBoer et al. (2014) within the quoted uncertainties (corresponding to 68% probability density intervals). However, our recommended rate is larger by 6.0%. At big bang temperatures (  GK), our result is in good agreement with the rate of deBoer et al. (2014). As already mentioned, we fit the S-factor data up to an energy of  MeV. Therefore, we can only compute the reaction rates up to a temperature of  GK. For higher temperatures, we adopt the values from Kontos et al. (2013), which are shown in italics in Table 4.

At temperatures relevant for the center of the Sun (  MK), the Be and B solar neutrino fluxes approximately scale with the value according to the relations and (Bahcall & Ulrich, 1988). Therefore, compared to the rate of deBoer et al. (2014), our results translate into an increase in the Be and B solar neutrino fluxes by 4.7% and 4.5%, respectively.

The He(,)Be rate is the major nuclear physics source of uncertainty for the prediction of the primordial Li abundance. The Li/H ratio varies almost linearly with the reaction rate (Iocco et al., 2009). The study of primordial nucleosynthesis by Coc et al. (2015) adopted the rate of deBoer et al. (2014), which agrees with our result at big bang temperatures within a few percent. Therefore, we expect only minor modifications to the predicted primordial Li/H ratio. Such small variations are negligible compared to the factor of discrepancy between predicted and observed primordial Li/H ratios, and thus are not relevant for the Li problem.

## 5 Summary

We discussed astrophysical S-factors and reaction rates based on Bayesian statistics, and developed a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties. Unlike the minimization applied previously, the Bayesian technique provides consistent answers without the need to resort to Gaussian assumptions and other frequently applied approximations. The method is used to estimate the d(p,)He, He(He,2p)He, and He(,)Be S-factors and reaction rates, important for deuterium burning, solar neutrinos, and big bang nucleosynthesis.

For the d(p,)He reaction, our analysis verifies the results reported by Coc et al. (2015), both for the recommended values and the magnitude of the uncertainties. Our zero-energy S-factor also agrees with the value presented in Adelberger et al. (2011), although our uncertainty is smaller by a factor of . The zero-energy S-factor presented in Xu et al. (2013) has a much larger uncertainty (19%) compared to all other recently published values. Our reaction rate factor uncertainty is 3.7% for all temperatures below  GK. Compared to Coc et al. (2015), our reaction rate at big bang temperatures is larger by about %. This translates into a decrease in the primordial D/H value by only %, while the total uncertainty remains unchanged at %.

For the He(He,2p)He reaction, our results agree with those of Bonetti et al. (1999) within uncertainties. However, our parameter values for and disagree with those reported by Adelberger et al. (2011). Our uncertainty of 2.6% for the zero-energy S-factor is much smaller than the value of 9.4% reported by Xu et al. (2013). Furthermore, Adelberger et al. (2011) report an S-factor of  MeVb at the Gamow peak (  keV) for the Sun’s central temperature (  MK), corresponding to an uncertainty of 4.3%. Our result is  MeVb, corresponding to a smaller uncertainty of 2.7%. Compared to the reaction rate of Adelberger et al. (2011), our results translate into an increase in the Be and B solar neutrino fluxes by 1.5% and 1.4%, respectively.

For the He(,)Be reaction, we find  MeVb, representing an uncertainty of 2.1%, from fitting the data using the ab initio model of Neff (2011). Our result agrees within the quoted uncertainties with that of Adelberger et al. (2011). However, compared to the latter work, our uncertainty in from fitting the data is smaller by a factor of . Also, our value for disagrees with the R-matrix result of deBoer et al. (2014). Their quoted mean value of is lower by 5.5% compared to our result. This translates into an increase in the Be and B solar neutrino fluxes by 4.7% and 4.5%, respectively.

We would like to thank Stefano Andreon, Jack Dermigny, Ryan Fitzgerald, Jordi José, Richard Longland, Thomas Neff, and Ken Nollett for their input and feedback. This work was supported in part by NASA under the Astrophysics Theory Program grant 14-ATP14-0007 and the Theoretical and Computational Astrophysics Networks grant NNX14AB53G, U.S. DOE under Contract No. DE-FG02-97ER41041, and NSF under the Software Infrastructure for Sustained Innovation grant 1339600 and grant PHY 08-022648 for the Physics Frontier Center “Joint Institute for Nuclear Astrophysics – Center for the Evolution of the Elements” (JINA-CEE). F.X.T. acknowledges sabbatical support from the Simons Foundation. \softwareJAGS (Plummer, 2003), R (R Core Team, 2015)

## Appendix A Bayesian inference

Bayesian methods have revolutionized many scientific fields, including archeology, ecology, genetics, linguistics, political science, and psychology. A brief historical account and a comparison of traditional (“frequentist”) and Bayesian statistics can be found in Brooks (2003). An introduction to Bayesian inference in physics (von Toussaint, 2011) and a textbook on Bayesian methods for the physical sciences (Andreon & Weaver, 2015) have been published recently.

Denoting with the probability that “proposition A is true given that proposition B is true”, we can write the product rule of elementary logic as

 p(A∧B) = p(A|B)p(B) (A1a) p(B∧A) = p(B|A)p(A) (A1b)

Since , solving for yields the general form of Bayes’ theorem

 p(A|B)=p(B|A)p(A)p(B) (A2)

The above expression applies to any kind of proposition. When applied to experimental data and continuous model parameters, it can be written as (Kruschke, 2015)

 p(θ|D)=p(D|θ)p(θ)p(D)=p(D|θ)p(θ)∫dθp(D|θ)p(θ) (A3)

The factor is the likelihood function, the same as in traditional (“frequentist”) statistics, and denotes the probability that the data, , were obtained assuming given values for the model parameters, . Since in most cases more than one parameter is involved in a given model, denotes the complete set of model parameters, (, ,…, ). The factor is called the prior, which represents our state of knowledge before seeing the data. The product of likelihood and prior defines the factor , called the posterior. The denominator, called the evidence, is a normalization factor representing the product of likelihood and prior, integrated over all values of the parameters, . All of the factors entering in Bayes’ theorem represent probability densities.

Equation A3 shows that the traditional maximum likelihood estimate, obtained by maximizing the likelihood function, generally differs from the posterior estimate because of the presence of the prior, . The maximum likelihood estimate is often mistakenly interpreted as “the most probable estimate given the data”. This is incorrect since in frequentist statistics the model parameters are not random variables. Their true values are unknown. In Bayesian statistics, on the other hand, the model parameters are random variables and the posterior provides directly the information we seek, that is, the probability of a given set of model parameters given the data. Bayesian inference is used for parameter estimation, value prediction, and model selection. We will be concerned in this work with parameter estimation and value prediction only. In that case, only the numerator on the right-hand side of Equation A3 is of interest.

In a Bayesian analysis, it is important to compare posterior inferences under different reasonable choices of prior distributions. The posterior will be insensitive to the choice of prior when the sample size is large. However, when the sample size is small, the prior distribution becomes more important. Prior distributions range from “non-informative”, e.g., a uniform density between two reasonable limits, to “highly informative”, e.g., when fairly precise information is available for a given parameter (Gelman, 2002). In this work, we will explore uniform, Gaussian, and lognormal distributions as prior densities.

Without the use of numerical algorithms, the Bayesian method discussed so far is only applicable to very simple problems, involving few parameters, for which analytical solutions exist. The main reason for the wide adoption of Bayesian techniques in many scientific fields is that the random sampling of the posterior can be performed numerically over many parameter dimensions using Markov chain Monte Carlo (MCMC) algorithms (Metropolis et al., 1953; Hastings, 1970; Geyer, 2011).

A Markov chain is a random walk, where a transition from state to state is independent (“memory-less”) of how state was populated. The fundamental theorem of Markov chains states that for a very long random walk the proportion of time (i.e., probability) the chain spends in some state is independent of the initial state it started from. This set of limiting, long random walk, probabilities is called the stationary (or equilibrium) distribution of the Markov chain. Consequently, when a Markov chain is constructed with a stationary distribution equal to the posterior, , the samples drawn at every step during a sufficiently long random walk will closely approximate the posterior density. Several related algorithms (e.g., Metropolis, Metropolis-Hastings, Gibbs) are known to solve this problem numerically. The combination of Bayes’ theorem and MCMC algorithms allows for computing models that are too difficult to estimate using traditional statistical methods.

In this work we employ the program JAGS (“Just Another Gibbs Sampler”) for the analysis of Bayesian models using Markov chain Monte Carlo sampling (Plummer, 2003). Specifically, we will employ the rjags package that works directly with JAGS within the R language (R Core Team, 2015). Running a JAGS model refers to generating random samples from the posterior distribution of model parameters. This involves the definition of the model, likelihood, and priors, as well as the initialization, adaptation, and monitoring of the Markov chain.

Two major issues that need to be tested in a Bayesian analysis are the mixing and convergence of the Markov chains, and the sensitivity of the results to the priors. The former is achieved using suitable diagnostic tools (Gelman & Rubin, 1992; Geweke, 1992; Raftery & Lewis, 1992), while the latter can be investigated by comparing posterior inferences under different reasonable choices of prior distribution.

## Appendix B A simple example: linear regression

To illustrate these ideas with a simple example, we apply the Bayesian method to the problem of linear regression. A comprehensive treatment of regression using Bayesian statistics can be found in Gelman & Hill (2007).

Suppose we aim to fit a straight line to some data, , given by , , and , with , , and the independent variable, dependent variable, and the error in the dependent variable, respectively. For simplicity, we will assume no error on the independent variable. The linear relationship between variables satisfies

 y′=α+βx (B1)

but cannot be observed directly. Instead, we observe the quantity

 yi=y′i+ϵi (B2)

If we further assume for this simple example that the errors, , are Gaussian random variables with standard deviations of , the likelihood function for all data points is

 p(D|αβ)=n∏i=11σi√2πe−(yi−[α+βxi])22σ2i (B3)

which represents a product of normal distributions, each with a mean of and a standard deviation of . In abbreviated form, we may write symbolically for each data point

 yi∼N(y′i,σ2i) (B4)

implying that its value is sampled from a normal distribution with a mean equal to the true value, , and a variance of .

A specific example is displayed in Figure 7. The artificial data shown as open circles have been generated under the following assumptions: (i) first, -values are sampled from a uniform probability density in the range from to ; (ii) the corresponding -values are computed using the linear relationship , i.e., an intercept of and a slope of ; (iii) a noise contribution to each -value is sampled from a normal probability density with a mean of and a standard deviation of ; (iv) the observed value, , is obtained by adding the noise contribution to the true (but unobserved) value of , according to Equation B2. In other words, in this simple example we assume the scatter is represented by the same distribution for all data points.

We now analyze this hypothetical data set using Bayesian inference and are particularly interested to see if the analysis recovers the input parameters used to generate the data, i.e., the slope, the intercept, and the magnitude of the noise. The JAGS model, in symbolic notation888Unlike R, Fortran, or C, JAGS is a declarative language, i.e., the syntax provided here is a model declaration, and does not define a set of computational steps to be run sequentially. At compilation, the model declaration syntax is turned into a set of instructions that would correspond to a program in the conventional sense, but this is never seen by the user. Therefore, the precise order in which statements are given in the model declaration is unimportant., is set up as follows:

# LIKELIHOOD
obsy[i] ~ dnorm(y[i], pow(sigmay, -2))
y[i] = alpha + beta * obsx[i]
# PRIORS
alpha ~ dnorm(0.0, pow(100, -2))
beta ~ dnorm(0.0, pow(100, -2))
sigmay ~ dunif(0, 200)


The third line states our model, i.e., a linear relationship between the unobserved (true) -value and the -value. The second line describes the likelihood for each data point, . The “” symbol stands for “distributed as” or “sampled from”. Thus we sample for each data point the observed value, (obsy), from a normal probability density with a mean given by the true value, , and a standard deviation of (sigmay). The pow() command appears because JAGS requires as input the precision, , instead of the standard deviation, . The three parameters of our model are the intercept (), the slope (), and the scatter (). Next, each of these parameters require a prior probability density. For the slope and intercept we adopt normal distributions with a mean of zero and a standard distribution of , i.e., very broad and slowly declining priors. For the standard deviation of the noise, we assume a uniform prior between values of and .

The output of the JAGS model is displayed in Figure 8. The panels on the left-hand side show samples of , , and , for three independent Markov chains (indicated by different colors in each panel). The first samples were discarded to ensure that the chains have achieved equilibrium (“burn-in”). It is apparent that the scatter is uniform and the chains are well mixed. The panels on the right-hand side display the corresponding posterior densities for , , and . The 16th, 50th (median), and 84th percentiles extracted from the posteriors are , , and , and thus the orginal values used to generate the data set, shown as vertical red lines, are recovered within uncertainty. Notice that the panels shown on the right-hand side represent “marginalized” posterior densities, i.e., each distribution was obtained by integrating out the other two parameters. It would be inappropriate to use the full widths of the posteriors for estimating regression lines. Such a procedure would overestimate the uncertainties because of parameter correlations.

Credible regression lines, calculated using the sampled values for the intercept () and slope (), marginalized over the scatter (), form the grey shaded area in Figure 7. A few interesting observations can be made. First, the density of grey lines decreases with increasing distance from the data points. Second, the width of the credible region is smallest in the middle of the data set, near , and increases towards lower and higher -values. Both observations agree with expectation, since the uncertainties should increase in regions devoid of data. We can quantify the credible region by computing suitable percentiles of -values on a grid of -values. The 50th percentile is shown as a blue line, whereas the 16th and 84th percentiles are displayed as red lines. Therefore, at any given -value, there is a 68% probability that the true (but usually unknown) -value is located between the two red lines.

The two dashed lines in Figure 7 correspond to the 50th percentile, plus or minus the mean of the sampled values of the scatter, . The region between the two dashed lines indicates the most likely location of new data points acquired under the same conditions as the data shown.

We will now provide the JAGS implementation of systematic uncertainties, robust regression, and non-Gaussian statistical uncertainties, discussed in Section 2, for the example of linear regression. The inclusion of a systematic normalization uncertainty of a factor of can be accomplished by:

# LIKELIHOOD
obsy[i] ~ dnorm(z[i], pow(erobsy[i], -2))
z[i] = n.factor * y[i]
y[i] = alpha + beta * obsx[i]
# PRIORS
alpha ~ dnorm(0.0, pow(100, -2))
beta ~ dnorm(0.0, pow(100, -2))
## lognormal density
n.factor ~ dlnorm(logmu, pow(logsigma, -2))
logmu = log(1.0)
logsigma = log(1.1)


The first line contains the experimental statistical uncertainty, erobsy, for each individual data point, , instead of assuming the same error as we did above. The second line includes the normalization factor (n.factor) into the likelihood function. The last three lines contain the information about the prior for the systematic uncertainty: n.factor is sampled from a lognormal density (dlnorm) with parameters of logmu=0 and logsigma=log(1.1), where log denotes the natural logarithm.

Robust regression can be implemented into our simple example as:

# LIKELIHOOD
obsy[i] ~ dnorm(y[i], pow(corr.er[i], -2))
p.alt[i] ~ dcat(p[])
corr.er[i] = erobsy[i] * phi[p.alt[i]]
y[i] = alpha + beta * obsx[i]
# PRIORS
alpha ~ dnorm(0.0, pow(100, -2))
beta ~ dnorm(0.0, pow(100, -2))
# if measured errors are correct:
phi[1] = 1
# if measured errors are overoptimistic:
phi[2] ~ dunif(1, 50)
p[1] ~ dunif(0, 1)
p[2] = 1 - p[1]


First, a two-element vector phi[] is defined: for phi[1]=1 it is assumed that the reported uncertainty for a given datum, , is correct; phi[2] is sampled from a uniform distribution between and some higher value (e.g., in this example). For example, if p[1]=0.2 and p[2]=0.8, then dcat() will return values of or with a probability of and , respectively, each time dcat() is called. If p.alt[]=1, we obtain phi[1]=1, and the observed uncertainty for a given datum is assumed to be correct. If, on the other hand, p.alt[]=2, then the reported uncertainty of a given datum is multiplied by a factor of phi[2]. The MCMC sampling quantifies the outlier probability for a given datum by counting the number of times the indices 1 (no outlier) or 2 (outlier) have been called.

To account for the lognormal likelihood of the data analyzed here, we can replace the first line in the last JAGS code shown above,

# LIKELIHOOD
obsy[i] ~ dnorm(y[i], pow(corr.er[i], -2))


which is appropriate for a Gaussian likelihood function, by

# LIKELIHOOD
obsy[i] ~ dlnorm(ylg[i],
pow(corrlg.er[i],-2))
ylg[i] = log(y[i])-0.5*log(1+
(pow(corr.er[i],2)/pow(y[i],2)))
corrlg.er[i] = sqrt(log(1+
(pow(corr.er[i],2)/pow(y[i],2))))


The quantities ylg[i] and corrlg.er[i] denote the lognormal parameters, and , respectively, of datum (Equations 2 and 3).

## Appendix C Nuclear data

### c.1 The d(p,γ)3He reaction

The data for the d(p,)He reaction analyzed in the present work have recently been evaluated by Coc et al. (2015). We adopt the data listed in their Appendix B, with two exceptions. First, the energies in Bystritsky et al. (2008) have been misinterpreted by Coc et al. (2015). The correct center-of-mass energies, used in the present work, of the three data points are  keV,  keV, and  keV (see Footnote 6). Second, the data point at the lowest measured bombarding energy of Casella et al. (2002) has a mean-value-to-standard-deviation ratio in excess of a factor of and has been omitted in our analysis for the reasons given in Section 2.3.

### c.2 The 3He(3He,2p)4He reaction

#### c.2.1 The data of Kudomi et al. (2004) and Bonetti et al. (1999)

The S-factors from Kudomi et al. (2004) are taken from their Table II and are reproduced in Table 5, which only lists statistical uncertainties. The authors state that the sum of the systematic uncertainties for the S-factor is 3.8%. They also re-evaluate the systematic uncertainties reported in the experiments of Krauss et al. (1987) and Junker et al. (1998), and obtain 5.5% and 3.7%, respectively.

The S-factors listed in Table I of Bonetti et al. (1999) are reproduced in Table 6.

#### c.2.2 Data of Junker et al. (1998) and Krauss et al. (1987)

The S-factors obtained from Table I of Junker et al. (1998) are presented in Table 7 with statistical uncertainties only, and supersede the preliminary results reported by Arpesella et al. (1996). Junker et al. (1998) note that the systematic uncertainty (one standard deviation) includes uncertainties in the gas target pressure (1%), beam power (3%), detection efficiency (2%), beam energy resolution, and beam energy loss (10%). Kudomi et al. (2004) re-evaluated their total systematic uncertainty and find a value of 3.7%.

The S-factors of Krauss et al. (1987) are extracted from their Table I and are listed in Table 8 with statistical uncertainties only. Systematic uncertainties of 3.0% and 3.4% from the excitation function normalization and the absolute cross section scale, respectively, have to be considered as well.

#### c.2.3 Data of Dwarakanath & Winkler (1971)

We included the data of Dwarakanath & Winkler (1971) in our analysis. Statistical and systematic uncertainties are not reported directly in that work, but it is possible to estimate the respective contributions from the information provided. Experimental S-factors versus center-of-mass energy are shown in their Figure 8 and we extracted the data points directly from the figure. Three data points with error bars are shown in representative energy regions. Their energies and S-factors are ( %), ( %), and ( %). The error bars “include statistical and estimated systematic errors in both measured total cross sections and center-of-mass energy”.

The total cross section in Dwarakanath & Winkler (1971) is approximately given by 4 times the differential cross section at . Their Figure 6 shows the differential cross section at a center-of-mass energy of  keV. Representative uncertainties are shown at two laboratory angles ( and ). Each of these consists of two error bars, “the larger error bar indicates the absolute error in the measured differential cross section and the smaller error bar indicates the relative error between measurements at different angles”. Extracting these values from their figure, we find for the relative (i.e., statistical) uncertainty a value of 5.5% and for the absolute (statistical and systematic) uncertainty a value of 9.9%. Assuming that statistical and systematic uncertainties have been added quadratically in Dwarakanath & Winkler (1971), we find a systematic uncertainty of 8.2% at a center-of-mass energy near 150 keV. Therefore, we adopt a global systematic uncertainty of 8.2% for all data points. For the statistical uncertainty we assume a value of 4.0% at energies above  keV, and % at lower energies. These estimates agree with the overall uncertainties quoted above for the three total S-factors. Our adopted values are listed in Table 9.

#### c.2.4 Other data

Several data sets that were used in previous evaluations (Angulo et al., 1999; Descouvemont et al., 2004; Adelberger et al., 2011; Xu et al., 2013) were not incorporated into our analysis. For example, Brown et al. (1987) measured the cross section at energies too high to be of interest here, while Dwarakanath (1974); Bacher & Tombrello (1967); Wang et al. (1966) did not provide enough information to reliably estimate the separate contributions of statistical and systematic uncertainties.

### c.3 The 3He(α,γ)7Be reaction

#### c.3.1 The data of Brown et al. (2007)

We extracted only the activation data of Brown et al. (2007) from their Table III, and list the values here in Table 10. A systematic uncertainty of % is adopted from their Table IV.

#### c.3.2 Data of Nara Singh et al. (2004)

We use the four activation data points of Nara Singh et al. (2004); see their Fig. 3 and Table II. The statistical and systematic uncertainties are taken from their Table II. The adopted results are listed in our Table 11.

#### c.3.3 Data of Di Leva et al. (2009)

The cross section data are taken from Table I of Di Leva et al. (2009). The corresponding S-factor values, computed using Eqs. 4 and 5, are displayed in Table 12, together with statistical uncertainties. The total systematic uncertainty of 5% is dominated by contributions from the target thickness (4%) and the current integration (1%).

#### c.3.4 The LUNA data

The results of activation measurements at LUNA are presented in Bemmerer et al. (2006) and Gyürky et al. (2007). They are summarized in Table 2 of Costantini et al. (2008) and are listed in our Table 13.

## References

• Adelberger et al. (2011) Adelberger, E. G., et al. 2011, Rev. Mod. Phys., 83, 195
• Aharmin et al. (2013) Aharmin, B., et al. 2013, Phys. Rev. C, 88, 025501
• Andreon (2010) Andreon, S. 2010, MNRAS, 407, 263
• Andreon & Weaver (2015) Andreon, S., & Weaver, B. 2015, Bayesian Methods for the Physical Sciences, (Switzerland: Springer Inetrnational Publishing )
• Angulo et al. (1999) Angulo, C., et al. 1999, Nucl. Phys. A, 656, 3
• Arpesella et al. (1996) Arpesella, C., et al. 1996, Phys. Rev. Lett., 389, 452
• Bacher & Tombrello (1967) Bacher, A. D. and Tombrello, T. A., reported by T. A. Tombrello, Astrophysical Problems in “Nuclear research with Low-Energy Accelerators”, J. B. Marion and D. M. Van Patter (eds), Academic Press, New York 1967, p.195.
• Bahcall & Ulrich (1988) Bahcall, J. N., & Ulrich, R. K. 2004, Rev. Mod. Phys. 60, 297 (1988)
• Bellini et al. (2014) Bellini, G., et al. 2014, Phys. Rev. D, 89, 112007
• Bemmerer et al. (2006) Bemmerer, D., et al. 2006, Phys. Rev. Lett., 97, 122502
• Bonetti et al. (1999) Bonetti, R., et al. 1999, Phys. Rev. Lett., 82, 5205
• Bordeanu et al. (2013) Bordeanu, C., et al. 2013, Nucl. Phys. A, 908, 1
• Brooks (2003) Brooks, S. P. 2003, Phil. Trans. R. Soc. Lond. A, 361, 2681
• Brown et al. (1987) Brown, R.E. , Correll, F.D., Hegland, P.M., Koepke, J.A. and Poppe, C.H. 1987, Phys. Rev. C, 35, 383
• Brown et al. (2007) Brown, T. A. D., et al. 2007, Phys. Rev. C, 76, 055801
• Bystritsky et al. (2008) Bystritsky, V. M., et al. 2008, Nucl. Instr. Meth. A, 595, 543
• Caughlan & Fowler (1988) Caughlan, G. R., & Fowler, W. A. 1988, At. Data Nucl. Data Tab., 40, 283
• Casella et al. (2002) Casella, C., et al. 2002, Nucl. Phys. A, 706, 203
• Coc et al. (2015) Coc, A., Petitjean, P., Uzan, J.-P., Vangioni, E., Descouvemont, P., Iliadis, C., & Longland, R. 2015, Phys. Rev. D, 92, 123526
• Cooke et al. (2014) Cooke, R., et al. 2014, ApJ, 781, 31
• Costantini et al. (2008) Costantini, H., et al. 2008, Nucl. Phys. A, 814, 144
• Cyburt & Davids (2008) Cyburt, R. H., & Davids, B. 2008, Phys. Rev. C, 78, 064614
• deBoer et al. (2014) deBoer, R. J., et al. 2014, Phys. Rev. C, 90, 035804
• Descouvemont et al. (2004) Descouvemont, P., Adahchour, A., Angulo, C., Coc, A., & Vangioni-Flam, E. 2004, At. Data Nucl. Data Tab., 88, 203
• Di Leva et al. (2009) Di Leva, A., et al. 2009, Phys. Rev. Lett., 102, 232502
• Dwarakanath & Winkler (1971) Dwarakanath, M. R., & Winkler, H. 1971, Phys. Rev C, 4, 1532
• Dwarakanath (1974) Dwarakanath, M. R., 1974, Phys. Rev. C, 9, 805
• Fields et al. (2016) Fields, C. E., Farmer, R., Petermann, I., Iliadis, C., & Timmes, F. X. 2016, ApJ, 823, 46
• Fowler, Caughlan & Zimmerman (1967) Fowler, W. A., Caughlan, G. R., & Zimmerman, B. A. 1967, Annu. Rev. Astron. Astrophys., 5, 525
• Fowler, Caughlan & Zimmerman (1975) Fowler, W. A., Caughlan, G. R., & Zimmerman, B. A. 1975, Annu. Rev. Astron. Astrophys., 13, 69
• Gelman & Rubin (1992) Gelman, A, & Rubin, D. B. 1992, Statistical Science, 7, 457
• Gelman (2002) Gelman, A. 2002, in: Encyclopedia of Environmetrics, Eds. A. H. El-Shaarawi & W. W. Piegorsch, Vol. 3, p. 1634 (Chichester: John Wiley & Sons)
• Gelman & Hill (2007) Gelman, A., & Hill, J. 2007, Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge: Cambridge University Press)
• Geweke (1992) Geweke, J. 1992, in: Bayesian Statistics 4, Eds. J. M. Bernardo, J. O. Berger, A. P. Dawid, & Smith, A. F. M. 1992 (Oxford: Oxford University Press) p. 169
• Geyer (2011) Geyer, C. J. 2011, in Handbook of Markov Chain Monte Carlo, Eds. S. Brooks, A. Gelman, G. L. Jones & X.-L. Meng (Boca Raton: Chapman & Hall/CRC)
• Gyürky et al. (2007) Gyürky, Gy., et al. 2007, Phys. Rev. C, 75, 35805
• Hastings (1970) Hastings, W. K. 1970, Biometrika, 57, 97
• Heinrich & Lyons (2007) Heinrich, J., & Lyons, L. 2007, Annu. Rev. Nucl. Part. Sci., 57, 145
• Iliadis et al. (2001) Iliadis, C., D’Auria, J. M., Starrfield, S., Thompson, W. J., & Wiescher, M. 2001, ApJS, 134, 151
• Iliadis et al. (2002) Iliadis, C., Champagne, A. E., José, J., Starrfield, S., & Tupper, P. 2002, ApJS, 142, 105
• Iliadis et al. (2010a) Iliadis, C., Longland, R., Champagne, Coc, A., & Fitzgerald, R. 2010, Nucl. Phys. A, 841, 31
• Iliadis et al. (2010b) Iliadis, C., Longland, R., Champagne, & Coc, A. 2010, Nucl. Phys. A, 841, 251
• Iliadis et al. (2010c) Iliadis, C., Longland, R., Champagne, & Coc, A. 2010, Nucl. Phys. A, 841, 323
• Iliadis et al. (2011) Iliadis, C., Champagne, A. E., Chieffi, A., & Limongi, M. 2011, ApJS, 193, 16
• Iliadis (2015) Iliadis, C. 2015, Nuclear Physics of Stars, (2nd Edition; Weinheim; Wiley-VCH)
• Iliadis et al. (2016) Iliadis, C., Karakas, A. I., Prantzos, N., Lattanzio, J. C., & Doherty, C. L. 2016, ApJ, 818, 98
• Iocco et al. (2009) Iocco, F., Mangano, G., Miele, G., Pisanti, O., & Serpico, P. D. 2009, Phys. Rep. 472, 1
• Junker et al. (1998) Junker, M., et al. 1998, Phys. Rev. C, 57, 2700
• Kajino (1986) Kajino, T. 1986, Nucl. Phys. A, 460, 559
• Kelly et al. (2013) Kelly, K. J., Iliadis, C., Downen, L., José, J., & Champagne, A. E. 2013, ApJ, 777, 130
• Kim, Izumoto & Nagatani (1981) Kim, B. T., Izumoto, T., & Nagatani, K. 1981, Phys. Rev. C, 23, 33
• Kontos et al. (2013) Kontos, A., et al. 2013, Phys. Rev. C, 87, 065804 (2013)
• Krauss et al. (1987) Krauss, A., Becker, H. W., Trautvetter, H.-P., & Rolfs, C. 1987, Nucl. Phys. A, 467, 273
• Kruschke (2015) Kruschke, J. K. 2015, Doing Bayesian Data Analysis, (2nd Edition; Amsterdam; Elsevier)
• Kudomi et al. (2004) Kudomi, N, et al. 2004, Phys. Rev. C, 69, 015802
• Lange, Little, & Taylor (1989) Lange, K. L., Little, R. J. A., & Taylor, J. M. G. 1989, J. Amer. Stat. Assoc., 84, 881 (408)
• Longland et al. (2010) Longland, R., Iliadis, C., Champagne, A. E., Newton, J. R., Ugalde, C., Coc, A., & Fitzgerald, R. 2010, Nucl. Phys. A, 841, 1
• Ma et al. (1997) Ma, L., et al. 1997, Phys. Rev. C, 55, 588 (1997)
• Marcucci et al. (2005) Marcucci, L. E., Viviani, M., Schiavilla, R., Kievsky, A., and Rosati, S. 2005, Phys. Rev. C, 72, 014001
• Metropolis et al. (1953) Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. 1953, J. Chem Phys., 21, 1087
• Nara Singh et al. (2004) Nara Singh, B. S. , Hass, M., Nir-El, Y., & Haquin, G. 2004, Phys. Rev. Lett., 93, 262503
• Neff (2011) Neff, T. 2011, Phys. Rev. Lett., 106, 042502
• Nollett (2001) Nollett, K. M. 2001, Phys. Rev. C, 63, 054002
• Parikh et al. (2008) Parikh, A., José, J., Moreno, F., & Iliadis, C. 2013, ApJS, 178, 110
• Paxton et al. (2011) Paxton, B., et al. 2011, ApJS, 192, 3
• Paxton et al. (2013) Paxton, B., et al. 2013, ApJS, 208, 4
• Paxton et al. (2015) Paxton, B., et al. 2015, ApJS, 220, 15
• Plummer (2003) Plummer, M. 2003, JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling, in: Proceedings of the 3rd International Workshop on Distributed Statistical Computing (dsc 2003), Vienna, Austria, ISSN 1609-395X
• Raftery & Lewis (1992) Raftery, A. E., & Lewis, S. M. 1992, Statistical Science, 7, 493
• Rapp et al. (2006) Rapp, W., Görres, J., Wiescher, M., Schatz, H., Käppeler, F. ApJ, 653, 474
• R Core Team (2015) R Core Team 2015, R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria (https://www.R-project.org/)
• Sallaska et al. (2013) Sallaska, A. L., Iliadis, C., Champagne, A. E., Goriely, S., Starrfield, S., & Timmes, F. X. 2013, ApJS, 207, 18
• Schmid et al. (1997) Schmid, G. J., et al. 1997, Phys. Rev. C, 56, 2565
• Smy (2013) Smy, M. 2013, Nucl. Phys. B (Proc. Suppl.), 235, 49
• Stoesz & Herwig (2003) Stoesz, J. A., & Herwig, F. MNRAS, 340, 763
• Tombrello & Parker (1963) Tombrello, T. A., & Parker, P. D. 1963, Phys. Rev., 131, 2582
• von Toussaint (2011) von Toussaint, U. 2011, Rev. Mod. Phys., 83, 943
• Wang et al. (1966) Wang, N., Novatskii, V., Osetinskii, G., Chien, N. and Chepurcenko, I. A., 1966, Soviet Nucl. Phys., 3, 777
• Wesolowski et al. (2016) Wesolowski, S., Klco, N., Furnstahl, R.J., Phillips, D.R., & Thapaliya, A. 2016, J. Phys. G, 43, 074001
• Xu et al. (2013) Xu, Y., Takahashi, K., Goriely, S., Arnould, M., Ohta, M., & Utsunomiya, H. 2013, Nucl. Phys. A, 918, 61
• Zhang, Nollett & Phillips (2015) Zhang, X., Nollett, K.M., & Phillips, D.R. 2015, Phys. Lett. B, 751, 535
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters