Nuclear mass predictions based on Bayesian neural network approach with pairing and shell effects
Bayesian neural network (BNN) approach is employed to improve the nuclear mass predictions of various models. It is found that the noise error in the likelihood function plays an important role in the predictive performance of the BNN approach. By including a distribution for the noise error, an appropriate value can be found automatically in the sampling process, which optimizes the nuclear mass predictions. Furthermore, two quantities related to nuclear pairing and shell effects are added to the input layer in addition to the proton and mass numbers. As a result, the theoretical accuracies are significantly improved not only for nuclear masses but also for single-nucleon separation energies. Due to the inclusion of the shell effect, in the unknown region, the BNN approach predicts a similar shell-correction structure to that in the known region, e.g., the predictions of underestimation of nuclear mass around the magic numbers in the relativistic mean-field model. This manifests that better predictive performance can be achieved if more physical features are included in the BNN approach.
pacs:21.10.Dr, 21.60.-n, 21.30.Fe
Mass is a fundamental property of atomic nuclei. It can be employed to extract various nuclear structure information, such as nuclear pairing correlation, shell effect, deformation transition, and so on Lunney2003RMP (). Nowadays it has been also widely used to determine nuclear effective interactions Bender2003RMP (). Moreover, nuclear mass is essential to determine the nuclear reaction energy in astrophysics and hence plays a crucial role in understanding the origin of elements in Universe Burbidge1957RMP (). In addition, the accurate mass determination is very important to test the unitarity of Cabibbo-Kobayashi-Maskawa matrix Liang2009PRC (); Hardy2015PRC ().
Measurements of nuclear mass have achieved great progress in recent years Franzke2008MSR (); Sun2015FP () and about nuclear masses have been measured up to now Wang2017CPC (). However, the accurate predictions of nuclear mass are still a great challenge for theoretical models, due to the difficulties in the exact theory of nuclear interaction and in the quantum many-body calculations. Nowadays three types of nuclear models are mainly used in global mass predictions: macroscopic, macroscopic-microscopic, and microscopic mass models. The Bethe-Weizsäcker (BW) mass formula is the first model used to estimate nuclear masses Weizsacker1935ZP (); Bethe1937RMP (), which belongs to the macroscopic type. It assumes the nucleus is similar to a charged liquid drop, so the microscopic effects, such as shell effect, cannot be well described. By taking into account the important corrections related to the microscopic effects, the macroscopic-microscopic models are developed, such as the finite-range droplet model (FRDM) Moller2012PRL () and the Weizsäcker-Skyrme (WS) model Wang2014PLB (). The microscopic mass models are mainly rooted in the density functional theory, which are more complicated but potentially have a better ability of extrapolation. In the non-relativistic framework, a series of Hartree-Fock-Bogoliubov (HFB) mass models have been constructed with the Skyrme Goriely2009PRLSkyrme (); Goriely2016PRC () or Gogny Goriely2009PRLGogny () effective interactions. In recent years, the relativistic mean-field (RMF) model also receives wide attention due to its success in describing various nuclear phenomena Vretenar2005PRp (); Meng2006PPNP (); Meng2016Book (); Meng2006PRC (); Liang2008PRL (); Niu2013PRCR (); Niu2017PRC () and its successful applications in astrophysics Sun2008PRC (); Niu2009PRC (); Xu2013PRC (); Niu2013PLB (). Based on the RMF model, global calculations of nuclear mass have been made and the accuracies were gradually improved Geng2005PTP (); Hua2012SCPMA (); Arteaga2016EPJA ().
The accuracies of these mass models range from about MeV for the BW model Kirson2008NPA () to about MeV for the WS model Wang2014PLB (). However, these accuracies are still insufficient to the studies of exotic nuclear structures and astrophysics nucleosynthesis. Especially, these models predict very different nuclear masses with the differences even up to tens of MeV when they are extrapolated to the neutron drip line. Therefore, it is still a high demand to further improve the existing nuclear mass models. Some techniques have been developed along this direction, such as the radial basis function (RBF) approach Wang2011PRC (); Niu2013PRCb (); Zheng2014PRC (); Niu2016PRC () and the image reconstruction technique with the CLEAN algorithm Morales2010PRC (). Moreover, the neural network has been proved to be a very powerful tool and it has been widely used in an impressive range of problem domains, such as pattern recognition and machine learning, see, e.g., Books Haykin2009Book (); Bishop2006Book () and the references therein. The application of neural network to predict nuclear masses can be traced back to the 1990s Gazula1992NPA (). A series of works after that were developed to further improve its predictive performance Gernoth1993PLB (); Athanassopoulos2004NPA (); Zhang2017JPG (). It was also extended to study other nuclear properties, such as nuclear -decay half-lives Costiris2009PRC (). These approaches usually need many parameters, in general hundreds or even thousands of parameters, for achieving better predictions, so the over-fitting problem and the quantification of uncertainties in the predictions should be treated in a reliable way.
The Bayesian approach can avoid the over-fitting problem by introducing the prior distribution of parameters, and it can quantify the uncertainties in the predictions since all parameters have probability distributions Neal1996Book (). Thus, it would be a valuable approach for improving the mass predictions of nuclear models. However, the Bayesian approach involves high-dimensional integrals over the whole parameter space, so its calculations are very time-consuming and great progress was achieved only in the last decades along with the developments in sampling methods and dramatic improvements in the speed and memory of computers Bishop2006Book (). Recently, the Bayesian neural network (BNN) approach was applied to improve the theoretical predictions of nuclear masses Utama2016PRC () and nuclear charge radii Utama2016JPG (). The noise error in the likelihood function is a key quantity in the BNN approach, however, it was usually much simplified by taking a fixed value in the previous studies Utama2016PRC (); Utama2016JPG (). In this work, we will introduce a prior distribution for the noise error. Furthermore, only the proton and mass numbers were considered in the input layer of the neural network in the previous studies Utama2016PRC (); Utama2016JPG (). Here we will consider more physical features into the input layer, i.e., we will include two quantities related to the well known nuclear pairing and shell effects, and investigate their influences on the predictive performance of the BNN approach.
In the Bayesian approach, the model parameters are described probabilistically. A probability distribution is introduced over all possible values of based on our background knowledge, which is called the prior distribution. When we observe a set of data , this distribution will be updated by using the Bayes’ theorem
where and are input and output data, is the number of data; is the likelihood function, which contains the information about parameters derived from the observations; is the probability distribution of parameters after the data are considered, which is called the posterior distribution; is a normalization constant, which ensures the posterior distribution is a valid probability density and integrates to one.
For the likelihood function , a Gaussian distribution, , is usually employed, where the objective function reads
Here, the standard deviation parameter is the associated noise error related to the th observable. For the BNN approach, the function is described with a neural network, which is
where and , and and are the numbers of neurons in the hidden layer and the number of input variables, respectively. In total, the number of parameters in this neural network is .
For the prior distributions of model parameters, they are usually set as Gaussian distributions with zero means. However, the precisions (inverse of variances) of these Gaussian distributions are not set as fixed values by hand. We set them as gamma distributions so that the precisions can vary over a large range and hence the BNN approach can search the optimal values of precisions in the sampling process automatically.
After specifying the likelihood function and the prior distribution , the posterior distribution of model parameters is known in principle. One can then make predictions based on this posterior distribution,
Since the model parameters are described with a probability distribution, an estimate of uncertainty in theoretical predictions is obtained naturally as
Note that Eq. (4) involves a high-dimensional integral in the whole parameter space. For that, we will employ the Monte Carlo integral algorithm, where the posterior distribution is sampled using the flexible Bayesian model developed by Neal Neal1996Book (), in which the Markov chain Monte Carlo algorithm is employed.
In this work, we will employ the BNN approach to reconstruct mass residuals between experimental data and mass predictions of various models, i.e.,
As in Refs. Utama2016PRC (); Utama2016JPG (), the inputs are usually taken as . However, we will consider more physical information into the BNN approach, so two extra inputs and related to nuclear pairing and shell effects are also included, which are
Here, and are the differences between the actual nucleon numbers and and the nearest magic numbers (, , , , , for protons and , , , , , , for neutrons) Kirson2008NPA (). For simplicity, we will use BNN-I2 and BNN-I4 to denote the BNN approaches with and , respectively. Their numbers of neurons are taken as and , respectively, so the model parameters in both neural networks are the same as .
The experimental masses are taken from the atomic mass evaluation of 2016 (AME2016) Wang2017CPC (), while only those nuclei with and experimental errors keV are considered. There are data left that compose the entire data set. In order to examine the validity of the BNN approach, we separate the entire set into two different sets: the learning set and the validation set. The learning set is built by randomly selecting nuclei from the entire set and the remaining nuclei compose the validation set. For the theoretical mass models, we take two microscopic (RMF Geng2005PTP () and HFB-31 Goriely2016PRC ()), two macroscopic-microscopic (WS4 Wang2014PLB () and FRDM12 Moller2012PRL ()), and two macroscopic (BW Kirson2008NPA () and BW2 Kirson2008NPA ()) mass models as examples.
The noise errors in Eq. (2) were usually taken as a fixed value estimated from mass differences between experimental data and model predictions Utama2016PRC (); Utama2016JPG (). A more elegant way is to set it as a distribution, and the sampling process can search an appropriate value automatically, which can optimize the nuclear mass predictions. In this work, we will use a gamma distribution for the noise precision (inverse of squared noise error ), because the gamma distribution is the conjugate prior distribution of the precision of Gaussian distribution, which can make calculations easier in mathematics Bishop2006Book ().
Table 1 gives the root-mean-square (rms) deviations of nuclear mass with respect to the experimental data in the learning sets for various mass models and their counterparts improved by the BNN-I2 approaches. Clearly, the BNN approach can significantly improve the mass predictions even with a fixed noise precision. By using a gamma distribution, the improvements are further enhanced and the reduction in the rms deviations even approaches for the RMF and BW2 models. In the following, all calculations will be performed with a gamma distribution for the noise precision.
It is well known that nuclear pairing and shell effects play very important roles in mass predictions Lunney2003RMP (). For further improving the mass deviations related to such effects, two extra inputs and are included in addition to and . The corresponding rms deviations for various mass models improved by the BNN-I4 approach are given in Table 2. The results of original models and those improved by BNN-I2 are also shown for comparison.
As the best example, the liquid-drop BW mass model only includes the volume, surface, symmetry, and Coulomb terms, while both pairing and shell effects are fully neglected Kirson2008NPA (). Improved by BNN-I2, its posterior rms deviation is still much larger than those of other mass models. However, with the BNN-I4 approach, its posterior rms deviation is significantly reduced from to keV.
In general, improved by the BNN-I4 approach, the rms deviations of all mass models are significantly reduced, e.g., exceeding for the BW model. It can be seen clearly in Fig. 1(a). In addition, from the rms deviations for the validation set shown in Table 2, one can evaluate the predictive performance of the BNN approach. Although the rms deviations for the validation set are slightly larger than those for the learning set, the improvements on the original models are still significant.
The single-nucleon separation energies are related to the derivatives of nuclear mass surface. They are also very important to nucleon-capture reactions in astrophysics. Therefore, it is interesting to investigate the improvements of single-nucleon separation energies with the BNN approaches. Previous studies found that the RBF approach is one of the powerful techniques to improve the mass predictions of nuclear models Wang2011PRC (); Niu2013PRCb (); Zheng2014PRC (), but its improvement in overall mass predictions even deteriorates the description of single-nucleon separation energy ( or ) unless the RBF is done twice separately Niu2016PRC (). Table 3 shows the rms deviations of and with respect to the data in the learning and validation sets for various mass models and their counterparts improved by the BNN approaches. For completeness, the two-neutron () and two-proton () separation energies are given together. The results for the learning set are shown in Fig. 1(b). It is clear that the BNN approach can improve the predictions of nuclear masses and the single-nucleon separation energies simultaneously, remarkably for the BNN-I4 approach. This indicates the BNN-I4 approach is more effective to simultaneously improve the descriptions of nuclear mass surface and its derivatives than the BNN-I2 approach.
The rms deviation provides only a gross assessment of the accuracy of a nuclear mass model. To show some details, we present the mass differences between the experimental data and the predictions of each nucleus in the entire set in panel (a) of Fig. 2 by taking the RMF mass model as an example. Clearly, there are some large differences, such as in the region around the magic numbers. These discrepancies around the magic numbers were also found in the HFB mass models with Skyrme force Kortelainen2010PRC () or Gogny force Goriely2016EPJA (), which are generally explained as being due to the physics missing from the energy density functionals—the so-called “beyond mean-field” physics. The idea of the BNN approach is to employ a neural network for simulating such kinds of missing physics in nuclear mass models, so it is expected that the mass predictions of nuclear models can be improved. Panel (b) gives the mass corrections of RMF by using the BNN-I2 approach. It is found that there are very similar structures between panel (a) and those inside the contour lines of panel (b). This indicates the BNN approach can well describe the smooth deviations between the experimental data and theoretical predictions. The mass differences between the experimental data and the mass predictions improved by the BNN-I2 approach are shown in panel (c). Clearly, the mass deviations of the RMF model are almost eliminated. Quantitatively, the resulting rms deviation is reduced from to MeV. However, the remaining differences still show some odd-even staggering structures, i.e., smaller and larger differences appear alternately. In addition, from the structure outside the contour lines in panel (b), the BNN approach predicts a systematic overestimation (underestimation) of nuclear mass in the neutron-rich (neutron-deficient) region except for heavy neutron-rich nuclei. It is different from the structure in the known region inside the contour lines, which holds richer features and predicts an overestimation of nuclear masses for nuclei around the magic numbers.
It is well known the odd-even staggering and local structures around magic numbers are related to nuclear pairing correlation and shell effect, respectively. Therefore, the inclusion of and in Eq. (7) is expected to work out these problems. The corresponding results for the BNN-I4 approach are shown in panes (d) and (e) of Fig. 2. From panel (d), it is clear that the BNN-I4 approach eliminates not only the smooth deviations but also the odd-even staggering. Therefore, there is no remarkable odd-even staggering for the mass differences left in panel (e). Furthermore, the BNN corrections outside the contour lines in panel (d) show more structure features than those in panel (b). For example, it predicts an overestimation of mass for nuclei towards and . This may indicate the extrapolation ability of BNN-I4 is more reliable than that of BNN-I2.
To evaluate the extrapolation ability of BNN approach, we will use those nuclei in AME2016 but not selected into the entire set, since their present experimental errors keV. Taking the Ba isotopes and isotones as examples, the corresponding results are shown in Figs. 3 and 4, respectively. The gray-hatched regions denote the range of the entire set. Clearly, both BNN-I2 and BNN-I4 approaches can eliminate the smooth mass deviations to a large extent, while the BNN-I4 approach remarkably reduces the odd-even staggering. If the extrapolation is not far away from the training region, i.e., when the change in neutron or proton number is not larger than , the RMF mass predictions are well improved by the BNN approaches, especially by BNN-I4. This further manifests the BNN-I4 approach achieves better predictive performance than the BNN-I2 approach.
Apart from improving the mass predictions of nuclear models, the BNN approach also provides the uncertainties in mass predictions, which are shown in Figs. 3 and 4 as well. It is found that the uncertainties of BNN mass predictions become larger and larger for both BNN approaches if they are extrapolated away from training region, while the uncertainties in the BNN-I4 approach are smaller than those in the BNN-I2 approach. In addition, the yellow-hatched regions in Figs. 3 and 4 give the mass uncertainties from the average errors of theoretical models and the experimental errors. We found that the BNN mass predictions generally agree well with the data within uncertainties, even they are extrapolated from the training region. This demonstrates the BNN approaches can estimate uncertainties in mass predictions in quite a reliable way. However, if it is extrapolated too far away from the known region, there might be some new physics effects, which are hidden in the known region and hence cannot be discovered by training the neural network using the known data.
In summary, we have employed the Bayesian neural network approach to improve the nuclear mass predictions of various models. By using a distribution for the noise error in likelihood function, the BNN approach can find the optimal value of the noise error automatically, which improve nuclear mass predictions remarkably. To better describe nuclear pairing and shell effects on mass predictions, we further include two relevant quantities in addition to the proton and mass numbers, keeping the number of parameters unchanged. It is found that the present BNN approach not only eliminates the smooth mass deviations significantly but also remarkably reduces the odd-even staggering in mass deviations. As a result, the accuracies of all mass models considered here are significantly improved not only for the nuclear masses but also for the separation energies. Furthermore, the mass corrections with the present BNN approach show more structure features, e.g., it predicts an overestimation of nuclear masses for nuclei towards and in the RMF mass model. This manifests better predictive performance can be achieved not only in the known region but also in the unknown region far from the -stability line, if more physical features are included in the BNN approach.
It is known that there exists an exact universal energy density functional for nuclear ground-state properties, though it is very difficult even impossible to construct it. If one is able to find an accurate energy density functional with the BNN approach by taking various densities as the direct inputs, one can make reliable predictions for various nuclear ground-state properties. Works along this line are now in progress. In addition, one can also apply the BNN approach to improve other nuclear properties with many experimental data, such as nuclear charge radii, -decay half-lives, and so on.
We are grateful to Professor T. Hatsuda and Dr. Y. F. Niu for the fruitful discussions. This work was partly supported by the National Natural Science Foundation of China under Grant No. 11205004, the Natural Science Foundation of Anhui Province under Grant No. 1708085QA10, the Key Research Foundation of Education Ministry of Anhui Province under Grant No. KJ2016A026, and the RIKEN iTHES project and iTHEMS program.
- (1) D. Lunney, J. M. Pearson, C. Thibault, Rev. Mod. Phys. 75, 1021 (2003).
- (2) M. Bender and P.H. Heenen, and P.G. Reinhard, Rev. Mod. Phys. 75, 121 (2003).
- (3) E. Margaret Burbidge, G. R. Burbidge, William A. Fowler, and F. Hoyle, Rev. Mod. Phys. 29, 547 (1957).
- (4) H. Z. Liang, N. Van Giai, J. Meng, Phys. Rev. C 79, 064316 (2009).
- (5) J. C. Hardy, I. S. Towner, Phys. Rev. C 91, 025501 (2015).
- (6) B. Franzke, H. Geissel, G. Münzenberg, Mass Spectrom. Rev. 27, 428 (2008).
- (7) B. H. Sun, Yu. A. Litvinov, I. Tanihata, Y. H. Zhang, Front. Phys. 10, 102102 (2015).
- (8) M. Wang, G. Audi, F. G. Kondev, W. J. Huang, S. Naimi, and X. Xu, Chin. Phys. C 41, 030003 (2017).
- (9) C. F. Von Weizsäcker, Z. Phys. 96, 431 (1935).
- (10) H. A. Bethe, R. F. Bacher, Rev. Mod. Phys. 8, 82 (1936).
- (11) P. Möller, W. D. Myers, H. Sagawa, and S. Yoshida, Phys. Rev. Lett. 108, 052501 (2012).
- (12) N. Wang, M. Liu, X. Z. Wu, J. Meng, Phys. Lett. B 734, 215 (2014).
- (13) S. Goriely, N. Chamel, J.M. Pearson, Phys. Rev. Lett. 102, 152503 (2009).
- (14) S. Goriely, N. Chamel, J. M. Pearson, Phys. Rev. C 93, 034337 (2016).
- (15) S. Goriely, S. Hilaire, M. Girod, S. Péru, Phys. Rev. Lett. 102, 242501 (2009).
- (16) J. Meng, H. Toki, S. G. Zhou, S. Q. Zhang, W. H. Long, and L. S. Geng, Prog. Part. Nucl. Phys. 57, 470 (2006).
- (17) D. Vretenar, A. V. Afanasjev, G.A. Lalazissis, P. Ring, Phys. Rep. 409, 101 (2005).
- (18) International Review of Nuclear Physics, Vol. 10, Relativistic Density Functional for Nuclear Structure, edited by J. Meng (World Scientific, Singapore, 2016).
- (19) J. Meng, J. Peng, S. Q. Zhang, and S. G. Zhou, Phys. Rev. C 73, 037303 (2006).
- (20) H. Z. Liang, N. Van Giai, J. Meng, Phys. Rev. Lett. 101, 122502 (2008).
- (21) Z. M. Niu, Y. F. Niu, Q. Liu, H. Z. Liang, and J. Y. Guo, Phys. Rev. C 87, 051303(R) (2013).
- (22) Z. M. Niu, Y. F. Niu, H. Z. Liang, W. H. Long, and J. Meng, Phys. Rev. C 95, 044301 (2017).
- (23) B. Sun, F. Montes, L. S. Geng, H. Geissel, Yu. A. Litvinov, and J. Meng, Phys. Rev. C 78, 025806 (2008).
- (24) Z. M. Niu, B. H. Sun, and J. Meng, Phys. Rev. C 80, 065806 (2009).
- (25) Z. M. Niu, Y. F. Niu, H. Z. Liang, W. H. Long, T. Nikšić, D. Vretenar, and J. Meng, Phys. Lett. B 723, 172 (2013).
- (26) X. D. Xu, B. Sun, Z. M. Niu, Z. Li, Y. Z. Qian, and J. Meng, Phys. Rev. C 87, 015805 (2013).
- (27) L. S. Geng, H. Toki, and J. Meng, Prog. Theor. Phys. 113, 785 (2005).
- (28) X. M. Hua, T. H. Heng, Z. M. Niu, B. H. Sun, J. Y. Guo, Sci. China Phys. Mech. Astron. 55, 2414 (2012).
- (29) D. Peña-Arteaga, S. Goriely, and N. Chamel, Eur. Phys. J. A 52, 320 (2016).
- (30) M. W. Kirson, Nucl. Phys. A 798, 29 (2008).
- (31) N. Wang and M. Liu, Phys. Rev. C 84, 051303(R) (2011).
- (32) Z. M. Niu, Z. L. Zhu, Y. F. Niu, B. H. Sun, T. H. Heng, and J. Y. Guo, Phys. Rev. C 88, 024325 (2013).
- (33) J. S. Zheng, N. Y. Wang, Z. Y. Wang, Z. M. Niu, Y. F. Niu, and B. Sun, Phys. Rev. C 90, 014303 (2014).
- (34) Z. M. Niu, B. H. Sun, H. Z. Liang, Y. F. Niu, and J. Y. Guo, Phys. Rev. C 94, 054315 (2016).
- (35) Irving O. Morales, P. Van Isacker, V. Velázquez, J. Barea, J. Mendoza-Temis, J. C. López Vieyra, J. G. Hirsch, and A. Frank, Phys. Rev. C 81, 024304 (2010).
- (36) S. Haykin, Neural Networks and Learning Machines (Pearson Education, New Jersey, 2009).
- (37) C. M. Bishop, Pattern Recognition and Machine Learning (Springer, Singapore, 2006).
- (38) S. Gazula, J. W. Clark, and H. Bohr, Nucl. Phys. A 540, 1 (1992).
- (39) K. A. Gernoth, J. W. Clark, and J. S. Prater, Phys. Lett. B 300, 1 (1993).
- (40) S. Athanassopoulos, E. Mavrommatis, K. A. Gernoth, and J. W. Clark, Nucl. Phys. A 743, 222 (2004).
- (41) H. F. Zhang, L. H. Wang, J. P. Yin, P. H. Chen, and H. F. Zhang, J. Phys. G: Nucl. Part. Phys. 44, 045110 (2017).
- (42) N. J. Costiris, E. Mavrommatis, K. A. Gernoth, and J. W. Clark, Phys. Rev. C 80, 044332 (2009).
- (43) R. Neal, Bayesian Learning of Neural Network (Springer, New York, 1996).
- (44) R. Utama, J. Piekarewicz, and H. B. Prosper, Phys. Rev. C 93, 014311 (2016).
- (45) R. Utama, W. C. Chen, and J. Piekarewicz, J. Phys. G: Nucl. Part. Phys. 43, 114002 (2016).
- (46) M. Kortelainen, T. Lesinski, J. Moré, W. Nazarewicz, J. Sarich, N. Schunck, M. V. Stoitsov, and S. Wild, Phys. Rev. C 82, 024313 (2010).
- (47) S. Goriely, S. Hilaire, M. Girod, and S. Péru, Eur. Phys. J. A 52, 202 (2016).