Likelihood Approach to the First Dark Matter Results from XENON100
Many experiments that aim at the direct detection of Dark Matter are able to distinguish a dominant background from the expected feeble signals, based on some measured discrimination parameter. We develop a statistical model for such experiments using the Profile Likelihood ratio as a test statistic in a frequentist approach. We take data from calibrations as control measurements for signal and background, and the method allows the inclusion of data from Monte Carlo simulations. Systematic detector uncertainties, such as uncertainties in the energy scale, as well as astrophysical uncertainties, are included in the model. The statistical model can be used to either set an exclusion limit or to quantify a discovery claim, and the results are derived with a proper treatment of statistical and systematic uncertainties. We apply the model to the first data release of the XENON100 experiment, which allows to extract additional information from the data, and place stronger limits on the spin-independent elastic WIMP-nucleon scattering cross-section. In particular, we derive a single limit, including all relevant systematic uncertainties, with a minimum of for WIMPs with a mass of .
pacs:95.35.+d, 29.40.-n, 29.85.Fj, 02.50.-r
The XENON100 Collaboration
It is now well established that the dominant mass fraction of our Universe consists of some yet-unknown form of dark matter Jarosik et al. (2011). Well-motivated models predict Dark Matter in the form of Weakly Interacting Massive Particles (WIMPs) Nakamura et al. (2010) for which searches can be conducted in direct scattering experiments located on Earth. In such experiments, WIMPs are expected to induce nuclear recoils with a roughly exponentially falling spectrum that extends to at most a few tens of keV in energy Cerdeno and Green (2010). In contrast, the dominant background is usually composed of electronic recoils from - or -radiation. Most direct Dark Matter search experiments are able to discriminate nuclear recoils from electronic recoils based on some discrimination parameter, such as the ratio of ionization to phonon signal Ahmed et al. (2010); Armengaud et al. (2010), scintillation to phonon signal Angloher et al. (2009), ionization to scintillation signal Angle et al. (2008); Aprile et al. (2010), or the pulse shape of the scintillation light Lippincott et al. (2008). In a plot of this discrimination parameter as function of energy, the signal and background events thus separate. Typically, a signal acceptance area is defined a priori in this parameter space. Only events that fall within this area are considered as WIMP candidate events for the interpretation of the result in terms of a Dark Matter detection, or for the calculation of limits on the WIMP-nucleon scattering cross-section.
Such a hard cut has some obvious disadvantages. Most importantly, the particular location of the signal candidate events within the signal acceptance area is not taken into account. In general, the expected signal and background will follow a different spectrum, which is utilized by the maximum-gap and optimum-interval methods Yellin (2002), a frequentist approach to calculate limits. The generalization to two dimensions, the maximum-patch method Henderson et al. (2008), extends this feature to the case where background leaks into the signal acceptance region from high or low discrimination parameters. An intrinsic advantage of these methods is that the resulting limit is robust against small changes of the signal acceptance region, and they are widely used in discriminating direct Dark Matter searches today. However, these methods also have some severe drawbacks: they are designed to always result in an upper limit, and a natural transition to a claim of detection, as for example inherent to the Feldman-Cousins-approach Feldman and Cousins (1998), is not possible. In addition, systematic uncertainties are not taken into account. Thus, the resulting limit can be quoted at a given statistical confidence level (CL), typically taken to be 90%, but systematic uncertainties need to be treated separately.
In this paper, we present an approach that overcomes both of these problems of the conventional treatment of data from direct Dark Matter detection experiments. As a practical example, we use data from the first Dark Matter results of the XENON100 experiment Aprile et al. (2010). This data was previously analyzed in the classical way, by pre-defining a signal acceptance region based on calibration data. In particular, this region was defined using the prompt scintillation signal as a measurement of recoil energy, in the range between 4-20 photoelectrons (PE). In energy, this corresponds to about (keV nuclear recoil), depending on the energy calibration through the assumed relative scintillation efficiency Hitachi (2005); Lindhard and Scharff (1961). The logarithm of the ratio of proportional scintillation, , to prompt scintillation, , was taken as a discrimination parameter Aprile et al. (2006), and the signal acceptance region was defined below the median of the nuclear recoil band, as measured with a AmBe neutron calibration. The electronic recoil discrimination in this signal acceptance region was determined to be larger than 99% based on calibration data from Compton-scattered Co gammas. Using 11.17 live days of data, no events were observed in the signal acceptance region. Thus, upper limits on the WIMP-nucleon scattering cross-section were derived, based on zero observed events. Two separate limit curves were calculated for two different parameterizations of in the low-energy region where measurements are lacking or are uncertain Manalaysay (2010).
Here, we re-analyze this data with the Profile Likelihood Ratio statistical approach Eadie et al. (1971), that is inherent to the MINOS package commonly used with the MINUIT software James and Roos (1975). This method relies on the assumption that one can properly model the background via calibration measurements or Monte Carlo simulations, as opposed to the Yellin methods Yellin (2002), where no information on the background is assumed. This increases the sensitivity of the method at the price of being less robust against systematic uncertainties on the background. However, by incorporating systematic uncertainties into the model, one can adjust the trade-off between sensitivity and robustness to the level desired by the experimentalist. In any case, understanding of the background is of course essential to any method that is used to quantify the significance of a discovery claim.
The particular Profile Likelihood model used here is presented in Section II. Uncertainties in , that can serve as an example for systematic uncertainties related to the detector, are treated in Section III, together with systematic uncertainties from the escape velocity as an example for astrophysical uncertainties. Calibration data for both the electronic recoil background as well as the nuclear recoil signal are taken as constraining control measurements. These constrain the full likelihood model, which is constructed in Section IV. The sensitivity of the experiment is calculated, based solely on this model, in Section V. Taking the actual measurement into account, the resulting exclusion curve is derived also in Section V. Discussion and conclusions follow in Section VI.
Ii The Statistical Model
Events recorded by the XENON100 detector can be characterized by their prompt and proportional scintillation signals, and are therefore treated in this analysis as a set of points in the plane. The distribution of signal events in can be predicted from theoretical models taking into account the detector response, as will be described below. The joint distribution in and , for signal and background events, is estimated from the calibration measurements. The current measurements of and are similarly treated as another set of control measurements. The statistical model describing the outcome of the measurements thus includes a number of unknown quantities, i.e. nuisance parameters. Those are the expected number of background events given the exposure and the parameter range in question, a set of probabilities and describing the distribution of signal and background events in the plane, the relative scintillation efficiency , as well as the escape velocity . The full likelihood function , for a given WIMP mass and cross-section , is written as a product of five terms:
The first term describes the main measurement of the XENON100 detector, while the following terms describe the subsidiary measurements that are used to constrain the nuisance parameters in the main likelihood term . The precise definition of these terms will be given in the following sections. Additional uncertainties can easily be incorporated by adding additional Likelihood terms, but are not relevant for the data set analyzed here.
The signal cross-section is the one parameter of interest. All other parameters are nuisance parameters which are profiled out with a profile likelihood ratio, as explained below (Equation 3). We follow the procedure of a hypothesis test based on the profile likelihood ratio Rolke et al. (2005); Cowan et al. (2010). This technique can be used both to exclude a WIMP with a specific mass and cross-section, or to establish the significance of a discovery.
A test statistic reduces the observed data to only one value and is constructed in order to test the signal hypothesis . It is given by
where is the maximum likelihood estimator (MLE) of , i.e. the value of that maximizes the likelihood Equation II. is the Profile Likelihood ratio and is given by
The double-hat parameters in the numerator are the conditional MLEs of the nuisance parameters when the signal cross-section is fixed to a given value . The ‘single-hat’ parameters in the denominator are the MLEs of all parameters allowing also to vary. By construction, , hence . equals zero when the best-fit value of the cross-section () equals the hypothesized value (), which corresponds to the most signal-like outcome. Larger values of the test statistic indicate that the data are less compatible with the signal hypothesis . Since we are concerned with calculating a one-sided upper bound, we only consider outcomes with as an evidence against the signal hypothesis and set to zero otherwise.
Let be the probability distribution function of the test statistic under the signal hypothesis , and let be the value of the test statistic obtained with the observed data. The signal -value , is the probability that the outcome of a hypothetical, random XENON100 experiment results in a test statistic larger (less signal-like) than the observed one, when the signal hypothesis is true. Therefore, given by
The signal hypothesis is rejected at 90% CL if .
Downward fluctuations of the background might lead to exclusions of very small cross-sections to which the experiment is not sensitive. To protect against such an effect, is modified Junk (1999); Read (2002) to
is the probability of the test statistic to be larger than the observed one under the background-only hypothesis . Other protection procedures would also be conceivable, but this particular extension has a conservative over-coverage nature. It has been verified by Monte Carlo simulations that the coverage of our claimed 90% CL upper bound on the cross-section is in the range (92-95)% and thus larger than 90%, in accordance with the conservative nature of the method.
The upper limit on the cross-section for a given WIMP mass is found by solving
Wilks’ theorem Wilks (1938) states that follows a chi-square distribution in the limit of a large number of observations, which in this context not only includes the observed WIMP candidate events, but also all of the control measurements. It has been verified by Monte Carlo simulations that this asymptotic behavior indeed describes the distribution of the test statistic to a good approximation. We therefore use the chi-square approximation in order to estimate the signal -value . is estimated from Monte Carlo simulations of background only events.
We can also use the above statistical method in a natural way to quantify the significance of a possible additional event population or signal discovery. To this end, we test the background-only hypothesis (), and try to reject it. Similar to Equation 2, the discovery test statistic is defined as
Setting in Equation 3 yields
The -value is defined as
where is the probability density function of under the background-only hypothesis . A -value of corresponds to a 3 signal evidence. Following Wilks’ theorem, asymptotically distributes as a chi-square distribution under the background-only hypothesis. Under Wilks’ theorem, the discovery significance is given by with . As a general remark, the Profile Likelihood can result in the discovery of a process that looks like the signal hypothesis. However, the interpretation of the physical origin of such a signal is of course left to scientific discourse.
Iii Expected Signal with and as Control Measurements
The relative scintillation efficiency is the largest systematic uncertainty for the data analyzed here. It relates the expected number of photoelectrons to the recoil energy . The recoil energy dependence of together with its uncertainty are taken from a Gaussian fit of all available direct measurements Arneodo et al. (2000); Akimov et al. (2002); Bernabei et al. (2001); Aprile et al. (2005); Chepel et al. (2006); Aprile et al. (2009); Manzur et al. (2010) as shown in figure 1. To extrapolate below , where there are no available measurements of yet, the 68% confidence interval for is constructed such that it follows the two different extrapolations used in Aprile et al. (2010) to facilitate the comparison of the resulting limit. We emphasize that this extrapolation follows both the new data published in Plante et al. (2011), the theoretical expectations spelled out in Bezrukov et al. (2010), as well as the trend expected from simulations Szydagis et al. (2011).
We parametrize the uncertainty of with a single nuisance parameter , which is defined to be normally distributed with zero mean and unit variance. Thus, the -term in Equation II corresponds simply to
The number of expected signal events is given by the integral over the predicted WIMP energy spectrum Lewin and Smith (1996) as seen in the detector. is calculated for a given WIMP mass , cross-section , Galactic escape velocity as well as a given target mass and exposure time.
For the uncertainty on the escape velocity , we use the asymmetric probability distribution as shown in Figure 7 of Smith et al. (2007), yielding at 90% CL, with the median being . The likelihood term is defined accordingly to be
Let be the expected number of photoelectrons (PE) for a given recoil energy , given by
and Aprile et al. (2006) are the scintillation quenching factors due to the electric field for electronic and nuclear recoils, respectively. The normalization light yield for this data sample was measured to be Aprile et al. (2010). Errors on these parameters are small and have been neglected for the current analysis.
The signal rate in number of photoelectrons is then given by
Taking into account also the finite average single-photoelectron resolution of the XENON100 photomultipliers, the resulting -spectrum is given by
where is the acceptance of the applied cuts Aprile et al. (2010).
The total number of expected signal events is calculated from the integral
where and is the energy interval considered in the analysis. The effect of varying (Equation 11) for different WIMP masses is shown in Figure 2. It is calculated for an exposure of 11.17 days with 40 kg of xenon and a cross-section of .
Finally, the predicted normalized WIMP spectrum is given by
Variation of modifies the expected number of total signal events at a given cross-section, but the shape of the distribution is found to be hardly affected, see Figure 3. We therefore assume that is independent of , which greatly simplifies the analysis.
Iv Likelihood Construction
The total number of calibration events used for this analysis in the 4 PE-20 PE -region was nuclear recoils from AmBe and electronic recoils from Compton-scattered Co gammas. Those were divided into bands in the plane, containing approximately equal fractions (about 4%) of nuclear recoils in each band, see Figure 4. and are the corresponding probabilities for a signal or background event to fall in a given band . These bands were constructed such that the fraction of nuclear recoil events in each band is independent of , so that the signal spectrum is the same in all bands. The binning resolution of the -plane is limited by the amount of available neutron calibration data. It was verified that the resulting limit is not sensitive to the number of bands.
We take the calibration measurements as control measurements of and events in band , in order to constrain and . and are assumed to be Poisson distributed with expectations and . The corresponding likelihood terms in Equation II are thus
Finally, given a set of data points in a band , the likelihood that they emerge from a given WIMP spectrum is given by
Collecting all the likelihood terms together, the full likelihood function reads
Differentiating with respect to the expected number of signal events and with respect to , we find the following relations between the MLEs:
These relations prove useful in the minimization of the likelihood.
V WIMP Exclusion
To assess the statistical power of the experiment, the exclusion sensitivity can be calculated even before analyzing the actual data. The sensitivity of the experiment is the limit that can be expected for given exposure and experimental conditions. The test statistic is obtained by plugging the likelihood (Equation 21) into equations 2 and 3. To verify the validity of Wilks’ theorem, we generate a signal for a given WIMP mass and add it to a background simulation. The resulting test statistic distribution for a WIMP is shown in Figure 5 (thick red histogram). One can clearly see that the distribution is well approximated by a chi-square distribution. This allows to estimate analytically.
To estimate the exclusion sensitivity, background-only experiments are simulated, based on a Poisson distribution with the expectation of 22 events as observed in Aprile et al. (2010). The test statistic distribution is shown as the thin (blue) curve in Figure 5. The sensitivity is defined as the median of , also indicated in the figure. One can see that a large fraction of experiments under the background-only hypothesis (thin blue histogram) result in a test statistic that is similar to or even more signal-like (the area to the left of ) than that of experiments with a signal (thick red histogram). Figure 6 shows the sensitivity as its (dark shaded) and (light shaded) bands. For completeness of the statistical analysis we note that the -value of the background-only hypothesis is 50% as there are less events than expected, and .
To derive the upper bound on the WIMP cross-section as function of WIMP mass, we solve Equation 7 for the cross-section that satisfies (see Section II). The test statistic actually observed in data is shown in Figure 5, and the actual limit is shown in Figure 6. This limit has a minimum of at .
Two major advantages of the method presented here manifest themselves when comparing this new limit to the conventionally calculated one previously published in Aprile et al. (2010). In the high WIMP mass region, the limit obtained here is stronger by about 20%-30%. This simply reflects the fact that in Aprile et al. (2010), only half of the available parameter space (50% nuclear recoil acceptance) was defined a priori as the signal acceptance region. In contrast, the current analysis considers the full available discrimination parameter space, also taking the background in each band into account. Since there are no events observed in the data even in the region above the nuclear recoil median (see Figure 3 in Aprile et al. (2010)), the limit improves accordingly.
The second major difference between these two analyses appears in the low WIMP mass region, which is most sensitive to uncertainties in . The limit obtained here is more stringent than the limit shown in Aprile et al. (2010), constraining the areas favored by CoGeNT Aalseth et al. (2010) and DAMA Bernabei et al. (2010); Savage et al. (2009). This comes from our definition of the uncertainty bands of as shown in Figure 1, where we set the interval to equal the two extrapolations used in the previous analysis. We emphasize that here we do not assume any specific behavior of , but rather consider it as an unknown quantity with the appropriate measurement uncertainty, as described in Section III. Such systematic uncertainty is incorporated in a natural way in the calculation of the limit and results in a single limit at 90% CL, including all considered uncertainties.
We have introduced and applied the frequentist statistical method based on the profile likelihood test statistic to re-analyze the first data release from the XENON100 direct Dark Matter search experiment. This method avoids the need to a priori define a signal acceptance region, but instead takes all measured data into account. The background was estimated using calibration data as control measurements. Uncertainties in the relative scintillation efficiency and the Galactic escape velocity were taken into account in the construction of the likelihood model. Using the profile likelihood test statistic allows to calculate the sensitivity of the experiment with its uncertainty bands, and to set a well-defined single limit, taking all systematic uncertainties into account. Applying the method to the previously published XENON100 data results in an improvement of the limit over a wide WIMP mass range, with a minimum for WIMPs with mass . In addition, this method can easily be applied for the discovery of a WIMP signal.
We gratefully acknowledge support from NSF Grants No. PHY07-05326, PHY07-05337, PHY09-04220, PHY09-04212, and PHY09-04224, DOE Grant No. DE-FG-03-91ER40662, SNF Grants No. 20-118119 and 20-126993, the Volkswagen Foundation, FCT Grant No. PTDC/FIS/100474/2008, STCSM Grant No. 10ZR1415000, the Minerva Gesellschaft and GIF. We are grateful to the LNGS staff for their continued support.
- Jarosik et al. (2011) N. Jarosik et al., Astrophys. J. Suppl. 192, 14 (2011), arXiv:1001.4744.
- Nakamura et al. (2010) K. Nakamura et al. (Particle Data Group), J. Phys. G37, 075021 (2010).
- Cerdeno and Green (2010) D. G. Cerdeno and A. M. Green, Direct detection of WIMPs (Cambridge University Press, 2010), Chapter 17 in: Particle Dark Matter: Observations, Models and Searches, ed. G. Bertone, arXiv:1002.1912.
- Ahmed et al. (2010) Z. Ahmed et al., Science 327, 1619 (2010), arXiv:0912.3592.
- Armengaud et al. (2010) E. Armengaud et al., Phys. Lett. B687, 294 (2010), arXiv:0912.0805.
- Angloher et al. (2009) G. Angloher et al., Astropart. Phys. 31, 270 (2009), arXiv:0809.1829.
- Angle et al. (2008) J. Angle et al., Phys. Rev. Lett. 100, 021303 (2008), arXiv:0706.0039.
- Aprile et al. (2010) E. Aprile et al., Phys. Rev. Lett. 105, 131302 (2010), arXiv:1005.0380.
- Lippincott et al. (2008) W. H. Lippincott et al., Phys. Rev. C78, 035801 (2008), arXiv:0801.1531.
- Yellin (2002) S. Yellin, Phys. Rev. D66, 032005 (2002), arXiv:physics/0203002, See also S. Yellin (2007), Extending the optimum interval method, arXiv:0709.2701.
- Henderson et al. (2008) S. Henderson, J. Monroe, and P. Fisher, Phys. Rev. D78, 015020 (2008), arXiv:0801.1624.
- Feldman and Cousins (1998) G. J. Feldman and R. D. Cousins, Phys. Rev. D57, 3873 (1998), arXiv:physics/9711021.
- Hitachi (2005) A. Hitachi, Astropart. Phys. 24, 247 (2005).
- Lindhard and Scharff (1961) J. Lindhard and M. Scharff, Phys. Rev. 124, 128 (1961).
- Aprile et al. (2006) E. Aprile et al., Phys. Rev. Lett. 97, 081302 (2006), arXiv:astro-ph/0601552.
- Manalaysay (2010) A. Manalaysay (2010), arXiv:1007.3746.
- Eadie et al. (1971) W. Eadie, D. Drijard, F. James, M. Roos, and B. Sadoulet, Statistical Methods in Experimental Physics (North-Holland Publ. Co., Amsterdam, 1971), see in particular pp. 203–205.
- James and Roos (1975) F. James and M. Roos, Comput. Phys. Commun. 10, 343 (1975).
- Rolke et al. (2005) W. A. Rolke, A. M. Lopez, and J. Conrad, Nucl. Instrum. Meth. A551, 493 (2005), arXiv:physics/0403059.
- Cowan et al. (2010) G. Cowan, K. Cranmer, E. Gross, and O. Vitells (2010), arXiv:1007.1727.
- Junk (1999) T. Junk, Nucl. Instrum. Meth. A434, 435 (1999), arXiv:hep-ex/9902006.
- Read (2002) A. L. Read, J. Phys. G28, 2693 (2002).
- Wilks (1938) S. S. Wilks, Ann. Math. Statist. 9, 60 (1938).
- Arneodo et al. (2000) F. Arneodo et al., Nucl. Instrum. Meth. A449, 147 (2000).
- Akimov et al. (2002) D. Akimov et al., Phys. Lett. B524, 245 (2002), arXiv:hep-ex/0106042.
- Bernabei et al. (2001) R. Bernabei et al., Eur. Phys. J. direct C3, 11 (2001).
- Aprile et al. (2005) E. Aprile et al., Phys. Rev. D72, 072006 (2005), arXiv:astro-ph/0503621.
- Chepel et al. (2006) V. Chepel et al., Astropart. Phys. 26, 58 (2006).
- Aprile et al. (2009) E. Aprile et al., Phys. Rev. C79, 045807 (2009), arXiv:0810.0274.
- Manzur et al. (2010) A. Manzur et al., Phys. Rev. C81, 025808 (2010), arXiv:0909.1063.
- Plante et al. (2011) G. Plante et al. (2011), arXiv:1104.2587.
- Bezrukov et al. (2010) F. Bezrukov, F. Kahlhoefer, and M. Lindner (2010), arXiv:1011.3990.
- Szydagis et al. (2011) M. Szydagis et al. (2011), arXiv:1106.1613.
- Lewin and Smith (1996) J. D. Lewin and P. F. Smith, Astropart. Phys. 6, 87 (1996), see also Detector response corrections: correction and Spin factors - revised tables, available from http://hepwww.rl.ac.uk/UKDMC/pub/publications.html.
- Smith et al. (2007) M. C. Smith et al., Mon. Not. Roy. Astron. Soc. 379, 755 (2007), arXiv:astro-ph/0611671.
- Aprile et al. (2011) E. Aprile et al., Phys. Rev. D83, 082001 (2011), arXiv:1101.3866.
- Trotta et al. (2008) R. Trotta, F. Feroz, M. P. Hobson, L. Roszkowski, and R. Ruiz de Austri, JHEP 12, 024 (2008), arXiv:0809.3792.
- Aalseth et al. (2010) C. E. Aalseth et al. (2010), arXiv:1002.4703.
- Savage et al. (2009) C. Savage, G. Gelmini, P. Gondolo, and K. Freese, JCAP 0904, 010 (2009), arXiv:0808.3607.
- Bernabei et al. (2010) R. Bernabei et al., Eur. Phys. J. C67, 39 (2010), arXiv:1002.1028.