Bosons at Colliders: a Bayesian Viewpoint
We revisit the CDF data on di-muon production to impose constraints on a large class of bosons occurring in a variety of GUT based models. We analyze the dependence of these limits on various factors contributing to the production cross-section, showing that currently systematic and theoretical uncertainties play a relatively minor role. Driven by this observation, we emphasize the use of the Bayesian statistical method, which allows us to straightforwardly (i) vary the gauge coupling strength, , of the underlying ; (ii) include interference effects with the amplitude (which are especially important for large ); (iii) smoothly vary the charges; (iv) combine these data with the electroweak precision constraints as well as with other observables obtained from colliders such as LEP 2 and the LHC; and (v) find preferred regions in parameter space once an excess is seen. We adopt this method as a complementary approach for a couple of sample models and find limits on the mass, generally differing by only a few percent from the corresponding CDF ones when we follow their approach. Another general result is that the interference effects are quite relevant if one aims at discriminating between models. Finally, the Bayesian approach frees us of any ad hoc assumptions about the number of events needed to constitute a signal or exclusion limit for various actual and hypothetical reference energies and luminosities at the Tevatron and the LHC.
The search for new physics beyond the Standard Model (SM) is one of the main objectives of the current and future collider experiments. A promising signature of such physics are neutral bosons which appear in numerous models extending the SM gauge symmetry group by an additional factor (for reviews, see [1, 2, 3, 4]). This is not only interesting in its own right. In fact, many (or even most) theories, scenarios, and models beyond the SM have been supplemented by extra symmetries, in order to cure or ease specific problems that have arisen there. Thus, by finding a and reconstructing the underlying charges, one may obtain clues regarding the underlying physics and principles, like supersymmetry, large or warped extra dimensions, other strong dynamics, etc.
There is currently no experimental evidence for a , not in the electroweak precision data (EWPD) [5, 6], nor in the study of interference effects at LEP 2 , nor in searches for resonance production at the Tevatron  or the LHC [9, 10]. It is therefore customary to set lower limits on the boson masses, , for a number of models and relative to a fixed value of the gauge coupling, .
However, one can extract more information from all the available experimental results than is reflected in a collection of mass limits. Indeed, the information is based on a variety of different channels and observables (e.g., cross-sections and asymmetries) which can be used to disentangle the underlying model parameters and to diagnose the . Moreover, precision constraints come from the -pole and related observables, from low energy measurements, and from the flavor sector. Thus, one should find a framework allowing to discuss these very different sources simultaneously and transparently. Bayesian data analysis, being particularly suited for parameter estimation (as opposed to hypothesis and model testing), proves very convenient to achieve this goal, and in this paper we take a first step in this direction. Specifically, we consider as an example the recent di-muon results by the CDF Collaboration .
Working towards the above mentioned goal, we exploit here the implicit features of the Bayesian approach i) to study the effects of interference of the boson with the and bosons; and ii) to project exclusion limits for current, future, and hypothetical colliders and various luminosity reaches. For the latter, this approach avoids any ad hoc assumptions about how many observed or expected events would constitute an exclusion or a discovery.
In our previous study  of bosons we analyzed the most recent EWPD, which — as usual — was based on least- fits, i.e., the likelihood is given by . Further factors entering the total (posterior) probability density need to be constructed for each data set,
where is the mixing angle between the and the ordinary boson and is a non-informative prior density. We usually take it to be flat for variables defined over having in mind an infinitely wide multivariant Gaussian distribution. In other cases a parameter transformation may be in order. The dots refer to further parameters such as those characterizing the charges. We believe that computing the full is a (long-term) task worth its effort to make full use of all experimental (and theoretical) information. This is particularly obvious when a signal is seen in one or several places, and one wants to narrow down the space of possible underlying symmetries, as mentioned above. Notice that can easily be updated by regarding it as a (now informative) prior density and multiplying it with a new factor of (we are assuming here that there are no experimental or theoretical correlations between various factors so that the factorization property holds).
Since among the final goals of our approach is the ability to discriminate among models (charges) we review in Section 2 a large and popular class of models based on the gauge group and try to put these on an equal footing. models with the same charges (at least as far as the SM fermions are concerned) also arise from a bottom-up approach  when demanding the cancellation of gauge and mixed gauge-gravitational anomalies in supersymmetric extensions of the SM together with a set of fairly general requirements such as allowing the SM Yukawa couplings, gauge coupling unification, a solution [13, 14] to the -problem , the absence of dimension 4 proton decay as well as fractional electric charges, and chirality (to protect all fields from acquiring very large masses).
In Section 3 we lay out the theoretical framework for hadro-production, where some technical details relevant to the calculation of cross-sections are given for completeness in Appendix A. As a check, we reproduce the CDF limits following their approach and extend the limits to models not considered in their original analysis. Moreover, we project limits for various integrated luminosities and center-of-mass (CM) energies for the LHC.
Finally, in Section 4 we discuss the proposed Bayesian statistical method. We derive mass limits and compare them with those in Section 3. We then compute exclusion contours for some illustrative models, emphasizing the role of interference between the and SM amplitudes, especially for large . We conclude in Section 5.
2 The model class
As mentioned in the introduction, a very large class of symmetries underlying the bosons are subgroups of , and can be written in the form,
Here is the mixing angle between the and maximal subgroups defined by  and , respectively, and is non-vanishing when there is a mixing term  between the hypercharge, , and field strengths , and this kinetic mixing term has been undone by field redefinitions. The , , and groups are mutually orthogonal in the sense that when the trace is taken over a complete representation of . The second form appearing in Eq. (2) uses a different orthogonal basis, , , and , which are the maximal subgroups  defined by and , referring here to the trinification subgroup  of .
The charges of the particles appearing in the fundamental representation of are shown in Table 1 in terms of the parameters , and , satisfying,
The values of , , and the for some specific models are given in Table 2. We also display the charges for the models listed in Table 2 more explicitly in Table 3. The general classification of all models with integer charges, as well as models arising from breaking chains involving maximal subgroups is the subject of Ref. .
There are also classes of models described by one continuous parameter. For example, one can restrict oneself to subgroups of , i.e., those perpendicular to the and therefore with . These models are equivalent (up to non-chiral sets of fermions) to the ones described by the real parameter and denoted by in Ref. , i.e., with charges defined by , when one identifies,
This class  contains the models based on left-right symmetry, , which can be seen from the breaking, . Incidentally, Table 2 shows that is the diagonal combination of and (and indeed was initially dubbed ) and has manifestly left-right symmetric charges. However, left-right symmetry is broken in the SM. In the fully symmetric case, i.e., when all gauge couplings are equal, (the tilde denotes normalization which we use here to simplify the discussion), the
must be orthogonal to the since these are obtained by an rotation of . This yields the and . At lower energies, renormalization group (RG) effects will generally split the gauge couplings. One then has [3, 32]
where the second step uses the relation  , and where the weak mixing angle appears in the last. Thus, assuming manifest left-right symmetry, , one finds . Formally one has the entire range, , but realistic breaking patterns  suggest . Finally, this range can be extended to include all models by identifying .
Similarly, there is a class of models perpendicular to the and therefore with . Under the identification,
these models are equivalent to those denoted by , i.e., with charges related by . Finally, Ref.  discussed another one-parameter subset of models, , which can be obtained by demanding and , and by identifying,
Of course, any other one-parameter subset of models may be considered. All models are guaranteed to be free of anomalies due to the absence of an independent cubic Casimir invariant from . On the other hand, the model class  is not contained in except for , and a different anomaly-free completion of the model (e.g., involving charged lepton singlets ) is needed. A similar remark applies to the models in Ref.  which predict charges of the SM fermions as in but distinct charges of exotics. E.g., the model  couples like the to the SM fermions as well as to the exotic charged leptons, but the and charges are multiplied by a factor of 3/2 and further SM singlets must be added (see, e.g., Table III in Ref.  for details and generalizations).
Finally, we will consider two models corresponding to maximal constructive () and destructive () interference of the amplitude for quarks (dominating the Drell-Yan production process at large momentum transfer) with those of the and the ordinary boson (see Section 4).
3 Direct searches at hadron colliders
Due to the large QCD background at the Tevatron, the decay into a lepton pair is the preferred discovery channel for a since leptons are relatively easy to identify and their energies and momenta can be measured more precisely than those of hadrons, although quarks [33, 34], quarks [34, 35] and di-jets [36, 37, 38] can also be detected. Among leptons, the background for a pair is harder to manage  compared to the and channels, of which the former is preferable still .
The theoretical production cross-section of a boson at a hadron collider depends on certain crucial factors, such as the treatment of the parton distribution functions (PDFs) of the ingoing quarks and of the radiative corrections to the leading order (LO) process. The PDF sets for quarks and gluons are evaluated at various perturbative orders for a wide range of factorization scales and momentum fractions by a number of independent groups. These sets generally agree with high precision, so that the choice depends on whether a particular group provides PDFs at the required perturbative order, the inclusion or neglect of small corrections, the data sets available as of the latest update, etc. For the publication  we base our analysis on, the CDF Collaboration has employed CTEQ6  PDFs. The PDF sets have since been updated a number of times by the CTEQ group. We redo this analysis using the latest sets available and verify the limits using the MSTW set .
As shown in Eq. (A) of the appendix, for every parton the next-to-leading order (NLO) differential Drell-Yan (DY) cross-section consists of three main parts: the PDFs for the incoming hadrons, the parton-level hard cross-section and the QCD higher order terms. The determination of the PDFs requires experimental input. To evaluate them, the parameters of some functional form are fit to the data sets from a number of experiments (see, e.g., Ref. ). The central fit, , corresponds to the minimum of the function. To allow error estimates the CTEQ and MSTW Collaborations also provide PDF sets, , which are defined as the eigenvectors of the Hessian matrix . Thus, the are uncorrelated by construction, providing an efficient method of calculating the induced variations of the PDF predictions for a chosen practical tolerance value, , defining the region of ‘acceptable fits’ with . The eigenvectors of the Hessian matrix are normalized in such a way that the confidence levels correspond to hyper-spheres. The uncertainty can then be computed from the simple master formula ,
where is the number of eigenvectors, is the observable (in our case the cross-section ) and are the predictions for based on the PDF sets .
CDF used CTEQ6 PDFs for their calculation of the DY cross-section for which CTEQ employed the Particle Data Group average for the strong coupling, . The CTEQ6M package contains the LO sets in addition to the central NLO PDF sets as well as the eigenvector sets for the latter. The latest version 6.6M  has 22 pairs of eigenvector sets for error calculations. In the PDF sets produced by the MSTW group, has been treated as a fit parameter. This results in LO, NLO, and NNLO values of approximately 0.139, 0.120 and 0.117, respectively [42, 47]. The MSTW sets contain LO, NLO and NNLO PDFs, in each case along with 20 pairs of eigenvector sets.
The QCD corrections to the LO hadronic process of production may considerably alter the magnitude of the cross-section. Conventionally, these corrections are taken into account with a ‘-factor’, labelled here as , and is defined as,
where is the differential cross-section to order111While it may be obvious from the definition (10), we recall that the cross-sections need to be evaluated with the corresponding order PDFs. in . This factor expresses higher order corrections to production only and not to the complete process. If we consider only one quark flavor in Eq. (10) then is independent of the model and . Thus, the proper way to account for higher order QCD corrections is to calculate a different for every flavor, even though it is common practice to choose a universal factor for all flavors and models . The main results of the present paper are calculated for defined in Eq. (27), with a factor included as we now explain.
As intimated earlier, CDF has used a LO expression (with LO PDFs) and a -factor (taken from Ref. ) for their calculation of the DY cross-section via exchange. We have compared the factors from that work with effective values obtained using the expressions given in the appendix A and found good agreement (within a few %). Then we also adopted -factors in such a way that our results are effectively NNLO. Only for the comparisons in Table 4 with the CDF results we use the factor instead222For invariant masses beyond 1 TeV at the Tevatron, we use constant factors and .. As for the LHC, NLO results suffice  and we take .
We take the cross-section upper limit from the data curve in Fig. 3 of Ref.  and find the intersection with our cross-section, , as defined in Eq. (27) for boson exchange. Final state radiation effects  can be ignored in this case since we integrate over almost all invariant mass range in such a way that these effects are mostly canceled according to the Kinoshita theorem . The evaluation of the multi-dimensional integrals involved is done using the CUHRE and SUAVE programs under the CUBA package . We input the numerical values of couplings and charges from the FORTRAN package GAPP . Only decays into SM fermions are assumed and fermion masses neglected. A numerical routine extracted from the PEGASUS package  is used for the running of from its value at to the factorization scale . Fig. 1 shows the ‘model-lines’ for the bosons including some models not included in the original CDF analysis. The slopes of the model-lines and their intersection points with the experimental data line (giving the limits) match with those in the original CDF plot within a few per mille for models included in both analyses. The 95% C.L. mass limits for various models are listed in Table 4. For comparison, the CDF limits from  and the EWPD limits from  are also quoted. The last column in the table gives the 95% C.L. mass limits anticipated at the end of the current Tevatron run, obtained using the Bayesian statistical method explained in the next section.
|CTEQ6M ||CTEQ6.6M ||MSTW2008 |
Fig. 2 shows the area preserving sinusoidal (Sanson-Flamsteed) projection of a hemisphere parameterizing the bosons in terms of the angles and . We note that the lower bound for quoted in Table 4 corresponds to the normalization given in Ref. , and hence differs from the corresponding value in Fig. 2, where all the models are normalized as in Eq. (5).
In Table 5 we display the dependence of these limits on the PDFs. To also investigate the uncertainties due to them we use the central CTEQ6M PDFs and the corresponding eigenvector PDF sets as displayed in Table 6 for two selected values of . One sees that the relative uncertainty in is very large for the which is due to in this model and points to the fact that for a given the uncertainties in the quark PDFs are larger than those of the quarks. Also, the quark contribution is suppressed by more than an order of magnitude with respect to that of the quark which is reflected by the ratio of the cross-sections for the and the . We recall that we use a common normalization for all models so that the cross-sections can be directly compared. Finally we show in the table how the uncertainty in affects the limits in these models.
4 The Bayesian statistical method
The CDF Collaboration collected an integrated 2.3 fb of data  in the channel, binned in inverse invariant di-muon mass, . The CDF analysis then looks for an enhancement in di-muon production above the SM background for particular models, and so their lower limits on correspond to upper limits on the cross-section333It utilizes signal templates that have been generated with a fixed and relatively narrow (motivated by ).. But bosons interfere with the SM neutral gauge bosons, and destructive interference would result in a reduction of the SM cross-section. For this reason in addition to the general motivation given in Section 1 — clear-cut combination of constraints from quite distinct sources — we adopt a statistical framework wherein it is straightforward to address interference effects and to vary the coupling strength (see also Ref. ) up to the strong coupling regime (and broad resonances).
The basic idea is to apply the Bayesian analysis of the SM Higgs mass, , of Ref.  to physics. In this case the collider constraints from LEP 2  and the Tevatron  were included using the published log-likelihood ratios,
These are given in terms of the probabilities (likelihoods) to obtain the data, conditional on the signal plus background hypothesis, , and background only hypothesis, , and may be compounded of many experiments, channels, energies, etc. The depend on the parameter(s), , of interest, ( in Ref. ), through the signal hypothesis. Information on is obtained by Bayes’s theorem,
where is the prior probability density function (pdf) entering Eq. (1), and is a summary of our knowledge, if any, prior to the experiment or analysis. In the absence of prior information, or if the prior information is explicitly taken into account by extra factors , then is called non-informative, and is most conservatively taken as or , whenever may be an arbitrary real number or positive real number, respectively. Notice, that drops out from likelihood ratios. This is crucial: if various data points show poor compatibility, or if an excess (or deficit) is observed, this will have an impact only if some value of describes the data better than some other.
We use here a similar philosophy as in Ref. , but here we first have to construct the corresponding LLR ourselves. Our input data are the number of events, , in bin (see Fig. 1 in Ref. ). The parameter set may include all of the parameters introduced in Sections 1 and 2, namely , , , , and , but in this paper we allow only and for some specific models (i.e., fixed values of and ) and set . Complementary data sets, such as other channels, LHC and DØ results, EWPD data [5, 6], and LEP 2 constraints , will be necessary to disentangle these parameters in an integrated analysis. The SM point corresponds to or . The events, , in each bin follow Poisson statistics,
where is the predicted number of events in bin given specific values for and .
It is important to note here that the above likelihood is the same as the one employed by the CDF collaboration in their analysis . However, they determine by summing the bin-counts from the SM and the bin-counts from the signal template as mentioned earlier, effectively summing the cross-sections of the SM process and the -mediated process without any interference. Therefore, the CDF approach essentially differs from ours not in the choice of the likelihood, but in that the interference effects were neglected in their analysis in order to keep it simple and model-independent , deliberately making it blind to wider resonances through the use of templates based on narrow signal width. We, conversely, treat coupling strength as a free parameter and avoid signal templates, which makes the inclusion of interference effects rather natural in our framework.
In practice, we compute a grid444Alternatively calculating directly without the grid gives mass limits which differ by at most 3 GeV for our benchmark models. However, this dramatically increases CPU time if a multi-variate minimization is performed i.e., without fixing and other model parameters. It is therefore expedient to avoid the mostly redundant PDF integrations. of values for the discretizing and and interpolate in every one of the first 35 bins corresponding to the invariant mass range searched by CDF. We have thus effectively reduced the analysis to a least- fit where any observed event count adds a piece,
to the overall function ( and refer here to the SM and SM plus new physics expectations, respectively).
The detector resolution, TeV, is approximately constant (in the variable ), and must be taken into account since it is of the order of the bin size of (3.5 TeV). We define as the theoretical invariant mass of the muon pairs, as opposed to the nominally measured , and introduce the convolution,
where fb is the integrated luminosity, is the detector efficiency, which we take as a constant 0.982 for all bins, is given by Eq. (A), and refers to the non-DY background which is extracted directly from Fig. 1 of Ref. . is the total acceptance of the CDF detector, which increases from about at the -pole to about at 1 TeV and then falls off rapidly . For our current analysis, the acceptance values have been gleaned from .
Along with the QCD corrections, the QED radiative corrections  also have a sizable effect on the DY cross-section, and strongly affect the shape of the di-lepton invariant mass distribution. While initial state radiation is negligible for di-muon masses between 50 and 100 GeV, final state QED corrections are in fact larger than the QCD corrections and so have to be taken into account. The QED corrections are shown in Fig. 6 of Ref.  with a rapid variation visible in the range GeV. Just below the peak, these corrections enhance the cross-section by up to a factor of 1.9, so the cross-sections in the neighboring bins to the peak differ considerably from the values expected without these corrections. When the full di-muon mass range is integrated over, the large negative and positive corrections tend to cancel and do not have a big impact on the total cross-section . For GeV, they uniformly reduce the differential cross-section by 7%, and for our calculations555The figure  has been generated for TeV, while our process is being computed for the TeV of the current Tevatron run, but we expect this to have negligible effect on our final results. we have extrapolated the data points from the mentioned figure and used these as multiplicative factors, which we refer to as . In principle, such effects should also appear near the -pole, but considering that the bin around the expected mass is fairly wide, any large effect around the peak will be washed out at least for weak and intermediate coupling strength.
Returning now to Eq. (15), the quantity is our smearing function,
with and and is constructed as a Beta distribution with mean and variance . Note that it approaches a Gaussian form for (which is the case except for the first few bins), but is more adequate than a Gaussian since takes non-negative values only. We note in passing that we neglect here systematic and theoretical uncertainties, justifying this with the very small event numbers in the most relevant bins so that statistics dominates.
Eq. (14) also makes it explicit how our approach allows the new physics to enter with either sign, as is always the case for interfering bosons. At the level of the differential cross-section, the interference terms change sign when crosses the or poles. Thus, there are fairly large cancellations at work when the whole range of is integrated over, and when the objective is the usual hunting for a narrow bump where neglect of interference effects is justified . Since here we put more emphasis on the event distribution over larger numbers of bins, the interference issue becomes more interesting. To have a closer look as to how significant the interference effects are numerically, we now discuss two cases where they are enhanced, and . They are defined to have, respectively, maximum constructive and destructive interference with the SM amplitudes for up quarks in the limit , i.e., we extremize the expression (in a slightly more compact notation),
We chose as our reference value because then any dependence on and drops out. Moreover, at large the down quark contribution to the PDFs is strongly suppressed, providing a further simplification. For the case of the it now turns out that neglecting the second ( interference) term shifts the corresponding values for our model parameters and only at the level, so that we can neglect this term, as well, and we find,
which is relatively close to the case. The facts that the boson couples only vector-like and that the vector-coupling for the muons is suppressed by a factor , may give a rationale for why in this case the interference term is small. Similarly, the is numerically close to the boson which has only axial-vector couplings to the SM fermions. In this case we can neglect the first ( interference) term in Eq. (17) and simply define . As for the integrated cross-sections, the constructive interference for is about an order of magnitude larger than the destructive interference in , and in the latter case we find that the sign of the interference effect in the total cross-section is reversed compared to the amplitude level in the limit.
We illustrate the interference effects for the and bosons in Fig. 4. As can be seen, they become significant for values of order unity. In fact, for large most of the expected signal events come from the interference, since the pure exchange contribution is more strongly mass suppressed. Another way to quantify the interference effects is to look at the behavior of the best fit location. E.g., for the model (not included in the plot) we found the global best fit at TeV and , with the value of relative to the SM. On the other hand, if we turn off the interference effects, the global minimum strongly shifts to TeV and with .
We also stress that the interferences are important if one wants to discriminate between models666The importance of interference effects in forward-backward asymmetries was emphasized in Ref. .. E.g., the and models have their global minimum at low mass and weak coupling similar to the values above regardless of the interference. But the -minimum becomes deeper in the presence of interference effects, even though we show in Fig. 4 that the mass limits are unaffected at small coupling. Moreover, without interference is virtually degenerate at the minimum for the three mentioned models, but this is lifted by the interference effects.
The contours in the plane are given in Fig. 5 for the . For comparison with our approach, we extended the CDF limit corresponding to to other values of the coupling. Crucially, the CDF line breaks down at owing, again, to the fact that their templates assume a narrow boson. As can be seen, our method reveals a strong variation of the mass limit with , while the frequentist method used by CDF shows a weaker and mostly monotonic dependence. The PE line is obtained by assuming an event count (i.e., the SM expectation even though is not integer valued) in Eq. (14), in place of the actually observed number of events for all the bins. This yields a smooth and monotonic contour, demonstrating that the strong dependence of the