1 Introduction

Studying Gaugino Mass Unification at the LHC

Studying Gaugino Mass Unification at the LHC


Baris Altunkaynak, Phillip Grajek, Michael Holmes,

Gordon Kane, and Brent D. Nelson

(1) Department of Physics, Northeastern University, Boston, MA 02115

(2) Michigan Center for Theoretical Physics, Randall Lab.,

University of Michigan, Ann Arbor, MI 48109

We begin a systematic study of how gaugino mass unification can be probed at the CERN Large Hadron Collider (LHC) in a quasi-model independent manner. As a first step in that direction we focus our attention on the theoretically well-motivated mirage pattern of gaugino masses, a one-parameter family of models of which universal (high scale) gaugino masses are a limiting case. We improve on previous methods to define an analytic expression for the metric on signature space and use it to study one-parameter deviations from universality in the gaugino sector, randomizing over other soft supersymmetry-breaking parameters. We put forward three ensembles of observables targeted at the physics of the gaugino sector, allowing for a determination of this non-universality parameter without reconstructing individual mass eigenvalues or the soft supersymmetry-breaking gaugino masses themselves. In this controlled environment we find that approximately 80% of the supersymmetric parameter space would give rise to a model for which our method will detect non-universality in the gaugino mass sector at the 10% level with of integrated luminosity. We discuss strategies for improving the method and for adding more realism in dealing with the actual experimental circumstances of the LHC.

1 Introduction

As the Large Hadron Collider (LHC) era fast approaches, the theoretical community is increasingly focused on how the new discoveries made there will be interpreted. The first step, most obviously, is to establish the presence of physics beyond the Standard Model. This will be done using search strategies that are by now well-established, though many interesting “what-if” scenarios continue to be proposed and investigated [1]. We continue to believe that supersymmetry (SUSY) is the best-motivated extension to the Standard Model for physics at the LHC energy scale. Furthermore, if supersymmetry is indeed relevant at the electroweak scale there are many reasons to expect that its presence will be established early on in the LHC program [2]. Indeed, even some properties of the spectrum, such as the masses and spins of low-lying new states, may be crudely known even after relatively little integrated luminosity [3, 4, 5]. In this paper we begin a research program into what comes next: how to connect the multiple LHC observations to organizing principles in some (high-energy) effective Lagrangian of underlying physics.

This secondary problem can be further divided into two sub-problems. The first has come to be called the “inversion” problem. Briefly stated, the inversion problem is the recognition that even in very restrictive model frameworks it is quite likely that more than one set of model parameters will give predictions for LHC observations that are in good agreement with the experimental data [6]. Much recent work has focused on how to address this issue [7, 8, 9, 10, 11], and we will borrow much of the philosophy and many of the useful techniques from this recent literature. But our focus here is on what we might call the second sub-problem: how to turn the ensemble of distinct LHC signatures into a determination of certain broad properties of the underlying Lagrangian at low energies. Clearly the most direct attack on this second sub-problem is to perform a global fit to the parameters of a particular model [12, 13], modulo the degeneracy issue just described above. Not surprisingly, therefore, the work we will describe in this paper will make significant use of likelihood fits. But our ultimate goal is to fit to certain broad properties of the underlying physics itself – and not simply to a particular model of that physics.

We will refine this rather vague-sounding goal in a moment. But it is helpful to first consider an example of what we mean by the phrase “broad properties of the underlying physics.” Consider a high energy theorist interested in connecting the (supersymmetric) physics at the LHC to physics at an even higher energy scale, such as some underlying string theory. What sort of information would be of most use to him or her in this pursuit? Would it be a precise measurement of the gluino mass, or of the mass splitting in the top squark sector, or some other such measurement? Obtaining such information is (at least in principle) possible at the LHC, but far more valuable would be knowledge of the size of the supersymmetric -parameter or whether is very small. Such information is far more difficult to obtain at the LHC [14] but is more correlated with moduli stabilization and/or how the -parameter is generated in string models [15]. For example, this knowledge may tell us whether the -parameter is fundamental in the superpotential or generated via the Kähler potential as in the Giudice-Masiero mechanism [16]. This, in turn, is far more powerful in discriminating between potential string constructions than the gluino mass itself – no matter how accurately it is determined. We might refer to the genesis of the -parameter as a “broad property of the underlying physics.”

If all such key broad properties of the underlying physics were enumerated, it is our view that one of the most important such properties would be the question of gaugino mass universality. That is, the notion that at the energy scale at which supersymmetry breaking is transmitted to the observable sector, the gauginos of the minimal supersymmetric Standard Model (MSSM) all acquired soft masses of the same magnitude. This issue is intimately related to another, perhaps equally important issue: the wave-function of the lightest supersymmetric particle, typically the lightest neutral gaugino. Few properties of the superpartner spectrum have more far-reaching implications for low-energy phenomenology, the nature of supersymmetry breaking, and the structure of the underlying physics Lagrangian [17]. If the theorist could be told only one “result” from the LHC data the answer to the simple question “Is there evidence for gaugino mass universality?” might well be it. But these soft parameters are not themselves directly measurable at the LHC [18].1 One might consider performing a fit to some particular theory, such as minimal supergravity (mSUGRA), in which universal gaugino masses are assumed [21] – or perhaps to certain models with fixed, non-universal gaugino mass ratios [22, 23]. But we are not so much interested in whether mSUGRA – or any other particular theory for which gaugino mass universality is a feature – is a good fit to the data. Rather, we wish to know whether gaugino mass universality is a property of the underlying physics independent of all other properties of the model. From this example both the ambitiousness and the difficulty inherent in our task is clear.

We have therefore decided to begin our attack by considering a concrete parametrization of non-universalities in soft gaugino masses. Many such frameworks present themselves, but we will choose a parametrization that has the virtue of also having a strong theoretical motivation from string theory. In recent work by Choi and Nilles [24] soft supersymmetry-breaking gaugino mass patterns were explored in a variety of string-motivated contexts. In particular, the so-called “mirage pattern” of gaugino masses provides an interesting case study in gaugino mass non-universality. Yet as mentioned above, these soft supersymmetry breaking parameters are not themselves directly measurable. Linking the soft parameters to the underlying Lagrangian is important, but without the crucial step of linking the parameters to the data itself it will be impossible to reconstruct the underlying physics from the LHC observations.

The mirage paradigm gets its name from the fact that should the mirage pattern of gaugino masses be used as the low-energy boundary condition of the (one-loop) renormalization group equations then there will exist some high energy scale at which all three gaugino masses are identical. This unification has nothing to do with grand unification of gauge groups, however, and the gauge couplings will in general not unify at this particular energy scale – hence the name “mirage.” The set of all such low-energy boundary conditions that satisfy the mirage condition defines a one-parameter family of models. This parameter can be taken to be the mirage unification scale itself, or some other parameter, such as the ratio between various contributions to the gaugino soft masses. We note that the minimal supergravity paradigm of soft supersymmetry breaking is itself a member of this family of models since it is defined by the property that gaugino masses are universal at the scale . Indeed, in the parametrization we adopt from [24], the gaugino mass ratios at the electroweak scale take the form

(1.1)

where the case is precisely the unified mSUGRA limit. Note that when we speak of testing gaugino mass universality, therefore, we do not imagine a common gaugino soft mass at the low-energy scale. Instead, the “universality” paradigm implies the ratios

(1.2)

The goal of this work is to ask whether it is possible to determine that the parameter of (1.1) is different from zero – and if so, how.

The theoretical details behind the ratios of (1.1) will be the topic of Section 2 in this paper. These details are largely irrelevant for the analysis that follows in Sections 3 and 4, but may nevertheless be of interest to many readers. For those who are only interested in the methodology we will pursue and the results, this section can be omitted. At the end of Section 2 we will present two benchmark scenarios that arise from concrete realizations of the mirage pattern of gaugino masses in certain classes of string models. As this is a paper about the interface of theory and experiment at the LHC – and not about string phenomenology per se – we will leave the theoretical description of these models to the Appendix. In Section 3 we discuss how we will go about attempting to measure the value of the parameter in (1.1) and describe the process that led us to an ensemble of specific LHC observables targeted for precisely this purpose. In Section 4 this list of signatures is tested on a large collection of MSSM models, as well as on our two special benchmarks from Section 2. We will see that the signature lists constructed using the method of Section 3 do an excellent job of detecting the presence of non-universality in the gaugino soft masses over a very wide array of supersymmetric spectra hierarchies and mass ranges. Non-universality on the order of 30-50% should become apparent within the first 10  of analyzed data for most supersymmetric models consistent with current experimental constraints. Detecting non-universality at the 10% level would require an increase in data by roughly a factor of two. Nevertheless, depending on the details of the superpartner spectrum, some cases will require far more data to truly measure the presence of non-universality. Of course all of these statements must here be understood in the context of the very particular assumptions of this study. Some thoughts on how the process can be taken further in the direction of increased realism are discussed in the concluding section.

Before moving to the body of the paper, however, we would like to take a moment to emphasize a few broad features of the theoretical motivation behind the parametrization in (1.1). In the limit of very large values for the parameter the ratios among the gaugino masses approach those of the anomaly-mediated supersymmetry breaking (AMSB) paradigm [25, 26]. In fact, the mirage pattern is most naturally realized in scenarios in which a common contribution to all gaugino masses is balanced against an equally sizable contribution proportional to the beta-function coefficients of the three Standard Model gauge groups. Such an outcome arises in string-motivated contexts, such as KKLT-type moduli stabilization in D-brane models [27, 28] and Kähler stabilization in heterotic string models [29]. These string-derived manifestations can also be extended easily to include the presence of gauge mediation, in which the mirage pattern is maintained in the gaugino sector [30, 31]. Importantly, however, it can arise in non-stringy models, such as deflected anomaly mediation [32, 33]. We note that in none of these cases is the pure-AMSB limit likely to be obtained, so our focus here will be on small to moderate values of the parameter in (1.1).2 We will further refine these observations in Section 2 before turning our attention to the measurement of the parameter at the LHC.

2 Theoretical Motivation and Background

In this section we wish to understand the origin of the mass ratios in (1.1) from first principles. We will treat the mirage mass pattern here in complete generality, without any reference to its possible origin from string-theoretic considerations. This short section concludes with two specific sets of soft parameters, both of which represent models with the mirage gaugino mass pattern (though the physics behind the rest of their soft supersymmetry breaking parameters are quite different). In the Appendix we will recast the discussion of this section in terms of the degrees of freedom present in low-energy effective Lagrangians from string model building. There we will also present the string theory origin of the two benchmark models that appear in Table 1 at the end of this section.

Let us begin by imagining a situation in which there are two contributions to the soft supersymmetry breaking gaugino masses. We assume that these contributions arise at some effective high-energy scale at which supersymmetry breaking is transmitted from some hidden sector to the observable sector. Let us refer to this scale as simply the ultraviolet scale . It is traditional in phenomenological treatments to take this scale to be the GUT scale at which gauge couplings unify, but in string constructions one might choose a different (possibly higher scale) at which the supergravity approximation for the effective Lagrangian becomes valid. We will further assume that one contribution to gaugino masses is universal in nature while the other contribution is proportional to the beta-function coefficient of the Standard Model gauge group. More specifically, consider the universal piece to be given by

(2.3)

where labels the Standard Model gauge group factors and represents some mass scale in the theory. The second piece is the so-called anomaly mediated piece, which arises from loop diagrams involving the auxiliary scalar field of supergravity [35, 36]. It will take the form

(2.4)

where the are the beta-function coefficients for the Standard Model gauge groups. In our conventions these are given by

(2.5)

where , are the quadratic Casimir operators for the gauge group , respectively, in the adjoint representation and in the representation of the matter fields charged under that group.3 For the MSSM these are

(2.6)

Note that if we take then we have

(2.7)

The mass scale is common to all three gauge groups; the subscript is meant to indicate that the contribution in (2.4) is related to the gravitino mass. The full gaugino mass at the high energy boundary condition scale is therefore

(2.8)

Now imagine evolving the boundary conditions in (2.8) to some low-energy scale via the (one-loop) renormalization group equations (RGEs). For the anomaly-generated piece of (2.4) we need only replace the gauge coupling with the value at the appropriate scale

(2.9)

while for the universal piece we can use the fact that is a constant for the one-loop RGEs. After some manipulation this yields

(2.10)

Combining (2.10) and (2.9) gives the low scale expression

(2.11)

For gaugino masses to be unified at the low scale then the quantity in the square brackets in (2.11) must be engineered to vanish. This can be achieved with a judicious choice of the values and for a particular high-energy input scale . Put differently, for a given (such as the GUT scale) and a given overall scale , there is a one-parameter family of models defined by the choice .

It is possible, however, to find a more convenient parametrization of the family of gaugino mass patterns defined by (2.11). Consider defining the parameter by

(2.12)

so that (2.11) becomes

(2.13)

and the requirement of universality at the scale now implies . Normalizing the three gaugino masses by and evaluating the gauge couplings at a scale we obtain the mirage ratios

(2.14)

for , in good agreement with the expression in (1.1).

Let us generalize the parametrization in (2.12) once more. Instead of defining the parameter in terms of the starting and stoping points in the RG evolution of the gaugino mass parameters, we will fix them in terms of mass scales in the theory itself. Thus we follow the convention of Choi et al. [38] and define

(2.15)

where is the reduced Planck mass . Our parametrization is now divorced from the boundary condition scales of the RG flow and can be fixed in advance. The choice of mass parameters in the logarithm of (2.15) may seem arbitrary – and at this point it is indeed completely arbitrary – but they have been chosen so as to make better contact with string constructions, such as those which we present in the Appendix. Inserting (2.15) into (2.11) yields

(2.16)

Comparing this expression with (2.10) it is clear if gauge couplings unify at a scale , then we should expect the soft supersymmetry breaking gaugino masses to unify at an effective scale given by

(2.17)

We see that our parametrization in terms of is indeed equivalent to a parametrization in terms of the effective unification scale, as suggested in the introduction.

The value of as defined in (2.12) or (2.15) can be crudely thought of as the ratio of the anomaly contribution to the universal contribution to gaugino masses. Indeed, the limit is the limit of the minimal supergravity paradigm, while is the AMSB limit. But as (2.8) makes clear, these two contributions will be of comparable size only if is at least an order of magnitude larger than . We could therefore have chosen a parametrization based on the ratio , with interesting values being in the range . But such a parametrization spoils the simple relation with the mirage unification scale (2.17). Furthermore, the introduction of the factor in (2.15) provides the needed large factor, taking a value of for TeV. To obtain the mirage pattern it is therefore necessary for the underlying theory to generate some large number . Specific examples of how this is achieved in explicit string-based models are given in the Appendix to this paper.

Parameter Point A Point B Parameter Point A Point B
0.3 1.0
1.5 TeV 16.3 TeV
198.7 851.6
172.1 553.3 ,
154.6 339.1
193.0 1309
205.3 1084
188.4 1248
Table 1: Soft Term Inputs. Initial values of supersymmetry breaking soft terms in GeV at the initial scale given by . Both points are taken to have and . The actual value of is fixed in the electroweak symmetry-breaking conditions.

In Table 1 we have collected the necessary soft supersymmetry-breaking parameters to completely specify two benchmark points for further analysis in what follows. The details behind these two models are described in the Appendix. Here we will simply indicate that point A represents a heterotic string model with Kähler stabilization of the dilaton which was studied in detail in [37]. This particular example has a value of . Point B is an example from a class of Type IIB string compactifications with fluxes which was studied in [38]. This second example has a value . Both are examples of the mirage pattern of gaugino masses, having mirage unification scales of and , respectively. Note that these soft supersymmetry breaking terms are taken to be specified at the GUT energy scale of and must be evolved to electroweak scale energies through the renormalization group equations.

3 Determining : Methodology

3.1 Setting Up the Problem

As mentioned in the introduction, the ultimate goal of this avenue of study is to determine whether or not soft supersymmetry breaking gaugino masses obey some sort of universality condition independent of all other facts about the supersymmetric model. Such a goal cannot be met in a single paper so we have begun by asking a simpler question: assuming the world is defined by the MSSM with gaugino masses obeying the relation (1.1), how well can we determine the value of the parameter . At the very least we would like to be able to establish that with a relatively small amount of integrated luminosity. The first step in such an incremental approach is to demonstrate that some set of “targeted observables” [12] (we will call them “signatures” in what follows) is sensitive to small changes in the value of the parameter in a world where all other parameters which define the SUSY model are kept fixed. In subsequent work we intend to relax this strong constraint and treat the issue of gaugino mass universality more generally. Despite the lack of realism we feel this is a logical point of departure – very much in the spirit of the “slopes” of the Snowmass Points and Slopes [39] and other such benchmark studies. Thus, where the Snowmass benchmarks talk of slopes, we will here speak of “model lines” in which all parameters are kept fixed but the value of is varied in a controlled manner.

To construct a model line we must specify the supersymmetric model in all aspects other than the gaugino sector. The MSSM is completely specified by 105 distinct parameters, but only a small subset are in any way relevant for the determination of LHC collider observables [14]. We will therefore choose a simplified set of 17 parameters as in the two benchmark models of Table 1

(3.18)

The parameters in (3.18) are understood to be taken at the electroweak scale (specifically ) so no renormalization group evolution is required. The gluino soft mass will set the overall scale for the gaugino mass sector. The other two gaugino masses and are then determined relative to via (2.14). A model line will take the inputs of (3.18) and then construct a family of theories by varying the parameter from (the mSUGRA limit) to some non-zero value in even increments.

For each point along the model line we pass the model parameters to PYTHIA 6.4 [40] for spectrum calculation and event generation. Events are then sent to the PGS4 [41] package to simulate the detector response. Additional details of the analysis will be presented in later sections. The end result of our procedure is a set of observable quantities that have been designed and (at least crudely) optimized so as to be effective at separating from other points along the model line in the least amount of integrated luminosity possible. In Section 3.2 we describe the manner in which we perform this separation between models. The signature lists, and the analysis behind their construction, is presented in Section 3.3. In Section 4 we will demonstrate the effectiveness of these signature lists on a large sample of randomly generated model lines and provide some deeper insight on why the whole procedure works by examining our benchmarks in greater detail.

3.2 Distinguishability

The technique we will employ to distinguish between candidate theories using LHC observables was suggested in [12] and subsequently refined in [6]. The basic premise is to construct a variable similar to a traditional chi-square statistic

(3.19)

where is some observable quantity (or signature). The index labels these signatures, with being the total number of signatures considered. The labels and indicate two distinct theories which give rise to the signature sets and , respectively. Finally, the error term is an appropriately-constructed measure of the uncertainty of the term in the numerator, i.e. the difference between the signatures. In this work we will always define a signature as an observation interpreted as a count (or number) and denote it with capital . One example is the number of same-sign, same-flavor lepton pairs in a certain amount of integrated luminosity. Another example is taking the invariant mass of all such pairs and forming a histogram of the results, then integrating from some minimum value to some maximum value to obtain a number. In principle there can be an infinite number of signatures defined in this manner. In practice experimentalists will consider a finite number and many such signatures are redundant.

We can identify any signature with an effective cross section via the relation

(3.20)

where is the integrated luminosity. We refer to this as an effective cross-section as it is defined by the counting signature which contains in its definition such things as the geometric cuts that are performed on the data, the detector efficiencies, and so forth. Furthermore these effective cross sections, whether inferred from actual data or simulated data, are subject to statistical fluctuations. As we increase the integrated luminosity we expect that this effective cross section (as inferred from the data) converges to an “exact” cross section given by

(3.21)

These exact cross sections are (at least in principle) calculable predictions of a particular theory, making them the more natural quantities to use when trying to distinguish between theories. The transformation in (3.20) allows for a comparison of two signatures with differing amounts of integrated luminosity. This will prove useful in cases where the experimental data is presented after a limited amount of integrated luminosity , but the simulation being compared to the data involves a much higher integrated luminosity . Using these notions we can re-express our chi-square variable in terms of the cross sections

(3.22)

We will assume that the errors associated with the signatures are purely statistical in nature and that the integrated luminosities and are precisely known, so that

(3.23)

and therefore is given by

(3.24)

where each cross section includes the (common) Standard Model background, i.e. .

The variable forms a measure of the distance between any two theories in the space of signatures defined by the . We can use this metric on signature space to answer the following question: how far apart should two sets of signatures and be before we conclude that theories and are truly distinct? The original criterion used in [6] was as follows. Imagine taking any supersymmetric theory and performing a collider simulation. Now choose a new random number seed and repeat the simulation. Due to random fluctuations we expect that even the same set of input parameters, after simulation and event reconstruction, will produce a slightly different set of signatures. That is, we expect since it involves the effective cross-sections as extracted from the simulated data. Now repeat the simulation a large number of times, each with a different random number seed. Use (3.24) to compute the distance of each new simulation with the original simulation in signature space. The set of all values so constructed will form a distribution. Find the value of in this distribution which represents the 95th percentile of the distribution. This might be taken as a measure of the uncertainty in “distance” measurements associated with statistical fluctuations.

This procedure for defining distinguishability is unwieldy in a number of respects. Determining the threshold for separating models by is computationally intensive as it requires many repeated simulations of the same model (as well as the Standard Model background). More importantly, the “brute force” determination of is particular to model  as well as the list of signatures used in (3.24). Each change in either the model parameters or the signature mix demands a new determination of the threshold for distinguishability. We will therefore propose a new criterion that has the benefit of being analytically calculable with a form that is universal to any pair of models and any set of signatures.

To do that let us reconsider the quantum fluctuations. At a finite integrated luminosity we can describe the outcome of a counting experiment as a Poisson distribution approximated by a normal distribution (this is a good approximation for approximately 10 counts or more), which can be expressed as

(3.25)

Here is a standard random variable, i.e. a random variable having a normal distribution centered at 0 with a standard deviation of 1. Note that by introducing statistical fluctuations via the variable we can replace in (3.25) with the exact cross section. Equation (3.25) then merely states the well known fact that the distribution in measured values should form a normal distribution about the value . To combine two such distributions and we may write

(3.26)

where is a new standard random variable and is the total cross-section. For example, might be the contribution to a particular final state arising from Standard Model processes while might be the contribution arising from production of supersymmetric particles.

With the above in mind we can re-visit the definition (3.24) and obtain an analytic approximation for the distribution in values by using random variables to represent the signatures. The measured cross sections can be related to the exact cross sections via

(3.27)

with a similar expression for the model . Substituting (3.27) into (3.24) gives

(3.28)

where we have combined and into the random variables and and have assumed that and are sufficiently large to be able to neglect the term proportional to . In this limit we immediately see that is itself a random variable with a probability distribution for the quantity given by

(3.29)

where is the non-central chi-squared distribution for degrees of freedom.4 The non-centrality parameter is given by

(3.30)

and now the represent exact cross sections. This is actually the result we expect since the original in (3.24) is essentially a chi-square like function. Note that since the in the distribution of (3.28) are exact, we have the anticipated result that fluctuations of the quantity should be given by the central chi-square distribution . We note, however, that the derivation of (3.28) implicitly assumed that the signatures which we consider are uncorrelated – or more precisely that the fluctuations in these signatures are uncorrelated. We will have more to say about signature correlations in Section 3.3 below. We now have a measure of separation in signature space that is related to well known functions in probability theory.5

Figure 1: Plot of distribution in values. The top panel plots the probability distribution function (3.29) for and and 10. The lower panel plots the cumulative distribution function – the absolute probability for obtaining that value of . The 95% percent threshold is indicated by the horizontal lines, and the corresponding values of are indicated by the marked values of .

Armed with this technology, let us return to the issue of distinguishing a model from itself. From (3.28), (3.29) and (3.30) it is apparent that all the physics behind the distribution of possible values is contained in the values of and . In particular the distribution of possible values (a central chi-square distribution) should depend only on the number of signatures considered – not on the model point nor on the nature of those signatures. When comparing a model with itself we can therefore dispense with the subscript and write . We plot the probability distribution of (3.29) for and various values of in the top panel of Figure 1. We have also plotted the cumulative distribution function for the same values in the lower panel of Figure 1. To rule out the null hypothesis (i.e. the hypothesis that models  and  are in fact the same model) to a level of confidence requires demanding that is larger than the -th percentile value for the distribution (3.29) for the appropriate value. For example, if we use the criterion from [6] and require then . We have indicated this value for the cumulative distribution function by the horizontal dashed line in Figure 1. In general we will denote this particular value of for each value of by the symbol . It can be found via the cumulative distribution function as in Figure 1, or by numerically solving the equation

(3.31)

where is Euler’s gamma function and is the incomplete gamma function. A summary of these values for smaller values is given in Table 2. If we measure our signatures, extract the cross-sections, form and the number is greater than then we can say that the null hypothesis can be ruled out at a level of confidence given by %. The value of this critical is a universal number determined only by our choice of value and the number of signatures that we choose to consider.

Confidence Level
n 0.95 0.975 0.99 0.999
1 3.84 5.02 6.64 10.83
2 3.00 3.69 4.61 6.91
3 2.61 3.12 3.78 5.42
4 2.37 2.79 3.32 4.62
5 2.21 2.57 3.02 4.10
6 2.10 2.41 2.80 3.74
7 2.01 2.29 2.64 3.48
8 1.94 2.19 2.51 3.27
9 1.88 2.11 2.41 3.10
10 1.83 2.05 2.32 2.96
Table 2: List of values for various values of the parameters and . The value represents the position of the -th percentile in the distribution of for any list of signatures. For example, if we consider a list of 10 signatures, then the quantity formed by these ten measurements must be larger than 1.83 to say that models  and  are distinct, with 95% confidence. If we demand 99% confidence this threshold becomes 2.32.

If, however, our measurement gives then we cannot say the two models are distinct, at least not at the confidence level . But they may still be separate models and we were simply unfortunate, with statistical fluctuations producing a small value of . If we accumulate more data and measure again, we may find a different result. To quantify the probability that two different models and will give a particular value of requires the use of the non-central chi-square distribution in (3.29). The degree of non-centrality is given by the quantity in (3.30). Clearly, the more distinct the predictions and are from one another, the larger this number will be. In Figure 2 we plot the distribution for for signatures and several values of . As expected, the larger this parameter is, the more likely we are to find large values of .

Figure 2: Plot of distribution in values for and various . The probability distribution function (3.29) for and 35 is plotted for the case of . The curves are normalized such that the total area under each distribution remains unity. Note that the peak in the distribution moves to larger values of as the non-centrality parameter is increased.

Let us assume for the moment that “model ” is the experimental data, which corresponds to an integrated luminosity of . Our “model ” can then be a simulation with integrated luminosity . We might imagine that can be arbitrarily large, limited only by computational resources.6 We can then rewrite (3.30) as

(3.32)

From this expression it is clear that we can expect the value of this parameter to increase as experimental data is collected. The larger the value of the less likely it becomes to find a particularly small value of . This confirms out basic intuition that given any observable (or set of observables) for which the two models predict different values then with sufficient integrated luminosity it should always be possible to distinguish the models to arbitrary degree of confidence.

For any given value of , the probability that a measurement of will fluctuate to a value so small that it is not possible to separate two distinct models (to confidence level ) is simply the fraction of the probability distribution in (3.29) that lies to the left of the value . If we wish to be at least 95% certain that our measurements will correctly recognize that two different models are indeed distinct we must require

(3.33)

Since the value of the integral in (3.33) decreases monotonically as increases the value of this parameter which makes (3.33) an equality is the minimum non-centrality value such that the two models can be distinguished.

Confidence Level
n 0.95 0.975 0.99 0.999
1 12.99 17.65 24.03 40.71
2 15.44 20.55 27.41 44.99
3 17.17 22.60 29.83 48.10
4 18.57 24.27 31.79 50.66
5 19.78 25.71 33.50 52.88
6 20.86 26.99 35.02 54.88
7 21.84 28.16 36.41 56.71
8 22.74 29.25 37.69 58.40
9 23.59 30.26 38.89 59.99
10 24.39 31.21 40.02 61.48
Table 3: List of values for various values of the parameters and . A distribution such as those in Figure 2 with will have precisely the fraction of its total area at larger values of than the corresponding critical value from Table 2. A graphical example of this statement is shown in Figure 3.

In other words for two distinct models  and , any combination of experimental signatures such that will be effective in demonstrating that the two models are indeed different 95% of the time, with a confidence level of 95%. We have successfully reduced the problem to an exercise in pure mathematics, as these values can be calculated analytically without regard to the physics involved. A collection of values for small values of are given in Table 3. Note that as we increase the necessary value increases, reflecting the fact that as more observations are made we should expect that it will become increasingly likely that at least one will show a large deviation. Indeed, the quantity can be thought of as a measure of the overall distance from in the -dimensional signature space in units of the variances. As an example, again consider the case where . For this value of the corresponding value can be found from Table 2, while we can find from Table 3. We plot the distributions (3.29) for and simultaneously in Figure 3. By construction, the area of the non-central distribution to the left of the indicated value of will be precisely 5% of the total area.

Figure 3: Determination of for the case . The plot shows an example of the distribution of for . The curve on the left represent case, i.e. values we will get when we compare a model to itself. 95% of the possible outcomes of this comparison are below 2.61 which is shown on the plot. The curve on the right has and 95% of the curve is beyond 2.61. As increases, this curve moves further to the right and gets flatter.

Having reached the end of our somewhat lengthy digression on probability theory we now return to the physics issue at hand. The requirement that can be translated into a condition on the signature set and/or luminosity via the definition in (3.32). Let us make one final notational definition

(3.34)

where has the units of a cross section. Our condition for 95% certainty that we will be able to separate two truly distinct models at the 95% confidence level becomes

(3.35)

Given two models  and  and a selection of signatures both and are completely determined. Therefore the minimum amount of integrated luminosity needed to separate the models experimentally will be given by

(3.36)

We will be using (3.36) repeatedly throughout the rest of this paper. A well-chosen set of signatures will be the set that makes the resulting value of determined from (3.36) as small as it can possibly be.

3.3 Specific Signature Choice

Following the discussion in Section 3.2 we are in a position to define the goal behind our signature selection more precisely. We wish to select a set of signatures such that the quantity as defined in (3.36), for a given value of , is as small as it can possibly be over the widest possible array of model pairs  and . We must also do our best to ensure that the  signatures we choose to consider are reasonably uncorrelated with one another so that the statistical treatment of the preceding section is applicable. We will address the latter issue below, but let us first turn our attention to the matter of optimizing the signature list.

We took as our starting point an extremely large initial set of possible signatures. These included all the counting signatures and most of the kinematic distributions used in [6], all of the signatures of [42], several “classic” observables common in the literature [43] and several more which we constructed ourselves. Removing redundant instances of the same signature this yielded 46 independent counting signatures and 82 kinematic distributions represented by histograms, for 128 signatures in total. We might naively think that the best strategy is to include all of these signatures in the analysis (neglecting for now the issue of possible correlations among them). In fact, if the goal is statistically separating two models, the optimal strategy is generally to choose a rather small subset of the total signatures. Let us understand why that is the case. To do so we need a quantitative way of establish an absolute measure of the “power” of any given signature to separate two models  and . This can be provided by considering the condition in (3.36). For any signature we can define an individual by

(3.37)

where, for example, . This quantity is exactly the integrated luminosity required to separate models and , to confidence level , by using the single observable . For a list of signatures it is possible to construct such values and order them from smallest value (most powerful) to largest value (least powerful). If we take any subset of these, then the requisite that results from considering all simultaneously is given by

(3.38)

Referring back to Table 3 we see that the ratio grows with . This indicates that as we add signatures with ever diminishing values we will eventually encounter a point of negative returns, where the resulting overall starts to grow again.

Figure 4: An example of finding an “optimal” signature list. By sequentially ordering the calculated values for any particular pair of models in ascending order, it is always possible to find the optimal set of signatures for that pair by applying (3.38). In this particular example the minimum value of is found after combining just the first 12 signatures. After just the best six signatures we are already within 20% of the optimal value, as indicated by the shaded band.

As more signatures are added, the threshold for adding the next signature in the list gets steadily stronger. For a particular pair of models,  and , it is always possible to find the optimal list of signatures from among a given grand set by ordering the resulting values and adding them sequentially until a minimum of is observed. To do so, we note that kinematic distributions must be converted into counts (and all counts are then converted into effective cross sections). This conversion requires specifying an integration range for each histogram. The choice of this range can itself be optimized, by considering each integration range as a separate signature and choosing the values such that is minimized.

Figure 4, based on an actual pair of models from one of our model lines, represents the outcome of just such an optimization procedure. In this case a clearly optimal signature set is given by the 12 signatures represented by the circled point, which yields fb. The situation in Figure 4 is typical of the many examples we studied: the optimal signature set usually consisted of signatures. If we are willing to settle for a luminosity just 20% higher than this minimal value then we need only signatures, typically.7 This 20% range is indicated by the shaded band in Figure 4. Of course this “optimal” set of signatures is only optimal for the specific pair of models  and . We must repeat this optimization procedure on a large collection of model pairs and form a suitable average of the results in order to find a set of signatures that best approximates the truly optimal set over the widest possible set of model pairs . The lists we will present at the end of this section represent the results of just such a procedure.

But before we present them, we must now address the issue of correlations. To be able to use the analytic results of our statistical presentation of the problem in Section 3.2 we must be careful to only choose signatures from a list in which all the members are uncorrelated with one another. This immediately suggests a dilemma: once a signature is chosen, many others in the grand set will now be excluded for being correlated with the first. This complicates the process of optimization considerably – the task now becomes to perform the above optimization procedure over the largest possible list of uncorrelated (or at least minimally correlated) signatures. To find the correlation between any two signatures and it is sufficient to construct their correlation coefficient , given by

(3.39)

where the represent the individual results obtained from each of the cross section measurements, labeled by the index .

In our analysis we estimated the entries in the dimensional matrix of (3.39) in the following crude manner. We began with a simple MSSM model specified by a parameter set as in (3.18), with gaugino masses having the unified ratios of (1.2). We simulated this model  times, each time with a different random number seed. The simulation involved generating 5 fb of events using PYTHIA 6.4, which were passed to the detector simulator PGS4. After simulating the detector response and object reconstruction the default level-one triggers included in the PGS4 detector simulation were applied. Further object-level cuts were then performed, as summarized in Table 4. After these object-specific cuts we then applied an event-level cut on the surviving detector objects similar to those used in [6]. Specifically we required all events to have missing transverse energy , transverse sphericity , and (400 GeV for events with 2 or more leptons) where . Once all cuts were applied the grand list of 128 signatures was then computed for each run, and from these signatures the covariance matrix in (3.39) was constructed. All histograms and counting signatures were constructed and analyzing using the ROOT-based analysis package Parvicursor [44].

Object Minimum Minimum
Photon 20 GeV 2.0
Electron 20 GeV 2.0
Muon 20 GeV 2.0
Tau 20 GeV 2.4
Jet 50 GeV 3.0
Table 4: Initial cuts to keep an object in the event record. After event reconstruction using the package PGS4 we apply additional cuts to the individual objects in the event record. Detector objects that fail to meet the above criteria are removed from the event record and do not enter our signature analysis. These cuts are applied to all analysis described in this paper.

Not surprisingly, many of the signatures considered in our grand list of 128 observables were highly correlated with one another. For example, the distribution of transverse momenta for the hardest jet in any event was correlated with the overall effective mass of the jets in the events (defined as the scalar sum of all jet values: ). Both were correlated with the distribution of values for the events, and so forth. The consistency of our approach would then require that only a subset of these signatures can be included. One way to eliminate correlations is to partition the experimental data into mutually-exclusive subsets through some topological criteria such as the number of jets and/or leptons. For example, the distribution of values in the set having any number of jets and zero leptons will be uncorrelated with the same signature in the set having any number of jets and at least one lepton. Our analysis indicated that this partitioning strategy has its limitations, however. The resolving power of any given signature tends to diminish as the set it is applied to is made ever more exclusive. This is in part due to the diminishing cross-section associated with the more exclusive final state (recall that our metric for evaluating signatures is proportional to the cross-section). It is also the case that the statistical error associated with extracting these cross-section values from the counts will grow as the number of events drops. We were thus led to consider a very simple two-fold partitioning of the data:

(3.40)

This choice of data partitioning is reflected in the signature tables at the end of this section.

Within each if the four subsets it is still necessary to perform a correlation analysis and construct the matrix in (3.39). Let us for the moment imagine that we are willing to tolerate a correlation among signatures given by some value . Then the matrix of correlations in (3.39) can be converted into a matrix which defines the uncorrelated signatures by assigning the values

(3.41)

The matrix is actually the adjacency matrix of a graph8 and the problem of finding all the possible sets of uncorrelated signatures is equivalent to finding all the complete subgraphs (or ‘clique’) of that graph. A complete graph is a graph which has an edge between each vertex. In terms of our problem, this means a set of signatures having at most a correlation at the level of between any two of them. This is a well-known problem in combinatorics that becomes exponentially more difficult to solve as the number of signatures increases. For our purposes we will be working with relatively small sets of signatures which were pre-selected on the basis of their effectiveness for separating from non-zero values of this parameter. Then from these sets we will proceed to build the maximal subgraph for our choice of allowed correlation .

Description Min Value Max Value
1 = + [All events] 1250 GeV End
Table 5: Signature List A. The effective mass formed from the transverse momenta of all objects in the event (including the missing transverse energy) was the single most effective signature of the 128 signatures we investigated. Since this “list” is a single item it was not necessary to partition the data in any way. For this distribution we integrate from the minimum value of 1250 GeV to the end of the distribution.

We constructed a large number of model families in the manner described in Section 3.1, each involving the range for the parameter in steps of . For each point along these model lines we generated 100,000 events using PYTHIA 6.4 and PGS4. To this we added an appropriately-weighted Standard Model background sample consisting of 5 fb each of / and / pair production, high- QCD dijet production, single and -boson production, pair production of electroweak gauge bosons (, and ), and Drell-Yan processes. To examine which of our 128 signatures would be effective in measuring the value of the parameter we fixed “model ” to be the point on each of the model lines with and then treated each point along the line with as a candidate “model .” Clearly each model line we investigated – and each value along that line – gave slightly different sets of maximally effective signatures. The lists we will present in Tables 56 and 7 represent an ensemble average over these model lines, restricted to a maximum correlation amount as described above.

Description Min Value Max Value
1 [0 leptons, jets] 1100 GeV End
2 [0 leptons, jets] 1450 GeV End
3 [ leptons, jets] 1550 GeV End
4 (Hardest Lepton) [ lepton, jets] 150 GeV End
5 [0 leptons, jets] 0 GeV 850 GeV
Table 6: Signature List B. The collection of our most effective observables, restricted to the case where the maximum correlation between any two of these signatures is 10%. Note that the jet-based effective mass variables would normally be highly-correlated if we had not partitioned the data according to (3.40). For these distributions we integrate from “Min Value” to “Max Value”.

Let us begin with Table 5, which gives the single most effective signature at separating models with different values of the parameter . It is the effective mass formed from all objects in the event

(3.42)

where we form the distribution from all events which pass our initial cuts. That this one signature would be the most powerful is not a surprise given the way we have set up the problem. It is the most inclusive possible signature one can imagine (apart from the overall event rate itself) and therefore has the largest overall cross-section. Furthermore, the variable in (3.42) is sensitive to the mass differences between the gluino mass and the lighter electroweak gauginos – precisely the quantity that is governed by the parameter . Yet as we will see in Section 4 this one signature can often fail to be effective at all in certain circumstances, resulting in a rather large required to be able to separate from non-vanishing cases. In addition it is built from precisely the detector objects that suffer the most from experimental uncertainty. This suggests a larger and more varied set of signatures would be preferable.

We next consider the five signatures in Table 6. These signatures were chosen by taking our most effective observables and restricting ourselves to that set for which = 10%. We again see the totally inclusive effective mass variable of (3.42) as well as the more traditional effective mass variable, , defined via (3.42) but with the scalar sum of values now running over the jets only. We now include the of the hardest lepton in events with at least one lepton and five or more jets, as well as the invariant mass of the jets in events with zero leptons and 4 or less jets. The various jet-based effective mass variables would normally be highly correlated with one another if we were not forming them from disjoint partitions of the overall data set. The favoring of jet-based observables to those based on leptons is again largely due to the fact that jet-based signatures will have larger effective cross-sections for reasonable values of the SUSY parameters in (3.18) than leptonic signatures. The best signatures are those which track the narrowing gap between the gluino mass and the electroweak gauginos and the narrowing gap between the lightest chargino/second-lightest neutralino mass and the LSP mass. In this case the first leptonic signature to appear – the transverse momentum of the leading lepton in events with at least one lepton – is an example of just such a signature.

Description Min Value Max Value
Counting Signatures
1 [ leptons, jets]
2 []
3 [ B-jets]
[0 leptons, jets]
4 1000 GeV End
5 750 GeV End
6 500 GeV End
[0 leptons, jets]
7 1250 GeV 3500 GeV
8 [3 jets 200 GeV] 0.25 1.0
9 (4th Hardest Jet) 125 GeV End
10 / 0.0 0.25
[ leptons, jets]
11 / 0.0 0.25
12 (Hardest Lepton) 150 GeV End
13 (4th Hardest Jet) 125 GeV End
14 + 1250 GeV End
Table 7: Signature List C. In this collection of signatures we have allowed the maximum correlation between any two signatures to be as high as 30%. Note that some of the signatures are normalized signatures, (#8, #10 and #11), while the first three are truly counting signatures. A description of each of these observables is given in the text. For all distributions we integrate from “Min Value” to “Max Value”.

Finally, let us consider the larger ensemble of signatures in Table 7. In this final set we have relaxed our concern over the issue of correlated signatures, allowing as much as 30% correlation between any two signatures in the list. This allows for a larger number as well as a wider variety of observables to be included. As we will see in Section 4 this can be very important in some cases in which the supersymmetric model has unusual properties, or in cases where the two values being considered give rise to different mass orderings (or hierarchies) in the superpartner spectrum. In displaying the signatures in Table 7 we find it convenient to group them according to the partition of the data being considered. Note that the counting signatures are taken over the entire data set.

The first counting signature is simply the total size of the partition from (3.40) in which the events have at least one lepton and 4 or less jets. This was the only observable taken on this data set that made our list of the most effective observables. The next two signatures are related to “spoiler” modes for the trilepton signal. Note that the trilepton signal itself did not make the list: this is a wonderful discovery mode for supersymmetry, but the event rates between a model with and one with non-vanishing were always very similar (and low). This made the trilepton counting signature ineffective at distinguishing between models. By contrast, counting the number of b-jet pairs (a proxy for counting on-shell Higgs bosons) or the number of opposite-sign electron or muon pairs whose invariant mass was within 5 GeV of the Z-mass (a proxy for counting on-shell Z-bosons) were excellent signatures for separating models from time to time. This was especially true when the two models in question had very different values of such that the mass differences between and were quite different in the two cases. We will give specific examples of such outcomes in Section 4.

The following three sections of Table 7 involve some of the same types of observables as in the previous tables, with a few notable changes and surprises. First note that several of the observables in Table 7 involve some sort of normalization. In particular numbers 8, 10 and 11. Our estimate of the correlations among signatures found that the fluctuations of these normalized signatures tended to be less correlated with other observables for that partition than the un-normalized quantities. However, normalizing signatures in this way also tended to reduce their ability to distinguish models. Signature #8 is defined as the following ratio

(3.43)

where is the transverse momentum of the -th hardest jet in the event. For this signature we require that there be at least three jets with . This signature, like the of the hardest lepton or the