Search for excited quarks of light and heavy flavor in \gamma+\text{jet} final states in proton-proton collisions at \sqrt{s}=13\TeV

Search for excited quarks of light and heavy flavor in final states in proton-proton collisions at


A search is presented for excited quarks of light and heavy flavor that decay to final states. The analysis is based on data corresponding to an integrated luminosity of 35.9\fbinvcollected by the CMS experiment in proton-proton collisions at at the LHC. A signal would appear as a resonant contribution to the invariant mass spectrum of the system, above the background expected from standard model processes. No resonant excess is found, and upper limits are set on the product of the excited quark cross section and its branching fraction as a function of its mass. These are the most stringent limits to date in the final state, and exclude excited light quarks with masses below 5.5\TeVand excited b quarks with masses below 1.8\TeV, assuming standard model couplings.







1 Introduction

High energy proton-proton collisions containing a photon and a jet with large transverse momenta (\pt) provide a powerful means of searching for new physics. For example, models involving compositeness [1, 2, 3] predict excited states of quarks that can be identified by searching for events that contain a photon and a jet from their decays. We present a search for excited states of light (u,d) and heavy (b) quarks using this decay signature.

We assume that the coupling between the excited quark (), the ordinary quarks, and gauge bosons proceeds through a gauge-invariant magnetic-moment operator, described by the effective Lagrangian [4]:


where is the right-handed excited quark field; the Pauli spin matrix; the left-handed quark field; , , and are the field tensors of the SU(3), SU(2), and U(1) gauge fields respectively; , , and are the corresponding gauge structure constants, and , , and are the gauge couplings. The compositeness scale is the energy scale typical for these interactions. The quantities , , and are unknown dimensionless constants that represent the strengths of the excited quark couplings to the standard model (SM) partners. Their values are determined by the compositeness dynamics, and are usually assumed to be of order unity.

In pp collisions, excited quarks are expected to be produced predominantly through quark-gluon fusion (), and then decay into a quark and a gauge boson (). Searches have been performed in different channels  [5, 6, 7, 8, 9, 10, 11], but no evidence for the existence of excited quarks has yet been found. This analysis looks for evidence of (where represents or ) and production by searching for resonances in final states. The signal model includes excited quarks with spin-, and assumes a compositeness scale that equals the mass of the resonance (). An assumption is also made that , , and have identical values [3, 4] and henceforth these will be referred to collectively as . The data correspond to an integrated luminosity of 35.9\fbinvcollected by the CMS experiment in pp collisions at at the CERN LHC, in 2016.

A final state with a photon and a jet is produced in the SM mainly through , , , multijet, and W/Z processes. Among these, the main irreducible backgrounds are quark-gluon Compton scattering () and quark-antiquark annihilation (). Although the probability for a jet to be reconstructed as a photon is to , the cross section for multijet production is two to three orders of magnitude larger than that for the irreducible backgrounds, depending on the of the jet [12], making jet misidentification the second-largest source of background. Electroweak production of /+, where the W or Z boson decays to a pair of quark jets, contributes a very small fraction of the background due to its small production cross section.

This Letter provides a brief description of the CMS detector in Section 2. The main strategy used in selecting the events is discussed in Section 3. Section 4 contains information about signal and background models, while Section 5 lists the systematic uncertainties estimated in this analysis. The results of the study are presented in Section 6 and summarized in Section 7.

2 The CMS detector

The central feature of the CMS detector is a superconducting solenoid of 6\unitm internal diameter, providing a magnetic field of 3.8\unitT. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. The very forward regions of the detector near the beam line is covered by the forward calorimeters. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. In the barrel section of the ECAL, an energy resolution of about 1% is achieved for unconverted or late-converting photons in the tens of \GeVenergy range. The remaining barrel photons have a resolution of about 1.3% up to pseudorapidity = 1 rising to about 2.5% at  [12], where is defined as , being the polar angle of the cylindrical coordinates of the CMS detector. In the endcaps, the resolution of unconverted or late-converting photons is about 2.5%, while the remaining endcap photons have a resolution between 3–4%. When combining information from the entire detector, the jet energy resolution is typically around 15% at 10\GeV, 8% at 100\GeV, and 4% at 1\TeV. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in [13].

The CMS experiment selects physics events using a two-tier trigger system, a hardware-based level-1 (L1) and a software-based high-level trigger (HLT). The L1 trigger selects events of interest using information from the calorimeters and the muon system only, and reduces the readout rate from the bunch crossing frequency of 40\unitMHz to below 100\unitkHz. The HLT system further decreases this rate to an average of a few 100\unitHz to a maximum of 1\unitkHz. The events selected by the HLT are then reconstructed offline and used for analysis.

3 Event selection

Events are analyzed using a particle-flow (PF) algorithm [14], which reconstructs and identifies each individual particle with an optimized combination of information from the various elements of the CMS detector. The energy of photons is directly obtained from the ECAL measurement, corrected for zero-suppression effects [12]. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy.

The jets in each event are formed mainly from photons, charged, and neutral hadrons using the infrared- and collinear-safe anti- algorithm [15], with distance parameter where , and being the pseudorapidity and azimuthal angle (in radians) difference between the jet axis and its constituents. Jet momenta and energies are corrected to establish a uniform calorimetric response in and an absolute response in at the particle level using calibration constants [16] obtained from simulation, test beam results, and pp collision data at .

The data sample used in this analysis consists of events that are selected by a photon trigger having a threshold of 165\GeVand an additional condition on the ratio of the photon energy deposited in the HCAL to that in the ECAL (H/E), which is required to be less than 10%. The efficiency of the trigger used in the study has been evaluated separately using samples collected with photon, muon, or jet triggers to account for possible biases in trigger selection. The trigger efficiencies measured in these samples are greater than 95% for 200, as measured offline.

In the offline selection, each event is required to have at least one reconstructed primary vertex with at least four associated tracks, and lie within 24\unitcm along the direction and within 2\unitcm in the transverse plane, from the nominal collision point. The reconstructed vertex with the largest value of summed physics-object is taken as the primary pp interaction vertex. The physics objects are the jets, clustered using the jet finding algorithm [15, 17] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the of those jets.

The photon identification [12] is based on requirements on H/E and shower profile of the photon. The photon should be isolated from any identified electrons in the detector. The photon is also required to be well isolated from other photons and hadrons within a cone of around its axis. The photon must have \GeVand lie in the central barrel region (). Among the photons passing the above criteria in each event, the one with the highest is selected to reconstruct the mass of the photon+jet system in the event. The isolation quantities are corrected for effects from overlapping pp interactions (pileup) in the same or adjacent bunch crossings, by subtracting the energy calculated from the mean energy density in the event, as computed using the package [17]. The photon identification and isolation criteria used in this analysis lead to a signal efficiency of 80% with an estimated background rejection of 90%.

In order to be combined with a photon to form a resonance candidate, the selected jet must be separated from the chosen photon candidate by and satisfy the jet identification criteria [18]. The jet identification criteria comprise requirements on the number of constituents, and on the fraction of jet energy carried by each constituent type. The jet is required to be within the region and must have a \GeV. The angular separation between the selected photon and jet is restricted by applying a requirement of (,jet) 1.5. This selection removes a large fraction of the multijet background without rejecting signal events, and thus enhances the signal-over-background ratio. If more than one jet candidate is present in the event, the jet with the highest \ptis used in the analysis. The selected events form the “inclusive category” for the search of light excited quarks.

Jets originating from b quarks are identified using the combined secondary vertex v2 algorithm (CSVv2) [19, 20]. The algorithm combines the information from the primary vertex, impact parameters, and secondary vertices within the jet using a neural network discriminator. The loose working point used in the analysis has 81% b jet selection efficiency, 10% misidentification rate for light-quark and gluon jets, and 40% misidentification rate for c quark jets [19]. Depending on the outcome of the CSVv2 algorithm, a jet is tagged either as a b jet or a non b jet candidate. According to this tagging, for the analysis, the events are further classified in the “1b tag” and “0b tag” categories. Since the acceptance falls off slightly for 1b tag category at higher masses (Fig. 1), the sensitivity of the search is improved by including the results from 0b tag category in the limit computation.

The above selection criteria are optimized for the best expected 95% confidence level (CL) limits on the cross section versus mass of and .

The efficiencies for assigning events to the 1b tag and 0b tag categories, determined from the Monte Carlo (MC) simulation, are corrected using b tag scale factors (SFs), to take into account the observed differences between the b tagging efficiency of the CSVv2 tagger applied to data and to MC simulation. The SFs are defined as , where and correspond to the b tagging efficiencies of the CSVv2 algorithms in data and MC simulation, respectively. These SFs have been measured using the techniques described in [19].

The invariant mass of the selected () system is required to be , to avoid the turn-on region due to the requirements imposed on the kinematic properties of the trigger objects. Fig. 1 shows the total selection and reconstruction efficiency times acceptance for and processes. The acceptance times efficiency for the 1b tag category decreases with increasing mass owing to the decrease in the efficiency of the track reconstruction and the resolution of the reconstructed track parameters with increasing of the jet.

Figure 1: The product of acceptance and efficiency for and signals as a function of generated or mass, calculated using MC simulation.

4 Modelling signal and background

The signal samples for and are simulated at leading order (LO) with the \PYTHIA 8.212 event generator [21] for , 0.5 and 0.1 at different resonance masses in the range from 1 to 7\TeVat intervals of 1\TeVand from 1 to 5\TeVat intervals of 0.5\TeV, respectively. The generated events are processed through a full CMS detector simulation based on \GEANTfour [22]. The simulation uses the CUETP8M1 underlying event tune [23, 24], a renormalization and factorization scale corresponding to for the hard-scattered partons, and NNPDF2.3LO parton distribution functions (PDFs) [25]. The natural width of the resonance, at parton level, can be approximated as  [3]. The production cross section is also proportional to . The signals for intermediate mass points are interpolated at intervals of 50\GeV.

The \MGvATNLOv2.2.2 program [26] has been used to generate the and /+ background MC samples at LO, with the showering and hadronization carried out by the \PYTHIA 8.212 program. A double counting of the partons generated with \MADGRAPHand those with \PYTHIAis removed using the MLM [27] matching scheme. The multijet MC events are generated using \PYTHIA 8.212 event generator. The same event reconstruction is employed in data and MC simulations. However, the background is evaluated from data, and the MC simulation is used only for the optimization of the event selection. The invariant mass distribution of the SM background falls smoothly and can be described by an analytic function.

The inclusive invariant mass distribution and the distributions for 1b tag and 0b tag categories, expressed in \TeV, are shown in Figs. 2 and 3, respectively. The binning is chosen to have a bin width approximately equal to the expected mass resolution, which varies from about 4.5% at a mass of 1\TeVto 3.3% at 6\TeV. These distributions are modeled using an empirical parametrization that has been used widely in similar previous searches [7, 8, 10, 11]:


where and , , , and are four parameters used to describe the background distribution and its normalization. The order of the function has been chosen by performing Fisher tests [28], with a cut-off p-value of 0.05. The highest invariant mass event observed in data has of 4.6\TeVwith a b-tagged jet, and thus belongs to both the inclusive and 1b tag categories.

In order to examine the presence of a possible systematic bias due to the choice of background fitting function, tests are performed using alternate functional forms. These alternative expressions are polynomial functions that also provide adequate descriptions of the data. To perform these tests, an invariant mass distribution of the SM background is obtained from MC simulation. This invariant mass distribution is fitted with alternate test functions and the results of the fit, considered as the truth model, are used to generate a large number of pseudo-data samples that have bin-to-bin statistical fluctuations similar to those of the data. A signal with a cross section close to the expected sensitivity is also injected in the pseudo-data distributions. These distributions are then fitted using the default background function along with a signal model, and the signal cross section is extracted. Pull distributions defined as the difference between the true and extracted signal cross sections divided by the estimated statistical uncertainty, for the obtained signal cross sections are constructed. The deviation from zero, of the mean in the pull distribution, is a measure of the bias present in the model. The pull distributions for and modelling over the studied mass range are found to be consistent with normalized Gaussian forms with medians deviating by no more than 0.5 from zero, and widths consistent with unity for the full mass range. When added in quadrature with the statistical uncertainty, the bias uncertainty is found to contribute approximately 10% of the total. Therefore, it is concluded that the systematic uncertainty associated with the choice of the parametric function is negligible, and the statistical uncertainty of the fit is the only uncertainty in the background prediction that needs to be considered.

Figure 2: The invariant mass distribution in data (black points) for the inclusive category used for the analysis, after final selection. The result of the fit to the data using the parametrization defined in Eq. (2) is shown by the blue dashed curve with associated bands indicating the uncertainty. The bin-by-bin pull, (Data-Fit)/(stat. unc.), where the denominator refers to the statistical uncertainty in data, is also presented. The green and yellow bands corresponds to 1 and 2 standard deviations, respectively from the mean value. Simulations of excited quark signals representing the expected excess of signal events over the background are shown for the mass values of 1.0 and 5.0\TeVfor , and 2.0\TeVfor .
Figure 3: The invariant mass distribution in data (black points) used for the analysis, after final selection for (left) 1b tag category and (right) 0b tag category. The result of the fit to the data using the parametrization defined in Eq. (2) is shown by the blue dashed curve with associated bands indicating the uncertainty. The bin-by-bin pull, (Data-Fit)/(stat. unc.), where the denominator refers to the statistical uncertainty in data, is also presented. The green and yellow bands corresponds to 1 and 2 standard deviations, respectively from the mean value. Simulations of excited b quark signals representing the expected excess of signal events over the background are shown for the 1b and 0b tag categories for the mass values of 1.0 and 2.0\TeVfor .

5 Systematic uncertainties

The dominant sources of the systematic uncertainties affecting the and signals are summarized in Table 5.


Summary of the dominant sources of uncertainties and their effect on the signal yield. Source Effect on the signal yield(%) Integrated luminosity 2.5 Jet energy scale 1 Jet energy resolution 0.2–0.4 Photon energy scale 0.6 Photon energy resolution 0.2–0.4 Pileup 1–2 Photon ID efficiency 2 Trigger efficiency 5 Signal interpolation 0.5–1 PDF choice 1.5–3 b tag SF (only ) 1 b tag SF normalization (only ) 2

The uncertainties in the jet energy scale and jet energy resolution [16] affect both the signal yield and its distribution. The size of the effect is determined by varying the four-momenta of the jets by the corresponding uncertainties and repeating the full analysis with the modified quantities.

The systematic uncertainties in the photon energy scale and resolution, and photon identification efficiency are derived from events. The uncertainty in the photon energy scale is found to be about 1% and it includes the uncertainty in the extrapolation to higher \pt, beyond the reach of the control samples [12]. The uncertainty in the photon identification is estimated to be around 2%. Also, a systematic uncertainty of 5% has been included to account for the precision of the photon trigger efficiency measurement. The effect of the b tagging scale-factor uncertainty on the distribution of the signal is evaluated to be around 1% while on the normalization, the effect is around 2%. The method used to interpolate the signal distributions from the generated distributions is assigned an uncertainty of 0.5–1.0%, which accounts for the difference between the generated and interpolated signals. The PDF uncertainty affects the signal acceptance by 1.5–3.0% for both and quarks and is evaluated using PDF4LHC recommendations [29].

The uncertainties in the measurement of the integrated luminosity (2.5%) [30] and pileup description (1%) affect the overall signal yield. The uncertainty in the background estimate is accounted for in the fit by varying the parameter values within their respective uncertainties, with no additional constraints.

6 Results

In the mass region studied, no significant excess has been observed. We use the invariant mass spectra (Figs. 23), the background parametrization, and the and theoretical predictions to set 95% CL upper limits on the production cross section of and decaying to and , respectively.

The modified frequentist CL method [31, 32] in the asymptotic approximation [33] is utilized to set upper limits on signal cross sections. In order to evaluate limits, a likelihood function is constructed that is the product of the Poisson likelihoods of all the bins in the distribution. The systematic uncertainties in the signal are implemented in terms of nuisance parameters with Gaussian and log-normal constraints. The uncertainty due to the background parametrization is found to have the largest impact and is quantified by considering the effect of changing the parameters from their central values by their estimated 1 sigma uncertainties. We calculate limits by evaluating the likelihood independently at successive values of resonance mass from 1 to 6\TeVfor , and 1 to 5\TeVfor in steps of 50\GeV. The cross section limits are not evaluated below 1\TeV, because of uncertainties in the signal efficiency associated with the invariant mass selection, .

In order to evaluate limits for , likelihoods for 1b and 0b tag categories are combined together. The observed and expected mass limits for and are computed at 95 CL. The results are presented in terms of limits on the product of the cross section () and branching fraction (). The cross section upper limits, for , are compared to the LO theoretical predictions, for all the three couplings, to estimate the lower mass limit on excited quarks, as shown in Figs. 4 and 5 for and , respectively. The dependence of on is found to be negligible since the resonance width is small compared to the experimental mass resolution. Observed lower bounds of 5.5 and 1.8\TeVare obtained for and , respectively, for . The corresponding expected mass limits obtained are 5.4 (1.8)\TeVfor (). The variation of the excluded mass as a function of the coupling strength, obtained by interpolating the efficiencies for three signal MC samples corresponding to , 0.5, 0.1, is shown for and in Fig. 6. This result can also be interpreted in terms of the ratio of the resonance mass and , , if we relax the assumption of = , the excited quark production cross section is proportional to as well as /.

Figure 4: The observed and expected upper limits at 95% CL on as a function of the mass of the excited quark, for . The limits are compared with theoretical predictions for excited quark production for three couplings. The inner (green) band and the outer (yellow) band indicate the regions containing 68 and 95%, respectively, of the mean limits under the background-only hypothesis.
Figure 5: The observed and expected upper limits at 95% CL on as a function of the mass of the excited b quark, for . The limits are compared with theoretical predictions for excited b quark production for three couplings. The inner (green) band and the outer (yellow) band indicate the regions containing 68 and 95%, respectively, of the mean limits under the background-only hypothesis.
Figure 6: The observed and expected regions excluded at 95% CL for and production and decay, as a function of , and .

7 Summary

A search has been presented for excited states of light and b quarks in final states, using data corresponding to an integrated luminosity of 35.9\fbinv, collected at . Upper limits at the 95% confidence level are placed on the product of production cross section and decay branching fraction for the presence of and excited quarks in final states. Comparing these upper limits with theoretical predictions, excited light quarks within the mass range and excited b quarks within the mass range are excluded at 95% confidence level, assuming standard model couplings. These are the most sensitive limits for and searches in the final states. In addition, the search for excited b quarks is the first to be presented in any final state at .

We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Austria); FNRS and FWO (Belgium); CNPq, CAPES, FAPERJ, and FAPESP (Brazil); MES (Bulgaria); CERN; CAS, MoST, and NSFC (China); COLCIENCIAS (Colombia); MSES and CSF (Croatia); RPF (Cyprus); SENESCYT (Ecuador); MoER, ERC IUT, and ERDF (Estonia); Academy of Finland, MEC, and HIP (Finland); CEA and CNRS/IN2P3 (France); BMBF, DFG, and HGF (Germany); GSRT (Greece); OTKA and NIH (Hungary); DAE and DST (India); IPM (Iran); SFI (Ireland); INFN (Italy); MSIP and NRF (Republic of Korea); LAS (Lithuania); MOE and UM (Malaysia); BUAP, CINVESTAV, CONACYT, LNS, SEP, and UASLP-FAI (Mexico); MBIE (New Zealand); PAEC (Pakistan); MSHE and NSC (Poland); FCT (Portugal); JINR (Dubna); MON, RosAtom, RAS, RFBR and RAEP (Russia); MESTD (Serbia); SEIDI, CPAN, PCTI and FEDER (Spain); Swiss Funding Agencies (Switzerland); MST (Taipei); ThEPCenter, IPST, STAR, and NSTDA (Thailand); TUBITAK and TAEK (Turkey); NASU and SFFR (Ukraine); STFC (United Kingdom); DOE and NSF (USA). Individuals have received support from the Marie-Curie program and the European Research Council and Horizon 2020 Grant, contract No. 675440 (European Union); the Leventis Foundation; the A. P. Sloan Foundation; the Alexander von Humboldt Foundation; the Belgian Federal Science Policy Office; the Fonds pour la Formation à la Recherche dans l’Industrie et dans l’Agriculture (FRIA-Belgium); the Agentschap voor Innovatie door Wetenschap en Technologie (IWT-Belgium); the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic; the Council of Science and Industrial Research, India; the HOMING PLUS program of the Foundation for Polish Science, cofinanced from European Union, Regional Development Fund, the Mobility Plus program of the Ministry of Science and Higher Education, the National Science Center (Poland), contracts Harmonia 2014/14/M/ST2/00428, Opus 2014/13/B/ST2/02543, 2014/15/B/ST2/03998, and 2015/19/B/ST2/02861, Sonata-bis 2012/07/E/ST2/01406; the National Priorities Research Program by Qatar National Research Fund; the Programa Severo Ochoa del Principado de Asturias; the Thalis and Aristeia programs cofinanced by EU-ESF and the Greek NSRF; the Rachadapisek Sompot Fund for Postdoctoral Fellowship, Chulalongkorn University and the Chulalongkorn Academic into Its 2nd Century Project Advancement Project (Thailand); the Welch Foundation, contract C-1845; and the Weston Havens Foundation (USA).

