Subtracted Cumulants: Mitigating Large Background in Jet Substructure

# Subtracted Cumulants: Mitigating Large Background in Jet Substructure

Yang-Ting Chien Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA    Daekyoung Kang Key Laboratory of Nuclear Physics and Ion-beam Application (MOE) and Institute of Modern Physics, Fudan University, Shanghai, China 200433    Kyle Lee C.N. Yang Institute for Theoretical Physics, Stony Brook University, Stony Brook, NY 11794, USA Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY 11794, USA    Yiannis Makris Theoretical Division T-2, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA
###### Abstract

We introduce a new approach for jet physics studies using subtracted cumulants of jet substructure observables, which are shown to be insensitive to contributions from soft-particle emissions uncorrelated with the hard process. Therefore subtracted cumulants allow comparisons between theoretical calculations and experimental measurements without the complication of large background contaminations such as underlying and pile-up events in hadron collisions. We test our method using subtracted jet mass cumulants by comparing Monte Carlo simulations to analytic calculations performed using soft-collinear effective theory. We find that, for proton-proton collisions, the method efficiently eliminates contributions from multiparton interactions and pile-up events. We also find within theoretical uncertainty our analytic calculations are in good agreement with the subtracted cumulants calculated by using ATLAS jet mass measurements.

preprint: LA-UR-18-31092preprint: MIT-CTP 5088

IntroductionJets have become essential objects of study at high energy colliders. They are produced ubiquitously in hard scattering processes as well as hadronic decays of heavy particles. Tremendous progress has been made to understand and use jets for testing the standard model and searching for new physics, with reliable Monte Carlo simulations Sjostrand et al. (2008); Bellm et al. (2016) of high-energy collisions as well as useful jet substructure analysis tools Larkoski et al. (2017); Andrews et al. (2018). However, theoretical precision is often limited by the need to model soft radiation such as from multiparton interactions (MPI) in hadron collisions, which are insensitive to the hard process in a wide range of energies. Other contributions to the underlying event (UE) are either calculable (e.g. initial state radiation (ISR)) or relatively small (e.g. hadronization). In high-luminosity (HL) collisions there can also be a large number of uncorrelated pile-up (PU) events producing a significant background of soft particles. With accurate vertex determination using charged-particle tracks, one can remove PU charged particles but still not PU neutral particles. Also, heavy-ion collisions (HIC) can produce a large number of soft particles () through the interactions among a large number of nucleons in the nuclei. The soft particles are observed to be mostly uncorrelated with the hard process of interest and exhibit novel collective behaviors Voloshin and Zhang (1996); Poskanzer and Voloshin (1998). However, correlated effects certainly exist and significantly quench the jets Bjorken (1982); Connors et al. (2018).

It is clear that jet observables are affected dramatically by an uncorrelated large background, which can overwhelm the correlated effects one wants to probe such as jet modifications and medium responses in HIC. Many subtraction techniques Cacciari and Salam (2008); Feige et al. (2012); Larkoski and Thaler (2014); Krohn et al. (2014); Cacciari et al. (2015); Bertolini et al. (2014a); Berta et al. (2014); Larkoski et al. (2014); Komiske et al. (2017); Soyez (2018) have been developed in order to correct jets back to their true compositions. Ideally one would like the subtraction to work for each jet, which is impossible due to the intrinsic ambiguity between signal and background particles. One could then hope to remove the background and correct jet observable distributions statistically. The precision of the background subtraction will then have to be quantified when comparing measurements to analytic calculations.

In this paper, instead of relying on algorithms to remove soft particles out of jets, we provide an alternative approach for comparing theoretical calculations directly to experimental measurements without the complication of modeling soft uncorrelated emissions (SUEs). Specifically, we define subtracted cumulants which cancel the contributions from SUEs. This approach was first introduced in the context of transverse energy of Drell-Yan processes Kang et al. (2018a). Here we extend its use to jet substructure observables to which SUEs additively contribute. Additive contributions from uncorrelated emissions can be easily removed, and the resulting subtracted cumulant is thus useful for precision jet physics studies. The jet mass, is a classic observable which receives additive contributions from SUEs. In this paper, as proof of concept, we focus only on the first cumulant of jet invariant mass in proton-proton collisions. It is straightforward to extend our study to other jet substructure observables and higher cumulants.

The rest of the paper is organized as follows. We first give the definition of subtracted jet mass cumulant , and we demonstrate its robustness against SUEs using Pythia Monte Carlo simulations. Since jet substructure observables highly depend on the jet-initiating partons we also show the sensitivity of to quark-gluon jet fractions. Finally, we show that the comparisons of our theoretical predictions performed using soft-collinear effective theory (SCET) Bauer et al. (2000, 2001); Bauer and Stewart (2001); Bauer et al. (2002a, b); Beneke et al. (2002) to the results computed from ATLAS measurements are in a good agreement.

Definition of the observableFor a jet substructure observable, , which receives additive contributions from individual particles within a jet, we have

 e=∑i∈jet^e(i), (1)

where depends on the four-momenta of the -th particle in the jet. In the presence of SUEs, because of the additivity, the observable can be decomposed into two terms,

 e=∑i∈signal^e(i)+∑i∈SUEs^e(i)≡eS+eB, (2)

where “S” refers to the signal contributions which are correlated with the hard process and “B” to the background from SUEs. Here is background contribution statistically independent of . Its probability density does not depend on the kinematics and details of the hard process, such as the jet energy, angular direction, and the flavor of the initiating parton. Let , , and denote probability densities of the observables , , and , respectively. Since SUEs are uncorrelated with the signal, the probability density at the values and is simply a product of uncorrelated distributions . Then, is given by

 P(e) = ∫deSdeBδ(e−eS−e%B)P(eS,eB) (3) = ∫deBPS(e−eB)PB(eB),

which has a convolution form. The cumulants are defined using the cumulant-generating function ,

 K(t)=∞∑n=1κntnn!=log⟨exp(te)⟩, (4)

where denotes the expectation value. Note the additivity of cumulants: which will allow us to cancel uncorrelated contributions in the subtraction between cumulants. Also, cumulants are in one-to-one correspondence with moments : , , , etc.

We define the jet substructure observable which is closely related to the jet invariant mass and receives additive contributions from signal and background,

 ^τ=2cosh(η)∑i∈jetp+i=^τS+^τB=m2JpT[1+O(m2Jp2T)], (5)

where and are the jet transverse momentum and pseudorapidity with respect to the beam axis, and is the small light-cone component of the constituent’s momentum with respect to the jet axis . Then for the dimensionless observable , up to corrections suppressed by , Eq.(3) becomes

 dσdτdpTdη(τ,pT)=∫d^τBdσcorrdτdpTdη(τ−^τBpT,pT)f(^τB), (6)

where the function is the normalized probability distribution of the SUE contribution to the observable . It was shown that this convolutional expression with a simple model for well describes the MPI contribution in Monte Carlo simulations Stewart et al. (2015); Hoang et al. (2017); Kang et al. (2018a) and experimental measurements Kang et al. (2018b). Note that the expression in Eq. (6) resembles the factorization of hadronization contributions derived using operator product expansion Lee and Sterman (2007), and hadronization is correlated with the hard process. Here, only includes SUEs and is observed to be independent of the jet pseudorapidity in the plateau region. Due to similar convolution structure for hadronization effects, hadronization effects are also largely removed in subtracted cumulant we define below for proton-proton collisions.

For the first cumulant (the first moment, equivalently), which is denoted by and is a function of jet and ,

 ⟨τ⟩=(dσdpTdη)−1∫dττdσdτdpTdη=⟨τcorr⟩+ΩfpT, (7)

where is the first cumulant in the absence of SUEs and is independent of hard scale . Therefore one can define SUE-independent observable by taking the derivative of -weighted cumulant.

 ddpTpT⟨τ⟩=ddpTpT⟨τcorr⟩. (8)

For a binned cross section of the -th bin in and -th bin in jet , the cumulant of the -th bin is the following,

 ⟨τ⟩[j]=∑iτ[i]σ[i,j]∑iσ[i,j]=⟨τcorr⟩[j]+Ωf⟨p−1T⟩[j], (9)

where is the central value of the -th bin and . We then define the subtracted cumulant with the same mass dimension as ,

 Δjkτ=⟨τ⟩[j]−⟨τ⟩[k]⟨p−1T⟩[j]⟨p−1T⟩[k]=⟨τcorr⟩[j]−⟨τcorr⟩[k]⟨p−1T⟩[j]⟨p−1T⟩[k]. (10)

The model function dependence vanishes and we are left with purely signal-correlated contributions. Note that we do not have to assume any specific form for the model function, .

Removal of soft uncorrelated emissionsWe discuss and demonstrate using Pythia simulations that subtracted cumulants are indeed insensitive to MPI and PU contributions in proton-proton collisions. We compare the results with the perturbative calculation performed in Kang et al. (2018b) at next-to-leading logarithmic and next-to-leading order accuracy (NLL+NLO) using SCET. (See also Dasgupta et al. (2012); Chien et al. (2013); Jouttenus et al. (2013); Liu et al. (2015); Hornig et al. (2017) for previous jet mass calculations.) Within theoretical uncertainties, the calculation agrees well with the simulations even for a large number of PU events.

The NLL+NLO result is obtained by matching the resummed and fixed order results,

 dσNLL′+NLOdτdpTdη=dσNLL′dτdpTdη+dσNLOdτdpTdη−dσNLO-sing.dτdpTdη, (11)

where , , and are the resummed, fixed-order, and fixed-order singular cross sections, respectively. The NLO result is obtained using MadGraph 5 Alwall et al. (2014). For the simulation, we use PythiaSjostrand et al. (2006, 2008) with the ATLAS-A14-variation-2 tune. We study the effect of MPI on subtracted cumulants by switching on and off its contribution in Pythia. PU events are simulated by soft QCD processes and added on top of signal events, and the PU event number follows a Poisson distribution with the mean . Here we present results for 7.5 and 50. Jets are reconstructed using the anti- algorithm Cacciari et al. (2008) implemented in FastJet Cacciari et al. (2012).

We first show in FIG. 1 the contributions of MPI and PU to the distributions. Both MPI and PU affect the peak position of the distribution significantly, especially for lower jets with a large . FIG. 2 then shows the results of subtracted jet mass cumulants. The blue band is the theoretical uncertainty of the NLL+NLO calculation estimated by varying characteristic energy scales with a factor of two. Remarkably, the simulation results from Pythia for different cases with and without MPI or PU contributions all agree with the analytic calculation of the signal distribution within theoretical uncertainty. This clearly demonstrates that the proposed subtracted cumulants largely mitigate contributions from UE and PU.

Modification for high luminosity collisionsFor the situation with large background contamination from PU at HL-LHC or UE in HIC, the jet is significantly altered by SUEs and the jet mass is no longer an additive observable from jet constituents. Therefore, we instead consider the observable, , defined in Eq.(5) which is explicitly additive. Note that the jet direction is assumed to be only mildly affected by a large but approximately uniform background, or one can use a recoil-free axis Bertolini et al. (2014b). On the other hand, since SUE contamination alters the value of jet significantly, in order to compare subtracted cumulants between experiment and theory we need to correct for the jet bin migration. This can be effectively achieved using the area subtraction method Cacciari and Salam (2008); Cacciari et al. (2010), and we refer to the corrected as . The subtracted cumulants for are defined as follows,

 Δjk^τ=⟨^τ⟩[j]−⟨^τ⟩[k]=⟨^τcorr⟩[j]−⟨^τcorr⟩[k], (12)

where the indices label the bins. Note that the subtracted cumulant of above is different from Eq. (10) in -weighting factor.

In FIG. 3 we demonstrate the robustness of against large SUEs by comparing the Pythia partonic result to the one including MPI and PU with , which is typical at HL-LHC and can give an indication of how this observable removes SUEs in HIC. Note the remarkable agreement between the two results. In practice, we use the approximation which is in terms of the well studied invariant mass. For this reason and in contrast to the previous plots, we choose to subtract the highest, instead of the lowest, bin where this approximation is more accurate.

Sensitivity to quark/gluon jet fractionWe discuss the sensitivity of subtracted cumulants to quark and gluon jet fractions, and , respectively. Assuming that the fractions vary slowly within each bin , the distribution is a weighted sum of the corresponding quark and gluon distributions,

 dσ[j]dτ=f[j]gdσ[j]gdτ+(1−f[j]g)dσ[j]qdτ, (13)

and similarly for ,

 ⟨τ⟩[j]=f[j]g⟨τ⟩[j]g+(1−f[j]g)⟨τ⟩[j]q, (14)

Since and , we have . The subtracted cumulants are

 Δjkτ = Δjkτ,q+[f[j]g(⟨τ⟩g−⟨τ⟩q)[j] (15) −f[k]g(⟨τ⟩g−⟨τ⟩q)[k]⟨p−1T⟩[j]⟨p−1T⟩[k]].

We use Pythia to simulate pure quark and gluon jets, and we mix the samples manually using the parameterized function (See Appendix for details). Within the range of interest we examine two scenarios in which the gluon jet fraction is larger (model-1) or smaller (model-2) than the expected value in collisions.

FIG. 4 shows the gluon jet fraction and subtracted cumulant as a function of jet for model-1 and model-2, as well as theoretical predictions at NLL accuracy for collisions. We find that a change of quark-gluon jet fraction can induce a significant change of the subtracted cumulant distinguishable with the theoretical precision. Precise measurements of subtracted cumulants of inclusive jets (gluon-enriched) and photon-tagged jets (quark-enriched) will then give useful information about the different quark-gluon jet fractions as well as subtracted cumulants of pure quark and gluon jet samples. Since quark and gluon jets are initiated by partons with different color charges, one expects that the two are quenched differently and thus their fractions may change from proton-proton to HIC Spousta and Cole (2016); Chien and Vitev (2016). The fraction change can induce modifications of jet substructure which should be disentangled from the jet-by-jet modification, for which subtracted cumulants can be very useful.

Comparison with experimental dataWe compare our analytic calculation and simulation to subtracted cumulants calculated from the experimental data measured by the ATLAS collaboration at the LHC with the collisional center of mass energies 7 TeV Aad et al. (2012) and 5.02 TeV collaboration (2018).

FIG. 5 shows the results for the NLL+NLO calculation (blue band) and Pythia simulations with (black) or without (red) MPI effect and hadronization. The data points are calculated from ATLAS measurements of jet mass distributions. The error bars include only the statistical uncertainty and are calculated from the variance of : , where is the variance of the distribution and is the total number of jets estimated from the integrated luminosity: . The statistical error in these experiments is small resulting in the small error bars in the plots. Including the systematic uncertainty requires experiment details and is beyond the scope of this work. For the 7 TeV case, only the differential distributions in jet mass are available rather than thus we redefine in term of cumulant of as follows,

 Δjks=⟨s⟩[j]−⟨s⟩[k]⟨pT⟩[j]⟨pT⟩[k]=⟨scorr⟩[j]−⟨scorr⟩[k]⟨pT⟩[j]⟨pT⟩[k]. (16)

This redefinition is only necessary due to the large bin sizes in the experiment. The average values are not given in Aad et al. (2012) and we use the ones generated by Pythia including hadronization and underlying event contributions since these quantities are well described by simulations.

For both the 7 TeV and 5.02 TeV cases, we find that the results of analytic calculations and simulations are in good agreement with the experimental data.

ConclusionsIn this paper, we extend the work in Kang et al. (2018a) to jet substructure observables and introduce the new method of comparing theoretical calculation of jet substructure observables to data using subtracted cumulant. The method makes the comparison insensitive to soft uncorrelated emissions such as multiparton interactions and pile-up using neither background subtraction algorithms to correct each jet nor having to model uncorrelated effects. Our theoretical prediction at NLL+NLO accuracy using SCET shows an excellent agreement with the subtracted cumulants calculated from two independent ATLAS jet mass measurements and those from Pythia simulations. We also demonstrate that subtracted jet substructure cumulants remove large background contaminations up to 200 pile-up events. Its robustness makes subtracted cumulants useful for jet studies at the high-luminosity LHC and in the heavy-ion collisions, where the identification of signal jets is challenged by a large background. We also show that subtracted cumulants are sensitive to the change of quark-gluon jet fraction. This could allow for precise determination of the fraction and its modification in heavy-ion collisions, which will be useful for discriminating possible medium effects and contributions. The mitigation of UE with flow modulation in HIC will be studied in future work.

###### Acknowledgements.
The authors would like to thank Yongsun Kim, Yen-Jie Lee, Christopher Lee, Duff Neill, Felix Ringer, Iain Stewart, Jesse Thaler and Ivan Vitev for useful conversations during the completion of this work. YTC is supported by the LHC Theory Initiative Postdoctoral Fellowship under the National Science Foundation grant PHY-1419008. DK is supported by the National Natural Science Foundation of China under Grant No. 11875112. KL is supported by the National Science Foundation under Grants No. PHY-1316617 and No. PHY-1620628. YM is supported by the DOE Office of Science under Contract DE-AC52-06NA25396, the Early Career Program (Christopher Lee, P.I.) and the LDRD Program at LANL.

## Appendix A Dependence of subtracted cumulants on quark and gluon jet fraction

We give the details of the parameterization used in FIG. 4 for the two models with different quark and gluon jet fractions. The gluon fraction, , has the following power-law modification form,

 fg(pT;a,b)=fNLLg(pT)(pTa)b, (17)

where and can be varied. The function is the analytic result extracted from NLLcalculation of the inclusive cross section Kang et al. (2016). It is the gluon fraction at the jet scale from evolving the partons produced at the hard scale . We check using Pythia simulations and find that, the distribution formed by weighing the pure quark and gluon distributions from and processes with the fraction agrees well with the full simulation. For the models 1 and 2 in FIG. 4 we choose the following parameters,

 model 1: a=120GeV,b=+1/3 model 2: a=120GeV,b=−1/2 (18)

Also, we demonstrate that different quark and gluon jet fractions can give the same subtracted cumulants, as shown in FIG. 6. For simplicity, we assume that the cumulants of pure quark and pure gluon jets depend linearly on jet , and the subtracted cumulants are defined in Eq. (15). The left panel shows the cumulants corresponding to different quark and gluon jet fractions: pure quark, pure gluon and two interpolations between pure quark or gluon across jet , as well as cases A and B that sit between the pure quark and gluon cases. The right panel shows the subtracted cumulants. We can clearly see that, cases A and B give the same subtracted cumulant, and that the cases of pure quark and gluon jets do not represent extreme values of subtracted cumulants.

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters