A CPU time measurements

Reweighting QCD matrix-element and parton-shower calculations


We present the implementation and validation of the techniques used to efficiently evaluate parametric and perturbative theoretical uncertainties in matrix-element plus parton-shower simulations within the Sherpa event-generator framework. By tracing the full and PDF dependences, including the parton-shower component, as well as the fixed-order scale uncertainties, we compute variational event weights on-the-fly, thereby greatly reducing the computational costs to obtain theoretical-uncertainty estimates.
Keywords: QCD, NLO, Monte Carlo generators, Matrix Elements, Parton Showers

1 Introduction

The first operational run of the LHC collider during the years 2009-2013 was a tremendous success, clearly culminating in the announcement of the discovery of a Higgs-boson candidate by the ATLAS and CMS collaborations in July 2012 [1, 2]. Through a large number of experimental analyses, focusing on a variety of final states and observables, the LHC experiments (re)established and underpinned to an unprecedented level of accuracy the validity of the Standard Model of particle physics (SM) [3].

When comparing theoretical predictions with actual collider data, Monte-Carlo event generators prove to be an indispensable tool. In particular parton-shower Monte-Carlo programmes like Herwig [4, 5], Pythia [6] and Sherpa [7, 8] provide simulations at the level of exclusive particle-level final states [9]. The cornerstones of these generators are their implementations of QCD parton-shower algorithms and their modelling of the non-perturbative parton-to-hadron fragmentation process. With the advent of sophisticated techniques to combine parton-shower simulations with exact higher-order QCD calculations at leading [10, 11], next-to-leading [12, 13] and even next-to-next-to-leading order [14, 15, 16, 17], Monte-Carlo simulations have developed into high-precision tools, encapsulating the best of our current knowledge of perturbative QCD.

With these simulations being widely used for making SM predictions, e.g. of the background expectation in searches for New Physics or the detailed properties of Higgs-boson production final states, a comprehensive and efficient evaluation of associated theoretical uncertainties is of utmost importance. A comprehensive list of sources for generator uncertainties has been quoted in [18]. Following the categories identified there, when focusing on systematics related to the perturbative phases of event evolution, the following uncertainties might be distinguished:

  • Parametric uncertainties reflecting the dependence of the prediction on input parameters such as couplings, particle masses or the parton-density functions (PDFs).

  • Perturbative uncertainties originating from the fact that perturbation theory is used in making predictions, to fixed-order in the matrix elements and resummed to all-orders with a certain logarithmic accuracy in the showers, thereby, however, neglecting higher-order contributions. Similarly, the use of the large- approximation in the showers belongs in this category.

  • Algorithmic uncertainties corresponding to the actual choices made in the implementation of the shower algorithm, i.e. for the evolution variable, the inclusion of non-singular terms in the splitting functions, or the employed matching/merging prescription. Per construction, for sensible choices, these systematics also correspond to higher-order perturbative corrections, but might be addressed separately.

In addition to the listed categories, generically non-perturbative effects such as hadronisation or the underlying event are described through phenomenological models that feature various generator-specific choices and parameters, typically subject to tuning against experimental data, see for instance [19, 20].

This publication focuses on the efficient evaluation of parametric and (some) perturbative uncertainties in matrix-element plus parton-shower simulations within the Sherpa event-generator framework. We present a comprehensive approach to fully trace the and PDF dependences in the matrix-element and parton-shower components of particle-level Sherpa simulations in leading- [21] and next-to-leading [22] order merged calculations based on the Sherpa dipole-shower implementation [23]. Furthermore, we provide the means to quickly evaluate the renormalisation- and factorisation-scale dependence of the fixed-order matrix-element contributions. Our approach is based on event-wise reweighting and allows us to provide with a single generator run a set of variational event weights corresponding to the predefined parameter and scale variations, that would otherwise have to be determined through dedicated re-evaluations. The alternative event weights can either be accessed through the output of a HepMC event record [24], or directly passed via the internal interface of Sherpa to the Rivet analysis framework [25].

The systematics of leading-order parton-shower simulations with Herwig 7 have recently been discussed in [18], a corresponding reweighting procedure has been presented in [26]. A similar reweighting implementation for the Pythia 8 parton shower has also appeared recently [27]. A discussion of uncertainty estimates for the Vincia shower model can be found in [28, 29, 30]. A comprehensive comparison of various generators is presented in [31]. The impact of PDFs in parton-shower simulations has been discussed in [32, 33].

Our paper is organised as follows. In Sec. 2 we review the dependence structure of leading-order (LO) and next-to-leading-order QCD calculations on , the PDFs and the renormalisation and factorisation scales, and introduce the reweighting approach. In Sec. 3 we extend this to parton-shower simulations and in particular the algorithm employed in the Sherpa framework. In Sec. 4 we present the generalisation of the reweighting approach to multijet-merged calculations, based on leading and next-to-leading-order matrix elements matched to the parton shower. Our conclusions are summarised in Sec. 6. In App. A we present CPU time measurements that assess the reduction in computational time when the reweighting is used. The technical details on enabling and accessing the variations considered in Sherpa runs are listed in App. B.

Note, while fixed-order reweighting is already available with Sherpa-2.2, the general reweighting implementation described here, including parton showers and multijet merging, will be part of the next release, i.e. Sherpa-2.3.

2 Reweighting fixed-order calculations

In order to re-evaluate a QCD cross-section calculation for a new choice of input parameters, i.e. , PDFs or renormalisation and factorisation scales, it is necessary to understand and trace-out its respective dependences. This is a rather easy task at leading-order (LO) but is already more involved when considering next-to-leading order (NLO) calculations in a given subtraction scheme. However, these decompositions have been presented for Catani–Seymour dipole subtraction and the FKS subtraction formalism in [34, 35].

In this section, we briefly review the dependence structure and discuss the corresponding reweighting equations for LO and Catani–Seymour subtracted NLO calculations within the Sherpa framework. With this paragraph we also introduce the notation used in the later sections, which explore the reweighting of more intricate QCD calculations, involving QCD parton showers and merging different final-state multiplicity processes.

2.1 The leading-order case

A LO parton-level calculation of some observable or measurement function of the final-state momenta is based on Born matrix elements of . It exhibits explicit dependences on the PDFs , the running strong coupling , the renormalisation scale and the factorisation scale :


with , denoting the number of attempts to generate an accepted event configuration, and


Therein, the contains all couplings, symmetry and flux factors, and PDFs, whereas has the PDFs, here for assumed two incoming parton flavours and , and the strong coupling stripped off. Note that we have suppressed the event index here. It is understood that depends on the event kinematics and that and can be chosen dynamically, i.e. in a momentum (and flavour) dependent way. Changing the input parameters , , and the input functions , results in


From eq. (2.3) we conclude that for PDF reweighting it is necessary to know the values of the event.

For an unweighted event generation, the event weights are uniform initially, i.e. , eq. (2.1) thus simplifies to


Scale and parameter variations then work the very same way as for weighted events. Applying eq. (2.3) then, however, leads to a broader weight distribution and eq. (2.1) has to be used again. Partially unweighted events can be treated on the same footing. These conclusions hold irrespective of the type of event generation whenever (partially) unweighted event generation is possible, i.e. when the weight distribution is bounded from above and below. We therefore will not comment further on it.

2.2 The next-to-leading-order case

A full NLO parton-level calculation including real-emission and one-loop corrections of based in Catani–Seymour dipole subtraction [36, 37] has the following structure


where the new parts have the following dependences


Therein, combines the renormalised one-loop matrix element with the -operator of the Catani-Seymour subtraction scheme. This operator gives the flavour-diagonal endpoint contribution of the integrated subtraction terms. is thus separately infrared finite and exhibits a common transformation behaviour. Thus, for , , and


with - and PDF-independent coefficients and . Again, is stripped of all coupling and PDF factors.

The -terms are defined as the remainders of the integrated dipole subtraction terms, containing all flavour changing and -dependent pieces, combined with the collinear counterterms. Here, are the ratios of the partonic momentum fractions in the respective dipole before and after radiation. Again, this combination is separately infrared finite and transforms as one unit. When evaluated for the modified set of input parameters, they read


with the coefficients for , , and

for , respectively. Thereby, the sum over includes all light-quark flavours, corresponding to all potential quarks emitting a gluon. We note that in order to obtain the reweighted expressions for the and contributions the additional book-keeping of the , and (altogether 18)1 coefficients is required [34]. Due to its composite structure, the -terms do not possess a coupling- and PDF-stripped version . Nonetheless, we formally introduce a still PDF-dependent version in eq. (2.8) for reference in later sections.

The remaining pieces of eq. (2.5) are the Born matrix element , the real emission contribution and the differential dipole subtraction terms . The latter defines an underlying Born configuration through its dipole-dependent phase-space map, employing the phase-space factorisation . While the transformation of under the exchange of input parameters was detailed in eq. (2.3), the transformation of and the contributions works identically, merely having to adjust the power of the strong-coupling factor.

2.3 Validation

The reweighting approach outlined above has been implemented in the Sherpa framework for the two matrix-element generators Amegic [38] and Comix [39, 40] in conjunction with the corresponding Catani–Seymour dipole-subtraction implementation [41]. The required decomposition of virtual amplitudes is generic and can be used for matrix elements from BlackHat [42, 34], OpenLoops [43], GoSam [44], Njet [45], the internal library of simple processes, or, via the BLHA interface [46].

Here we shall present the validation of the reweighting approach in particular of NLO QCD event samples. For that purpose we consider \PW-boson production in \SI13\TeV proton-proton collisions at NLO QCD, and focus on the transverse-momentum distribution for the \PWand the lepton it decays to. In Fig. 1, the scale, and PDF uncertainty bands for the \PW and the lepton distributions are presented. All three bands have been produced for both observables using the internal reweighting of Sherpa from a single event generation run using with


a scale choice that has been motivated in [47]. For the PDFs the NNPDF 3.0 NLO set [48] has been used with . The running of is calculated within Sherpa using its renormalisation group equation at NLO with parton thresholds as given by the PDF.

The treatment of partonic thresholds deserves a short discussion. While any flavour thresholds in the running of do not present any challenges to the reweighting algorithm as for all and any loop order, this is different for the PDFs, where crossing a parton threshold results in a vanishing PDF for that flavour. Hence, the cross section component of the given partonic channel may be zero if no other non-zero contribution exists. Such an event will be discarded and, thus, cannot be reweighted. If now the respective parton threshold of the target PDF is smaller than the target factorisation scale while the one of the nominal PDF is larger than the nominal factorisation scale we are in a region of phase space where the reweighting must fail to reproduce a dedicated calculation. This could be remedied by storing events as well which vanish solely due to crossing PDF thresholds. However, as only observables sensitive to on-threshold production of light quarks (typically bottom quarks) are susceptible to these effects, they are of little relevance to the vast majority of LHC observables.2

Figure 1: The gauge-boson and lepton transverse momenta in off-shell \PWproduction at the LHC with independent variations of (green), (red) and the PDF (blue). In the right-hand panels, the individual uncertainty bands, calculated via an on-the-fly reweighting, are compared to uncertainty bands from dedicated calculations (yellow). They are found to be equal.

For the scale uncertainty band we employ a 7-point scale variation for and : Both scales are varied independently by factors of \sfrac12 and 2, omitting the variations with ratios of 4 between the two scales. The uncertainty is then taken as the envelope of all variations. The uncertainty band is generated by varying the numerical value of the starting point of the running coupling, , to the following five values: , , , and . Note that this variation of should also enter the PDF fit, and hence the PDFs are varied consistently. This is expected to extenuate the effect of the variation in most cases, as the PDF of the varied is still fitted to describe the same data as the PDF of the nominal . This consistent +PDF variation is also part of the PDF4LHC recommendations for LHC Run II [49]. The envelope of these +PDF variations is taken as the respective uncertainty. The pure PDF uncertainty estimate is generated using the average and the standard deviation over the PDF replicas provided by the NNPDF3.0 set (at a fixed value of ). This corresponds to the \SI68\percent confidence level. This set-up is repeated for later reference in Table 1, along with a CT14 PDF variant, which is used in later studies.

nominal variations error band
PDF sets CT14 56 Hessian error sets with a \SI90\percent CL Hessian
NNPDF3.0 100 statistical replicas with a \SI68\percent CL statistical
value 0.118 0.115, 0.117, 0.119, 0.121 envelope
/ factors , , , , , envelope
Table 1: Variations, which are used for studies in this publication, with two variants depending on the PDF choice. Note that each value is used with its associated PDF set variant in the context of hadronic collisions.

Comparing the uncertainties for the \PW, we observe that the scale uncertainties are the largest, with relative deviations of . The relative deviations related to the PDF and the strong coupling do not exceed . The scale uncertainty exhibits a minimum for . The reason is that the variations of alone cross the central value prediction in this range, such that only the variation contributes to the overall scale uncertainty here.

Note that at , and therefore only real-emission events contribute to the distribution. Hence, the observable is only described to leading-order. We introduce it here as a reference for our later validations including the parton-shower, which use this observable. For the current validation, we complement the discussion of the \PWtransverse momentum with the one of the lepton it decays to, as the region below is already filled at , and therefore we have in part a true next-to-leading description for this observable. In fact, the scale uncertainties are much larger in that region, especially towards the threshold, and at the lepton cut at \SI25\GeV. This gives a more realistic picture of the perturbative uncertainties than in the leading-order region above the threshold.

The small panels on the right of Fig. 1 compare the uncertainty bands calculated using the reweighting approach to uncertainty bands where dedicated calculations have been done for each variation. We observe that all bands overlap perfectly for both observables. This is because the reweighting as presented above is exact and for all runs the same phase-space points could be used: The reweighted and the dedicated predictions for each variation are therefore equal, and so are the uncertainty bands.3

3 Reweighting parton-shower calculations

If parton-showering is added to a LO calculation, the value of the observable is not evaluated at any longer, but at , which denotes the phase-space point after showering. Applying this modification to eq. (2.1) yields


Therefore the reweighting for does not need to be altered, but the parton-shower emissions depend on the PDF, the strong coupling, their respective scale prefactors and (detailed below) and the starting scale , i.e.


In order to reweight the parton-shower emissions, we first need to identify its exact dependence structure. Schematically, it acts on the phase-space element in the following way


where the Sudakov form factor of the -parton state, , and its splitting kernel have been introduced. While the first term describes the no-emission probability between the starting scale and the infrared cut-off and therefore does not change the phase-space element, the second term describes the emission of a parton at scale in the configuration (the integration boundaries are to be understood in this decomposition), leading to a configuration . The Jacobian is not relevant to the discussion here and is subsequently absorbed in the splitting kernel . As the emissions are ordered in , the Sudakov form factor in the second term ensures that the current emission is the hardest after starting the evolution at . Additional emissions may occur at smaller and are not resolved at this stage – they are described by the parton shower acting on the newly produced state with the new starting scale . In eq. (3.3) the dependences on , the PDFs, and their respective scale prefactors and have been omitted for brevity. They directly carry over to the splitting kernel and the Sudakov form factor, according to


When considering parton-shower emissions off NLO QCD matrix elements special emphasis has to be given to the first emission as described in Sec. 3.3 below.

3.1 Parton-shower dependence structure

Table 2: Definition of the evolution and splitting variables for each dipole type. The fifth column lists the splitting process as seen from the Born process, and refer to the flavour of the initial state before and after the splitting process, respectively. The variables , , , , , and are defined in [36, 37, 23].

The default parton shower of Sherpa, dubbed CSShower [23], is based on Catani–Seymour dipole factorisation [36, 37]. Each branching of an emitter parton into two daughters is witnessed by a spectator parton, which takes the recoil, and ensures that on-shell states are transferred into on-shell states and energy-momentum conservation is respected simultaneously. The emitter and spectator partons reside either in the initial-state (I) or final-state (F), such that four dipole types need to be distinguished: II, IF, FI and FF. In this notation, the first letter refers to emitter, and the second to the spectator parton. The no-branching probabilities are given by the four corresponding Sudakov form factors


They share the common form


wherein the kinematics of the splitting are given by the default choice for in the massless case while the denote the coupling and PDF stripped splitting kernels incorporating the remaining pieces of the and the Jacobian of the phase-space parametrisation. The precise definitions of the variables for each dipole type are given in Table 2. It directly follows that for FF-type dipole splittings the ratio of PDFs is simply unity. Eq. (3.6) further details the dependence on the and PDF scale factors and . These multiplicative factors as well as their variations are assumed to be of order one, such that they do not induce spurious large logarithms. The generalisation to the massive case is straightforward and only involves generalised definitions of , , and , cf. [23].

3.2 Reweighting trial emissions

To numerically integrate Sudakov form factors typically the Sudakov Veto Algorithm is used [50, 51, 52, 53, 54, 55]. Therein the integrands found in the Sudakov form factors are replaced with integrable overestimates . This is balanced by only accepting a proposed emission with probability . A multiplicative factor in is therefore equivalent to a multiplicative factor in  [52]. This observation is for example used to apply matrix-element corrections [54], where the splitting kernels are replaced with a real-emission-like kernel . This is done a-posteriori, i.e. the event weight is multiplied by , the emission itself is unchanged. The same method is also used in the Vincia parton shower to calculate uncertainty variations for different scales, finite terms of the antenna functions, ordering parameters and sub-leading colour corrections [28]. Here we employ this technique to account for variations of the strong-coupling parameter and the PDFs in the shower evolution of LO and NLO QCD matrix elements.

As has been laid out in the previous section, the emission kernels depend linearly on and on a ratio of parton densities . A change of PDFs , the strong coupling and the scale prefactors entering both, i.e.  and , is equivalent to modifying the emission probability accordingly4:


where the scale dependence and the definition of and can be read off the Sudakov form factors given in eq. (3.6) and Table 2. In case of FF dipoles eq. (3.7) simplifies significantly as the ratios of PDF factors reduces to unity. It further follows, that the event weight for each accepted emission needs to be multiplied by the corresponding factor in order to incorporate the new choice of , PDFs and the scales they are evaluated at. Accordingly, the probability to reject an emission is changed to


Consequently, for each rejected emission the event weight receives a corrective weight of . Proofs that this treatment indeed results in the correct Sudakov form factors can be found in [52, 27, 26].

3.3 Next-to-leading-order matching

To match NLO QCD parton-level calculations with subsequent parton-shower evolution Sherpa employs a variant of the original Mc@Nlo algorithm presented in [12], referred to as S-Mc@Nlo [54]. Schematically, such a S-Mc@Nlo calculation has the following structure:


Here the real-emission contribution of the NLO calculation has effectively been split into an infrared-singular (soft) and an infrared-regular (hard) part, the resummation kernel and the finite hard remainder , respectively, such that  [57, 54]. The -function has the following explicit parameter dependences


From the perspective of parameter reweighting, the resummation kernel behaves the same way as the subtraction term . In fact, in our reweighting implementation the contribution is treated as a single term, as indicated. It is only to note that their PDFs are evaluated at the partonic momentum fraction and external flavours and of their phase-space configuration rather than those of . The other parts of the -function can then be reweighted as described in Sec. 2.2, leading to


The -function then transforms as


wherein each subtraction term has its own scales , defined on its underlying Born configuration . Writing eq. (3.9) as a Monte-Carlo sum over events with B-like and R-like structure, which are conventionally called and events in Mc@Nlo calculations, and with , we obtain


Thus, under , , and both of the -events and  of the -events transform as composite objects in terms of their constituents, as defined above. This leaves the S-Mc@Nlo parton shower, , defined through