FastJet user manual(for version 3.0.1)

FastJet user manual
(for version 3.0.1)

Matteo Cacciari, Gavin P. Salam and Gregory Soyez
LPTHE, UPMC Univ. Paris 6 and CNRS UMR 7589, Paris, France
Université Paris Diderot, Paris, France
CERN, Physics Department, Theory Unit, Geneva, Switzerland
Department of Physics, Princeton University, Princeton, NJ 08544,USA
Institut de Physique Théorique, CEA Saclay, France
Abstract

FastJet is a C++ package that provides a broad range of jet finding and analysis tools. It includes efficient native implementations of all widely used sequential recombination jet algorithms for and collisions, as well as access to 3rd party jet algorithms through a plugin mechanism, including all currently used cone algorithms. FastJet also provides means to facilitate the manipulation of jet substructure, including some common boosted heavy-object taggers, as well as tools for estimation of pileup and underlying-event noise levels, determination of jet areas and subtraction or suppression of noise in jets.

CERN-PH-TH/2011-297

Contents

1 Introduction

Jets are the collimated sprays of hadrons that result from the fragmentation of a high-energy quark or gluon. They tend to be visually obvious structures when one looks at an experimental event display, and by measuring their energy and direction one can approach the idea of the original “parton” that produced them. Consequently jets are both an intuitive and quantitatively essential part of collider experiments, used in a vast array of analyses, from new physics searches to studies of Quantum Chromodynamics (QCD). For any tool to be so widely used, its behaviour must be well defined and reproducible: it is not sufficient that one be able to visually identify jets, but rather one should have rules that project a set of particles onto a set of jets. Such a set of rules is referred to as a jet algorithm. Usually a jet algorithm involves one or more parameters that govern its detailed behaviour. The combination of a jet algorithm and its parameters is known as a jet definition. Suitable jet definitions can be applied to particles, calorimeter towers, or even to the partonic events of perturbative QCD calculations, with the feature that the jets resulting from these different kinds of input are not just physically close to the concept of partons, but can be meaningfully be compared to each other.

Jet finding dates back to seminal work by Sterman and Weinberg [1] and several reviews have been written describing the various kinds of jet finders, their different uses and properties, and even the history of the field, for example [2, 3, 4, 5, 6].

It is possible to classify most jet algorithms into one of two broad classes: sequential recombination algorithms and cone algorithms.

Sequential recombination algorithms usually identify the pair of particles that are closest in some distance measure, recombine them, and then repeat the procedure over and again, until some stopping criterion is reached. The distance measure is usually related to the structure of divergences in perturbative QCD. The various sequential recombination algorithms differ mainly in their particular choices of distance measure and stopping criterion.

Cone algorithms put together particles within specific conical angular regions, notably such that the momentum sum of the particles contained in a given cone coincides with the cone axis (a “stable cone”). Because QCD radiation and hadronisation leaves the direction of a parton’s energy flow essentially unchanged, the stable cones are physically close in direction and energy to the original partons. Differences between various cone algorithms are essentially to do with the strategy taken to search for the stable cones (e.g. whether iterative or exhaustive) and the procedure used to deal with cases where the same particle is found in multiple stable cones (e.g. split–merge procedures).

One of the aims of the FastJet C++ library is to provide straightforward, efficient implementations for the majority of widely used sequential-recombination algorithms, both for hadron-hadron and colliders, and easy access also to cone-type jet algorithms. It is distributed under the terms of version 2 of the GNU General Public License (GPL) [7].

To help introduce the terminology used throughout FastJet and this manual, let us consider the longitudinally-invariant algorithm for hadron colliders [8, 9]. This was the first jet algorithm to be implemented in FastJet [10] and its structure, together with that of other sequential recombination algorithms, has played a key role in the design of FastJet’s interface. The algorithm involves a (symmetric) distance measure, , between all pairs of particles and ,

(1)

where is the transverse momentum of particle with respect to the beam () direction and , with and respectively ’s rapidity and azimuth. The algorithm also involves a distance measure between every particle and the beam

(2)

in eq. (1), usually called the jet radius, is a parameter of the algorithm that determines its angular reach. In the original, so-called “exclusive” formulation of the algorithm [8] (generally used with ), one identifies the smallest of the and . If it is a , one replaces and with a single new object whose momentum is — often this object is called a “pseudojet”, since it is neither a particle, nor yet a full jet.111In FastJet we actually will use PseudoJet to denote any generic object with 4-momentum. If instead the smallest distance is a , then one removes from the list of particles/pseudojets and declares it to be part of the “beam” jet. One repeats this procedure until the smallest or is above some threshold ; all particles/pseudojets that are left are then that event’s (non-beam) jets.

In the “inclusive” formulation of the algorithm [9], the and distances are the same as above. The only difference is that when a is smallest, then is removed from the list of particles/pseudojets and added to the list of final “inclusive” jets (this is instead of being incorporated into a beam jet). There is no threshold and the clustering continues until no particles/pseudojets remain. Of the final jets, generally only those above some transverse momentum are actually used.222This transverse momentum cut has some similarity to in the exclusive case, since in the exclusive case pseudojets with become part of the beam jets, i.e. are discarded. Because the distance measures are the same in the inclusive and exclusive algorithms, the clustering sequence is common to both formulations (at least up to ), a property that will be reflected in FastJet’s common interface to both formulations.

Having seen these descriptions, the reader may wonder why a special library is needed for sequential-recombination jet finding. Indeed, the algorithm can be easily implemented in just a few dozen lines of code. The difficulty that arises, however, is that at hadron colliders, clustering is often performed with several hundreds or even thousands of particles. Given particles, there are distances to calculate, and since one must identify the smallest of these distances at each of iterations of the algorithm, original implementations of the algorithm [11, 12] involved operations to perform the clustering. In practice this translates to about 1 s for . Given that events with pileup can have multiplicities significantly in excess of 1000 and that it can be necessary to cluster hundreds of millions of events, timing quickly becomes prohibitive, all the more so in time-critical contexts such as online triggers. To alleviate this problem, FastJet makes use of the observation [10] that the smallest pairwise distance remains the same if one uses the following alternative (non-symmetric) distance measure:

(3)

For a given , the smallest of the is simply found by choosing the that minimises the , i.e. by identifying ’s geometrical nearest neighbour on the cylinder. Geometry adds many constraints to closest pair and nearest neighbour type problems, e.g. if is geometrically close to and is geometrically close to , then and are also geometrically close; such a property is not true for the . The factorisation of the problem into momentum and geometrical parts makes it possible to calculate and search for minima among a much smaller set of distances. This is sufficiently powerful that with the help of the external Computational Geometry Algorithms Library (CGAL) [13] (specifically, its Delaunay triangulation modules), FastJet achieves expected timing for many sequential recombination algorithms. This strategy is supplemented in FastJet with several other implementations, also partially based on geometry, which help optimise clustering speed up to moderately large multiplicities, . The timing for is thus reduced to a few milliseconds. The same techniques apply to a range of sequential recombination algorithms, described in section 4.

At the time of writing, sequential recombination jet algorithms are the main kind of algorithm in use at CERN’s Large Hadron Collider (LHC), notably the anti- algorithm [14], which simply replaces with in eqs. (1,2). Sequential recombination algorithms were also widely used at HERA and LEP. However at Fermilab’s Tevatron, and in much preparatory LHC work, cone algorithms were used for nearly all studies. For theoretical and phenomenological comparisons with these results, it is therefore useful to have straightforward access also to cone algorithm codes. The main challenge that would be faced by someone wishing to write their own implementation of a given cone algorithm comes from the large number of details that enter into a full specification of such algorithms, e.g. the precise manner in which stable cones are found, or in which the split–merge step is carried out. The complexity is such that in many cases the written descriptions that exist of specific cone algorithms are simply insufficient to allow a user to correctly implement them. Fortunately, in most cases, the authors of cone algorithms have provided public implementations and these serve as a reference for the algorithm. While each tends to involve a different interface, a different 4-momentum type, etc., FastJet has a “plugin” mechanism, which makes it possible to provide a uniform interface to these different third party jet algorithms. Many plugins (and the corresponding third party code) are distributed with FastJet. Together with the natively-implemented sequential-recombination algorithms, they ensure easy access to all jet algorithms used at colliders in the past decade (section 5). Our distribution of this codebase is complemented with some limited curatorial activity, e.g. solving bugs that become apparent when updating compiler versions, providing a common build infrastructure, etc.

In the past few years, research into jets has evolved significantly beyond the question of just “finding” jets. This has been spurred by two characteristics of CERN’s LHC experimental programme. The first is that the LHC starts to probe momentum scales that are far above the the electroweak scale, , e.g. in the search for new particles or the study of high-energy scattering. However, even in events with transverse momenta , there can simultaneously be hadronic physics occurring on the electroweak scale (e.g. hadronic decays). Jet finding then becomes a multi-scale problem, one manifestation of which is that hadronic decays of W’s, Z’s and top quarks may be so collimated that they are entirely contained within a single jet. The study of this kind of problem has led to the development of a wide array of jet substructure tools for identifying “boosted” object decays, as reviewed in [15]. As was the situation with cone algorithms a few years ago, there is considerable fragmentation among these different tools, with some public code available from a range of different sources, but interfaces that differ from one tool to the next. Furthermore, the facilities provided with version 2 of FastJet did not always easily accommodate tools to manipulate and transform jets. Version 3 of FastJet aims to improve this situation, providing implementations of the most common jet substructure tools333To some extent there is overlap here with SpartyJet [16], however we believe there are benefits to being able to easily carry jet structure manipulations natively within the framework of FastJet. and a framework to help guide third party authors who wish to write further such tools using a standard interface (section 9). In the near future we also envisage the creation of a FastJet “contrib” space, to provide a common location for access to these new tools as they are developed.

The second characteristic of the LHC that motivates facilities beyond simple jet finding in FastJet is the need to use jets in high-noise environments. This is the case for proton-proton () collisions, where in addition to the collision of interest there are many additional soft “pileup” collisions, which contaminate jets with a high density of low-momentum particles. A similar problem of “background contamination” arises also for heavy-ion collisions (also at RHIC) where the underlying event in the nuclear collision can generate over a TeV of transverse momentum per unit rapidity, part of which contaminates any hard jets that are present. One way of correcting for this involves the use of jet “areas”, which provide a measure of a given jet’s susceptibility to soft contamination. Jet areas can be determined for example by examining the clustering of a large number of additional, infinitesimally soft “ghost” particles [17]. Together with a determination of the level of pileup or underlying-event noise in a specific event, one can then perform event-by-event and jet-by-jet subtraction of the contamination [18, 19]. FastJet allows jet clustering to be performed in such a way that the jet areas are determined at the same time as the jets are identified, simply by providing an “area definition” in addition to the jet definition (section 7). Furthermore it provides the tools needed to estimate the density of noise contamination in an event and to subtract the appropriate amount of noise from each jet (section 8). The interface here shares a number of characteristics with the substructure tools, some of which also serve to remove noise contamination. Both the substructure and pileup removal make use also of a “selectors” framework for specifying and combining simple cuts (section 6).

While FastJet provides a broad range of facilities, usage for basic jet finding is straightforward. To illustrate this, a quick-start guide is provided in section 2, while the core classes (PseudoJet, JetDefinition and ClusterSequence) are described in section 3. For more advanced usage, one of the design considerations in FastJet has been to favour user extensibility, for example through plugins, selectors, tools, etc. This is one of the topics covered in the appendices. Further information is also available from the extensive “doxygen” documentation, available online at http://fastjet.fr.

2 Quick-start guide

For the impatient, the FastJet package can be set up and run as follows.

  • Download the code and the unpack it

     curl -O http://fastjet.fr/repo/fastjet-X.Y.Z.tar.gz
     tar zxvf fastjet-X.Y.Z.tar.gz
     cd fastjet-X.Y.Z/
    

    replacing X.Y.Z with the appropriate version number. On some systems you may need to replace “curl -O” with “wget”.

  • Compile and install (choose your own preferred prefix), and when you’re done go back to the original directory

     ./configure --prefix=‘pwd‘/../fastjet-install
     make
     make check
     make install
     cd ..
    

    If you copy and paste the above lines from one very widespread PDF viewer, you should note that the first line contains back-quotes not forward quotes but that your PDF viewer may nevertheless paste forward quotes, causing problems down the line (the issue arises again below).

  • Now paste the following piece of code into a file called short-example.cc

    #include "fastjet/ClusterSequence.hh"
    #include <iostream>
    using namespace fastjet;
    using namespace std;
    int main () {
      vector<PseudoJet> particles;
      // an event with three particles:   px    py  pz      E
      particles.push_back( PseudoJet(   99.0,  0.1,  0, 100.0) );
      particles.push_back( PseudoJet(    4.0, -0.1,  0,   5.0) );
      particles.push_back( PseudoJet(  -99.0,    0,  0,  99.0) );
      // choose a jet definition
      double R = 0.7;
      JetDefinition jet_def(antikt_algorithm, R);
      // run the clustering, extract the jets
      ClusterSequence cs(particles, jet_def);
      vector<PseudoJet> jets = sorted_by_pt(cs.inclusive_jets());
      // print out some info
      cout << "Clustered with " << jet_def.description() << endl;
      // print the jets
      cout <<   "        pt y phi" << endl;
      for (unsigned i = 0; i < jets.size(); i++) {
        cout << "jet " << i << ": "<< jets[i].perp() << " "
                       << jets[i].rap() << " " << jets[i].phi() << endl;
        vector<PseudoJet> constituents = jets[i].constituents();
        for (unsigned j = 0; j < constituents.size(); j++) {
          cout << "    constituent " << j << "’s pt: "<< constituents[j].perp() << endl;
        }
      }
    }
  • Then compile and run it with

     g++ short-example.cc -o short-example \
         ‘fastjet-install/bin/fastjet-config --cxxflags --libs --plugins‘
     ./short-example
    

    (watch out, once again, for the back-quotes if you cut and paste from the PDF).

The output will consist of a banner, followed by the lines

Clustering with Longitudinally invariant anti-kt algorithm with R = 0.7
and E scheme recombination
        pt y phi
jet 0: 103 0 0
    constituent 0’s pt: 99.0001
    constituent 1’s pt: 4.00125
jet 1: 99 0 3.14159
    constituent 0’s pt: 99

More evolved example programs, illustrating many of the capabilities of FastJet, are available in the example/ subdirectory of the FastJet distribution.

3 Core classes

All classes are contained in the fastjet namespace. For brevity this namespace will usually not be explicitly written below, with the possible exception of the first appearance of a FastJet class, and code excerpts will assume that a “using namespace fastjet;” statement is present in the user code. For basic usage, the user is exposed to three main classes:

  class fastjet::PseudoJet;
  class fastjet::JetDefinition;
  class fastjet::ClusterSequence;

PseudoJet provides a jet object with a four-momentum and some internal indices to situate it in the context of a jet-clustering sequence. The class JetDefinition contains a specification of how jet clustering is to be performed. ClusterSequence is the class that carries out jet-clustering and provides access to the final jets.

3.1 fastjet::PseudoJet

All jets, as well as input particles to the clustering (optionally) are PseudoJet objects. They can be created using one of the following constructors

  PseudoJet (double px, double py, double pz, double  E);
  template<class T> PseudoJet (const T & some_lorentz_vector);

where the second form allows the initialisation to be obtained from any class T that allows subscripting to return the components of the momentum (running from in the order ). The default constructor for a PseudoJet sets the momentum components to zero.

The PseudoJet class includes the following member functions for accessing the components

  double E()        const ; // returns the energy component
  double e()        const ; // returns the energy component
  double px()       const ; // returns the x momentum component
  double py()       const ; // returns the y momentum component
  double pz()       const ; // returns the z momentum component
  double phi()      const ; // returns the azimuthal angle in range 
  double phi_std()  const ; // returns the azimuthal angle in range 
  double rap()      const ; // returns the rapidity
  double rapidity() const ; // returns the rapidity
  double pseudorapidity() const ; // returns the pseudo-rapidity
  double eta()      const ; // returns the pseudo-rapidity
  double pt2()      const ; // returns the squared transverse momentum
  double pt()       const ; // returns the transverse momentum
  double perp2()    const ; // returns the squared transverse momentum
  double perp()     const ; // returns the transverse momentum
  double m2()       const ; // returns squared invariant mass
  double m()        const ; // returns invariant mass ( if )
  double mt2()      const ; // returns the squared transverse mass = 
  double mt()       const ; // returns the transverse mass
  double mperp2()   const ; // returns the squared transverse mass = 
  double mperp()    const ; // returns the transverse mass
  double operator[] (int i) const; // returns component i
  double operator() (int i) const; // returns component i
  /// return a valarray containing the four-momentum (components 0-2
  /// are 3-momentum, component 3 is energy).
  valarray<double> four_mom() const;

The reader may have observed that in some cases more than one name can be used to access the same quantity. This is intended to reflect the diversity of common usages within the community.444The pt(), pt2(), mt(), mt2() names are available only from version 3.0.1 onwards.

There are two ways of associating user information with a PseudoJet. The simpler method is through an integer called the user index

  /// set the user_index, intended to allow the user to label the object (default is -1)
  void set_user_index(const int index);
  /// return the user_index
  int user_index() const ;

A more powerful method, new in FastJet 3, involves passing a pointer to a derived class of PseudoJet::UserInfoBase. The two essential calls are

  /// set the pointer to user information (the PseudoJet will then own it)
  void set_user_info(UserInfoBase * user_info);
  /// retrieve a reference to a dynamic cast of type L of the user info
  template<class L> const L & user_info() const;

Further details are to be found in appendix B and in example/09-user_info.cc.

A PseudoJet can be reset with

  /// Reset the 4-momentum according to the supplied components, put the user
  /// and history indices and user info back to their default values (-1, unset)
  inline void reset(double px, double py, double pz, double E);
  /// Reset just the 4-momentum according to the supplied components,
  /// all other information is left unchanged
  inline void reset_momentum(double px, double py, double pz, double E);

and similarly taking as argument a templated some_lorentz_vector or a PseudoJet (in the latter case, or when some_lorentz_vector is of a type derived from PseudoJet, reset also copies the user and internal indices and user-info).

Additionally, the +, -, * and / operators are defined, with +, - acting on pairs of PseudoJets and *, / acting on a PseudoJet and a double coefficient. The analogous +=, etc., operators, are also defined.555 The +, - operators return a PseudoJet with default user information; the * and / operators return a PseudoJet with the same user information as the original PseudoJet; the +=, -=, etc., operators all preserve the user information of the PseudoJet on the left-hand side of the operator.

There are also equality testing operators: (jet1 == jet2) returns true if the two jets have identical 4-momenta, structural information and user information; the (jet == 0.0) test returns true if all the components of the 4-momentum are zero. The != operator works analogously.

Finally, we also provide routines for taking an unsorted vector of PseudoJets and returning a sorted vector,

  /// return a vector of jets sorted into decreasing transverse momentum
  vector<PseudoJet> sorted_by_pt(const vector<PseudoJet> & jets);
  /// return a vector of jets sorted into increasing rapidity
  vector<PseudoJet> sorted_by_rapidity(const vector<PseudoJet> & jets);
  /// return a vector of jets sorted into decreasing energy
  vector<PseudoJet> sorted_by_E(const vector<PseudoJet> & jets);

These will typically be used on the jets returned by ClusterSequence.

A number of further PseudoJet member functions provide access to information on a jet’s structure. They are documented below in sections 3.5 and 3.6.

3.2 fastjet::JetDefinition

The class JetDefinition contains a full specification of how to carry out the clustering. According to the Les Houches convention detailed in [20], a ‘jet definition’ should include the jet algorithm name, its parameters (often the radius ) and the recombination scheme. Its constructor is

  JetDefinition(fastjet::JetAlgorithm jet_algorithm,
                double R,
                fastjet::RecombinationScheme recomb_scheme = E_scheme,
                fastjet::Strategy strategy = Best);

The jet algorithm is one of the entries of the JetAlgorithm enum:666As of v2.3, the JetAlgorithm name replaces the old JetFinder one, in keeping with the Les Houches convention. Backward compatibility is assured at the user level by a typedef and a doubling of the methods’ names. Backward compatibility (with versions 2.3) is however broken for user-written derived classes of ClusterSequence, as the protected variables _default_jet_finder and _jet_finder have been replaced by _default_jet_algorithm and _jet_algorithm.

  enum JetAlgorithm {kt_algorithm, cambridge_algorithm,
                     antikt_algorithm, genkt_algorithm,
                     ee_kt_algorithm, ee_genkt_algorithm, ...};

Each algorithm is described in detail in section 4. The represent additional values that are present for internal or testing purposes. They include plugin_algorithm, automatically set when plugins are used (section 5) and undefined_jet_algorithm, which is the value set in JetDefinition’s default constructor.

The parameter R specifies the value of that appears in eq. (1) and in the various definitions of section 4. For one algorithm, ee_kt_algorithm, there is no parameter, so the constructor is to be called without the R argument. For the generalised algorithm and its version, one requires and (immediately after ) an extra parameter . Details are to be found in sections 4.44.6. If the user calls a constructor with the incorrect number of arguments for the requested jet algorithm, a fastjet::Error() exception will be thrown with an explanatory message.

The recombination scheme is set by an enum of type RecombinationScheme, and it is related to the choice of how to recombine the 4-momenta of PseudoJets during the clustering procedure. The default in FastJet is the -scheme, where the four components of two 4-vectors are simply added. This scheme is used when no explicit choice is made in the constructor. Further recombination schemes are described below in section 3.4.

The strategy selects the algorithmic strategy to use while clustering and is an enum of type Strategy. The default option of Best automatically determines and selects a strategy that should be close to optimal in speed for each event, based on its multiplicity. A discussion of the main available strategies together with their performance is given in appendix A.

A textual description of the jet definition can be obtained by a call to the member function std::string description().

3.3 fastjet::ClusterSequence

To run the jet clustering, create a ClusterSequence object through the following constructor

  template<class L> ClusterSequence(const std::vector<L> & input_particles,
                                    const JetDefinition & jet_def);

where input_particles is the vector of initial particles of any type (PseudoJet, HepLorentzVector, etc.) that can be used to initialise a PseudoJet and jet_def contains the full specification of the clustering (see Section 3.2).

3.3.1 Accessing inclusive jets

Inclusive jets correspond to all objects that have undergone a “beam” clustering (i.e. recombination) in the description following Eq. (2). For nearly all hadron-collider algorithms, the “inclusive” jets above some given transverse momentum cut are the ones usually just referred to as the “jets”.

To access inclusive jets, the following member function should be used

  /// return a vector of all jets (in the sense of the inclusive
  /// algorithm) with pt >= ptmin.
  vector<PseudoJet> inclusive_jets (const double & ptmin = 0.0) const;

where ptmin may be omitted, then implicitly taking the value zero. Note that the order in which the inclusive jets are provided depends on the jet algorithm. To obtain a specific ordering, such as decreasing , the user should perform a sort themselves, e.g. with the sorted_by_pt(...) function, described in section 3.1.

With a zero transverse momentum cut, the number of jets found in the event is not an infrared safe quantity (adding a single soft particle can lead to one extra soft jet). However it can still be useful to talk of all the objects returned by inclusive_jets() as being “jets”, e.g. in the context of the estimation underlying-event and pileup densities, cf. section 8.

3.3.2 Accessing exclusive jets

There are two ways of accessing exclusive jets, one where one specifies , the other where one specifies that the clustering is taken to be stopped once it reaches the specified number of jets.

  /// return a vector of all jets (in the sense of the exclusive algorithm) that would
  /// be obtained when running the algorithm with the given dcut.
  vector<PseudoJet> exclusive_jets (const double & dcut) const;
  /// return a vector of all jets when the event is clustered (in the exclusive sense)
  /// to exactly njets. Throws an error if the event has fewer than njets particles.
  vector<PseudoJet> exclusive_jets (const int & njets) const;
  /// return a vector of all jets when the event is clustered (in the exclusive sense)
  /// to exactly njets. If the event has fewer than njets particles, it returns all
  /// available particles.
  vector<PseudoJet> exclusive_jets_up_to (const int & njets) const;

The user may also wish just to obtain information about the number of jets in the exclusive algorithm:

  /// return the number of jets (in the sense of the exclusive algorithm) that would
  /// be obtained when running the algorithm with the given dcut.
  int n_exclusive_jets (const double & dcut) const;

Another common query is to establish the value associated with merging from to jets. Two member functions are available for determining this:

  /// return the dmin corresponding to the recombination that went from n+1 to n jets
  /// (sometimes known as ).
  double exclusive_dmerge (const int & n) const;
  /// return the maximum of the dmin encountered during all recombinations up to the one
  /// that led to an n-jet final state; identical to exclusive_dmerge, except in cases
  /// where the dmin do not increase monotonically.
  double exclusive_dmerge_max (const int & n) const;

The first returns the in going from to jets. Occasionally however the value does not increase monotonically during successive mergings and using a smaller than the return value from exclusive_dmerge does not guarantee an event with more than jets. For this reason the second function exclusive_dmerge_max is provided — using a below its return value is guaranteed to provide a final state with more than jets, while using a larger value will return a final state with or fewer jets.

For collisions, where it is usual to refer to ( is the total (visible) energy) FastJet provides the following methods:

  double exclusive_ymerge (int njets);
  double exclusive_ymerge_max (int njets);
  int n_exclusive_jets_ycut (double ycut);
  std::vector<PseudoJet> exclusive_jets_ycut (double ycut);

which are relevant for use with the algorithm and with the Jade plugin (section 5.4.2).

3.3.3 Other functionality

Unclustered particles.

Some jet algorithms (e.g. a number of the plugins in section 5) have the property that not all particles necessarily participate in the clustering. In other cases, particles may take part in the clustering, but not end up in any final inclusive jet. Two member functions are provided to obtain the list of these particles. One is

  vector<PseudoJet> unclustered = clust_seq.unclustered_particles();

which returns the list of particles that never took part in the clustering. The other additionally returns objects that are the result of clustering but that never made it into a inclusive jet (i.e. into a “beam” recombination):

  vector<PseudoJet> childless = clust_seq.childless_pseudojets();

A practical example where this is relevant is with plugins that perform pruning [21], since they include a condition for vetoing certain recombinations.777To obtain the list of all initial particles that never end up in any inclusive jet, one should simply concatenate the vectors of constituents of all the childless PseudoJets.

Copying and transforming a ClusterSequence.

A standard copy constructor is available for ClusterSequences. Additionally it is possible to copy the clustering history of a ClusterSequence while modifying the momenta of the PseudoJets at all (initial, intermediate, final) stages of the clustering, with the ClusterSequence member function

  void transfer_from_sequence(const ClusterSequence & original_cs,
                              const FunctionOfPseudoJet<PseudoJet> * action_on_jets = 0);

FunctionOfPseudoJet<PseudoJet> is an abstract base class whose interface provides a PseudoJet operator()(const PseudoJet & jet) function, i.e. a function of a PseudoJet that returns a PseudoJet (cf. appendix D). As the clustering history is copied to the target ClusterSequence, each PseudoJet in the target ClusterSequence is set to the result of action_on_jet(original_pseudojet). One use case for this is if one wishes to obtain a Lorentz-boosted copy of a ClusterSequence, which can be achieved as follows

  #include "fastjet/tools/Boost.hh"
  // ...
  ClusterSequence original_cs(...);
  ClusterSequence boosted_cs;
  Boost boost(boost_4momentum);
  boosted_cs.transfer_from_sequence(cs, &boost);

3.4 Recombination schemes

When merging particles (i.e. PseudoJets) during the clustering procedure, one must specify how to combine the momenta. The simplest procedure (-scheme) simply adds the four-vectors. This has been advocated as a standard in [3], was the main scheme in use during Run II of the Tevatron, is currently used by the LHC experiments, and is the default option in FastJet. Other choices are listed in table 1, and are described below.

E_scheme
pt_scheme
pt2_scheme
Et_scheme
Et2_scheme
BIpt_scheme
BIpt2_scheme
Table 1: Members of the RecombinationScheme enum; the last two refer to boost-invariant version of the and schemes (as defined in section 3.4).
Other schemes for collisions.

Other schemes provided by earlier -clustering implementations [11, 12] are the , , and schemes. They all incorporate a ‘preprocessing’ stage to make the initial momenta massless (rescaling the energy to be equal to the 3-momentum for the and schemes, rescaling to the 3-momentum to be equal to the energy in the and schemes). Then for all schemes the recombination of and is a massless 4-vector satisfying

(4a)
(4b)
(4c)

where is for the and schemes, and is for the and schemes.

Note that for massive particles the schemes defined in the previous paragraph are not invariant under longitudinal boosts. As a workaround for this issue, we propose boost-invariant and schemes, which are identical to the normal and schemes, except that they omit the preprocessing stage. They are available as BIpt_scheme and BIpt2_scheme.

Other schemes for collisions.

On request, we may in the future provide dedicated schemes for collisions.

User-defined schemes.

The user may define their own, custom recombination schemes, as described in Appendix E.1.

3.5 Constituents and substructure of jets

For any PseudoJet that results from a clustering, the user can obtain information about its constituents, internal substructure, etc., through member functions of the PseudoJet class.888This is a new development in version 3 of FastJet. In earlier versions, access to information about a jet’s contents had to be made through its ClusterSequence. Those forms of access, though not documented here, will be retained throughout the 3.X series.

Jet constituents.

The constituents of a given PseudoJet jet can be obtained through

  vector<PseudoJet> constituents = jet.constituents();

Note that if the user wishes to identify these constituents with the original particles provided to ClusterSequence, she or he should have set a unique index for each of the original particles with the PseudoJet::set_user_index function. Alternatively more detailed information can also be set through the user_info facilities of PseudoJet, as discussed in appendix B.

Subjet properties.

To obtain the set of subjets at a specific scale inside a given jet, one may use the following PseudoJet member function:

  /// Returns a vector of all subjets of the current jet (in the sense of the exclusive
  /// algorithm) that would be obtained when running the algorithm with the given dcut
  std::vector<PseudoJet> exclusive_subjets (const double & dcut) const;

If jets are found, this takes a time (owing to the internal use of a priority queue). Alternatively one may obtain the jet’s constituents, cluster them separately and then carry out an exclusive_jets analysis on the resulting ClusterSequence. The results should be identical. This second method is mandatory if one wishes to identify subjets with an algorithm that differs from the one used to find the original jets.

In analogy with the exclusive_jets(...) functions of ClusterSequence, PseudoJet also has exclusive_subjets(int nsub) and exclusive_subjets_up_to(int nsub) functions.

One can also make use of the following methods, which allow one to follow the merging sequence (and walk back through it):

  /// If the jet has parents in the clustering, returns true and sets parent1 and parent2
  /// equal to them. If it has no parents returns false and sets parent1 and parent2 to 0
  bool has_parents(PseudoJet & parent1, PseudoJet & parent2) const;
  /// If the jet has a child then returns true and sets the child jet otherwise returns
  /// false and sets the child to 0
  bool has_child(PseudoJet & child) const;
  /// If this jet has a child (and so a partner), returns true and sets the partner,
  /// otherwise returns false and sets the partner to 0
  bool has_partner(PseudoJet & partner) const;
Accessibility of structural information.

If any of the above functions are used with a PseudoJet that is not associated with a ClusterSequence, an error will be thrown. Since the information about a jet’s constituents is actually stored in the ClusterSequence and not in the jet itself, these methods will also throw an error if the ClusterSequence associated with the jet has gone out of scope, been deleted, or in any other way become invalid. One can establish the status of a PseudoJet’s associated cluster sequence with the following enquiry functions:

  // returns true if this PseudoJet has an associated (and valid) ClusterSequence.
  bool has_valid_cluster_sequence() const;
  // returns a (const) pointer to the parent ClusterSequence (throws if inexistent
  // or no longer valid)
  const ClusterSequence* validated_cluster_sequence() const;

There are also has_associated_cluster_sequence() and associated_cluster_sequence() member functions. The first returns true even when the cluster sequence is not valid, and the second returns a null pointer in that case. Further information is to be found in appendix C.

There are contexts in which, within some function, one might create a ClusterSequence, obtain a jet from it and then return that jet from the function. For the user to be able to access the information about the jet’s internal structure, the ClusterSequence must not have gone out of scope and/or been deleted. To aid with potential memory management issues in this case, as long the ClusterSequence was created via a new operation, then one can tell the ClusterSequence that it should be automatically deleted after the last external object (jet, etc.) associated with it has disappeared. The call to do this is ClusterSequence::delete_self_when_unused(). There must be at least one external object already associated with the ClusterSequence at the time of the call (e.g. a jet, subjet or jet constituent). Note that ClusterSequence tends to be a large object, so this should be used with care.

3.6 Composite jets, general considerations on jet structure

There are a number of cases where it is useful to be able to take two separate jets and create a single object that is the sum of the two, not just from the point of view of its 4-momentum, but also as concerns its structure. For example, in a search for a dijet resonance, some user code may identify two jets, jet1 and jet2, that are thought to come from a resonance decay and then wish to return a single object that combines both jet1 and jet2. This can be accomplished with the function join:

  PseudoJet resonance = join(jet1,jet2);

The 4-momenta are added,999This corresponds to -scheme recombination. If the user wishes to have the jets joined with a different recombination scheme he/she can pass a JetDefinition::Recombiner (cf. section E.1) as the last argument to join(...). and in addition the resonance remembers that it came from jet1 and jet2. So, for example, a call to resonance.constituents() will return the constituents of both jet1 and jet2. It is possible to join 1, 2, 3 or 4 jets or a vector of jets. If the jets being joined had areas (section 7) then the joined jet will also have an area.

For a jet formed with join, one can find out which pieces it has been composed from with the function

  vector<PseudoJet> pieces = resonance.pieces();

In the above example, this would simply return a vector of size 2 containing jet1 and jet2. The pieces() function also works for jets that come from a ClusterSequence, returning two pieces if the jet has parents, zero otherwise.

Enquiries as to available structural information.

Whether or not a given jet has constituents, recursive substructure or pieces depends on how it was formed. Generally a user will know how a given jet was formed, and so know if it makes sense, say, to call pieces(). However if a jet is returned from some third-party code, it may not always be clear what kinds of structural information it has. Accordingly a number of enquiry functions are available:

  bool has_structure();         // true if the jet has some kind of structural info
  bool has_constituents();      // true if the jet has constituents
  bool has_exclusive_subjets(); // true if there is cluster-sequence style subjet info
  bool has_pieces();            // true if the jet can be broken up into pieces
  bool has_area();              // true if the jet has jet-area information
  string description();         // returns a textual description of the type
                                // of structural info associated with the jet

Asking (say) for the pieces() of a jet for which has_pieces() returns false will cause an error to be thrown. The structural information available for different kinds of jets is summarised in appendix C.

3.7 Version information

Information on the version of FastJet that is being run can be obtained by making a call to

  std::string fastjet_version_string();

(defined in fastjet/JetDefinition.hh). In line with recommendations for other programs in high-energy physics, the user should include this information in publications and plots so as to facilitate reproducibility of the jet-finding.101010We devote significant effort to ensuring that all versions of the FastJet program give identical, correct clustering results, and that any other changes from one version to the next are clearly indicated. However, as per the terms of the GNU General Public License (v2), under which FastJet is released, we are not able to provide a warranty that FastJet is free of bugs that might affect your use of the program. Accordingly it is important for physics publications to include a mention of the FastJet version number used, in order to help trace the impact of any bugs that might be discovered in the future. We recommend also that the main elements of the jet_def.description() be provided, together with citations to the original article that defines the algorithm, as well as to this manual and, optionally, the original FastJet paper [10].

4 FastJet native jet algorithms

4.1 Longitudinally invariant jet algorithm

The longitudinally invariant jet algorithm [8, 9] comes in inclusive and exclusive variants. The inclusive variant (corresponding to [9], modulo small changes of notation) is formulated as follows:

  • For each pair of particles , work out the distance111111In the soft, small angle limit for , the distance is the (squared) transverse momentum of relative to .

    (5)

    with , where , and are the transverse momentum (with respect to the beam direction), rapidity and azimuth of particle . is a jet-radius parameter usually taken of order . For each parton also work out the beam distance .

  • Find the minimum of all the . If is a merge particles and into a single particle, summing their four-momenta (this is -scheme recombination); if it is a then declare particle to be a final jet and remove it from the list.

  • Repeat from step 1 until no particles are left.

The exclusive variant of the longitudinally invariant jet algorithm [8] is similar, except that (a) when a is the smallest value, that particle is considered to become part of the beam jet (i.e. is discarded) and (b) clustering is stopped when all and are above some . In the exclusive mode is commonly set to .

The inclusive and exclusive variants are both obtained through

   JetDefinition jet_def(kt_algorithm, R);
   ClusterSequence cs(particles, jet_def);

The clustering sequence is identical in the inclusive and exclusive cases and the jets can then be obtained as follows:

   vector<PseudoJet> inclusive_kt_jets = cs.inclusive_jets();
   vector<PseudoJet> exclusive_kt_jets = cs.exclusive_jets(dcut);

4.2 Cambridge/Aachen jet algorithm

Currently the Cambridge/Aachen (C/A) jet algorithm [22, 23] is provided only in an inclusive version [23], whose formulation is identical to that of the jet algorithm, except as regards the distance measures, which are:

(6a)
(6b)

To use this algorithm, define

   JetDefinition jet_def(cambridge_algorithm, R);

and then extract inclusive jets from the cluster sequence.

Attempting to extract exclusive jets from the Cambridge/Aachen algorithm with a parameter simply provides the set of jets obtained up to the point where all . Having clustered with some given , this can actually be an effective way of viewing the event at a smaller radius, , thus allowing a single event to be viewed at a continuous range of within a single clustering.

We note that the true exclusive formulation of the Cambridge algorithm [22] (in ) instead makes use an auxiliary () distance measure and ‘freezes’ pseudojets whose recombination would involve too large a value of the auxiliary distance measure. Details are given in section 5.4.1.

4.3 Anti- jet algorithm

This algorithm, introduced and studied in [14], is defined exactly like the standard algorithm, except for the distance measures which are now given by

(7a)
(7b)

While it is a sequential recombination algorithm like and Cambridge/Aachen, the anti- algorithm behaves in some sense like a ‘perfect’ cone algorithm, in that its hard jets are exactly circular on the - cylinder [14]. To use this algorithm, define

   JetDefinition jet_def(antikt_algorithm, R);

and then extract inclusive jets from the cluster sequence. We advise against the use of exclusive jets in the context of the anti- algorithm, because of the lack of physically meaningful hierarchy in the clustering sequence.

4.4 Generalised jet algorithm

The “generalised ” algorithm again follows the same procedure, but depends on an additional continuous parameter , and has the following distance measure:

(8a)
(8b)

For specific values of , it reduces to one or other of the algorithms list above, (), Cambridge/Aachen () and anti- (). To use this algorithm, define

  JetDefinition jet_def(genkt_algorithm, R, p);

and then extract inclusive jets from the cluster sequence (or, for , also the exclusive jets).

4.5 Generalised algorithm for collisions

FastJet also provides native implementations of clustering algorithms in spherical coordinates (specifically for collisions) along the lines of the original algorithms [24], but extended following the generalised algorithm of [14] and section 4.4. We define the two following distances:

(9a)
(9b)

for a general value of and . At a given stage of the clustering sequence, if a is smallest then and are recombined, while if a is smallest then is called an “inclusive jet”.

For values of in eq. (9), the generalised algorithm behaves in analogy with the algorithms: when an object is at an angle from all other objects then it forms an inclusive jet. With the choice this provides a simple, infrared and collinear safe way of obtaining a cone-like algorithm for collisions, since hard well-separated jets have a circular profile on the D sphere, with opening half-angle . To use this form of the algorithm, define

   JetDefinition jet_def(ee_genkt_algorithm, R, p);

and then extract inclusive jets from the cluster sequence.

For values of , FastJet replaces the factor in the denominator of eq. (9a) with . With this choice (as long as ), the only time a will be relevant is when there is just a single particle in the event. The inclusive_jets(...) will then always return a single jet consisting of all the particles in the event. In such a context it is only the exclusive_jets(...) call that provides non-trivial information.

4.6 algorithm for collisions

The algorithm [24], often referred to also as the Durham algorithm, has a single distance:

(10)

Note the difference in normalisation between the in eqs. (9) and (10), and the fact that in neither case have we normalised to the total energy in the event, contrary to the convention adopted originally in [24] (where the distance measure was called ). To use the algorithm, define

   JetDefinition jet_def(ee_kt_algorithm);

and then extract exclusive jets from the cluster sequence.

Note that the ee_genkt_algorithm with and gives a clustering sequence that is identical to that of the ee_kt_algorithm. The normalisation of the ’s will however be different.

5 Plugin jet algorithms

It can be useful to have a common interface for a range of jet algorithms beyond the native (, anti- and Cambridge/Aachen) algorithms, notably for the many cone algorithms that are in existence. It can also be useful to be able to use FastJet features such as area-measurement tools for these other jet algorithms. In order to facilitate this, the FastJet package provides a plugin facility, allowing almost any other jet algorithm121212Except those for which one particle may be assigned to more than one jet, e.g. algorithms such as ARCLUS [25], which performs clustering. to be used within the same framework.

Generic plugin use is described in the next subsection. The plugins distributed with FastJet are described afterwards in sections 5.25.4. They are all accessible within the fastjet namespace and their code is to be found in FastJet’s plugins/ directory. New user-defined plugins can also be implemented, as described in section E.2. Some third-party plugins are linked to from the tools page at http://fastjet.fr/.

Not all plugins are enabled by default in FastJet. At configuration time ./configure --help will indicate which ones get enabled by default. To enable all plugins, run configure with the argument --enable-allplugins, while to enable all but PxCone (which requires a Fortran compiler, and can introduce link-time issues) use --enable-allcxxplugins.

5.1 Generic plugin use

Plugins are classes derived from the abstract base class fastjet::JetDefinition::Plugin. A JetDefinition can be constructed by providing a pointer to a JetDefinition::Plugin; the resulting JetDefinition object can then be used identically to the normal JetDefinition objects used in the previous sections. We illustrate this with an example based on the SISCone plugin:

  #include "fastjet/SISConePlugin.hh"
  // allocate a new plugin for SISCone (for other plugins, simply
  // replace the appropriate parameters and plugin name)
  double cone_radius = 0.7;
  double overlap_threshold = 0.75;
  JetDefinition::Plugin * plugin = new SISConePlugin(cone_radius, overlap_threshold);
  // create a jet definition based on the plugin
  JetDefinition jet_def(plugin);
  // run the jet algorithm and extract the jets
  ClusterSequence clust_seq(particles, jet_def);
  vector<PseudoJet> inclusive_jets = clust_seq.inclusive_jets();
  // then analyse the jets as for native fastjet algorithms
  ...
  // only when you will no longer be using the jet definition, or
  // ClusterSequence objects that involve it, may you delete the plugin
  delete plugin;

In cases such as this where the plugin has been created with a new statement and the user does not wish to manage the deletion of the corresponding memory when the JetDefinition (and any copies) using the plugin goes out of scope, then the user may wish to call the JetDefinition’s delete_plugin_when_unused() function, which tells the JetDefinition to acquire ownership of the pointer to the plugin and delete it when it is no longer needed.

5.2 SISCone Plugin

SISCone [26] is an implementation of a stable-cone jet algorithm with a split–merge step (SC-SM in the notation of [5]). As with most modern cone algorithms, it is divided into two parts: first it searches for stable cones; then, because a particle can appear in more than one stable cone, a ‘split–merge’ procedure is applied, which ensures that no particle ends up in more than one jet. The stable cones are identified using an seedless approach. This (and some care in the the ‘split–merge’ procedure) ensures that the jets it produces are insensitive to additional soft particles and collinear splittings, i.e. the algorithm is infrared and collinear safe.

The plugin library and include files are to be be found in the plugins/SISCone directory, and the main header file is SISConePlugin.hh. The SISConePlugin class has a constructor with the following structure

  SISConePlugin (double cone_radius,
                 double overlap_threshold,
                 int    n_pass_max = 0,
                 double protojet_ptmin = 0.0,
                 bool   caching = false,
                 SISConePlugin::SplitMergeScale
                                split_merge_scale = SISConePlugin::SM_pttilde);

A cone centred at is stable if the sum of momenta of all particles satisfying has rapidity . The overlap_threshold is the fraction of overlapping momentum above which two protojets are merged in a Tevatron Run II type [3] split–merge procedure. The radius and overlap parameters are a common feature of most modern cone algorithms. Because some event particles are not to be found in any stable cone [27], SISCone can carry out multiple stable-cone search passes (as advocated in [28]): in each pass one searches for stable cones using just the subset of particles not present in any stable cone in earlier passes. Up to n_pass_max passes are carried out, and the algorithm automatically stops at the highest pass that gives no new stable cones. The default of is equivalent to setting it to .

Concern had at some point been expressed that an excessive number of stable cones might complicate cone jets in events with high noise [3], and in particular lead to large “monster” jets. The protojet_ptmin parameter allows one to use only protojets with in the split–merge phase (all others are thrown away), so as to limit this issue. A large value of the split–merge overlap threshold, e.g. , also helps to disfavour the production of these monster jets through repeated merge operations.

In many cases SISCone’s most time-consuming step is the search for stable cones. If one has multiple SISConePlugin-based jet definitions, each with caching=true, a check will be carried out whether the previously clustered event had the same set of particles and the same cone radius and number of passes. If it did, the stable cones are simply reused from the previous event, rather than being recalculated, and only the split–merge step is repeated, often leading to considerable speed gains.

A final comment concerns the split_merge_scale parameter. This controls both the scale used for ordering the protojets during the split–merge step during the split–merge step, and also the scale used to measure the degree of overlap between protojets. While various options have been implemented,

  enum SplitMergeScale {SM_pt, SM_Et, SM_mt, SM_pttilde};

we recommend using only the last of them , which is also the default scale. The other scales are included only for historical comparison purposes: (used in several other codes) is IR unsafe for events whose hadronic component conserves momentum, (advocated in [3]) is not boost-invariant, and is IR unsafe for events whose hadronic component conserves momentum and stems from the decay of two identical particles.

An example of the use of the SISCone plugin was given in section 5.1. As can be seen there, SISCone jets are to be obtained by requesting inclusive jets from the cluster sequence. Note that it makes no sense to ask for exclusive jets from a SISCone based ClusterSequence.

Some cone algorithms provide information beyond that simply about the jets. Such additional information, where available, can be accessed with the help of the ClusterSequence::extras() function. In the case of SISCone, one can access that information as follows:

  const fastjet::SISConeExtras * extras =
            dynamic_cast<const fastjet::SISConeExtras *>(clust_seq.extras());

To determine the pass at which a given jet was found, one then uses131313 In versions of FastJet prior to 3.0.1, a jet’s user index indicated the pass at which it had been found. The value was however incorrectly set for single-particle jets. The current choice is to leave the user index unchanged from its default.

  int pass = extras->pass(jet);

One may also obtain a list of the positions of original stable cones as follows:

  const vector<PseudoJet> & stable_cones = extras->stable_cones();

The stable cones are represented as PseudoJets, for which only the rapidity and azimuth are meaningful. The user_index() indicates the pass at which a given stable cone was found.

SISCone uses -scheme recombination internally and also for constructing the final jets from the list of constituents. For the latter task, the user may instead instruct SISCone to use the jet-definition’s own recombiner, with the command

  plugin->set_use_jet_def_recombiner(true);

For this to work, plugin must explicitly be of type SISConePlugin * rather than JetDefinition::Plugin *.

Since SISCone is infrared safe, it may meaningfully be used also with the ClusterSequenceArea class. Note however that in that case ones loses the cone-specific information from the jets, because of the way FastJet filters out the information relating to ghosts in the clustering. If the user needs both areas and cone-specific information, she/he may use the ClusterSequenceActiveAreaExplicitGhosts class (for usage information, see the corresponding .hh file).

A final remark concerns speed and memory requirements: as mentioned above, SISCone takes time to find jets, and the memory use is ; taking as a reference point, it runs in a few tenths of a second, making it about 100 times slower than native FastJet algorithms. These are ‘expected’ results, i.e. valid for a suitably random set of particles.141414In area determinations, the ghost particles are not entirely random, but distributed close to a grid pattern, all with similar transverse momenta. Run times and memory usage are then, in practice, somewhat larger than for a normal QCD event with the same number of particles. We therefore recommend running with not too small a ghost_area (e.g. ) and using grid_scatter=1 (cf. section 7), which helps to reduce the number of stable cones (and correspondingly, the time and memory usage of the subsequent split–merge step). An alternative, which has been found to be acceptable in many situations, is to use a passive area, since this is relatively fast to calculate with SISCone.

Note that the underlying implementation of SISCone is optimised for large . An alternative implementation that is faster for was presented in [29].

5.3 Other plugins for hadron colliders

Most of the algorithms listed below are cone algorithms. They are usually either infrared (IR) or collinear unsafe. The details are indicated for each algorithm following the notation of Ref. [5]: IR means that the hard jets may be modified if, to an ensemble of hard particles in a common neighbourhood, one adds a single soft particle; Coll means that for hard particles in a common neighbourhood, the collinear splitting of one of them may modify the hard jets. The FastJet authors (and numerous theory-experiment accords) advise against the use of IR and/or collinear unsafe jet algorithms. Interfaces to these algorithms have been provided mainly for legacy comparison purposes.

Except where stated, the usual way to access jets from these plugins is through ClusterSequence::inclusive_jets().

5.3.1 CDF Midpoint

This is one of the two cone algorithms used by CDF in Run II of the Tevatron, based on [3]. It is a midpoint-type iterative cone with a split–merge step.

  #include "fastjet/CDFCones.hh"
  // ...
  CDFMidPointPlugin(double R,
                    double overlap_threshold,
                    double seed_threshold = 1.0,
                    double cone_area_fraction = 1.0);

The overlap threshold () used by CDF is usually , the seed threshold is  GeV. With a cone area fraction , the search for stable cones is performed with a radius that is , i.e. it becomes the searchcone algorithm of [27]. CDF has used both and . It is our understanding that the particular choice of is not always clearly documented in the corresponding publications.

Further control over the plugin can be obtained by consulting the header file.

The original underlying code for this algorithm was provided on a webpage belonging to Joey Huston [30] (with minor modifications to ensure reasonable run times with optimising compilers for 32-bit Intel processors — these modifications do not affect the final jets).

This algorithm is IR unsafe in the limit of zero seed threshold [26] (with it becomes IR unsafe [28]). With a non-zero seed threshold (and no preclustering into calorimeter towers) it is collinear unsafe. It is to be deprecated for new experimental or theoretical analyses.

5.3.2 CDF JetClu

JetClu is the other cone algorithm used by CDF during Run II, as well as their main algorithm during Run I [31]. It is an iterative cone with split-merge and optional “ratcheting” if iratch == 1 (particles that appear in one iteration of a cone are retained in future iterations). It can be obtained as follows:

  #include "fastjet/CDFCones.hh"
  // ...
  CDFJetCluPlugin (double   cone_radius,
                   double   overlap_threshold,
                   double   seed_threshold = 1.0,
                   int      iratch = 1);

The overlap threshold is usually set to 0.75 in CDF analyses. Further control over the plugin can be obtained by consulting the header file.

The original underlying code for this algorithm was provided on a webpage belonging to Joey Huston [30].

This algorithm is IR unsafe (for zero seed threshold; some IR unsafety persists with non-zero seed threshold). It is to be deprecated for new experimental or theoretical analyses. Note also that the underlying implementation groups particles together into calorimeter towers, with CDF-type geometry, before running the jet algorithm.

5.3.3 DØ Run I cone

This is the main algorithm used by DØ in Run I of the Tevatron [32]. It is an iterative cone algorithm with a split-merge step. It comes in two versions

  #include "fastjet/D0RunIpre96ConePlugin.hh"
  // ...
  D0RunIpre96ConePlugin (double R,
                         double min_jet_Et,
                         double split_ratio = 0.5);

and

  #include "fastjet/D0RunIConePlugin.hh"
  // ...
  D0RunIConePlugin (double R,
                    double min_jet_Et,
                    double split_ratio = 0.5);

corresponding to versions of the algorithm used respectively before and after 1996. They differ only in the recombination scheme used to determine the jet momenta once each jet’s constituents have been identified. In the pre-1996 version, a hybrid between an -like scheme and an scheme recombination is used, while in the post-1996 version it is just the scheme [32].

The algorithm places a cut on the minimum of the cones during iteration (related to min_jet_Et). The split_ratio is the same as the overlap threshold in other split-merge based algorithms (DØ usually use 0.5). It is the FastJet authors’ understanding that the value used for min_jet_Et was 8 GeV, corresponding to a cut of on cones. The publication that describes this algorithm [32] mentions the use of a seed threshold applied to preclustered calorimeter towers in order to obtain the seeds for the stable cone finding. Such a threshold and the pre-clustering appear not to be included in the code distributed with FastJet.

The underlying code for this algorithm was provided by Lars Sonnenschein.

Note: this algorithm is IR unsafe. It is recommended that it be used only for the purpose of comparison with Run I data from DØ. It is to be deprecated for new experimental or theoretical analyse

5.3.4 DØ Run II cone

This is the main algorithm used by DØ in Run II of the Tevatron. It is a midpoint type iterative cone with split-merge step. The DØ collaboration usually refers to Ref. [3] when introducing the algorithm in its articles. That generic Tevatron document does not reflect all details of the actual DØ algorithm, and for a complementary description the reader is referred to Ref. [33].

  #include "fastjet/D0RunIIConePlugin.hh"
  // ...
  D0RunIIConePlugin (double R,
                     double min_jet_Et,
                     double split_ratio = 0.5);

The parameters have the same meaning as in the DØ Run I cone. There is a cut on the minimum of the cones during iteration, which by default is half of min_jet_Et. It is the FastJet authors’ understanding that two values have been used for min_jet_Et, 8 GeV (in earlier publications) and 6 GeV (in more recent publications).

As concerns seed thresholds and preclustering, DØ describes them as being applied to calorimeter towers in data and in Monte Carlo studies that include detector simulation [33]. However, for NLO calculations and Monte Carlo studies based on stable particles, no seed threshold is applied. The code distributed with FastJet does not allow for seed thresholds.

The underlying code for this algorithm was provided by Lars Sonnenschein.

Note: this algorithm is IR unsafe (IR for jets with energy too close to min_jet_Et). It is to be deprecated for new experimental or theoretical analyses.

5.3.5 ATLAS iterative cone

This iterative cone algorithm, with a split-merge step, was used by ATLAS during the preparation for the LHC.

  #include "fastjet/AtlasConePlugin.hh"
  // ...
  ATLASConePlugin (double R,
                   double seedPt = 2.0,
                   double f = 0.5);

is the overlap threshold

The underlying code for this algorithm was extracted from an early version of SpartyJet [16] (which itself was distributed under the GPL license). Since version 3.0 of FastJet it is a slightly modified version that we distribute, where an internal sort function has been replaced with a stable_sort, to ensure reproducibility of results across compilers and architectures (results are unchanged when the results of the sort are unambiguous).

Note: this algorithm is IR unsafe (in the limit of zero seed threshold). It is to be deprecated for new experimental or theoretical analyses.

5.3.6 CMS iterative cone

This iterative cone algorithm with progressive removal was used by CMS during the preparation for the LHC.

  #include "fastjet/CMSIterativeConePlugin.hh"
  // ...
  CMSIterativeConePlugin (double ConeRadius, double SeedThreshold=0.0);

The underlying code for this algorithm was extracted from the CMSSW web site, with certain small service routines having been rewritten by the FastJet authors. Permission to redistribute the resulting code with FastJet has been granted by CMS under the terms of the GPL license. The code was validated by clustering 1000 events with the original version of the CMS software and comparing the output to the clustering performed with the FastJet plugin. The jet contents were identical in all cases. However the jet momenta differed at a relative precision level of , related to the use of single-precision arithmetic at some internal stage of the CMS software (while the FastJet version is in double precision).

Note: this algorithm is Coll unsafe [14]. It is to be deprecated for new experimental or theoretical analyses.

5.3.7 PxCone

The PxCone algorithm is an iterative cone with midpoints and a split-drop procedure:

  #include "fastjet/PxConePlugin.hh"
  // ...
  PxConePlugin (double  cone_radius      ,
                double  min_jet_energy = 5.0  ,
                double  overlap_threshold = 0.5,
                bool    E_scheme_jets = false);

It includes a threshold on the minimum transverse energy for a cone (jet) to be included in the split-drop stage. If E_scheme_jets is true then the plugin applies an -scheme recombination to provide the momenta of the final jets (by default an type recombination scheme is used).

The base code for this plugin is written in Fortran and, on some systems, problems have been reported at the link stage due to mixing Fortran and C++. The Fortran code has been modified by the FastJet authors to provide the same jets regardless of the order of the input particles. This involved a small modification of the midpoint procedure, which can have a minor effect on the final jets and should make the algorithm correspond to the description of [34].

The underlying code for this algorithm was written by Luis del Pozo and Michael Seymour with input also from David Ward [35] and they have granted permission for their code to be distributed with FastJet under the terms of the GPL license.

This algorithm is IR unsafe. It is to be deprecated for new experimental or theoretical analyses.

5.3.8 TrackJet

This algorithm has been used at the Tevatron for identifying jets from charged-particle tracks in underlying-event studies [36].

  #include "fastjet/TrackJetPlugin.hh"
  // ...
  TrackJetPlugin (double radius,
                  RecombinationScheme jet_recombination_scheme=pt_scheme,
                  RecombinationScheme track_recombination_scheme=pt_scheme);

Two recombination schemes are involved: the first one indicates how momenta are recombined to provide the final jets (once particle-jet assignments are known), the second one indicates how momenta are combined in the procedure that constructs the jets.

The underlying code for this algorithm was written by the FastJet authors, based on code extracts from the (GPL) Rivet implementation, written by Andy Buckley with input from Manuel Bahr and Rick Field. Since version 3.0 of FastJet it is a slightly modified version that we distribute, where an internal sort function has been replaced with a stable_sort, to ensure reproducibility of results across compilers and architectures (results are unchanged when the results of the sort are unambiguous, which is the usual case).

Note: this algorithm is believed to be Coll unsafe. It is to be deprecated for new experimental or theoretical analyses.

5.3.9 GridJet

GridJet allows you to define a grid and then cluster particles such that all particles in a common grid cell combine to form a jet. Its main interest is in providing fast clustering for high multiplicities (the clustering time scales linearly with the number of particles). The jets that it forms are not always physically meaningful: for example, a genuine physical jet may lie at the corner of 4 grid cells and so be split up somewhat arbitrarily into 4 pieces. It is therefore not intended to be used for standard jet finding. However for some purposes (such as background estimation) this drawback is offset by the greater uniformity of the area of the jets. Its interface is as follows

  #include "fastjet/GridJetPlugin.hh"
  // ...
  GridJetPlugin (double ymax, double requested_grid_spacing);

creating a grid that covers ymax with a grid spacing close to the requested_grid_spacing: the spacings chosen in and are those that are closest to the requested spacing while also giving an integer number of grid cells that fit exactly into the rapidity and ranges.

Note that for background estimation purposes the GridMedianBackgroundEstimator is much faster than using the GridJetPlugin with ghosts and a JetMedianBackgroundEstimator.

The underlying code for this algorithm was written by the FastJet authors.

5.4 Plugins for collisions

5.4.1 Cambridge algorithm

The original Cambridge [22] algorithm is provided as a plugin:

  #include "fastjet/EECambridgePlugin.hh"
  // ...
  EECambridgePlugin (double ycut);

This algorithms performs sequential recombination of the pair of particles that is closest in angle, except when , in which case the less energetic of and is labelled a jet, and the other member of the pair remains free to cluster.

To access the jets, the user should use the inclusive_jets(), i.e. as they would for the majority of the algorithms.

The underlying code for this algorithm was written by the FastJet authors.

5.4.2 Jade algorithm

The JADE algorithm [37, 38], a sequential recombination algorithm with distance measure , is available through

  #include "fastjet/JadePlugin.hh"
  // ...
  JadePlugin ();

To access the jets at a given , the user should call ClusterSequence::exclusive_jets_ycut(double ycut).

Note: the JADE algorithm has been used with various recombination schemes. The current plugin will use whatever recombination scheme the user specifies with for the jet definition. The default -scheme is what was used in the original JADE publication [37]. To modify the recombination scheme, the user may first construct the jet definition and then use either of

  void JetDefinition::set_recombination_scheme(RecombinationScheme recomb_scheme);
  void JetDefinition::set_recombiner(const Recombiner * recomb)

(cf. sections 3.4, E.1) to modify the recombination scheme.

The underlying code for this algorithm was written by the FastJet authors.

5.4.3 Spherical SISCone algorithm

The spherical SISCone algorithm is an extension [39] to spherical coordinates of the hadron-collider SISCone algorithm [26].

  #include "fastjet/SISConeSphericalPlugin.hh"
  // ...
  SISConeSphericalPlugin(double R,
                         double overlap\_threshold
                         double protojet_Emin = 0.0,
                         bool   caching = false,
                         SISConeSphericalPlugin::SplitMergeScale
                           split_merge_scale = SISConeSphericalPlugin::SM_Etilde,
                         double split_merge_stopping_scale = 0.0);

The functionality follows directly that of SISConePlugin.

Note that the underlying implementation of spherical SISCone is optimised for large . An alternative implementation that is faster for was presented in [29]. That reference also contains a nice description of the algorithm.

6 Selectors

Analyses often place constraints (cuts) on jets’ transverse momenta, rapidity, maybe consider only some hardest jets, etc. There are situations in which it is convenient to be able to define a basic set of jet cuts in one part of a program and then have it used elsewhere. To allow for this, we provide a fastjet::Selector class, available through

  #include "fastjet/Selector.hh"

6.1 Essential usage

As an example of how Selectors are used, suppose that we have a vector of jets, jets, and wish to select those that have rapidities and transverse momenta above . We might write the following:

  Selector select_rapidity = SelectorAbsRapMax(2.5);
  Selector select_pt       = SelectorPtMin(20.0);
  Selector select_both     = select_pt && select_rapidity;
  vector<PseudoJet> selected_jets = select_both(jets);

Here, Selector is a class, while SelectorAbsRapMax and SelectorPtMin are functions that return an instance of the Selector class containing the internal information needed to carry out the selection. Selector::operator(const vector<PseudoJet> & jets) takes the jets given as input and returns a vector containing those that pass the selection cuts. The logical operations &&, || and ! enable different selectors to be combined.

Nearly all selectors, like those above, apply jet by jet (the function Selector::applies_jet_by_jet() returns true). For these, one can query whether a single jet passes the selection with the help of the function bool Selector::pass(const PseudoJet &).

There are also selectors that only make sense applied to an ensemble of jets. This is the case specifically for SelectorNHardest(unsigned int n), which, acting on an ensemble of jets, selects the jets with largest transverse momenta. If there are fewer than jets, then all jets pass.

When a selector is applied to an ensemble of jets one can also use

  Selector::sift(vector<PseudoJet> & jets,
                 vector<PseudoJet> & jets_that_pass,
                 vector<PseudoJet> & jets_that_fail)

to obtain the vectors of PseudoJets that pass or fail the selection.

For selectors that apply jet-by-jet, the selectors on either side of the logical operators && and || naturally commute. For operators that act only on the ensemble of jets the behaviour needs specifying. The design choice that we have made is that

  SelectorNHardest(2)    && SelectorAbsRapMax(2.5)
  SelectorAbsRapMax(2.5) && SelectorNHardest(2)

give identical results: in logical combinations of selectors, each constituent selector is applied independently to the ensemble of jets, and then a decision whether a jet passes is determined from the corresponding logical combination of each of the selectors’ results. Thus, here only jets that are among the 2 hardest of the whole ensemble and that have will be selected. If one wishes to first apply a rapidity cut, and then find the 2 hardest among those jets that pass the rapidity cut, then one should instead use the * operator:

  SelectorNHardest(2)  *  SelectorAbsRapMax(2.5)

In this combined selector, the right-hand selector is applied first, and then the left-hand selector is applied to the results of the right-hand selection.

A complementary selector can also be defined using the negation operator. For instance

  Selector sel_allbut2hardest = !SelectorNHardest(2);

Note that, if directly applying (as opposed to first defining) a similar negated selector to a collection of jets, one should write

  vector<PseudoJet> allbut2hardest = (!SelectorNHardest(2))(jets);

with the brackets around the selector definition being now necessary due to () having higher precedence in C++ than Boolean operators.

A user can obtain a string describing a given Selector’s action by calling its description() member function. This behaves sensibly also for compound selectors.

New selectors can be implemented as described in section E.3.

6.1.1 Other information about selectors

Selectors contain a certain amount of additional information that can provide useful hints to the functions using them.

One such piece of information is a selector’s rapidity extent, accessible through a get_rapidity_extent(rapmin,rapmax) call, which is useful in the context of background estimation (section 8). If it is not sensible to talk about a rapidity extent for a given selector (e.g. for SelectorPtMin) the rapidity limits that are returned are the largest (negative and positive) numbers that can be represented as doubles. The function is_geometric() returns true if the selector places constraints only on rapidity and azimuth.

Selectors may also have areas associated with them (in analogy with jet areas, section 7). The has_finite_area() member function returns true if a selector has a meaningful finite area. The area() function returns this area. In some cases the area may be computed using ghosts (by default with ghosts of area ; the user can specify a different ghost area as an argument to the area function).

6.2 Available selectors

6.2.1 Absolute kinematical cuts

A number of selectors have been implemented following the naming convention Selector{Var}{LimitType}. The {Var} indicates which variable is being cut on, and can be one of

       pt, Et, E, Mass, Rap, AbsRap, Eta, AbsEta

The {LimitType} indicates whether one places a lower-limit on the variable, an upper limit or a range, corresponding to the choices

       Min, Max, Range

A couple of examples are

  SelectorPtMin(25.0)        // Selects  (units are user’s default for momenta)
  SelectorRapRange(1.9,4.9)  // Selects 

Following a similar naming convention, there are also SelectorPhiRange() and SelectorRapPhiRange().

6.2.2 Relative kinematical cuts

Some selectors take a reference jet. They have been developed because it is can be useful for a selector to make its decision based on information about some other jet. For example one might wish to select all jets within some distance of a given reference jet; or all jets whose transverse momentum is at least some fraction of a reference jet’s. That reference jet may change from event to event, or even from one invocation of the Selector to the next, even though the Selector is fundamentally performing the same underlying type of action.

The available selectors of this kind are:

  SelectorCircle()                   // a circle of radius R around the reference jet
  SelectorDoughnut(, )          // a doughnut between  and 
  SelectorStrip(half_width)           // a rapidity strip 2*half_width large
  SelectorRectangle(half_rap_width, half_phi_width) // a rectangle in rapidity and phi
  SelectorPtFractionMin()            //  larger than 

One example of selectors taking a reference jet is the following. First, one constructs the selector,

  Selector sel = SelectorCircle(1.0);

Then if one is interested in the subset of jets near jet1, and then those near jet2, one performs the following operations:

  sel.set_reference(jet1);
  vector<PseudoJet> jets_near_jet1 = sel(jets);
  sel.set_reference(jet2);
  vector<PseudoJet> jets_near_jet2 = sel(jets);

If one uses a selector that takes a reference without the reference having been actually set, an exception will be thrown. If one sets a reference for a compound selector, the reference is automatically set for all components that take a reference. One can verify whether a given selector takes a reference by calling the member function

  bool Selector::takes_reference() const;

Attempting to set a reference for a Selector that returns false here will cause an exception to be thrown.

6.2.3 Other selectors

The following selectors are also available:

  SelectorNHardest()      //  selects the  hardest jets
  SelectorIsPureGhost()    //  selects jets that are made exclusively of ghost particles
  SelectorIsZero()         //  selects jets with zero momentum
  SelectorIdentity()       //  selects everything. Included for completeness

7 Jet areas

Jet areas provide a measure of the surface in the - plane over which a jet extends, or, equivalently, a measure of a jet’s susceptibility to soft contamination.

Since a jet is made up of only a finite number of particles, one needs a specific definition in order to make its area an unambiguous concept. Three definitions of area have been proposed in [17] and implemented in FastJet:

  • Active areas add a uniform background of extremely soft massless ‘ghost’ particles to the event and allow them to participate in the clustering. The area of a given jet is proportional to the number of ghosts it contains. Because the ghosts are extremely soft (and sensible jet algorithms are infrared safe), the presence of the ghosts does not affect the set of user particles that end up in a given jet. Active areas give a measure of a jet’s sensitivity to diffuse background noise.

  • Passive areas are defined as follows: one adds a single randomly placed ghost at a time to the event. One examines which jet (if any) the ghost ends up in. One repeats the procedure many times and the passive area of a jet is then proportional to the probability of it containing the ghost. Passive areas give a measure of a jet’s sensitivity to point-like background noise.

  • The Voronoi area of a jet is the sum of the Voronoi areas of its constituent particles. The Voronoi area of a particle is obtained by determining the Voronoi diagram for the event as a whole, and intersecting the Voronoi cell of the particle with a circle of radius centred on the particle. Note that for the algorithm (but not in general for other algorithms) the Voronoi area of a jet coincides with its passive area.

In the limit of very densely populated events, all area definitions lead to the same jet-area results [17].151515This can be useful when one area is particularly expensive to calculate: for example active areas for SISCone tend to be memory and CPU intensive; however, for dense events, they can be adequately replaced with passive areas, which, for SISCone, are computationally more straightforward.

The area of a jet can be calculated either as a scalar, or as a 4-vector. Essentially the scalar case corresponds to counting the number of ghosts in the jet; the -vector case corresponds to summing their 4-vectors, normalised such that for a narrow jet, the transverse component of the -vector is equal to the scalar area.

To access jet areas, the user is exposed to two main classes:

  class fastjet::AreaDefinition;
  class fastjet::ClusterSequenceArea;

with input particles, a jet definition and an area definition being supplied to a ClusterSequenceArea in order to obtain jets with area information. Typical usage would be as follows:

  #include "fastjet/ClusterSequenceArea.hh"
  // ...
  double ghost_maxrap = 5.0; // e.g. if particles go up to y=5
  AreaDefinition area_def(active_area, GhostedAreaSpec(ghost_maxrap));
  ClusterSequenceArea clust_seq(input_particles, jet_def, area_def);
  vector<PseudoJet> jets = sorted_by_pt(clust_seq.inclusive_jets());
  double area_hardest_jet = jets[0].area();

Details are to be found below and an example program is given as example/06-area.cc.

When jet areas are to be used to establish the level of a diffuse noise that might be present in the event (e.g. from underlying event particles or pileup), and maybe subtract it from jets, further classes such as fastjet::JetMedianBackgroundEstimator and fastjet::Subtractor are useful. This topic is discussed in Section 8 and an example program is given in example/07-subtraction.cc.

7.1 AreaDefinition

Area definitions are contained in the AreaDefinition class. Its two main constructors are:

  AreaDefinition(fastjet::AreaType area_type,
                 fastjet::GhostedAreaSpec ghost_spec);

for the various active and passive areas (which all involve ghosts) and

  AreaDefinition(fastjet::VoronoiAreaSpec voronoi_spec);

for the Voronoi area. A default constructor exists, and provides an active area with a ghost_spec that is acceptable for a majority of area measurements with clustering algorithms and typical Tevatron and LHC rapidity coverage.

Information about the current AreaDefinition can be retrieved with the help of description(), area_type(), ghost_spec() and voronoi_spec() member functions.

7.1.1 Ghosted Areas (active and passive)

There are two variants each of the active and passive areas, as defined by the AreaType enum:

  enum AreaType{ [...],
                 active_area,
                 active_area_explicit_ghosts,
                 one_ghost_passive_area,
                 passive_area,
                 [...]};

The two active variants give identical results for the areas of hard jets. The second one explicitly includes the ghosts when the user requests the constituents of a jet and also leads to the presence of “pure ghost” jets. The first of the passive variants explicitly runs through the procedure mentioned above, i.e. it clusters the events with one ghost at a time, and repeats this for very many ghosts. This can be quite slow, so we also provide the passive_area option, which makes use of information specific to the jet algorithm in order to speed up the passive-area determination.161616This ability is provided for , Cambridge/Aachen, anti- and the SISCone plugin. In the case of it is actually a Voronoi area that is used, since this can be shown to be equivalent to the passive area [17] (though some approximations are made for 4-vector areas). For other algorithms it defaults back to the one_ghost_passive_area approach.

In order to carry out a clustering with a ghosted area determination, the user should also create an object that specifies how to distribute the ghosts.171717Or accept a default — which uses the default values listed in the explicit constructor and This is done via the class GhostedAreaSpec whose constructor is

  GhostedAreaSpec(double ghost_maxrap,
                  int    repeat        = 1,   double ghost_area    = 0.01,
                  double grid_scatter  = 1.0, double pt_scatter    = 0.1,
                  double mean_ghost_pt = 1e-100);

The ghosts are distributed on a uniform grid in and , with small random fluctuations to avoid clustering degeneracies.

The ghost_maxrap variable defines the maximum rapidity up to which ghosts are generated. If one places ghosts well beyond the particle acceptance (at least beyond it), then jet areas also stretch beyond the acceptance, giving a measure of the jet’s full extent in rapidity and azimuth. If ghosts are placed only up to the particle acceptance, then the jet areas are clipped at that acceptance and correspond more closely to a measure of the jet’s susceptibility to contamination from accepted soft particles. This is relevant in particular for jets within a distance of the particle acceptance boundary. The two choices are illustrated in fig. 1. To define more complicated ghost acceptances it is possible to replace ghost_maxrap with a Selector, which must be purely geometrical and have finite rapidity extent.

Figure 1: Two choices for ghost placement. The grey area in each plot indicates the region containing ghosts, while the dots are particles, which here are accepted up to . The circular regions indicate the areas that will be found for two particular jets. In the left-hand case, with ghosts that extend well beyond the acceptance for particles, jet areas are unaffected by the particle acceptance; in the right-hand case, with ghosts placed only up to the acceptance limit, jet areas are clipped at the edge of the acceptance.

The ghost_area parameter in the GhostedAreaSpec constructor is the area associated with a single ghost. The number of ghosts is inversely proportional to the ghost area, and so a smaller area leads to a longer CPU-time for clustering. However small ghost areas give more accurate results. We have found the default ghost area given above to be adequate for most applications. Smaller ghost areas are beneficial mainly for high-precision determinations of areas of jets with small .

By default, one set of ghosts is generated for each event that is clustered. The small random fluctuations in ghost positions and ’s, introduced to break clustering degeneracies, mean that for repeated clustering of the same event a given hard jet’s area will be different after each clustering. This is especially true for sparse events, where a jet’s particle content fails to accurately delineate the boundaries of the jet. For the active_area choice (and certain passive areas), specifying causes FastJet to directly cluster the same hard event with multiple ghost sets. This results in a pseudo-Monte Carlo type evaluation of the jet areas. A statistical uncertainty on the area is available for each jet, through the jet.area_error() call. It is calculated as the standard deviation of areas obtained for that jet, divided by . While there are situations in which this facility is useful, for most applications of jet areas it is sufficient to use .181818Several parameters are available to control the properties and randomness of the ghosts: each ghost’s position differs from an exact grid vertex by a random amount distributed uniformly in the range times the grid spacing in both the rapidity and azimuth directions. Each ghost’s is distributed randomly in the range times mean_ghost_pt. For nearly all applications, it makes sense to use the default values. Facilities are also available to set and retrieve the seeds for the random-number generator, notably through the set_random_status(...) and get_random_status(...) members of GhostedAreaSpec.

After initialisation, the parameters can be modified and retrieved respectively with calls such as set_ghost_area(...) and ghost_maxrap() (similarly for the other parameters191919In versions of FastJet prior to 3.0.1, the names mean_ghost_kt and kt_scatter should be used rather than mean_ghost_pt and pt_scatter. The former names will be maintained for the foreseeable future.). A textual description of the GhostedAreaSpec can be obtained, as usual, with the description() member function.

7.1.2 Voronoi Areas

The Voronoi areas of jets are evaluated by summing the corresponding Voronoi areas of the jets’ constituents. The latter are obtained by considering the intersection between the Voronoi cell of each particle and a circle of radius centred on the particle’s position in the rapidity-azimuth plane.

The jets’ Voronoi areas can be obtained from ClusterSequenceArea by passing the proper VoronoiAreaSpec specification to AreaDefinition. Its constructors are

  /// default constructor (effective_Rfact = 1)
  VoronoiAreaSpec() ;
  /// constructor that allows you to set effective_Rfact
  VoronoiAreaSpec(double effective_Rfact) ;

The second constructor allows one to modify (by a multiplicative factor effective_Rfact) the radius of the circle which is intersected with the Voronoi cells. With , for the algorithm, the Voronoi area is equivalent to the passive area. Information about the specification in use is returned by effective_Rfact() and description() member functions.

The Voronoi areas are calculated with the help of Fortune’s () Voronoi diagram generator for planar static point sets [40].

One use for the Voronoi area is in background determination with the algorithm (see below, section 8): with the choice it provides an acceptable approximation to the algorithm’s active area and is often significantly faster to compute than the active area. Note that it is not currently possible to clip Voronoi areas with a given particle acceptance. As a result, given particles up to , only jets with will have areas that reflect the jets’ sensitivity to accepted particle contamination. It is only these jets that should then be used for background determinations.

7.2 ClusterSequenceArea

This is the class 202020 ClusterSequenceArea is derived from ClusterSequenceAreaBase (itself derived from ClusterSequence) and makes use of one among ClusterSequenceActiveAreaExplicitGhosts, ClusterSequenceActiveArea, ClusterSequencePassiveArea, ClusterSequence1GhostPassiveArea or ClusterSequenceVoronoiArea (all of them in the fastjet namespace of course), according to the choice made with AreaDefinition. The user can also use these classes directly. ClusterSequenceActiveAreaExplicitGhosts is particularly useful in that it allows the user to specify their own set of ghost particles. This is exploited to provide area support in a number of the transformers of section 9. used for producing a cluster sequence that also calculates jet areas. Its constructor is

  template<class L> ClusterSequenceArea(const std::vector<L> & input_particles,
                                        const JetDefinition & jet_def,
                                        const AreaDefinition & area_def);

and the class includes the methods

  /// Return a reference to the area definition
  virtual const AreaDefinition & area_def() const;
  /// Returns an estimate of the area contained within the selector that is free of jets.
  /// The selector needs to have a finite area and be applicable jet by jet.
  /// The function returns 0 if active_area_explicit_ghosts was used.
  virtual double empty_area(const Selector & selector) const;

As long as an instance of this class is in scope, a user can access information about the area of its jets using the following methods of PseudoJet:

  /// Returns the scalar area associated with the given jet
  double area = jet.area();
  /// Returns the error (uncertainty) associated with the determination of the
  /// scalar area of the jet; gives zero when the repeat=1 and for passive/Voronoi areas
  double area_error = jet.area_error();
  /// Returns a PseudoJet whose 4-vector is defined by the following integral
  ///
  ///        PseudoJet(,,) * (" inside jet boundary")
  ///
  /// where PseudoJet(,,) is a 4-vector with the given rapidity (),
  /// azimuth () and , while (" inside jet boundary") is 1
  /// when  define a direction inside the jet boundary and 0 otherwise.
  PseudoJet area_4vector = jet.area_4vector();
  /// When using active_area_explicit_ghosts, returns true for jets made
  /// exclusively of ghosts and for ghost constituents.
  bool is_pure_ghost = jet.is_pure_ghost();

8 Background estimation and subtraction

Events with hard jets are often accompanied by a more diffuse “background” of relatively soft particles, for example from the underlying event (in or PbPb collisions) or from pileup (in collisions). For many physical applications, it is useful to be able to estimate characteristics of the background on an event-by-event basis, for example the per unit area (), or fluctuations from point to point (). One use of this information is to correct the hard jets for the soft contamination, as discussed below in section 8.1.2.

One of the issues in characterising the background is that it is difficult to introduce a robust criterion to distinguish “background” jets from hard jets. The main method that is available in FastJet involves the determination of the distribution of for the jets in a given event (or region of the event) and then taking the median of the distribution as an estimate of , as proposed in [18] and studied in detail also in [41, 42]. This is largely insensitive to the presence of a handful of hard jets, and avoids any need for introducing a scale to distinguish hard and background jets.

The original form of this method used the or Cambridge/Aachen jet algorithms to find the jets. These algorithms have the advantage that the resulting jets tend to have reasonably uniform areas212121Whereas anti- and SISCone suffer from jets with near zero areas or, for SISCone, sometimes huge, “monster” jets, biasing the determination. They are therefore not recommended. In the meantime a variant of the approach that has emerged is to cluster the particles into rectangular grid cells in and and determine their median . This has the advantage of simplicity and much greater speed. One may worry that a hard jet will sometimes lie at a corner of multiple grid cells, inducing larger biases in the median than with a normal jet finder jets, however we have found this not to be an issue in practice [42].

8.1 General Usage

8.1.1 Background estimation

The simplest workflow for background estimation is first, outside the event loop, to create a background estimator. For the jet-based method, one creates a fastjet::JetMedianBackgroundEstimator,

  #include "fastjet/tools/JetMedianBackgroundEstimator.hh"
  // ...
  JetMedianBackgroundEstimator bge(const Selector & selector,
                                   const JetDefinition & jet_def,
                                   const AreaDefinition & area_def);

where the selector is used to indicate which jets are used for background estimation (for simple use cases, one just specifies a rapidity range, e.g. SelectorAbsRapMax(4.5) to use all jets with ), together with a jet definition and an area definition. We have found that the or Cambridge/Aachen jet algorithms with generally provide adequate background estimates, with the lower range of values to be preferred if the events are likely to be busy [41, 42]. An active area with explicit ghosts is generally recommended.222222With the algorithm one may also use a Voronoi area (effective_Rfact = 0.9 is recommended), which has the advantage of being deterministic and faster than ghosted areas. In this case however one must use a selector that is geometrical and selects only jets well within the range of event particles. E.g. if particles are present up to