A simulation study to distinguish prompt photon from \pi^{0} and beam halo in a granular calorimeter using deep networks

A simulation study to distinguish prompt photon from and beam halo in a granular calorimeter using deep networks

[    [    [    [    [

In a hadron collider environment identification of prompt photons originating in a hard partonic scattering process and rejection of non-prompt photons coming from hadronic jets or from beam related sources, is the first step for study of processes with photons in final state. Photons coming from decay of ’s produced inside a hadronic jet and photons produced in catastrophic bremsstrahlung by beam halo muons are two major sources of non-prompt photons. In this paper the potential of deep learning methods for separating the prompt photons from beam halo and ’s in the electromagnetic calorimeter of a collider detector is investigated, using an approximate description of the CMS detector. It is shown that, using only calorimetric information as images with a Convolutional Neural Network, beam halo (and ) can be separated from photon with 99.96% (97.7%) background rejection for 99.00% (90.0%) signal efficiency which is much better than traditionally employed variables.

a]S. Ghosh, b,c,1]A. Harilal, 11footnotetext: Corresponding author. b,d]A. R. Sahasransu, b]R. K. Singh a]and S. Bhattacharya Prepared for submission to JINST

A simulation study to distinguish prompt photon from and beam halo in a granular calorimeter using deep networks


  • High Energy Nuclear and Particle Physics Division, Saha Institute of Nuclear Physics, HBNI,
    1/AF Bidhannagar, Kolkata, India

  • Department of Physical Sciences, Indian Institute of Science Education and Research Kolkata,
    Mohanpur, 741246, India

  • Department of Physics, Carnegie Mellon University, Pittsburgh, USA

  • Vrije Universiteit Brussel, Belgium

E-mail: aharilal@andrew.cmu.edu

Keywords: Calorimeters, Detector modelling and simulations, Data processing methods, Performance of High Energy Physics Detectors, Deep Learning, Convolutional Neural Network



1 Introduction

Understanding the symmetry breaking mechanism in the standard model (SM) and search for new physics beyond the standard model are major goals of the physics program of the Large Hadron Collider (LHC). Photons and electrons play important roles in these searches as they provide clean signatures in hadron environments. Distinguishing prompt photons (coming from hard scattering) from photons coming from neutral meson () decays or other anomalous sources (e.g. bremsstrahlung photon coming from beam halo muon) is of critical importance. An example is the search for the Higgs boson using its H decay mode, which was one of the main channels for discovery of Higgs boson. The di-photon final state produced by this decay, has a large background coming from hadronic jets, where most of the jet energy has gone to single . Sensitivity of this search depends largely on the power of rejection of this background. Another example is search for dark matter and large extra dimensions in final states with a photon and large missing transverse momentum. In this channel, along with the photons coming from pion decays, beam halo photons pose an additional challenge.
Currently employed traditional techniques use combination of shower shape variables which are intelligently constructed by the physicist, to try to capture the difference in spatial patterns of the signal and the background events, from the raw output of an electromagnetic calorimeter, but are only a few in a set of infinite such possible variables. Classification using supervised learning algorithms like an artificial neural network (ANN) or a boosted decision tree (BDT), with the high level features made by physicsts as inputs, have also been done successfully. A prime example of such an analysis is the search for Higgs boson in the Higgs to diphoton channel performed by the CMS experiment at the LHC, where a boosted decision tree was deployed for identification of prompt photons[1]. However, with the advent of the the new image recognition networks it should now be possible to benefit from these modern techniques. Only recently significant efforts have been made in this direction. There are a number of recent instances in High Energy Physics where physics classification problems have been recast as computer vision problems like neutrino classification [2, 3] and jet image classification [4, 5, 6, 7].
The analysis presented in this paper derives its motivation from such recent instances of using the emerging techniques in Deep Learning (DL) like Convolutional Neural Network (CNN) [8]. In this approach the machine learns to construct many high level feature variables starting from the raw detector image. Every filter in the CNN projects such a high level variable. A CNN is thereby employed in trying to extract maximum amount of information from the raw output from the electromagnetic calorimeter, which can lead to better performance in discriminating between the two classes. In this study, data directly from the detector as images are used and a CNN is run on them without providing any high level physics information. Performance of the CNN is compared with an ANN fed with specialized physics variables so as to evaluate the efficacy of these deep learning algorithms in identifying features from the data without much human input.
The rest of the paper is planned as follows: in Section 2, an outline of photon identification in high energy physics and the various classes of photons considered in this paper are given. In Section 3, the details of the detector simulation are described, and in Section 4, the full analysis and results are presented, followed by a conclusion of this study in Section 5.

2 Photon Identification

2.1 Prompt photons

Prompt photons are photons produced in Compton scattering, annihilation or in fusion. Photons from decay of hadrons such as , fragmentation process as well as photons from bremsstrahlung of charged particles faking a prompt photon, form a large background over the signal. The tracker material in front of the electromagnetic calorimeter (ECAL) causes photons to convert to pairs in the tracker before reaching the ECAL. Unlike the photons that pass through the tracker unconverted, these converted photons have hits in the tracker and have ECAL energy deposition more spread in due to bending of the trajectories in the strong magnetic field. Typical energy deposit maps of different classes of photons are shown in Figure 1.

() (a)
() (b)
() (c)
() (d)
Figure 1: The image of (a) a prompt photon, (b) a converted photon, (c) a , (d) a beam halo photon energy deposit map in the Ecal, normalized with the seed crystal energy.

2.2 Photons from decay and fragmentation

Diphoton decay of a neutral meson () inside a hadronic jet is the single largest background to prompt photons. In this study, the two photons from a above 10 GeV hit the same crystal in the ECAL and appear like one photon.The hadronic multijet production cross section at the LHC is orders of magnitude higher than a typical new physics process. These jets copiously produce neutral mesons, a small fraction of which can fake a prompt photon. For search this background is almost two orders of magnitude larger than the signal, assuming a jet faking photon rate of ~ [9].

The energy deposits of prompt photons, and photons from and beamhalo in the ECAL as seen from the interaction point of our simulation is shown in Figure 2.

() (a)
() (b)
() (c)
Figure 2: Tracks and hits from a 10 GeV (a) prompt photon, (b) and (c) beam halo photon.

2.3 Beam halo photons

One of the main backgrounds, known as machine induced background (MIB), in high energy particle detectors comes from particles entering the detector from the accelerator. These particles which are produced in the hadronic and electromagnetic showers resulting from beam protons interacting with collimators or residual gas molecules in the vacuum pipe are called beam halo. Pions, being the lightest hadrons, are produced easily in these hadronic interactions, and constitute majority of the beam halo. Being short lived, the neutral pion decays into two photons, and the charged pions decay into muons. Some of these muons are very energetic with energies of hundreds of GeV, i.e, greater than the critical energy of muons in the lead tungstate crystals of the calorimeter. These high energy muons undergo bremsstrahlung as they interact with the atoms of the crystals, and give out photons. These halo photons result in final states with a single photon with no other object to balance its energy and momentum, thus mistakenly pointing to final states with invisible particles recoiling against photons. These photons are along the direction parallel to the beam pipe, and so have an elongated shower along .

3 Simulation

A model detector comprising of calorimeter and tracker has been constructed [10] using GEANT4 [11] to resemble CMS from its technical design report (TDR) [12]. The calorimeter construction includes the barrel region between pseudorapidity . It is made of parametrized volumes of part of a sphere arranged as an array of crystals placed in a cylindrical arrangement. Each parametrized volume has coverage, azimuthal angle () coverage and radial length. The size of the crystals is in the front face and vary from to in the back face. These value are within of the ECAL crystal sizes in the TDR. A greatly simplified version of a tracker has been implemented as concentric cylinders of varying thickness of silicon. The first three layers have a thickness of followed by four layers of and six more layers of thickness. Additional material has been added to have a material budget similar to that of the CMS tracker in the pseudo-rapidity region . A uniform magnetic field of 4 T has been applied along the positive z-axis. The cross-sectional view of the geometry is shown in Figure 3.

Figure 3: The Cylindrical detector simulated in GEANT4.

The simulation has a flag for filtering out photons which convert anywhere inside the tracking volume. The standard physics list FTFP-BERT has been used for simulating physics process to keep the simulation as close to the general purpose detectors CMS and ATLAS as possible [13]. These include bremmstrahlung, pair production and photo-electric effect for photons, as well as Compton scattering for . The particle is assumed to deposit its entire energy at a point if it can’t travel a distance of mm further from the point.

4 Analysis and Results

This section describes the details of the analysis procedure and results. The information from the cylindrical ECAL has been represented as a 2D image in the space as nn (n = 11 or 25 depending on the problem described in sections below) matrix of cells around a local maximum in energy deposit (seed crystal). The cells represent calorimeter crystals and the values of the cells the energy contained in them. The values have been further normalized to the seed crystal energy before feeding the matrix of cells as an input to the networks. Three different network analyses have been performed for each classification problem :

- A shallow ANN with the traditional shape variables constructed out of cell values, as input.

- The normalized cell values fed to an ANN or DNN.

- nn matrix of normalized cell energies fed to a CNN.

In the first analysis the shape variables are constructed from intuition, utilizing the knowledge about the narrow lateral shower profile of an electromagnetic shower. Some variables used are designed after the standard shower shapes variables used in the CMS ECAL [14, 15] and other prior studies done on homogeneous, granular calorimeter hodoscopes . The main study with the cylindrical geometry described in section 3, has been cross checked with a planar geometry with a magnetic field parallel to the face of the crystals.

4.1 Network Architecture and Ranking

The package Keras [16], with Tensorflow [17] as backend has been used for implementing the networks used in the analysis. The various networks used are listed below:

Convolutional Neural Network (CNN):

A CNN has been constructed with two convolutional layers with filters of size 33 and stride 11, acted on with activation function of rectified linear unit (RELU) [18] on the outputs along with L2 regularisation, followed by maxpooling of pool window size 22. It is followed by a fully connected layer of 64 nodes, with dropout regularization [19] of 30%. Finally there is a fully connected layer with the softmax activation function giving a binary output. The structure of the CNN used is shown in Figure 4.

Artificial Neural Nework (ANN):

For the photon-beam halo separation, an ANN with one hidden layer of 32 nodes with RELU activation, and an output layer of 2 nodes with softmax activation is used. We see that optimal classification is obtained using this simple one layered network. The network architecture is shown in Figure 5, where only one hidden layer is considered.

Deep Neural Network (DNN):

For photon- separation, an ANN with two hidden layers is used, as the problem is more difficult and a deeper network is found to perform better. The first layer has 64 nodes, the second layer has 32 nodes, both with RELU activation, followed by a dropout of 30%, and the output layer has 2 nodes with softmax activation. The structure of the ANN used is shown in Figure 5.

All the networks use cross-entropy loss function with ADADELTA optimizer [20].
Receiver Operating Characteristics (ROC) curve with signal efficiency Vs background rejection have been plotted to evaluate the performance of each classifier. The area under the curve (AUC) of the ROC has also been used as a quality criterion for the classification.

Figure 4: The CNN architecture. The input image is of size 1111 for beam halo separation and 2525 for separation.
Figure 5: Architecture of the neural network for separation has two hidden layers with 64 and 32 nodes respectively, with RELU activation. For the beam halo separation, the network has one hidden layer of 64 nodes with RELU activation. The output layer is acted on by softmax activation.

4.2 Datasets

51,000 images each of the signal (prompt photons) and background (non-prompt photons) classes have been generated. For prompt photon - beam halo separation, images of dimensions 1111 have been used, while for prompt photon - separation, images of dimensions 2525 have been used to efficiently capture the conversion of photons.
Out of the total sample set, 80% has been used for training and 20% for testing. Out of the training set, 30% has been used for cross validation.

4.3 Beam halo - prompt photon separation

10 GeV samples of prompt photons and beam halo photons have been generated, and the CNN and the ANN have been run on the 1111 images. The CNN and the ANN have been observed to perform remarkably well and comparable to one another with high background rejection for a high signal efficiency (see Figure 7 and Table 1).
Another approach of distinguishing prompt photons from fake photons is by using shower shape variables that capture the differences in the energy spread in the space as shown in Figure 1. To the ANN, the following set of shower shape variables based on the energy in a 55 cluster centered around the maximum energy (seed) crystal have been fed as the input:

  • ,  

s is the ratio of the energy contained within the 33 matrix of crystals centered on the seed crystal to the total energy contained in the 55 matrix around the seed crystal and,


where and are the energy and index of the crystal within the 55 cluster, and is the index of the seed crystal [14]. Similarly for we have,


With the beam halo photons having an elongated spread in and the prompt photons with a more circular spread in space, these variables have been chosen aiming to distinguish between the two classes utilizing these differences.
The distributions of these variables are shown in Figure 6.

Method used
Background rejection
Signal efficiency
CNN on image 99.96 99.00 0.9997
ANN on image 99.89 99.00 0.9990
ANN on 3 variables 71.31 99.00 0.9748
Table 1: Results for beam halo - prompt photon separation
() (a)
() (b)
() (c)
Figure 6: The distribution of the shower shape variables of 10 GeV prompt photons and beam halo photons: (a) s9/s25, (b) , and (c) .
() (a)
() (b)
Figure 7: (a) ROCs for all methods used in separation of prompt photons and beam halo photons, (b) zoomed-in view of the ROCs which shows that the CNN performs better than the ANN.

The ANN fed with these variables do not perform as well as the CNN and the ANN on the images, as evident from the ROCs in Figure 7. The results are listed in Table 1.

4.4 prompt photon separation

The sample containing prompt photons has about 38% chance of getting converted into pairs leading to a different signature (see Figure 1). The sample has two photons, and there is about 61% chance that at least one will be converted.
The samples have been grouped into different sets as follows:

  • Set A - converted and unconverted photons Vs converted and unconverted ’s

  • Set B - unconverted photons Vs converted and unconverted ’s

  • Set C - unconverted photons Vs unconverted ’s

The CNN and the ANN have been run on these sets of 2525 images, and from the ROCs in Figure 9, it is evident that the CNN outperforms the ANN.

The ANN is fed with the following set of shower shape variables centered around the maximum energy seed crystal in a 55 cluster as the input:

Method used
Background rejection
Signal efficiency
CNN on image set A 76.4 90.0 0.9030
ANN on image set A 73.4 90.0 0.8825
ANN on 9 variables for set A 46.8 90.0 0.8196
CNN on image set B 92.5 90.0 0.9567
CNN on image set C 97.7 90.0 0.9848
Table 2: Results for separation

Along with variables defined in section 4.3, is the energy of the seed crystal, is the maximum energy contained within a 22 matrix centered on the seed crystal, is the maximum energy contained within a 44 matrix centered on the seed crystal, is the energy of the crystal with the next highest energy adjacent to the seed crystal, is the ratio of s with the energy in the 2525 supercluster, and

() (a)
() (b)
() (c)
() (d)
() (e)
() (f)
() (g)
() (h)
() (i)
Figure 8: The distribution of the shower shape variables for set A of prompt photons and ’s:
(a) , (b) , (c) , (d ) , (e) , (f) , (g) , (h) , and (i) .
Figure 9: ROCs for all the methods used for separation of s and prompt photons.

The distributions of each of these variables for set A of photons and ’s have been shown in Figure 8. They do not have much discriminating power, and the ROCs in Figure 9 indicate how poorly they perform compared to the CNN on the images. The background rejection, signal efficiency and the area under the curve of the ROCs are listed in Table 2.

5 Conclusion

A study has been presented using deep learning techniques for separating prompt photons from neutral pions and beam halo. Different network types have been compared on simulation data from approximated CMS detector geometry consisting of a tracker and a calorimeter.
For separating photons from beam halo, CNN based on image gives the maximum background rejection of 99.96 for 99.00 signal efficiency. For separating neutral pions from prompt photons, CNN based on image gives the maximum background rejection of 97.7 for 90.0 signal efficiency in the best case scenario. For both the cases, it is evident from the ROC AUCs that the CNN outperforms the ANN with the same input image as well as the ANN using topological variables.
For the beam halo separation, the spatial patterns of the prompt photon and beam halo photon are so distinct that a simple one layered ANN will suffice to do a high accuracy classification. The classification proved to be a more difficult problem owing to the very similar energy deposition patterns of the decay photons and a prompt photon. The CNN still gives a good performance on unconverted photons Vs unconverted ’s. These techniques of neural networks based on images is generic and can be applied to any calorimeter.


The authors would like to thank Dr. Ananda Dasgupta for the useful discussions through the course of this project.


  • [1] CMS collaboration, S. Chatrchyan et al., Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC, Phys. Lett. B 716 (2012) 30.
  • [2] A. Aurisano, A. Radovic, D. Rocco, A. Himmel, M. D. Messier, E. Niner et al., A Convolutional Neural Network Neutrino Event Classifier, JINST 11 (2016) P09001, [1604.01444].
  • [3] MicroBooNE collaboration, R. Acciarri et al., Convolutional Neural Networks Applied to Neutrino Events in a Liquid Argon Time Projection Chamber, JINST 12 (2017) P03011, [1611.05531].
  • [4] L. de Oliveira, M. Kagan, L. Mackey, B. Nachman and A. Schwartzman, Jet-images — deep learning edition, JHEP 07 (2016) 069, [1511.05190].
  • [5] P. Baldi, K. Bauer, C. Eng, P. Sadowski and D. Whiteson, Jet Substructure Classification in High-Energy Physics with Deep Neural Networks, Phys. Rev. D93 (2016) 094034, [1603.09349].
  • [6] J. Barnard, E. N. Dawe, M. J. Dolan and N. Rajcic, Parton Shower Uncertainties in Jet Substructure Analyses with Deep Neural Networks, Phys. Rev. D95 (2017) 014018, [1609.00607].
  • [7] P. T. Komiske, E. M. Metodiev and M. D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110, [1612.01551].
  • [8] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE 86 (November, 1998) 2278–2324.
  • [9] M. Pieri, S. Bhattacharya, I. Fisk, J. Letts, V. A. Litvine and J. G. Branson, Inclusive search for the Higgs boson in the H —> gamma gamma channel, .
  • [10] A. R. Sahasransu and R. Singh, “Opencmsg4v0.1.” https://github.com/ars12ms062/OpenCMSG4, 2018.
  • [11] GEANT4 collaboration, S. Agostinelli et al., GEANT4: A Simulation toolkit, Nucl. Instrum. Meth. A506 (2003) 250–303.
  • [12] CMS collaboration, The CMS tracker: addendum to the Technical Design Report. Technical Design Report CMS. CERN, Geneva, 2000.
  • [13] S. Banerjee and C. Experiment, Validation of physics models of geant4 using data from cms experiment, Journal of Physics: Conference Series 898 (2017) 042005.
  • [14] CMS Collaboration collaboration, Photon reconstruction and identification at sqrt(s) = 7 TeV, Tech. Rep. CMS-PAS-EGM-10-005, CERN, Geneva, 2010.
  • [15] CMS Collaboration collaboration, Isolated Photon Reconstruction and Identification at sqrts, Tech. Rep. CMS-PAS-EGM-10-006, CERN, Geneva, 2011.
  • [16] “Keras.” https://keras.io/, 2018.
  • [17] “Tensorflow.” https://www.tensorflow.org/, 2018.
  • [18] V. Nair and G. E. Hinton, Rectified linear units improve restricted boltzmann machines, in Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, (USA), pp. 807–814, Omnipress, 2010.
  • [19] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever and R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, Journal of Machine Learning Research 15 (2014) 1929–1958.
  • [20] M. D. Zeiler, ADADELTA: an adaptive learning rate method, CoRR abs/1212.5701 (2012) , [1212.5701].
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description