CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks

CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks

Michela Paganini Department of Physics, Yale University, New Haven, CT 06520, USA Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA    Luke de Oliveira Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA    Benjamin Nachman Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
August 22, 2019

The precise modeling of subatomic particle interactions and propagation through matter is paramount for the advancement of nuclear and particle physics searches and precision measurements. The most computationally expensive step in the simulation pipeline of a typical experiment at the Large Hadron Collider (LHC) is the detailed modeling of the full complexity of physics processes that govern the motion and evolution of particle showers inside calorimeters. We introduce CaloGAN, a new fast simulation technique based on generative adversarial networks (GANs). We apply these neural networks to the modeling of electromagnetic showers in a longitudinally segmented calorimeter, and achieve speedup factors comparable to or better than existing full simulation techniques on CPU (-) and even faster on GPU (up to ). There are still challenges for achieving precision across the entire phase space, but our solution can reproduce a variety of geometric shower shape properties of photons, positrons and charged pions. This represents a significant stepping stone toward a full neural network-based detector simulation that could save significant computing time and enable many analyses now and in the future.

I Introduction

The physics programs of all experiments based at the LHC rely heavily on detailed simulation for all aspects of event reconstruction and data analysis. Simulated particle collisions, decays, and material interactions are used to interpret the results of ongoing experiments and estimate the performance of new ones, including detector upgrades.

State-of-the-art simulations are able to precisely model detector geometries and physical processes spanning distance scales as small as m for the initial parton-parton scattering, all the way to the material interactions at meter length scales. These processes, which include nuclear and atomic interactions, such as ionization, as well as strong, weak, and electromagnetic processes, will alter the state of incoming particles as they propagate through and interact with layers of material in the various detector components. Detection techniques such as calorimetry exploit these physical interactions to detect the presence and measure the energy of particles such as photons, electrons and hadrons via their interactions with hundreds of thousands of detector components. Upon interaction with a calorimeter, a cascade (shower) of secondary particles is produced and their energy is collected and transformed into electric signals.

Physics-based (full simulation) modeling of particle showers in calorimeters (with Geant4 GEANT4 Collaboration (2003) as the state of the art) is the most computationally demanding part of the whole simulation process, and can take minutes per event on modern, distributed high performance platforms Aad et al. (2010); Rahmat et al. (2012). The production of physics results is often limited by the absence of adequate Monte Carlo (MC) simulation, and the increase in luminosity at the LHC will only exacerbate the problem. For example, the ATLAS and CMS experiments at the high-luminosity phase of the LHC (HL-LHC) will each see about 3 billion top quark pair events Czakon and Mitov (2014); Botje et al. (2011); Martin et al. (2009); Gao et al. (2014); Ball et al. (2013); Khachatryan et al. (2017); Aaboud et al. (2016a); for a MC statistical uncertainty that is significantly below the data uncertainty, hundreds of billion simulated events would be required. This is not possible using full detector simulation techniques with existing computing resources. Currently, full MC simulation occupies 50-70 of the experiments’ worldwide computing resources, equivalent to billions of CPU hours per year Flynn (2015); Karavakis et al. (2014); Bozzi (2015).

The relevance of the calorimeter simulation step has sparked the development of approximate, fast simulation solutions to mitigate its computational complexity. Fast simulation techniques rely on parametrized showers Grindhammer and Peters (1993); The ATLAS Collaboration et al. (2010); Grindhammer et al. (1990) for fluctuations, and look-up tables for low energy interactions Barberio et al. (2009). For many applications, these techniques are sufficient. However, analyses that utilize the detailed structure of showers for particle identification as well as energy and direction calibration may not be able to rely on these simplified approaches ATL (2014).

We introduce a Deep Learning model to enable high-fidelity fast simulation of particle showers in electromagnetic calorimeters. Previous work de Oliveira et al. (2017) assessed the viability of GAN-based simulation of jet-images Cogan et al. (2015) – sparse, structured, 2D representations of jet fragmentation analogous to a single-layer, idealized calorimeter – and focused on providing architectural guidelines for this regime. Neural network-based generation, including GANs, Variational Auto-Encoders Kingma and Welling (2013), and Adversarial Auto-Encoders Makhzani et al. (2015), have also been tested in other areas of science, such as Cosmology Ravanbakhsh et al. (2016); Schawinski et al. (2017), Condensed Matter Physics Mosser et al. (2017), and Oncology Kadurin et al. (2016). The longitudinally segmented calorimeter simulation addressed in this work offers unique challenges due to the sparsity of hit cells, the non-uniform granularity among the detector layers, and their sequential structure. In addition to enabling physics analyses at the LHC, the CaloGAN may form a base for solving similar computationally intensive modeling problems in other domains of science, medicine, and technology.

The paper is organized as follows. Section II introduces the dataset of calorimeter showers and Sec. III briefly reviews the generic GAN setup. The CaloGAN is described in Sec. IV and first results of its performance are documented in Sec. V. The paper ends with conclusions and future outlook in Sec. VI.

Ii Dataset

A detector simulation begins with a list of particles with lifetimes greater than . For each particle, we are given its type (e.g. electron, pion, etc.), its energy, and its direction. The particle type determines when and how the particle interacts with the material along its trajectory. Material interactions with the detector factorize 111Energy losses factorize, but detector readout does not. Due to threshold and digitization effects, the energy readout from two energy deposits in different detector elements need not be the same as the recorded energy from the two deposits in the same element. In detector simulations, these non-linear effects are treated after accounting for the material interactions and are therefore beyond the scope of the CaloGAN. It may be interesting in future work to consider an end-to-end generator that includes these effects, but it may not save a lot of time since simulation is much more costly than reconstruction.: the energy deposited in a calorimeter by various particles is the sum of the energy from each shower treated independently.

There are two flavors of calorimeters: electromagnetic and hadronic. Electromagnetic calorimeters are designed to stop electrons and photons, which have shallower and narrower showers compared with protons, neutrons, and charged pions. Hadronic calorimeters are thicker and deeper in order to capture penetrating radiation that forms irregular showers from nuclear interactions. In this first application of GANs to a longitudinally segmented calorimeter, we choose to focus only on electromagnetic showers. In addition to already providing the capability to simulate electrons and photons, the electromagnetic shower contains all of the new challenges described in Sec. I.

Transverse segmentation is critical for particle identification and energy calibration in an electromagnetic calorimeter. For example, the radiation pattern can be used to distinguish prompt photons from , where the distance between the two photons is for a 10 GeV at one meter from the interaction point. Pion rejection and an excellent resolution for photons in the Higgs boson discovery channel were driving factors for the design of the ATLAS Liquid Argon (LAr) electromagnetic calorimeter lar (1996), which will serve as an inspiration for the calorimeter used in this study. In particular, the calorimeter used in this study is a cube with size mm with no material in front of it. There are three instrumented layers in the radial () direction 222This is the direction that prompt neutral particles at would enter the calorimeter without any prior material interactions. with thicknesses 90 mm, 347 mm, and 43 mm. The active material is LAr and the absorber material is lead. Only the total energy per layer, that includes both the active and inactive contributions is used in what follows.

In contrast to the complex accordion geometry in the actual ATLAS calorimeter, our simplified setup (built on the Geant4 B4 example) uses flat alternating layers of lead and LAr that are 2 mm and 4 mm thick, respectively. Each of the three layers has a different segmentation, which is also not square in the first and third layers. In particular, the cells in the first layer are 160 mm 5 mm, the cells in the second layer are 40 mm, and the cells in the third layer are mm. The short direction in the first layer () corresponds to what would be the beam direction in a full experiment. In contrast, the short direction in the third layer () is perpendicular to . Table LABEL:table:dimensions summarizes the calorimeter geometry.

The training data set Nachman et al. (2017) is prepared as follows. Geant4 10.2.0 GEANT4 Collaboration (2003) is used to generate particles and simulate their interaction with our calorimeter using the Ftfp_Bert physics list based on the Fritiof Andersson et al. (1987, 1996); Nilsson-Almqvist and Stenlund (1987); Ganhuyag and Uzhinsky (1997) and Bertini intra-nuclear cascade Guthrie et al. (1968); Bertini and Guthrie (1971); Karmanov (1980) models with the standard electromagnetic physics package Burkhardt et al. (2004). Positrons, photons, and charged pions with various energies are incident perpendicular on the center of the calorimeter front. Energies in the training are uniform in the range between 1 GeV and 100 GeV. Fig. LABEL:fig:calo_image shows an example 10 GeV electron event with the exact energy deposits from Geant4 (Fig. LABEL:fig:calo_image2) and after descretizing them according to our calorimeter geometry (Fig. LABEL:fig:calo_image1). For visualization purposes, a 3-dimensional particle energy signature (Fig. LABEL:fig:3d) will be displayed in the rest of this paper as a series of three 2D images in - space (Fig. LABEL:fig:2d), where the pixel intensity represents the sum of the energies of all particles incident to that cell 333For the purposes of this study, the cell energy is the sum of the energy deposited in the lead and the argon; in practice, only the LAr energy deposits are measured. Dividing out these two components is left for future work (see Sec. VI).. The first layer can be represented as a image, the middle layer as a image, and the last layer as a image.

Iii Generative Adversarial Networks

Since their first formulation Goodfellow et al. (2014), Generative Adversarial Networks have become a rapidly increasing area of attention in the Machine Learning literature with many applications in natural image processing. However, there are far fewer applications in basic science and prior to this work, no applications in high energy physics and nuclear physics.

Generative Adversarial Networks (GANs) cast the task of training a deep generative model as a two-player non-cooperative minimax game, in which a generator network is trained concomitantly with an adversary, the discriminator network , in order to learn a target distribution . The generator learns a map from a latent space (usually chosen to be ) to the space of generated samples, while learns a map from the sample space to , the probability that a shown sample is real. Note that the map that the generator learns implicitly defines a density . The game-theoretical basis for this framework Goodfellow et al. (2014); Goodfellow (2014) ensures that if we extend the space of allowed functions that and can draw from to be the space of all continuous functions, then there exists some (and, by construction, an implicit ) that exactly recovers the target distribution , i.e., , while for every sample produced by the generator, the discriminator is maximally confused and admits a posterior of being real of \sfrac12. In order to train both and , the traditional formulation of GANs Goodfellow et al. (2014) utilizes the loss function shown in Eqn. 1.


Though the GAN framework has shown promise, stability is still a major roadblock, and various ad-hoc and theoretical improvements have been suggested, from architectural guidelines Radford et al. (2015); Odena et al. (2016); Mirza and Osindero (2014); Chen et al. (2016); Zhang et al. (2016) to reformulations of the loss specified in Eqn. 1 to move away from the Jensen-Shannon divergence Arjovsky et al. (2017); Salimans et al. (2016); Nowozin et al. (2016); Qi (2017); Mao et al. (2016); Lin (2017). As suggested in Theis et al. (2015), we are able to impose task-specific metrics which allow us to move away from loss level notions of quality and focus on task-level fidelity measures. We make the conscious decision to utilize the vanilla loss formulation as we find adequate performance with this version.

Iv The CaloGAN

Generative Adversarial Networks are explored as a tool to speed up full simulation of particle showers in an EM calorimeter. We identify this solution with the name CaloGAN.

For it to be useful in realistic physics applications, such a system needs to be able to accept requests for the generation of showers originating from an incoming particle of type at energy  444This covers a significant portion of the challenge; in practice, the fast simulation must also take the - position of incidence and the incidence angles. We leave this to future work (see Sec. VI).. We introduce an auxiliary task of energy reconstruction to condition on , a real valued variable. The Auxiliary Classifier GAN Odena et al. (2016) formalism is tested to also condition on class , but ultimately abandoned in favor of training a specific generative model for each particle type, as the authors expect that versioning and particle-specific improvements will be prioritized in any practical implementation.

In practice, energy is scaled by a factor of and multiplied to the 1024-dimensional 555The dimensionality of the latent space is a hyper-parameter that need not be the same as the dimensionality of the target space. latent space vector . The generator then maps this input to three gray-scale image outputs with different numbers of pixels, which represent the energy patterns collected by the three calorimeter layers as the requested particle propagates through them. The discriminator accepts the three images as inputs, along with , the chosen value for the particle energy. The inputs are mapped to a binary output that classifies showers into real and fake, and a continuous output which calculates the total energy deposited in the three layers, then compares it with the requested energy .

iv.1 Model Architecture

Given the sparsity levels and high dynamic range in the data described in Section II, we follow the LAGAN guidelines de Oliveira et al. (2017) to modify the DCGAN Radford et al. (2015) architecture for this specific regime.

In the generator (shown in Fig. 1), our design combines parallel LAGAN-like processing streams with a trainable attention mechanism that encodes the sequential connection among calorimeter layers. The LAGAN submodules are composed of a 2D convolutional unit followed by two locally-connected units with batch-normalization Ioffe and Szegedy (2015) layers in between. The dimensionality and granularity mismatch among the three longitudinal segmentations of the detector demand separate streams of operations with suitably sized kernels. Towards providing a readily adaptable tool, we provide an architecture construction that is simply a function of the desired output image size, as we seek a common denominator that can be readily applied to a variety of particles in order to obtain reasonable baselines in a quick R&D cycle.

Modelling the sequential nature of the relationship among the energy patterns collected by the three layer requires extra care. Drawing inspiration from Zhang et al. (2016), we choose an attention mechanism to allow dependence among layers, in which we define trainable transfer functions to optimally resize and apply knowledge of the energy pattern in previous layers to the generation of the subsequent layer readout. More specifically, in-painting takes as input a resized image from a previous layer, , and the hypothesized image from the current layer, , and learns a per-pixel attention weight via a weighting function such that the pre-ReLU version of the current layer is , where is the Hadamard product. This end-to-end trainable unit can utilize information about the two layers to decide what information to propagate through from the previous particle deposition. An alternative architectural choice that includes a recurrent connection will be subject of future studies.

Leaky Rectified Linear Units Maas et al. (2013) are chosen as activation functions throughout the system, with the exception of the output layers of , in which we prefer Rectified Linear Units Glorot et al. (2011) for the creation of sparse samples de Oliveira et al. (2017).

In the discriminator (shown in Fig. 2), the feature space produced by each LAGAN-style output stream is augmented with a sub-differentiable version of sparsity percentage 666This is the fraction of hit pixels (occupancy). Sub-derivatives are a generalization of derivatives to cases with kinks., as well as minibatch discrimination Salimans et al. (2016) on both the standard locally connected network-produced features and the output sparsity itself, to ensure a well examined space of sparsities. These are represented in Fig. 2 by the ‘features’ vector.

The discriminator is further customized with domain-specific features to ensure fidelity of samples. Given the importance of matching the requested energy , directly calculates the empirical energy per layer , , as well as the total energy . Minibatch discrimination is performed on this vector of per-layer energies to ensure a proper distributional understanding. We also add as a feature, as well as with GeV – a binary, sub-differentiable feature which encodes the tolerance for GAN-produced scatterings to be incorrect in their reconstructed energy.

Further specifications of the exact hyper-parameter and architectural choices as well as software versioning constraints are available in the source code Paganini et al. (2017).

Figure 1: Composite Generator, illustrating three stream with attentional layer-to-layer dependence.
Figure 2: Composite Discriminator, depicting additional domain specific expressions included in the final feature space.

Two additional architectural modifications were tested in order to build a particle-type conditioning system directly into the learning process. Neither the AC-GAN Odena et al. (2016) nor the conditional GAN Mirza and Osindero (2014) frameworks were able to handle the substantial differences among the three particle types.

We suspect that both a significantly richer model and a larger latent space could alleviate some problems associated with conditioning using the investigated approaches. Although building a fully joint model is an interesting Machine Learning challenge, the practicality and flexibility of this application may suffer from having one single model for all particle showers.

iv.2 Loss Formulation

In this work, we augment the classical adversarial loss term (Eqn. 1) – which penalizes the system whenever fails to classify samples originating from generated or target distributions – with a mean absolute error term:


where , is the requested energy, and is the reconstructed energy. This allows us to penalize instances of too little or too much deposited energy. This solution not only helps ensuring the confinement of the generated energy to a desirable range, but it also allows to encode a ‘soft’ physical notion of conservation of energy, according to which no more energy than the initial of the incoming particle can be physically collected by the detector.

Note, however, that this formulation discourages, but does not forbid, a deposition of more energy than was requested. We can remedy this unphysical result by sampling from a conditional distribution until energy preservation is met. This issue is further addressed in Sec. V.2.

During training, the generator will maximize Eqn. 3, and the discriminator will maximize Eqn. 4.


iv.3 Training Strategy

is set to 0.05 to down-weight the importance of compared to and rescale the absolute error, which is measured in GeV, to a comparable range with respect to . This hyper-parameter can be tuned in a systematic way, but with minimal tuning, we were able to find a reasonable value.

The weights in the generator and discriminator are optimized in an alternating fashion over a set of 100,000 Geant4-simulated events for each particle type in batches of 256, using the Adam optimizer Kingma and Ba (2014). The discriminator has a learning rate of , and the generator has a learning rate of . We note that outside of initial rough hyper-parameter tuning, we perform no dedicated optimization per particle type, and simply apply the same training parameters to all three networks. We expect significant performance improvements (especially for pions) with dedicated training.

Each system is trained for 50 epochs. Sixteen NVIDIA K80 graphics cards are used for initial hyper-parameter sweeps, with two Titan X Pascal Architecture cards used for final training. Keras v2.0.3 Chollet (2017) is used to construct all models, with the TensorFlow v1.1.0 backend Abadi et al. (2015).

V Performance

As discussed in Theis et al. (2015), there exist several methods to qualitatively and quantitatively assess the performance of generative networks, but not all evaluation criteria are equally suitable and reliable for all applications. In this paper, we choose application-driven methods focused on sample quality. A first qualitative assessment will be accompanied by a quantitative evaluation based on physics-driven similarity metrics. The choice reflects the domain specific procedure for data-simulation comparison. These similarity metrics are based on one-dimensional statistics of the shower probability distribution. Visualizing and verifying the performance in higher dimensions is a challenge. One way to probe study the modeling of higher dimensions is to study ability to classify showers from different particles. This is studied in Sec. V.3.

v.1 Qualitative Assessment

We first examine the average calorimeter deposition per voxel (a volumetric pixel). On average, the systems learn a complete picture of the underlying physical processes governing the cascades of , , and with uniform energy between GeV and GeV (Figs. 3,  4, and 5).

Diversity and overtraining concerns can be investigated by considering the nearest neighbors among the training and generated datasets. Figs. 67, and 8 shows five randomly selected events and their GAN-generated nearest neighbors for all three calorimeter layers for , and showers respectively. Good qualitative agreement can be found between the two distributions across all layers, without obvious signs of mode collapse: a failure mode in which the generator learns to produce a small subset of samples from the distribution. Compared to the other two particle types explored in this application, at the individual image level, charged pions clearly display a higher degree of complexity and diversity in their showers. Some deposit energy in all cells of a given layer, some only hitting a handful of them. This is because charged pions undergo nuclear interactions in addition to electromagnetic interactions.

Figure 3: Average Geant4 shower (top), and average CaloGAN shower (bottom), with progressive calorimeter depth (left to right).
Figure 4: Average Geant4 shower (top), and average CaloGAN shower (bottom), with progressive calorimeter depth (left to right).
Figure 5: Average Geant4 shower (top), and average CaloGAN shower (bottom), with progressive calorimeter depth (left to right).
Figure 6: Five randomly selected showers per calorimeter layer from the training set (top) and the five nearest neighbors (by euclidean distance) from a set of CaloGAN candidates.
Figure 7: Five randomly selected showers per calorimeter layer from the training set (top) and the five nearest neighbors (by euclidean distance) from a set of CaloGAN candidates.
Figure 8: Five randomly selected showers per calorimeter layer from the training set (top) and the five nearest neighbors (by euclidean distance) from a set of CaloGAN candidates.

v.2 Shower Shapes

Electron and photon classification and energy calibration use properties of the calorimeter shower Aad et al. (2012, 2014); Aaboud et al. (2016b, 2017). These same features can be used to quantitatively assess the quality of the GAN samples. The list of features used for evaluation is provided in Table 3 in Appendix A. The key physical quantity that governs the shapes of these distributions is the number of radiation lengths that are traversed by the particle. By definition, is the distance an electron will travel before its energy is reduced to on average. The equivalent distance for photons is slightly further (by 9/7 Olive et al. (2014)) and is set by the mean free path for pair production. The transverse shower size is also proportional to . For a brief review, see e.g. Olive et al. (2014).

The 1-dimensional distributions for Geant4- and GAN-generated samples are available in Fig. 9. Although the sparsity levels per layer are only roughly matched, note that, for the majority of the remaining variables, the GAN picks up on complex features in the distributions across several orders of magnitude and all particles types. The unique features that pions exhibit, compared to the other particles, make it unfavorable to train a single model for multiple particle types.

Note that shower shape variables were not explicitly part of the training, which is based only on the distribution of pixel intensities and energy. In the future, one can integrate the shower shape distributions into the loss-function itself. For now, we have left them out for a comprehensive validation assessment.

Figure 9: Comparison of shower shape variables, introduced in Table 3, and other variables of interest, such as the sparsity level per layer, for the Geant4 and CaloGAN datasets for , and .

In addition to comparing shower shapes to reference distribution, we want to measure the quality of conditioning on energy. As outlined in Sec. IV.2, we cannot explicitly impose conservation of energy, but one can devise a simple sampling system to only keep simulated showers that obey this constraint.

As can be noted in the example in Fig. 10, our loss formulation coupled with the uniform training distribution admits an approximately symmetric conditional output energy distribution. In Fig. 10, note that the vertical lines that approximately coincide with the mode of each distribution represent the requested energy, and could easily be used as a threshold on selecting physical events. A noteworthy feature of this system is that one can request energies that lie outside the trained region (capped at 100 GeV in this application), to which a trained CaloGAN will return samples around the requested energy level – though with broader width, and mode shifted towards the training domain. Whether or not these extrapolated samples obey shower shape distributions and other metrics is left as future work.

Figure 10: Post energy-conditioned empirical energy response for incident at 1, 25, 50, 100, and 150 GeV. Though our model is only trained on the unform range between 1 and 100 GeV, it still admits a compelling peak at 150 GeV.

v.3 Classification as a Performance Proxy

Transferability of classification performance from GAN-generated samples to Geant4-generated samples can be used as a proxy both for CaloGAN image quality and potential utility in a practical fast simulation setting.

We perform ten identical trainings of simple six-layer fully-connected and classifiers, and we report the accuracies for in-domain and out-of-domain testing (Table 1) along with the following observations:

  • when training on Geant4, testing on the generated CaloGAN data set yields similar results to testing on a separate Geant4 data set, leading us to believe that the GAN has learned most of the discriminating physics between the classes of particles. Note that percent-level differences in accuracy may however be relevant for particular applications;

  • the significantly higher performance obtained on the CaloGAN-generated test set when training on a separate dataset of CaloGAN-generated images highlights a greater inter-class differentiation in the GAN synthetic dataset than originally present in the target Geant4 distribution.

This could either be due to new unphysical, class-dependent features produced by the GAN, or to the inability of the GAN to cover the entire feature space for at least one of the particle classes. It is likely that both of these contribute. To some extent, unphysical features are mitigated by the discriminator network of the GAN training itself, but both physical and unphysical features that are not very useful for distinguishing real from fake could turn into very useful features for the two-particle classification case. Such information would therefore appear discriminative in GAN images but not in Geant4. While classification is a useful metric for probing the high-dimensional feature space and shows promising results, there are still challenges for interpreting and improving upon the outcome.

Test on
Geant4 CaloGAN
Geant4 99.6% 0.1% 96.5% 1.1%
Train on CaloGAN 98.2% 0.9% 99.9% 0.2%
Test on
Geant4 CaloGAN
Geant4 66.1% 1.2% 70.6% 2.6%
Train on CaloGAN 54.3% 0.8% 100.0% 0.0%
Table 1: Mean and standard deviation over 10 particle classification trials using a six-layer fully-connected network with dropout. The networks are trained using a dataset from the domain specified in the first column, and tested on an independent dataset from the domain specified in the header.

v.4 Computational Performance

In addition to the promise of being a high-fidelity fast simulation paradigm and respecting many shower shape variables, the CaloGAN affords many orders of magnitude in computational speedups 777Note that non-distributed training can take day(s), depending on the total number of training epochs, but it always executes in constant time regardless of the number of showers requested at generation time, so it is a fixed cost that is not relevant for the majority of the total computing budget.. We benchmark generation time on with incident energy drawn uniformly between 1 GeV and 100 GeV. Geant4 and CaloGAN on CPU are benchmarked on nearly identical compute-nodes on the PDSF distributed cluster at the National Energy Research Scientific Computing Center (NERSC), and numerical results are obtained over an average of 100 runs. CaloGAN on GPU hardware is benchmarked on an Amazon Web Service (AWS) p2.8xlarge instance, where a single NVIDIA® K80 is used for the purposes of benchmarking.

In Table 2, we show the time-to-generate a single particle shower in milliseconds. We provide different batch sizes for CaloGAN, as we expect different use-cases will have different demands around batching computation. We note that a batch can accept any number of different requested energies. With the largest batch sizes on GPU, our method admits a speedup of 5 orders of magnitude compared to the single-threaded Geant4 benchmark. In addition, generation time with Geant4 scales with incident energy, whereas computational time is flat as a function of incident energy for the CaloGAN.

Simulator Hardware Batch Size ms/shower
Geant4 CPU N/A 1772
1 13.1
10 5.11
128 2.19
CPU 1024 2.03
1 14.5
4 3.68
128 0.021
512 0.014
CaloGAN GPU 1024 0.012
Table 2: Total expected time (in milliseconds) required to generate a single shower under various algorithm-hardware combinations

v.4.1 Implementation Notes

As noted previously in Sec. IV.1, separating per-particle-type CaloGAN architectures and implementations affords many benefits. It is easy to imagine a situation where the life cycles surrounding models for different particle types are very different. In addition, this allows for total independence of versioning, framework, or language.

When possible, any GAN should maximally employ batching – we imagine most applications can request all showers from one event simultaneously, maximally taking advantage of CPU/GPU while minimizing data transfer overhead.

Vi Conclusions and Future Outlook

Using modern generative deep neural network techniques, we have generated three-dimensional electromagnetic showers in a multi-layer sampling LAr calorimeter with uneven spatial segmentation, while attempting to preserve spatio-temporal relation among layers. Our approach infused Physics domain knowledge and reproduced many aspects of key shower shape properties comparable to the ones in the Geant4 full simulation. We showed the possibility of up to five orders of magnitude decrease in computing time.

Future work will focus on improving performance by drawing from the recent Machine Learning developments in GAN training procedures, as well as testing the direct inclusion of important shower shape variables as constraints at training time. Further developments will build on this result and continue expanding the complexity of the training dataset to include incoming particles at different locations and angles within the detector, as well as the hadronic calorimeter. Concurrent plans include contributing to testing the computational performance on high performance computing (HPC) clusters, and porting these solutions into the simulation packages used by the nuclear and particle physics communities, in order for the various experiments to be able to maximally benefit from this new technology.

This work was supported in part by the Office of High Energy Physics of the U.S. Department of Energy under contracts DE-AC02-05CH11231 and DE-FG02-92ER40704. The authors would like to thank Wahid Bhimji, Zach Marshall, Mustafa Mustafa, and Prabhat, for helpful conversations.


Appendix A Shower Shape Variables

Table 3 contains the description and mathematical definition of the shower shape variables used to compare the generated and target distributions. These are defined as functions of , the vector of pixel intensities for an image in layer , where .

Shower Shape Variable Formula Notes
Energy deposited in the layer of calorimeter
Total energy deposited in the electromagnetic calorimeter
Fraction of measured energy deposited in the layer of calorimeter
Difference in energy between the highest and second highest energy deposit in the cells of the layer, divided by the sum
Deepest calorimeter layer that registers non-zero energy
Depth-weighted total
The sum of the energy per layer, weighted by layer number
Shower Depth, The energy-weighted depth in units of layer number
Shower Depth Width, σ_s_d = ∑i=02i2IiEtot- (∑i=02i⋅IiEtot)^2 The standard deviation of in units of layer number
Layer Lateral Width, The standard deviation of the transverse energy profile per layer, in units of cell numbers
Table 3: One-dimensional observables used to assess the quality of the GAN samples
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description