Detecting gravitationalwave transients at five sigma:
a hierarchical approach
Abstract
As secondgeneration gravitationalwave detectors prepare to analyze data at unprecedented sensitivity, there is great interest in searches for unmodeled transients, commonly called bursts. Significant effort has yielded a variety of techniques to identify and characterize such transient signals, and many of these methods have been applied to produce astrophysical results using data from firstgeneration detectors. However, the computational cost of background estimation remains a challenging problem; it is difficult to claim a detection with reasonable computational resources without paying for efficiency with reduced sensitivity. We demonstrate a hierarchical approach to gravitationalwave transient detection, focusing on longlived signals, which can be used to detect transients with significance in excess of using modest computational resources. In particular, we show how previously developed seedless clustering techniques can be applied to large datasets to identify highsignificance candidates without having to trade sensitivity for speed.
Introduction. With secondgeneration gravitationalwave (GW) detectors coming online later this year, the first direct detection of GWs may be near. In order to establish the significance of a detection, it is common to report a false alarm probability (FAP), which quantifies the probability that a noise fluctuation could produce an event at least as loud as the observed candidate (as measured by some detection statistic). In some subfields, e.g., particle physics,“ significance,” corresponding to , is used as a detection threshold.
In order to estimate the FAP of a GW candidate, it is common to perform timeshifts in which the GW strain time series from one detector is shifted with respect to the series from a second detector by an amount greater than both the travel time between the detectors and the coherence time of the signals being targeted. Timeshifting preserves nonGaussian and nonstationary features that characterize the zerolag (no timeshift) noise, while simultaneously eliminating true GW signals. By performing timeshifts, it is possible to generate a distribution of the detection statistic, which can be used to estimate FAP to a level of . The threshold corresponds to . In many cases it is computationally impractical to carry out this many timeshifts, though, it has been accomplished in the “detection” of a LIGO blind injection with a matched filter search Abadie et al. (2012a). Despite the pervasive use of timeshifts, there are limitations Was et al. (2010).
For many transient GW searches, the significant computing costs incurred by background estimation arise from a desire to use a coherent detection statistic. Coherent algorithms utilize the complexvalued crosspower obtained by crosscorrelating strain data from detectors instead of or in addition to the incoherent autopower observed in each detector separately; see, e.g., Sutton P. et al. (2010); Klimenko et al. (2008); Thrane et al. (2011); Abadie et al. (2012b). The extra phase information helps differentiate signal from background, improving the sensitivity of the search. However, the cost of background estimation for coherent searches is relatively large compared to a comparable incoherent search because the detection statistic must be recalculated for each timeshift, after the fresh application of a clustering algorithm. Some algorithms use singledetector autopower exclusively, which allows for much more rapid background estimation Anderson et al. (2001); Abadie et al. (2010).
In this Letter, we describe a hierarchical approach to background estimation in the context of a search for longlived, unmodeled GW transients using seedless clustering Thrane and Coughlin (2013, 2014). First, we identify “events” using a computationally intensive, but incoherent, singledetector statistic. Second, we calculate a computationally fast, coherent detection statistic for each event identified with the singledetector statistic. The second, coherent detection statistic is used to evaluate significance. By splitting the calculation into an incoherent stage and a coherent stage, the computationally intensive calculations are carried out just once, allowing rapid background estimation without sacrificing the sensitivity gained by the use of coherence.
We demonstrate this technique by estimating the background—past the level—for one week of simulated Monte Carlo data and one week of Initial LIGO noise, recolored to resemble data from Advanced LIGO Aasi et al. (2015). We calculate the sensitivity of this mock search for several toy model waveforms and find that it is not adversely affected by the incoherent stage. The remainder of this Letter is organized as follows. We review the basics of transient identification with seedless clustering and describe the details of the new hierarchical detection statistic, we describe a mock data analysis carried out on one week of Monte Carlo data and one week of recolored Initial LIGO noise, and we present results demonstrating the ability to estimate background at the level.
Method. In previous work, we have described seedless clustering Thrane and Coughlin (2013, 2014); Thrane et al. (2015); Coughlin et al. (2014, 2015), a technique in which GW transients are identified by looking for clusters of excess coherence integrated along many different parametrized curves through frequencytime space. In particular, cubic Bézier curves Farin (1996) provide a useful family of curves suitable for the detection of bursting sources lasting tens of seconds to weeks, which slowly evolve in frequency, but which are approximately narrowband on short time scales Thrane and Coughlin (2013, 2014); Thrane et al. (2015). Such longlived signals Thrane et al. (2011), created, e.g., by rotational instabilities in nascent neutron stars Piro and Pfahl (2007); Piro and Ott (2011); Piro and Thrane (2012); Corsi and Mészáros (2009) or by the fragmentation of an accretion disk Kiuchi et al. (2011); van Putten (2001, 2008), are potentially detectable by secondgeneration detectors Thrane and Coughlin (2013, 2014).
Previous applications of seedless clustering have been employed to look for clusters of excess coherence using a coherent statistic Thrane and Coughlin (2014); Thrane et al. (2015):
(1) 
Here, is the Fourier transform of the strain time series in detector for a data segment centered on time with frequencies . is the autopower spectrum calculated using (typically nine) neighboring segments while is a Fourier normalization constant. By integrating along suitable curves in frequencytime space (and applying a phase factor to “point” in different sky directions), one can define a signaltonoise ratio for the cluster ^{1}^{1}1We correct the normalization factor given in Eq. 5 of Thrane and Coughlin (2014): the coefficient is as opposed to .
(2) 
Here, is a template describing the set of spectrogram pixels in the parametrized (Bézier) curve, is the number of pixels in the curve, and is the directiondependent time delay between two detectors.
By calculating for many different templates and for many different time delays , it is possible to find very weak signals buried in noise Thrane and Coughlin (2014). The loudest event in each spectrogram is deemed to have a signaltonoise ratio:
(3) 
is a coherent detection statistic. The observed value of is compared to a distribution of generated using timeshifts in order to assign a falsealarm probability.
To reap the benefits of seedless clustering, it is desirable to employ a large number of templates: for a single spectrogram The calculation can be sped up by parallelization, but it becomes prohibitive to repeat times on timeshifted data, the number of realizations required to carry out background estimation. In order to eliminate this computational bottleneck, we introduce a singledetector, incoherent statistic
(4) 
Note that is equivalent to for the case where the two detector indices are equal . It is the ratio of the autopower in detector at time to the autopower in detector at the times neighboring .
While the seedless clustering algorithms described above were developed in the context of a coherent search using , it is straightforward to apply them to the new incoherent statistic by simply defining a new singledetector signaltonoise ratio:
(5) 
The phase factor from Eq. 2 vanishes because the time delay between two (now identical) detectors is zero. As before, we identify the most significant cluster in each spectrogram. It is necessary to do this separately for each detector: .
While and are similarly defined, it is important to note differences in their statistical behavior. If no signal is present, the distribution of is peaked symmetrically about zero. However, the distribution of is positive definite and asymmetric with a peak near one. Thus, [calculated from ] and [calculated from ] have very different distributions. While corresponds to a highly significant event, is typical for noise.
If we stopped here, and used as a detection statistic, the method would rely on an incoherent statistic. However, we can do much better by using clusters identified with to calculate a coherent statistic. The loudest cluster as ranked by the singledetector statistic is denoted . The coherent signaltonoise ratio for this cluster is:
(6) 
The term indicates that the sum is carried out for evenly spaced values of , sufficient to match the diffractionlimited resolution. represents the coherent signaltonoise ratio in detectors and for the loudest cluster identified using only the autopower from detector . Last, we define the detection statistic as the maximum value of among the two detectors: .
To illustrate why this hierarchical design is useful, it is helpful to describe the procedure of background estimation as a series of numbered steps.

We break the coincident data into conveniently long spectrograms to be analyzed for clusters. In this Letter, we use overlapping, spectrograms.

For each spectrogram, we identify the loudest cluster in each detector using the singledetector, incoherent statistic .

If is less than some predetermined threshold , proceed no further. The cluster is not promising enough to spend time calculating the coherent statistic.

For each spectrogram—if there is a cluster passing the cut —we calculate the coherent detection statistic .

Take the spectrogram data produced in step 1 and timeshift the clusters in one detector to create a new noise realization. Repeat steps 2–4 to generate a background distribution for .
Since steps 2–3 use a singledetector statistic, we obtain the same list of singledetector clusters every time—it does not matter if the data streams are shifted with respect to one another. This means that we can carry out steps 2–3 just once and reuse the results for subsequent timeslides. This is important because step 2 (cluster identification) is the computationally expensive step. We still need to recalculate the coherent statistic for each timeslide, but this is a cheap calculation since we have reduced the number of templates from per spectrogram to one (and only a fraction of spectrograms will contain a cluster passing the step 3 cut). The zerolag data are analyzed identically with the same hierarchical process, ensuring that the background estimation can be used to identify detections.
Mock data analysis. We carry out two mock data analyses. For the first analysis, we analyze Monte Carlo noise. The simulated noise is Gaussian, stationary in time, and colored according to the design sensitivity of Advanced LIGO Aasi et al. (2015). For the second analysis, we analyze timeshifted data from Initial LIGO, which has been recolored to resemble Advanced LIGO noise at design sensitivity. The recolored noise preserves nonGaussian noise artifacts that are present in real data, but not in Monte Carlo simulations. Of course, the character of the nonGaussian noise present in Advanced LIGO noise is likely to be different from that of Initial LIGO as the detectors are significantly different. The recoloured noise results should therefore be taken as an plausible approximation of Advanced LIGO noise. In both cases, we use a twodetector network consisting of the LIGO Hanford and Livingston observatories.
In order to estimate the background for one week of data, we analyze of data, corresponding to , long, overlapping spectrograms. The spectrograms span a frequency range of – and are composed of pixels, overlapping in time. This choice of spectrographic resolution is suitable for many longlived transient waveforms with slowly varying frequency Piro and Thrane (2012); Thrane and Coughlin (2013, 2014); Aasi et al. (2013).
In order to impose the constraints of a realistic search, we assume that the seedless clustering parameters are tuned just once. We employ cubic Bézier curves with a minimum duration of . For each spectrogram, we employ templates to find the loudest autopower cluster for each detector . The difference in frequency between the first and third Bézier points determine the extent to which the signal may vary in time. We limit this variation so that the frequency of the third control point is within of the frequency of the first control point.
We consider three waveforms, which we have previously used to benchmark past studies using seedless clustering Thrane and Coughlin (2013, 2014). In particular, we employ two downchirping accretion disk instability (ADI) signals from Santamaría and Ott (2011) and one upchirping fallback accretion (FA) signal from Piro and Thrane (2012). Following Thrane and Coughlin (2013, 2014), the accretion disk instability signals are scaled so that the isotropic equivalent energy is . The spectrographic properties of the signals are described in Tab. 1. We follow the naming conventions adopted in Thrane and Coughlin (2013, 2014).
waveform  duration (s)  – (Hz)  median strain 

ADI 1  –  
ADI 2  –  
FA 2  – 
We do not include the results for a fourth waveform FA 1 from Thrane and Coughlin (2013, 2014), which persists for and evolves from –. The implementation described above performed poorly detecting this waveform because it is relatively short and it evolves rapidly in frequency. It is possible that we could effectively detect signals of this type using a different tuning from the one we describe above, but we seek to apply a uniform set of parameters, which are optimized for longer, lessquicklyevolving signals.
Above, we listed five steps necessary for carrying out the hierarchical search. The blue curve in Fig. 1a shows the FAP associated with the singledetector statistic obtained after steps 1–2 with recolored noise. FAP is defined for a single spectrogram so corresponds to a false alarm rate of . The vertical lines in Fig. 1a show the median values of the singledetector statistic for data that includes an injected ADI 2 signal injected faceon at an optimal sky location. The different colors indicate different source distances.
For step 3, it is necessary to define a threshold for the singledetector statistic. This threshold is applied in order to control the computational costs. By throwing out belowthreshold candidates, we reduce the number of surviving clusters to include in subsequent steps. The choice of is a balancing act. Increasing the threshold reduces the computational burden, but it can also reduce sensitivity by inadvertently discarding true signals. For the Monte Carlo analysis described here, we find that eliminates of background triggers (thereby reducing the computational burden of steps 4–5 by a hundred fold) without severely harming sensitivity. The step 3 cut is indicated by dashed black line on Fig. 1a. Note that (faceon, optimally oriented) signals with distance are likely to survive the cut. For recolored noise, the threshold is selected to be .
Having eliminated of the clusters, we proceed to calculate . In Fig. 1b, we plot FAP as a function of for recolored noise. The different colored vertical lines indicate the median value of for injected signals (faceon, optimal orientation). If an injected signal does not pass the step 3 autopower cut in at least one detector, it is assigned a value of . The blue curve starts at because of the clusters are removed by the autopower cut. The horizontal dashed lines indicate the required significance for , , and detections in one week of data. Recall that FAP is measured in reference to a spectrogram. One week corresponds to (overlapping) spectrograms and so corresponds to .
From Fig. 1b, it is apparent that the signals that we avoided cutting in step 3 (Fig. 1a) are detectable with the coherent statistic at the level. From Figs. 1a and b, we see why the hierarchical approach is superior to the singledetector statistic. The coherent statistic yields a FAP which is many orders of magnitude smaller. Next, we consider how the sensitivity is affected by the use of an incoherent statistic as an intermediate step. We compare to for mock signals that are loud enough to detect with the hierarchical scheme at and find that they are comparable to within . Somewhat surprisingly, the hierarchical scheme is actually slightly more sensitive in this regime because it searches the sky in a systematic way whereas the statistic pairs each track with a random sky position guess Thrane and Coughlin (2014).
The initial clustering procedure (corresponding to steps 1–3) is carried out on Kepler GK104s Graphics Processor Units (GPUs) at a cost of continuously running GPUs. We find that Intel Xeon E52670 Central Processing Unit (CPU) cores are required to match the performance of one GPU. Steps 4–5, in which we timeshift the data in order to calculate the coherent statistic , are performed using single CPU cores at a cost of continuously running cores in order to achieve the background estimation necessary to identify detection candidates. The calculation in steps 4–5 does not, at present, benefit dramatically from GPU acceleration. To put this in perspective, a fully coherent search, calculating for each timeslide with seedless clustering would require continuously running GPUs.
In Tab. 2, we summarize the results of a sensitivity study in which we estimate detection distance: the distance at which we can detect a waveform with a FAP corresponding to with a false dismissal probability (FDP) of . We consider the cases of faceon, optimally oriented sources (useful for comparison with previous work Thrane and Coughlin (2013, 2014)) and also randomly oriented sources with random sky locations. The first column is the waveform, the second is the type of noise (MC = Monte Carlo or RC = recolored noise), and the second is the detection distance. The RC detection distances are sometimes smaller than the MC distances because nonstationary noise artifacts present in real data tend to decrease the sensitivity compared to idealized Gaussian noise.
Waveform  Noise  distance (Mpc) 

ADI 1 optimal  MC  250 
RC  250  
ADI 1 random  MC  150 
RC  127  
ADI 2 optimal  MC  232 
RC  232  
ADI 2 random  MC  139 
RC  130  
PT 2 optimal  MC  30 
RC  25  
PT 2 random  MC  18 
RC  14 
In order to put Tab. 2 in context, we compare the faceon, optimal sky location, detection distances presented here—estimated for one week of data—to the detection distances from Thrane and Coughlin (2014)—estimated for a single spectrogram with a smaller wide band. Despite the fact that the results here include an additional trial factor, the hierarchical detection distances for Monte Carlo are only times less than the detection distances from Thrane and Coughlin (2014) obtained using seedless clustering. For recolored noise the ratios are . The event rate and loudness of longlived transients are unknown, but the sensitivity we demonstrate here is sufficient to potentially detect longlived signals with secondgeneration detectors Piro and Thrane (2012); van Putten (2008).
For illustrative purposes, we focus here on the detection of 5 events with minimal resources. As a result, the aggressive autopower cut eliminates some fraction of 5 signals that could be detected with a different tuning. However, the algorithm can be tuned differently to balance cost with sensitivity for different FAPs. For future work, it is worthwhile considering if this strategy can be usefully employed in other situations where the character of the noise is less wellbehaved. Moreover, the general strategy we outline here—splitting a search into a computationally expensive “incoherent” step followed by a computationally cheap “coherent” step in order to facilitate rapid background estimation—may find use in the broader community. For example, a similar scheme could prove useful in determining the significance of potentially faint electromagnetic signatures found (by telescopes) in coincidence with gravitationalwave detections. We thank Anthony Piro for sharing the fallback accretion waveforms used in this analysis. We thank Peter Shawhan and Jonah Kanner for helpful comments.
References
 Abadie et al. (2012a) J. Abadie et al., Phys. Rev. D 85, 082002 (2012a).
 Was et al. (2010) M. Was, M.A. Bizouard, V. Brisson, F. Cavalier, M. Davier, P. Hello, N. Leroy, F. Robinet, and M. Vavoulidis, Class. Quant. Grav. 27, 015005 (2010).
 Sutton P. et al. (2010) Sutton P. et al., New Journal of Physics 12, 053034 (2010).
 Klimenko et al. (2008) S. Klimenko, I. Yakushin, A. Mercer, and S. Mitselmakher, Class. Quant. Grav. 25, 114029 (2008).
 Thrane et al. (2011) E. Thrane, S. Kandhasamy, C. D. Ott, et al., Phys. Rev. D 83, 083004 (2011).
 Abadie et al. (2012b) J. Abadie et al., Phys. Rev. D 85, 122007 (2012b).
 Anderson et al. (2001) W. G. Anderson, P. R. Brady, J. D. E. Creighton, and É. É. Flanagan, Phys. Rev. D 63, 042003 (2001).
 Abadie et al. (2010) J. Abadie et al., Phys. Rev. D 81, 102001 (2010).
 Thrane and Coughlin (2013) E. Thrane and M. Coughlin, Phys. Rev. D 88, 083010 (2013).
 Thrane and Coughlin (2014) E. Thrane and M. Coughlin, Phys. Rev. D 89, 063012 (2014).
 Aasi et al. (2015) J. Aasi et al., Class. Quant. Grav. 32, 074001 (2015).
 Thrane et al. (2015) E. Thrane, V. Mandic, and N. Christensen, Phys. Rev. D 91, 104021 (2015).
 Coughlin et al. (2014) M. Coughlin, E. Thrane, and N. Christensen, Phys. Rev. D 90, 083005 (2014).
 Coughlin et al. (2015) M. Coughlin, P. Meyers, E. Thrane, J. Luo, and N. Christensen, Phys. Rev. D 91, 063004 (2015).
 Farin (1996) G. Farin, Curves and Surfaces for CAGD, Fourth Edition: A Practical Guide (Academic Press, 1996).
 Piro and Pfahl (2007) A. L. Piro and E. Pfahl, Astrophys. J. 658, 1173 (2007).
 Piro and Ott (2011) A. L. Piro and C. D. Ott, Astrophys. J. 736, 108 (2011).
 Piro and Thrane (2012) A. L. Piro and E. Thrane, Astrophys. J. 761, 63 (2012).
 Corsi and Mészáros (2009) A. Corsi and P. Mészáros, Astrophys. J. 702, 1171 (2009).
 Kiuchi et al. (2011) K. Kiuchi, M. Shibata, P. J. Montero, and J. A. Font, Phys. Rev. Lett. 106, 251102 (2011).
 van Putten (2001) M. H. P. M. van Putten, Phys. Rev. Lett. 87, 091101 (2001).
 van Putten (2008) M. H. P. M. van Putten, Astrophys. J. Lett. 684, L91 (2008).
 Aasi et al. (2013) J. Aasi et al., Phys. Rev. D 88, 122004 (2013).
 Santamaría and Ott (2011) L. Santamaría and C. D. Ott, Gravitational wave emission from accretion disk instabilities  analytic models (2011), https://dcc.ligo.org/LIGOT1100093v2/public.