# Investigating Binary Black Hole Mergers with Principal Component Analysis

## Abstract

Despite recent progress in numerical simulations of the coalescence of binary black hole systems, highly asymmetric spinning systems and the construction of accurate physical templates remain challenging and computationally expensive. We explore the feasibility of a prompt and robust test of whether the signals exhibit evidence for generic features that can educate new simulations. We form catalogs of numerical relativity waveforms with distinct physical effects and compute the relative probability that a gravitational wave signal belongs to each catalog. We introduce an algorithm designed to perform this task for coalescence signals using principal component analysis of waveform catalogs and Bayesian model selection and demonstrate its effectiveness.

## 0.1 Introduction

The coalescence of two black holes is arguably the most powerful source of gravitational waves (GWs) detectable by the second generation of ground based detectors: Advanced LIGO (Harry:2010zz), Advanced Virgo (Acernese:2009), and KAGRA (Somiya:2011np). The discovery of these signatures, forecast within the next few years (Aasi:2013wya), will open a new era of gravitational wave astrophysics, where the GW signature will provide insights on the physics of the source.

To decode the information in the GW waveform, we need a careful mapping with the masses and the spin magnitude and orientation of the black holes; this is the charge of numerical relativity (NR). While available NR waveforms span an increasing portion of the physical parameter space of unequal mass, spin and precessing binary black holes (BBHs) (Ajith:2012az; Hinder:2013oqa), each simulation takes a week or more to run. A complete coverage of the full parameter space remains a slow but important endeavor to enable GW matched filtering and parameter estimation (Thorne:87; Aasi:2013jjl).

The LIGO and Virgo Collaborations have refined techniques for the search of generic GW transients, or bursts, which don’t assume a specific waveform but rely on a coherent GW in multiple detectors for a variety of plausible sources (Abadie:2012rq; Andersson:2013mrx). The work presented here aims to answer the question of how a transient detected by a template-less burst search can trigger new NR simulations in interesting regions of the BBH parameter space. We introduce a proof-of-concept study, which uses the method of Principal Component Analysis (PCA) to compare a plausible signal to catalogs of NR waveforms, which represent certain regions of the BBH physical parameter space.

## 0.2 Binary Black Hole Merger Simulations

Name | Q | HR | RO3 |

\svhline | |||

Mass Ratio, | 1 – 2.5 | 1 – 4 | 1.5 – 4 |

Spin magnitude, | 0.0 | 0.4, 0.6 | |

Tilt Angle, | 0.0 | 0.0 | |

N waveforms | 13 | 15 | 20 |

The GW waveform produced by solar and intermediate mass BBH systems spans the sensitive band of ground based detectors through the inspiral, merger and ringdown phases. While post-Newtonian and perturbation theories adequately describe the inspiral and ringdown, numerical relativity is necessary to capture the physics of the merger. NR has been probing the parameter space of binary black hole mergers since the breakthrough of 2005 (Pretorius:2005gq) achieving extreme mass ratios (Lousto:2010ut), extreme spin magnitudes (Lovelace:2011nu) and many precessing runs (Mroue:2013xna; Pekowsky:2013ska).

The NR waveforms used in this paper were produced by the Maya code of the Georgia Institute of Technology (Vaishnav:2007nm)
The Maya code uses the Einstein Toolkit^{1}^{2}

For this work we use 48 NR runs, listed in Table 0.1 without hybridization with post-Newtonian waveforms. The Q-series contains 13 non-spinning, unequal-mass simulations. We use 15 runs from the HR-series, a set of unequal-mass, equal spin simulations, with initial spin parallel to the initial angular momentum. The RO3-series is a set of 20 unequal-mass simulations with the lighter black hole spin aligned to the initial angular momentum (z-axis) and the other black hole at a tilt angle with the z-axis in the xz-plane; these systems are precessing and the tilt-angles are defined at a specific separation of the black holes at one instant in the evolution of the binary system and change in time. While the runs are tabulated with initial parameters, there is no functional form to relate one waveform to the next; we use a Principal Component Analysis to determine the main features of each catalog.

## 0.3 Principal Component Analysis and Bayesian Model Selection

We parametrize the NR waveform catalogs of §0.2 with an orthonormal set of principal components (PCs), obtained with a standard singular value decomposition (heng:09; roever:09). For a catalog of waveforms with samples, we create a matrix H whose columns corresponds to each waveform. We then factorize the resulting matrix H so that:

(0.1) |

where U is an matrix whose columns are the eigenvectors of and V is an matrix whose columns are eigenvectors of . The matrix will have all zeros, except for the terms, which correspond to the square root of the th eigenvalue. contains the catalog’s PCs, ranked by their corresponding eigenvalue: the first column is the first PC, which encapsulates the most significant features common to all waveforms in the catalog, the second column, corresponding to the second largest eigenvalue, describes the second most significant common features in the catalog, and so on. The waveforms in H can be reconstructed as a linear combination of PCs:

(0.2) |

where is the catalog waveform, is the th PC and is the corresponding coefficient, obtained by projecting onto . The sum over PCs is an approximation of the desired waveform, since in general . In this analysis, the choice of is determined by the cumulative eigenvalue energy, , shown in Figure 0.1:

(0.3) |

In this analysis we use PCs, so that . This corresponds to 2, 4 and 5 PCs for the Q, HR and RO3 catalogs respectively. A selection of the waveforms from the HR catalog and corresponding PCs are shown in Figure 0.2.

*Left*: the waveforms in the HR catalog of spinning, non precessing waveforms used in this study.

*Right*: The principle component decomposition of the HR catalog.

Following the seminal work on Burst signals in (clark:07; Logue:2012zw), the PCs can be used to identify generic features for a measured waveform through the posterior odds ratio, which is widely used in GW data analysis to compare the probabilities of two competing models and . Given data , the odds ratio is the ratio of posterior probabilities for each model:

(0.4) |

where is the *prior odds ratio* which reflects any bias one has
for the models. is the evidence for model . The evidence ratio
is referred to as the Bayes’ factor and reflects the
influence of the data. To demonstrate the efficiency of our algorithm, we
assume here .
In this context, the models are the waveform catalogs and the
evidences are obtained by marginalizing over all model parameters
which are the coefficients used to construct the signal model
in equation 0.2 from the catalog’s PCs.
We adopt a uniform prior for , in a range obtained by projecting the waveforms from
each catalog onto its corresponding PCs.
As in (Logue:2012zw), the likelihood and corresponding evidences are computed with a nested sampling algorithm.
The model evidence is largest for the most parsimonious model that best
explains the data; indicates is preferred over .

## 0.4 Identifying Binary Black Hole Merger Phenomenology

We demonstrate the efficacy of the PCA-based Bayesian model selection with a Monte-Carlo analysis where simulated GW signals from each catalog are added to colored, Gaussian noise, which is generated as in Logue:2012zw. For this proof-of-principle study we assume a single aLIGO detector operating at design sensitivity in the “zero-detuned, high-power” configuration (Harry:2010zz). We make the further assumptions that the time of peak amplitude of the signal is known, that the source is optimally oriented and located on the sky with respect to the detector and, finally, that the total mass of the system is . This choice of mass ensures that the signals “switch on” below the minimum sensitive frequency of the aLIGO noise spectrum (10 Hz). The physical distance of the simulated signal is scaled such that the injections have SNR=50. The GW signals from our catalogs are injected into 50 independent noise realizations. Thus, for each waveform we obtain 50 evidence values for the waveform to belong to one of the catalogs: , and .

To demonstrate that model selection can correctly identifying which catalog a given injection originated from, for an injection from a given catalog we form the Bayes factors between the other catalogs and the model . For example, if an injection is performed from the HR catalog, we compute the log Bayes factors:

(0.5) |

If the algorithm correctly discriminates between the waveform catalogs, both and will be less than unity.

Figure 0.3 summarizes the distribution of Bayes factors for HR waveforms. The majority of the boxes lie well below zero, indicating that the algorithm correctly identifies the HR catalog as the most probable for these simulations, with . Qualitatively similar results are found when analyzing signals from the Q and RO3 catalogs and will be explored more fully in a follow-up publication.