Constraining astrophysical observables of Galaxy and Supermassive Black Hole Binary Mergers using Pulsar Timing Arrays
Abstract
We present an analytic, parametric model to describe the supermassive black hole binary (SMBHB) merger rate in the Universe in terms of astrophysical observables: galaxy stellar mass function, pair fraction, merger timescale and black hole  host galaxy relations. We construct observational priors for each observable and compute the allowed range of the characteristic spectrum of the gravitational wave background (GWB) emitted to be at a frequency of 1/1yr. We then exploit our novel parametrization to tackle the problem of astrophysical inference from Pulsar Timing Array (PTA) observations. We simulate a series of progressively more stringent PTA upper limits, as well as detections and use a nested sampling algorithm to explore the parameter space of the underlying model. Corroborating previous results, we find that the current PTA nondetection does not place significant constraints on any observables; however, either more stringent future upper limits or detections will significantly enhance our knowledge of the SMBHB population and the physics of galaxy mergers. If a GWB is not detected at a nominal level of , our current understanding of the physics driving galaxy and SMBHB mergers is disfavoured at a level, indicating a combination of severe binary stalling, overestimating of the SMBH – host galaxy relations, and extreme dynamical properties of merging SMBHBs. Conversely, future detections will allow to constrain some physical key parameters. An SKAtype array will be able to constrain the normalization of the SMBHB merger rate in the universe, the typical time elapsed between galaxy pairing and SMBHB mergers, the intrinsic normalization of the SMBH – host galaxy relations and the dynamical properties of the merging SMBHBs, including their typical eccentricity and the density of their stellar environment.
keywords:
gravitational waves – black hole physics – pulsars: general – galaxies: formation and evolution – methods: data analysis1 Introduction
It is well established that supermassive black holes (SMBHs) reside at the centre of massive galaxies, and their mass correlates with several properties of the host (see Kormendy & Ho, 2013, and references therein). In the hierarchical clustering scenario of structure formation (White & Rees, 1978), the SMBHs hosted in merging galaxies sink to the centre of the merger remnant because of dynamical friction, eventually forming a bound SMBH binary (SMBHB) at parsec scales (Begelman et al., 1980). The binary subsequently hardens because of (hydro)dynamical interaction with the dense background of stars and gas (see Dotti et al., 2012, for a review), until gravitational wave (GW) emission takes over at subparsec separations, leading to the final coalescence of the system (Peters & Mathews, 1963). SMBHBs are the loudest sources in the Universe of low frequency GWs. Upon coalescence, the frequency emitted by a SMBHB of mass at the last stable orbit is Hz, where M. Therefore, inspiralling SMBHBs emit GWs at subHz frequencies, accessible via precise timing of millisecond pulsars (MSPs). The most massive systems closest to Earth might be powerful enough to be detected as individual deterministic sources at nanoHz frequencies (Sesana et al., 2009; Mingarelli et al., 2017). There are, however, massive galaxies in the Universe. If each of them experienced one or more major merger in its lifetime and if the resulting SMBHB emits GWs at nanoHz frequencies for Myr (Mingarelli et al., 2017), then there are, at any time million SMBHBs emitting GWs in the frequency band probed by pulsar observations, resulting in an unresolved stochastic GW background (GWB, see e.g. Rajagopal & Romani, 1995; Jaffe & Backer, 2003; Sesana et al., 2008; Ravi et al., 2012).
GWs affect the time of arrivals (TOAs) of radio pulses emitted by MSPs. MSPs are the most stable macroscopic clocks in the Universe, some of them ticking with a stability of the order of 100 ns over decades (see Verbiest et al., 2016, for a compilation of the properties of the most stable MSPs). A GW crossing the line of sight to the pulsar modifies the null geodesic along which the photons travel, leaving a distinct fingerprint in their TOAs. As multiple GW sources addup incoherently in a GWB, the oscillatory nature of the GW signal is washed out and the net effect is to produce a red noise in the pulsar TOAs. This makes GWB detection problematic, as there are many other sources of (red) noise in individual MSPs (Cordes, 2013). It has been recognized by Hellings & Downs (1983), that due to the quadrupolar nature of GWs, the red noise produced by a GWB in the TOAs of a pair of MSPs has a specific correlation pattern, that solely depends on the angle dividing the two MSPs in the sky.
This correlation pattern (analogue to the overlap reduction function for GW interferometers) goes under the name of Hellings & Downs curve, is different than other sources of correlated noise (Tiburzi et al., 2016), and is the smoking gun of the presence of a GWB in pulsar timing data. To produce a confident detection of a GWB it is therefore necessary to sample this correlation by timing the largest possible number of pulsars, thus creating a pulsar timing array (PTA Foster & Backer, 1990). Although a GWB has not been detected yet, the three currently leading PTAs – the European Pulsar Timing Array (EPTA Desvignes et al., 2016), the North American Nanohertz Observatory for Gravitational Waves (NANOGrav Arzoumanian et al., 2018), and the Parkes Pulsar Timing Array (PPTA Reardon et al., 2016) – already produced stringent upper limits (Shannon et al., 2015; Lentati et al., 2015; Arzoumanian et al., 2018). The three PTAs work together under the aegis of the International Pulsar Timing Array (IPTA Verbiest et al., 2016), with the goal of building a larger TOA dataset to improve sensitivity. With the contribution of emerging PTAs in India, China and South Africa, a detection is expected within the next decade (Rosado et al., 2015; Taylor et al., 2016b; Kelley et al., 2017b).
Besides detecting the low frequency GWB, the final goal of PTAs is to extract useful astrophysical information from their data to address the ’inverse problem’. Since the GWB shape and normalization depends on the statistical properties of the SMBHB population and on the dynamics of individual binaries in their late inspiral (see Sesana, 2013a, for a general discussion of the relevant processes), stringent upper limits, and eventually a detection, will allow to gain invaluable insights in the underlying relevant physical processes. In fact, the most stringent upper limits to date have been already used to place tentative constrains on the population of SMBHBs (Simon & BurkeSpolaor, 2016; Taylor et al., 2017; Middleton et al., 2018). Generally speaking, the normalization of the GWB depends on the cosmic SMBHB merger rate, and its shape on the typical SMBHB eccentricity and on the effectiveness of energy and angular momentum loss to the dense environment of gas and stars surrounding the binary. Arzoumanian et al. (2016) investigated the implications of the NANOGrav nineyear upper limits on several astrophysical ingredients defining the underlying SMBHB population model. They found that the data prefer low SMBHB merger rate normalization, light SMBHs for a given galaxy mass (i.e. a low normalization of the SMBH – host galaxy scaling relation), eccentric binaries and dense stellar environment. Their analysis should be taken as a proof of concept, since each parameter was investigated separately, keeping all the other fixed. In a subsequent extension of the work, Simon & BurkeSpolaor (2016) have shown that it is possible to use that limit to constrain simultaneously the parameters describing the and the typical SMBHB merger timescale, but still keeping other relevant parameters within a narrow prior range and assuming a GWB characteristic strain, , described by a powerlaw, appropriate for circular, GW driven binaries (thus not considering the detailed SMBHB dynamics). Taylor et al. (2017) focused on the determination of the parameters driving the dynamical evolution of individual binaries, showing with detailed GWB simulations interpolated by means of Gaussian processes, that eccentricity and density of the stellar environment can be constrained for a specific choice of the SMBHB merger rate. Finally, a more sophisticated astrophysical inference investigation was conducted in Arzoumanian et al. (2018), including model selection between different SMBHB population models from the literature, and constrains on the SMBHB eccentricity and environment density for different scaling relations.
In Middleton et al. (2016) we started a longterm project of creating a general framework for astrophysical inference from PTA data. In Chen et al. (2017b) we presented a fast and flexible way to compute the stochastic GWB shape for a general parametrization of the SMBHB merger rate and the relevant properties defining the SMBHB dynamics. We demonstrated the versatility of our model on synthetic simulations in Chen et al. (2017a) and eventually applied it to the most stringent PTA limit to date in Middleton et al. (2018). This latter study, in particular, was instrumental in demonstrating that PTA upper limits are not in tension with our current understanding of the cosmic galaxy and SMBH buildup.
In this paper, we make an important step forward by rewriting our model of the SMBHB merger rate as a parametric function of astrophysical observables. In fact, as shown in Sesana (2013b), the SMBHB merger rate can be derived from the galaxy stellar mass function (GSMF), the galaxy pair fraction, the SMBHB merger timescale and the scaling relation connecting SMBHs and their hosts. By expressing the SMBHB merger rate as a function of simple analytical parametrizations of these ingredients – constrained by independent observations –, we build a GWB model that allows to use PTA observations to constrain a number of extremely relevant astrophysical observables.
The paper is organized as follows. Section 2 summarizes the model to compute the characteristic strain of the GWB and highlights the changes introduced in this paper. Section 3 derives the parametric formulation of the SMBHB merger rate as a function of all the relevant observational parameters describing the properties of merging galaxies and their SMBHs. In section 4 we briefly described how the PTA signal is constructed, the simulation setup of the different investigated PTAs, and the Bayesian method used in the analysis. Observationally motivated prior distributions for all model parameters are given in section 5. Detailed results are presented and discussed in section 6 and in section 7 we summarize our main findings and outline future research directions.
Unless stated otherwise, we use the standard Lambda CDM as our cosmology with the Hubble parameter and constant km Mpcs and energy density ratios , and .
2 GWB strain model
Deviations from an unperturbed spacetime arising from an incoherent superposition of GW sources (i.e. a stochastic GWB) are costumarily described in terms of characteristic strain , which represents the amplitude of the perturbation per unit logarithmic frequency interval. We compute following Chen et al. (2017b) (paper I hereafter). The model allows for the quick calculation of given the chirp mass , redshift and eccentricity at decoupling of any individual binary. The total strain of the GWB can then be computed by integrating over the population , giving the main equation of paper I:
(1) 
where is the strain of a reference binary with chirp mass , redshift and eccentricity and is the peak frequency of the spectrum (see equation 13 in paper I and relative discussion therein). The main concept of equation (1) is to use the selfsimilarity of the characteristic strain of a purely GW emission driven binary to shift the reference spectrum to match the emission of a binary with arbitrary parameters.
As in paper I, we assume that the evolution of the binary is driven by hardening in a stellar environment before GW emission takes over at a transition frequency given by (equation 21 in paper I):
(2) 
where
(3) 
(Peters & Mathews, 1963), is the density of the stellar environment at the influence radius of the SMBHB, is the velocity dispersion of stars in the galaxy and is the rescaled chirp mass. The stellar density is described by a Dehnen profile (Dehnen, 1993) with a fiducial profile parameter . We take the velocity dispersion to be a constant (see paper I).
Finally, the spectrum described by equation (1) is corrected by including an a high frequency drop related to an upper mass limit calculated, at each frequency, via (equation 39 paper I)
(4) 
This upper mass limit takes into account that, particularly at high frequencies, there is less than 1 binary above contributing to the signal within a frequency bin . Statistically, this means that in a given realization of the universe, there will be either one or zero loud sources contributing to the signal. In the case the source is present, it can be removed from the GWB computation since it will be likely resolvable as an individual deterministic GW source (see discussion in Sesana et al., 2008).
In paper I, we used a phenomenological parametric function to describe the SMBHB merger rate , and introduced an extra parameter to allow for eccentric binaries at . The quantity , however, cannot be directly measured from observations. It can be either computed theoretically from galaxy and SMBH formation and evolution models (e.g. Sesana et al., 2008; Ravi et al., 2012; Kelley et al., 2017b) or it can be indirectly inferred from observations of other astrophysical quantities, such as the galaxy mass function, pair fraction, typical merger timescales, and the SMBH – host galaxy relation. Parametrizing the SMBHB merger rate as a function of astrophysical observables would therefore allow to reverse engineer the outcome of current and future PTA observations to obtain useful constrains on those observables. With this goal in mind, in this paper we expand the model from paper I in two ways:

we multiply the fiducial density profile by an extra parameter to allow for variations in density of the stellar environment;

we cast the phenomenological SMBHB merger rate in terms of astrophysical observables, such as galaxy mass function and pair fraction, galaxy  black hole relations, etc., as we detail next in Section 3.
3 Parametric model of the SMBHB merger rate
As detailed in Sesana (2013a) and Sesana et al. (2016), the differential galaxy merger rate per unit redshift, mass and mass ratio, can be written as
(5) 
where is the redshift dependent galaxy stellar mass function (GSMF), is the differential pair fraction with respect to the mass ratio (see equation (12) below) and is the merger timescale. is the mass of the primary galaxy, is the redshift of the galaxy pair and is the mass ratio between the two galaxies. It is important to note that a pair of galaxies at redshift will merge at redshift . The timescale is used to convert the pair fraction of galaxies at into the galaxy merger rate at (Mundy et al., 2017). The merger redshift is obtained by solving for the implicit equation
(6) 
where, assuming a flat Lambda CDM model,
(7) 
The galaxy stellar mass function can be written as a single Schechter function (Conselice et al., 2016)
(8) 
where are phenomenological functions of redshift of the form (Mortlock et al., 2015):
(9)  
(10)  
(11) 
The 5 parameters are sufficient to fit the original Schechter functions at any redshift; an example is shown in figure 1. To simplify the notation, in the following will implicitly denote their corresponding redshift dependent functions .
The differential pair fraction as a function of is given by
(12) 
where is an arbitrary reference mass. Note that, in the literature, pair fractions are usually given as a function of primary galaxy mass and redshift only (e.g. Mundy et al., 2017), such that
(13) 
i.e. integrated over the mass ratio of the pairs. The integral of equation (12) over gives
(14) 
which becomes equivalent to equation (13) by setting
(15) 
Equation (15) allows to map an observational prior of the form of equation (13) into the four parameters of our model .
We use an analogue parametrization for the merger timescale:
(16) 
where , and are four further model parameters. Equation (16) has originally been derived to describe the galaxy merger timescale (Snyder et al., 2017). A further delay is, however, expected between the galaxy merger and the SMBHB final coalescence. In fact, after dynamical friction has merged the two galaxies and has brought the two SMBHs in the nuclear region, the newly formed SMBHB has to harden via energy and angular momentum losses mediated by either stars or gas, before GW emission eventually takes over (see Dotti et al., 2012, for a review). Depending on the details of the environment, this process can take up to several Gyrs, and even cause the binary to stall (Sesana & Khan, 2015; Vasiliev et al., 2015; Kelley et al., 2017a). For simplicity, we assume here that this further delay can be reabsorbed in equation (16), which we then use to describe the time elapsed between the observed galaxy pair and the final SMBHB coalescence.
Substituting equations (8), (12) and (16) into (5) gives
(17) 
where the effective parameters are
(18) 
Equation (17) is still a function of the merging galaxy stellar masses, which needs to be translated into SMBH masses. The total mass of a galaxy can be converted into its bulge mass , using assumptions on the ellipticity of the galaxy: more massive galaxies are typically elliptical and have higher bulge to total stellar mass ratio. We use a phenomenological fitting function (Bernardi et al., 2014; Sesana et al., 2016) to link the bulge mass to the total stellar mass of a galaxy:
(19) 
Note that this fit is appropriate for ellipticals and spheroidals, whereas spiral galaxies usually have smaller bulge to total mass ratio. In Sesana (2013a) different scaling relations were used for blue and red galaxy pairs (under the assumption that blue pairs are predominantly spirals and red pairs predominantly elliptical). The result was that the GW signal is completely dominated by red pairs. We have checked on Sesana (2013a) data that approximating all galaxies as spheroidals affects the overall signal by less than dex. We therefore apply equation (30) to all galaxies, independent on their colour or morphology.
We can then apply a scaling relation between the galaxy bulge mass and black hole mass of the form (see, e.g., Kormendy & Ho, 2013)
(20) 
where is a log normal distribution with mean value and standard deviation , to translate galaxy mass into black hole mass . Note that the galaxy mass ratio is in general different from the black hole mass ratio . Finally, the galaxy merger rate (17) can be converted into the SMBHB merger rate :
(21) 
Equation (20) adds three further parameters to the model: . Lastly, it is convenient to map into the SMBHB chirp mass , by performing a variable change and integrate over the black hole mass ratio to produce a SMBHB merger rate as a function of chirp mass and redshift only:
(22) 
Summarizing, the SMBHB merger rate is described as a function of 16 empirical parameters that are related to astrophysical observable: for the GSMF, for the pair fraction, for the merger timescale, and for the galaxy – SMBH scaling relation. Further, the first three sets of parameters can be grouped into the four effective parameters given by equation (3). Te two extra parameters enter the computation of the shape of the GW spectrum via the transition frequency given in equation (2). We can therefore express the stochastic GWB in equation (1) as a function of 18 phenomenological parameters, listed in table 1.
4 GWB simulations and analysis
As in Chen et al. (2017a) (paper II hereafter), we compute the signaltonoiseratio S/N of a detection of a GWB in the frequency domain as (Moore et al., 2015; Rosado et al., 2015):
(23) 
where are the HellingsDowns coefficients (Hellings & Downs, 1983):
(24) 
where and is the relative angle between pulsars and . and in equation (23) are spectral densities of the signal and noise respectively, and includes the ’self noise’ contribution of the pulsar term (see equation 11 in paper II for details).
We can simplify equation (23) by assuming that all pulsars are identical (except for their position in the sky), i.e. all pulsars have the same properties: rms , observation time and observation cadence . Furthermore, we also assume that there is a sufficient number of pulsars , uniformly distributed in the sky, so that each individual coefficient can be replaced by the rms computed across the sky , and the double sum over all pairs of pulsars becomes . For an observation time the spectrum of the GWB is resolved in Fourier frequency resolution bins of equal width , centred at . The total S/N in equation (23) can thus be split into frequency bin components
(25) 
In the strong signal regime () equation (25) can further be reduced to the approximate total S/N of a strong detection in frequency bins
(26) 
where we used the fact that and . Equation (26) is a drastic simplification, still it provides the relevant scaling between , number of pulsars in the array, and frequency range in which the signal is resolved. For , to achieve a S/N 5 in the lowest few frequency bins, an array of about 20 equally good pulsars is needed.
PTA data are simulated as in paper II. For a signal with amplitude in the th frequency bin, the detection S/N is related to the detection uncertainty via (see equation 18 in paper II)
(27) 
4.1 Simulated datasets
Besides adding a future and an ideal upper limit, we use the same simulation setup as in paper II, with the simplifying assumptions that all pulsars are observed with the same cadence for the same duration of and have the same rms of . These assumptions only affect the S/N of the detection, thus setting the error bars . This is purely a choice of convenience that does not affect the general validity of our results. We expand upon the 4 cases from paper II by adding 2 more upper limit cases to get a total of 6 fiducial cases (3 upper limits and 3 detections):

case PPTA15: we use the upper limit curve of the most recent PPTA analysis, as given by Shannon et al. (2015), which is representative of current PTA capabilities and results in a GWB upper limit of ;^{1}^{1}1 represents the amplitude of the GWB at a reference frequency of yr under the assumption that its spectrum is described by a single power law with , appropriate for circular, GWdriven binaries.

case PPTA16: we shift the PPTA15 curve down by one order of magnitude, which is representative of an upper limit of , reachable in the SKA era;

case PPTA17: we shift the PPTA15 curve down by two orders of magnitude, which is representative of an upper limit of . Although such sensitivity cannot be achieved in the foreseeable future, we use it to infer what conclusions can be drawn by a nondetection at a level well below currently predicted GWB values;

case IPTA30: , ns, yr, week. This PTA results in a detection S/N and is based on a future extrapolation of the current IPTA, without the addition of new telescopes;

case SKA20: , ns, yr, week. This PTA results in a high significance detection with S/N, which will be technically possible in the SKA era;

case ideal: , ns, yr, week. This is beyond SKA capabilities but provides useful insights of what might be achievable in principle.
As stated above, for each simulations we compute the S/N at each frequency bin. If , we then assume an observed signal with amplitude and error described by a lognormal distribution with width given by equation (27). Unlike paper II, if , we do not place an upper limit at that frequency. This is to save on computational costs, having verified that including the upper limit does not affect the resulting posterior distributions. Examples of signal generation are shown in figure 2 for spectra with at 1/1yr in the IPTA30 case. This setup results in an initial detection with low S/N, .
4.2 Data analysis method
We apply Bayes’ theorem to perform inference on our model , given some data and a set of parameters :
(28) 
where is the posterior distribution coming from the analysis of the PTA measurement, is the prior distribution and accounts for any beliefs on the constraints of the model parameters (prior to the PTA measurement), is the likelihood of producing the data for a given model and parameter set, and is the evidence, which is a measure of how likely the model is to produce the data.
To simulate detections we apply the likelihood from paper II
(29) 
to each frequency bin for which , and then sum over the frequency bins to obtain the total likelihood. For the upper limit analyses, we use the directly derived likelihood from the PPTA upper limit, as described in Appendix A.3 of Middleton et al. (2018).
Prior distributions are taken from independent theoretical and observational constrains, as described in Section 5. The parameter space is sampled using cpnest (Del Pozzo & Veitch, 2015), which is a parallel implementation of the nested sampling algorithm in the spirit of Veitch et al. (2015) and Skilling (2004). Nested sampling algorithms do not only provide posterior distributions, but also the total evidence. This allows us to compute Bayes factors for model comparisons. Each simulation has been run with 1000 livepoints, producing independent posterior samples.
5 Defining the prior ranges of the model parameters
There is a vast literature dedicated to the measurement of the GSMF, galaxy pair fraction, merger timescale and SMBH – host galaxy scaling relations. We now described how independent observational and theoretical work translates into constrained prior distributions of the 18 parameters of our model. A summary of all the prior ranges is given in table 1.
5.1 Galaxy stellar mass function
At any given redshift, the GSMF is usually described as a Schechter function with three parameters . The parameters, however, are independently determined at any redshift. Depending on the number of redshift bins to be considered in the computation, this can easily lead to a very large number of parameters . To reduce the dimensionality of the problem from to five, we note that the parameters show clear linear trends with redshift, whilst is fairly constant (see Mortlock et al. (2015)). This allows for a reparametrisation as a function of the 5 parameters performed in Section 3.
A comprehensive list of published values for the parameters for various redshift bins can be found in Conselice et al. (2016), which forms the basis of our prior distribution. We compute between and for all sets of , dividing the sample in two redshift bins: and . This gives a range of values for , shown in figure 3. We then take uniform distributions of , compute the for each sample and redshift bins and compare them against the allowed range. If the value is within the range, the sample is accepted, otherwise it is rejected. The resulting prior distributions are shown in figure 4.
5.2 Pair fraction
Constraints on the pair fraction have been derived by counting the numbers of paired and merged galaxies in various surveys with a number of different photometric and spectroscopic techniques (see, e.g., Conselice et al., 2003; Xu et al., 2012; Robotham et al., 2014; Keenan et al., 2014). Recently, Mundy et al. (2017) have combined data from several surveys to produce an overall uptodate constrain. We base our prior range on the results reported in table 3 of their paper, for the All+GAMA+D17 survey combination and galaxy separation of kpc:
(30) 
This is one of the flatter redshift dependences within the Mundy et al. (2017) compilation. It is, however, likely the more accurate measurement, coming from a combination of deep surveys. Moreover, while stronger redshift dependences are common for Milky Waysize galaxies, most measurements of galaxies with have a relatively flat redshift dependence. Most of the GWB will come from SMBHBs hosted in those massive galaxies, this justifies our choice. Noting that both sets of parameters and uncertainties in equation (30) are similar, we use flat priors for and for all galaxy masses. Steeper redshift dependences are allowed in our set of ’extended prior’, introduced in section 5.6 below. Mundy et al. (2017) also find no significant dependency on galaxy mass, thus we pick , adding the possibility of a mild deviation by imposing a flat prior .
5.3 Merger timescale
In this paper, we define the merger timescale, as the time elapsed between the observation of a galaxy pair at a given projected separation (usually 20 or 30 kpc) and the final coalescence of the SMBHB. Galaxy merger timescales have been computed both for simulations of isolated galaxy mergers (Lotz et al., 2011) and from ensemble of halos and galaxies extracted from large cosmological simulations (Kitzbichler & White, 2008; Snyder et al., 2017), resulting in a large dynamical range. We therefore choose the parametrisation given by equation (16), which is sufficiently generic to cover the whole range of possible effects influencing the total merger time. To include further delays of the SMBHB due to the hardening phase, we choose wide uniform prior ranges Gyr and . The mass dependencies are generally found to be milder, playing a minor role. We therefore choose flat prior ranges .
5.4 relation
Since SMBHs are thought to have an important impact on the formation and evolution of their host galaxy and vice versa, the relation between their mass and several properties of the host galaxy has been studied and constrained by a number of authors (see Kormendy & Ho, 2013, for a comprehensive review). Here we use the tight relation between the SMBH mass and the stellar mass of the spheroidal component (i.e. the bulge) of the host galaxy, which has been described as a powerlaw of the form of equation (20) with some intrinsic scattering. Although nonlinear functions have been proposed in the literature (see, e.g., Graham & Scott, 2012; Shankar et al., 2016), the nonlinearity is mostly introduced to describe the (observationally very uncertain) low mass end of the relation. Since the vast majority of the GWB is produced by SMBH with masses above (Sesana et al., 2008), we do not consider here those alternative parametrizations.
To construct the prior distributions, we apply the same method as in Section 5.1. We define the allow region of the relation as the one enclosed within a compilation of relations collected from the literature in Middleton et al. (2018). We then draw relations from a uniform distribution of and and accept them if they fall within the region allowed by observations. Additionally, we assume a flat distribution for the scattering . Figure 5 shows the obtained prior distributions for .
5.5 Eccentricity and stellar density
The last two parameters deal with the properties of the individual binary. As the eccentricity at decoupling is not well constrained (see, e.g. Sesana & Khan, 2015; Mirza et al., 2017), we choose an uninformative flat prior . The other additional parameter describes the stellar density around the SMBHB (see section 2). is a multiplicative factor added to the density at the SMBHB influence radius, , calculated by using the fiducial Dehnen profile defined in paper I. This has an impact on the frequency of decoupling, as a higher density of stars in the galactic centre means more efficient scattering. The SMBHB thus experiences a faster evolution, reaching a higher before transitioning to the efficient GW emission stage. We choose to include densities that are between 0.01 and 100 times the fiducial value, aiming at covering the large variation of stellar densities observed in cusped vs cored galaxies (Kormendy et al., 2009). This translates into a flat prior .
parameter  description  standard  extended 

GSMF norm  
GSMF norm redshift evolution  
GSMF scaling mass  
GSMF mass slope  
GSMF mass slope redshift evolution  
pair fraction norm  [0.02,0.03]  [0.01,0.05]  
pair fraction mass slope  [0.2,0.2]  [0.5,0.5]  
pair fraction redshift slope  [0.6,1]  [0,2]  
pair fraction mass ratio slope  [0.2,0.2]  [0.2,0.2]  
merger time norm  [0.1,2]  [0.1,10]  
merger time mass slope  [0.2,0.2]  [0.5,0.5]  
merger time redshift slope  [2,1]  [3,1]  
merger time mass ratio slope  [0.2,0.2]  [0.2,0.2]  
relation norm  
relation slope  
relation scatter  [0.3,0.5]  [0.2,0.5]  
binary eccentricity  [0.01,0.99]  [0.01,0.99]  
stellar density factor  [2,2]  [2,2] 
5.6 Extended prior ranges
Unless otherwise stated, the prior ranges just described are used in our analysis. However, we also consider ’extended’ prior ranges for some of the parameters. Although observational determination of the galaxy mass function is fairly solid, identifying and counting galaxy pairs in large galaxy surveys is a delicate endeavour, especially beyond the local universe. We therefore also consider extended prior ranges , and , allowing for more flexibility in the overall normalization, redshift and mass evolution of the galaxy pair fraction. Likewise, SMBHB merger timescales are poorly constrained. The prior range adopted in Section 5.3 is rather wide, but notably does not allow for stalling of low redshift binaries (the maximum allowed merger timescale being 2 Gyrs). Also in this case we consider extended prior ranges Gyr, and , allowing the possibility of SMBHB stalling at any redshift. Finally we also consider a wider prior on the scatter of the relation , mostly because several authors find , which is at the edge of our standard prior. All standard and extended priors are listed in table 1.
6 Results and discussion
Having defined the mathematical form of the signal, the prior ranges of all the model parameters, the simulated data and the form of the likelihood function, we performed our analysis on the six limits and detections described in Section 4.1. In this section, we present the results of our simulations and discuss their astrophysical consequences in detail. We first present the implications of current and future upper limits and then move onto discussing the different cases of detection. Note that, although all 18 parameters are left free to vary within their respective priors, we will present posteriors only for the subset of parameters that can be significantly constrained via PTA observations. Those are the overall normalization of the merger rate , the parameters defining the merger timescale , the parameters defining the relation , the eccentricity at the transition frequency , and the normalization of the stellar density . Because the large number of parameters and the limited information enclosed in the GWB shape and normalization, other parameters are generally unconstrained. Corner plots including all 18 parameters for all the simulated upper limits and detections are presented in Appendix A. All runs are performed using the standard prior distributions derived in Section 5, unless stated otherwise.
6.1 Predicted GWB Strain
A direct product of combining the GWB model described in Section 2 to the astrophysical priors presented in Section 5 is a robust update to the expected shape and normalization of the signal. Thus, before proceeding with the analysis of our PTA simulations, we present this result. In figure 6 the predicted strain of the GWB using our standard prior is compared to the ALL model from Middleton et al. (2018). The shapes and normalization of the two predictions, shown in the top panel, are fairly consistent. At yr our model predicts at 90% confidence, which is slightly more restrictive than the ALL model. This has to be expected since model ALL from Middleton et al. (2018) is constructed following the method of Sesana (2013a). The latter, in fact, gave equal credit to all measurements of the galaxy mass function, pair fractions and SMBH – galaxy scaling relations, without considering any possible correlation between their underlying parameters. Our detailed selection of the prior range takes correlations between different parameters into account (see e.g. figure 4) and is likely more restrictive in terms of galaxy pair fraction.
The bottom panel of figure 6 shows the predicted range assuming circular, GW driven binaries and no high frequency drop, hence producing the standard spectral shape. In this simplified case yr is a factor of higher, spanning from to . Still, most of the predicted range lies below current PTA upper limits, as well as being consistent with other recent theoretical calculations (Dvorkin & Barausse, 2017; Kelley et al., 2017b; Bonetti et al., 2018).
6.2 Upper limits
6.2.1 Current Upper limit at
Firstly, we discuss the implication of current PTA upper limits. Here, we use the PPTA upper limit, nominally quoted as , which represents the integrated constraining power over the entire frequency range assuming a powerlaw. As it has been recently pointed out by Arzoumanian et al. (2018), the sensitivity of PTAs has become comparable to the uncertainty in the determination of the solar system ephemeris SSE – the knowledge of which is required to refer pulse time of arrivals collected at the telescopes to the solar system baricenter. Thus, it has become necessary to include an extra parametrized model of the SSE into the GWB search analysis pipelines. This leads to a more robust albeit higher upper limit, as part of the constraining power is absorbed into the uncertainty of the SSE. A robust upper limit including this effect has recently been placed by the NANOGrav Collaboration at , which is higher but of the same order as the PPTA upper limit. We therefore consider the PPTA upper limit in this analysis, with the understanding that the recent NANOGrav upper limit would lead to very similar implications.
Figure 7 shows that upper limits add very little knowledge to our understanding of the SMBHB population as constrained by the priors on our model parameters. This is in agreement with there being no tension between the current PTA nondetection of the GWB and other astrophysical observations, as extensively discussed in (Middleton et al., 2018). The range of characteristic strain of the GWB predicted by the prior ranges of our model , shown in the upper left plot of figure 7, is only mildly reduced by current PTA observations. Therefore, PTAs are starting to probe the interesting, astrophysical region of the parameter space, without yet being able to rule out significant areas, as can be seen in the posterior distribution of the model parameters shown at the bottom of figure 7. This results into a logarithmic Bayesian evidence . The evidence is normalized so that an ideal reference model that is unaffected by the measurement has . The log evidence can therefore be directly interpreted as the Bayes factor against such a model. In this specific case, we find , indicating that current upper limits do not significantly disfavour the prior range of our astrophysical model. This can also be seen in the bottom row posteriors of figure 7 where the posterior and prior distributions are almost identical, e.g. the effective merger rate (top left histogram) has an upper limit of for the posterior(prior) respectively.
6.2.2 Future Upper limit at
To investigate what useful information on astrophysical observables can be extracted by future improvements of the PTA sensitivity, we have shifted the upper limit down by an order of magnitude to , indicative of the possible capabilities in the SKA era (Janssen et al., 2015).
Results are shown in figure 8. Unlike the current situation, a future upper limit can put significant constraints on the allowed parameter space, also reflected in value of the Bayesian evidence . The odds ratio compared to a reference model untouched by the limit is now , indicating that our astrophysical prior would be disfavoured a level. The Bayes factor provides evidence that there is tension between current constraints on astrophysical observables (defining our prior) and a PTA upper limit of on the GWB level. The top left panel of figure 8 shows that is relegated at the bottom of the allowed prior range, and the top right panel indicates that a low normalization to the relation is preferred. The bottom row posteriors in figure 8 show significant updates with respect to their prior distributions. A more restrictive upper limit on the effective merger rate (top left histogram) at can be placed and the distribution of all parameters defining the merger timescale are skewed towards high values, meaning that longer merger timescales, i.e. fewer mergers within the Hubble time, are preferred. Besides favouring lower merger rates, light SMBH are also required, as shown by the posterior of the parameter. Lastly, there is a slight preference for SMBHBs to be very eccentric and in dense stellar environments, although the whole prior range of these parameters is still possible.
6.2.3 Ideal Upper limit at
Pushing the exercise to the extreme, we shift the future upper limit down by another order of magnitude to , which might be reached in the far future by a postSKA facility (Janssen et al., 2015). Nonetheless, this unveils what would be the consequences of a severe non detection, well below the level predicted by current SMBHB population models. Figure 9 compares inference on model parameters for the PPTA17 run, assuming either standard or extended prior distributions.
If we assume standard priors, constraints are pushed to the extreme compared to those derived in the PPTA16 case. The Bayesian evidence is now . The odds ratio compared to a reference model untouched by the limit becomes , indicating that our astrophysical prior would be severely disfavoured at a level. This would rule out the vast majority of our current constraints on the GSMF, pair fraction, merger timescale and relation. Although the effective merger rate is only limited to be smaller than , all other parameters in the bottom row corner plots in figure 9 show rather extreme posterior distributions. Since our standard prior does not allow stalling of low redshift SMBHBs (the maximum normalization of the local merger timescale being 2 Gyrs), skewing the merger timescale to extreme values is not sufficient to explain the non detection. Further, the normalization to the is severely pushed to the low end, at , thus completely ruling out several currently popular relations (e.g., Kormendy & Ho, 2013; McConnell & Ma, 2013). Even with the smallest possible , a non detection at requires a very high frequency turnover of the GWB (see upper left panel of figure 9), which can be realized only if all binaries have eccentricity and reside in extremely dense environments (at least a factor of 10 larger than our fiducial Dehnen profile).
As mentioned above, our standard prior on the total merger timescale (see Section 5.3), implies that stalling hardly occurs in nature. Although this is backed up by recent progresses in Nbody simulations and the theory of SMBHB hardening in stellar environments (see, e.g., Sesana & Khan, 2015; Vasiliev et al., 2015), we want to keep all possibilities open and check what happens when arbitrary long merger timescales, and thus stalling, are allowed. We note, however, that such a model is intrinsically inconsistent, because when very long merger timescales are allowed, one should also consider the probable formation of SMBH triplets, due to subsequent galaxy mergers. Triple interactions are not included in our models but they have been shown (Bonetti et al., 2018; Ryu et al., 2018) to drive about 1/3 to the stalled SMBHBs to coalescence in less than 1 Gyr. Therefore, we caution that actual constrains on model parameters would likely be more stringent than what described in the following. The extended prior distributions relaxes the strong evidence of to and the Bayes factor becomes comparable to the PPTA16, this is mainly due to allowing binaries to stall as the merger timescale increases to Gyr. The extreme constraints on the other parameters are consequently loosened, although posterior distributions of indicate that light SMBHBs are favoured, along with large eccentricities and dense environments. The stalling of a substantial fraction of SMBHB pushes the effective merger rate to drop below .
Table 2 summarizes the increasing constraining power as the upper limits are lowered. As they become more restrictive, fewer mergers are allowed. The effective merger rate is therefore pushed to be as low as possible with long merger timescales, low SMBHB masses, large eccentricities and dense environments. Bayes factors comparing the current observational constraints, i.e. the prior ranges, with posterior constraints can be calculated from the evidences. These, however, show that the tension increases from with the current upper limit of to with an ideal upper limit at . Relaxing the upper bound on the merger time norm and other constraints (see Section 5.6) can alleviate the tension between current observations and such a upper limit to (although this does not take into account for tripleinduced mergers, as mentioned above).
parameter  

standard prior: no upper limit  0  
standard prior:  0.55  
standard prior:  4.32  
standard prior:  13.69  
extended prior:  4.56 
6.3 Simulated detections
Although it is useful to explore the implication of PTA upper limits, it is more interesting to consider the case of a future detection, which is expected within the next decade (Rosado et al., 2015; Taylor et al., 2016b; Kelley et al., 2017b). We therefore turn our attention at simulated detections and their potential to put further constraints on the astrophysics of galaxy evolution and SMBHB mergers. The amplitude of the simulated GWB is defined by the 16 parameters describing the SMBHB merger rate. We fix those as follows: (2.6, 0.45, , 1.15, 0.1, 0.025, 0.1, 0.8, 0.1, 0.8, 0.1, 2, 0.1, , 1, 0.3). The low frequency turnover is defined by the two extra parameters . We fix and we produce two GWB spectra distinguished solely by the assumed value of the eccentricity: (circular case) and (eccentric case). This set of parameters is chosen such that it results in a GWB strain of at 1/1yr (i.e. well within current upper limits), whilst being consistent with the current constraints of all the relevant astrophysical observables:

GSMF: the values for are chosen, such that they accurately reproduce the currently best measured GSMF, i.e., they are close to the best fit values of the reparametrisation described in Section 5.1;

relation: have been chosen to produce the injected characteristic strain amplitude, consistent with the allowed prior shown in figure 5.
The other parameters are chosen to be close to the centre of their prior ranges, except for the eccentricity, as mentioned above.
6.3.1 Circular case
Figure 10 shows a comparison of the results of the IPTA30 (left column) and SKA20 (right column) setups for the circular case (). In the IPTA30(SKA20) case the GWB has been detected in 10(14) frequency bins up to frequencies of Hz, for a total detection S/N 7(35). Qualitatively, both detections provide some extra constraints on selected prior parameters. The injected spectrum, mass and redshift function are recovered increasingly better as the S/N increases. Still, a broad portion of the initial parameter space is allowed, especially for the redshift evolution of the SMBHB merger rate. It should be noted that PTAs have the most constraining power around the bend of the mass function, at the SMBHB chirp mass . The posterior panels at the bottom of figure 10 show that there is not much additional information gained compared to the prior knowledge for most of the parameters (full corner plots shown in Appendix A), with three notable exceptions:

merger timescale. is marginally constrained around the injected value (0.8 Gyrs) in the IPTA30 case, the constraint becomes better in the SKA20 case. is also skewed towards low values (consistent with the injection). A clean PTA detection thus potentially allow to constrain the timescale of SMBHB coalescence, which can help in understanding the processes driving the merger;

relation. The panels show a tightening of the distribution with increasing S/N. A detection would thus also allow to constrain the relation;

eccentricity and stellar density. The posterior distributions for and show some marginal update. In particular in the SKA20 case, extreme eccentricities, above can be safely ruled out. Note that the absence of a low frequency turnover also favours small value of , fully consistent with the injected value .
6.3.2 Eccentric case
The results for the IPTA30 and SKA20 eccentric cases are shown in figure 11, with full corner plots reported in Appendix A. In general results are comparable to the circular case shown above, as the only difference is in the injected eccentricity parameter. The left column (IPTA30 case) of figure 11 shows nearly identical posterior distributions to its circular counterpart reported in figure 10, this also translates into similar recovered spectrum, mass and redshift functions.
However, in the SKA20 case, the detection S/N is high enough to allow a clear detection of the spectrum turnover in the lowest frequency bins. Which is not the case for IPTA30, as can be seen in the top row spectra plots of figure 11. This has important consequences for astrophysical inference since an observable turnover is only possible if binaries are significantly eccentric and/or evolve in very dense environments. This is shown in the and posterior distributions at the bottom right of figure 11: eccentricities are excluded and densities higher than what predicted by the fiducial Dehnen model are strongly favoured. The full corner plot 22 reported in Appendix A also highlights the degeneracy, as a low frequency turnover can be caused by either parameters; very eccentric binaries in low density stellar environments pruduce a turnover at the same frequency as more circular binaries in denser stellar environments. Additionally, a large region in the plane has been ruled out( and ). This also prompts some extra constrain in the relation, as can be seen in the trends in the and distributions.
Summarizing, little extra astrophysical information (besides the nontrivial confirmation that SMBHBs actually do merge) can be extracted in the IPTA30, whereas many more interesting constrains emerge as more details of the GWB spectrum are unveiled in the SKA20 case. Although posteriors on most of the parameters remain broad, the typical SMBHB coalescence timescale can be constrained around the injected value; the posterior distributions of and are tightened, providing some extra information on the SMBHB merger rate and on the scaling relation; significant constrains onto the SMBHB eccentricity and immediate environment can be placed if a low frequency turnover is detected.
6.3.3 Ideal case
We show ideal detections for both the circular and eccentric cases in figure 12. Although, such detection may not be achievable by PTAs in the foreseeable future, these results show what might be constrained in principle by combining astrophysical prior knowledge to precise measurements of the amplitude and shape of the nanoHz stochastic GWB.
The spectra, mass and redshift functions (not shown in the figure) are recovered extremely well in both cases. Both corner plots also show interesting constrains on some key parameters. The typical merger timescale is correctly measured and constrained within less than 1 Gyr uncertainty, and clear trends in and provide some extra information on the merger timescale evolution with galaxy mass and redshift. Note that those are parameters defining the SMBHB coalescence time which are unlikely to be measured by any other means. The normalization of the relation is also significantly constrained, as shown by the tight posterior distributions. Again we see both in the circular and eccentric cases the degeneracy between eccentricity and stellar density , as in the SKA20 eccentric case above. The posterior regions contain the injected values and exclude a large area from the prior: , for the circular and , for the eccentric case (95th percentile). Although, the ideal eccentric detection has a vastly larger S/N than its SKA20 analogue, the constraints on and are comparable due to the degeneracy between the two parameters. Table 3 shows the increasing constraining power on selected key parameters as the detection S/N improves for the eccentric case.
7 Conclusions and outlook
parameter  

prior  
IPTA30  
SKA20  
ideal  
injection 
We have presented an analytic parametrized model for the SMBHB merger rate in terms of astrophysical observables, including: galaxy stellar mass function, pair fraction, merger timescale and black hole  host galaxy relations. We described each individual ingredient with a simple analytic function and exploited our state of the art knowledge from observations, theory and simulations to define the prior range of each free parameter in the model. We then sampled the allowed parameter space (18 parameters in total) to produce an updated measure of the expected amplitude of the stochastic gravitational wave background across the frequency range. At our model with the prior selection from Section 5 results in a characteristic strain , confirming recent findings (e.g., Middleton et al., 2018). We used our model to interpret current and future pulsar timing array upper limits and detections, linking the outcome of PTA observations to constraints on interesting observed quantities describing the cosmic population of merging galaxies and SMBHs.
Consistent with our previous results (Middleton et al., 2018), we find that current PTA upper limits can only add very little to the prior knowledge of the physical parameters as determined by current observations and simulations. However, as the sensitivity of PTA improves over time, upper limits can become stringent enough to probe interesting regions of the prior parameter space. The more stringent the upper limit becomes, the more extreme the conditions for the SMBHB population must be. Longer merger time (maybe even stalling) of binaries, less massive black holes and a spectral turnover at nanoHz, all contribute to reduce the characteristic strain of the GWB in the PTA observable band. A upper limit at indicates moderate tension (at a nominal 2.5 level) between PTA observations and current astrophysical constraints. Pushing it down to would imply a strong tension with our current knowledge of the process of SMBHB formation and dynamics. Explaining a GWB below this level requires invoking a combination of SMBHB stalling, overestimate of the SMBH – host galaxy scaling relations, extreme eccentricities and dense environments.
Although exploring progressively stringent upper limits is a useful exercise, we are particularly interested in addressing the astrophysical significance of a future PTA detection. An initial detection at S/N will only put marginally better constraints on the underlying astrophysics of galaxies and SMBHs. As the detection significance increases, so do the constraints, as shown in table 3. A full SKAtype array, detecting the GWB at