[
Abstract
Decadelong timing observations of arrays of millisecond pulsars have placed highly constraining upper limits on the amplitude of the nanohertz gravitationalwave stochastic signal from the mergers of supermassive blackhole binaries ( strain at ). These limits suggest that binary merger rates have been overestimated, or that environmental influences from nuclear gas or stars accelerate orbital decay, reducing the gravitationalwave signal at the lowest, most sensitive frequencies. This prompts the question whether nanohertz gravitational waves are likely to be detected in the near future. In this letter, we answer this question quantitatively using simple statistical estimates, deriving the range of true signal amplitudes that are compatible with current upper limits, and computing expected detection probabilities as a function of observation time. We conclude that small arrays consisting of the pulsars with the least timing noise, which yield the tightest upper limits, have discouraging prospects of making a detection in the next two decades. By contrast, we find large arrays are crucial to detection because the quadrupolar spatial correlations induced by gravitational waves can be well sampled by many pulsar pairs. Indeed, timing programs which monitor a large and expanding set of pulsars have an probability of detecting gravitational waves within the next ten years, under assumptions on merger rates and environmental influences ranging from optimistic to conservative. Even in the extreme case where of binaries stall before merger and environmental coupling effects diminish lowfrequency gravitationalwave power, detection is delayed by at most a few years.
Subject headings:
methods: data analysis — gravitational waves — pulsars: individual — pulsars: generalAre we there yet?]Are we there yet?
Time to detection of nanohertz gravitational waves based on pulsartiming array limits
1. Introduction
For the last ten years, three international collaborations have been collecting precise timing observations of the most stable millisecond pulsars, with the aim of identifying the imprint of gravitational waves (GWs) in the nanohertz frequency band (Hobbs, 2013; Kramer & Champion, 2013; McLaughlin, 2013). The bestmotivated GW source for such pulsartiming arrays (PTAs) is the stochastic background signal from the cosmological population of gravitationally bound supermassive blackhole–binaries (SMBHBs: Rajagopal & Romani 1995; Jaffe & Backer 2003; Wyithe & Loeb 2003) that are expected to form at the centers of galaxies after these merge (Sesana et al., 2004, 2008). If the binary inspirals are driven purely by GW radiation reaction at orbital separations corresponding to the PTA frequency band, the resulting timingresidual signal has a powerlaw spectrum with a characteristic exponent,
(1) 
where is the characteristic strain spectrum of the GW background. Theoretical expectations for the GW amplitude center around a few (McWilliams et al., 2014; Sesana, 2013; Ravi et al., 2014). Recent observational upper limits from the three international pulsartiming consortia (Lentati et al., 2015; Arzoumanian et al., 2015a; Shannon et al., 2015) imply at 95% confidence. These limits appear in tension with theoretical expectations, so much so that it seems plausible that environmental effects in galactic centers either stall the formation of gravitationally bound SMBHBs, or accelerate their inspiral; in both cases, the effective is reduced at the frequencies where PTAs are most sensitive (Arzoumanian et al., 2015a; Shannon et al., 2015).
Indeed, the PTA detection of the stochastic GW background (GWB) from SMBHBs, considered imminent only a few years ago (Siemens et al., 2013; McWilliams et al., 2014), now seems to be receding toward the future. Is this really the case? In this letter, we answer this question quantitatively. Namely, given the upper limit obtained by the Parkes PTA (PPTA; Shannon et al. 2015), we ask when we can expect to make a positive detection using different PTAs: the PPTA (Hobbs, 2013), limited to the four low–timingnoise pulsars used for the upper limit; the North American Nanohertz Observatory for Gravitationalwaves (NANOGrav; McLaughlin 2013); the European PTA (EPTA; Kramer & Champion 2013); the International PTA (IPTA; see, e.g., Hobbs et al. 2010); and the pulsartiming project that will be supported by SKA1 (Janssen et al., 2015).
We adopt the frequentist formalism developed by Rosado et al. (2015), and we characterize the PTAs simply by listing, for each pulsar, the duration of the observation and the levels of measurement and timing noise. While GW searches and upper limits with PTAs have recently been given a Bayesian treatment (van Haasteren et al., 2009; van Haasteren & Levin, 2013; Lentati et al., 2013), the frequentist approach based on optimal statistics is both convenient and appropriate for the purpose of this paper, because it dispenses with the simulation of actual datasets, and because the quantification of detection probability is intrinsically a frequentist statement (see, e.g., Vallisneri 2012). Furthermore, experience shows that frequentist upper limits and detection prospects are rather close to Bayesian results (see, e.g., Arzoumanian et al. 2015a).
We proceed as follows: for a specified PTA and true GW background amplitude (), we compute the detection probability (DP) as a function of observation time, using the detection statistic described below. Detection is a probabilistic endeavor since the actual realization of measurement and timing noise may obscure or expose the underlying GW signal. Indeed, we do not currently have access to , but only to the observed upper limit . Setting upper limits is also a probabilistic endeavor, because different realizations of noise lead to different given the same , so we use the upperlimit optimal statistic (also described below) to compute . By introducing an astrophysically motivated prior , we can then use Bayes’ theorem to obtain . Finally, we obtain the expected detection probability as .^{1}^{1}1By assuming a prior, we are introducing a Bayesian element in a frequentist scheme, but this is necessary because we wish to make a statement about a range of astrophysical possibilities, which are all compatible to varying degrees with the observed upper limit.
We conclude that, while small arrays of a few low–timingnoise pulsars (represented here by the Shannon et al. 2015 configuration) provide the most constraining upper limits on the nanohertz GWB, they are suboptimal for detection in comparison to larger arrays such as NANOGrav, EPTA, fullPPTA, and the IPTA, which are expanded regularly with newly discovered pulsars. Indeed, detection and upper limits have rather different demands, and the PPTA itself may employ a pulsar array for detection in the near future (Reardon et al., 2016). Encouragingly, we find that larger PTAs have probability of making a detection in the next ten years, even allowing for a reduced GWB signal due to significant binary stalling or environmental influences.
2. Statistics
The details of the frequencydomain statistical framework used here can be found in Rosado et al. (2015). In the frequentist context, a statistic is a single number that summarizes the (noisy) data with respect to the presence of a (GW) signal, so that, on average, a higher yields a higher , with fluctuations due to the range of possible noise realizations. Given a set of data, there is one observed , and setting an upper limit , amounts to stating that if were equal to , for 95% of noise realizations the observed would have been higher than .
Detection schemes based on statistics, on the other hand, proceed as follows: one considers separately the case where a signal is present in the data at a certain level, and the case where it is not, and computes the probability distribution of the statistic (over the ensemble of noise realizations) in both cases. The zerosignal distribution is used to set a detection threshold as a function of a falsealarm probability; the signal distribution is used to compute the probability that for a certain the statistic will exceed the threshold—that is, the detection probability (or efficiency).
In the following we use crosscorrelation statistics for both upperlimit and detection considerations, making use of the correlated influence of a GWB signal on pulsars which are widely separated on the sky.
Upper limits.– For the discussion of upper limits in this work we will use the socalled statistic of Rosado et al. (2015) modified so that the standard deviation is measured in units of squared GWB amplitude and that the value of the statistic is equal to the injected GWB amplitude on average, as is done in Chamberlin et al. (2015). Assuming that the distribution of this statistic is Gaussian (a reasonable approximation at low signaltonoise ratios (Chamberlin et al., 2015)) we obtain
(2) 
with
(3) 
where denotes a sum over frequencies , denotes a sum over pulsar pairs, is the overlap reduction function (i.e. this is the Hellings & Downs (1983) curve for an isotropic GWB) for pulsars and , is the modeled crosspower spectral density, and is the intrinsic noise power spectral density. From this Gaussian probability distribution, it can be shown that
(4) 
where is the measured value of the GWB amplitude and is our upper limit confidence (e.g., 95%). Thus, the distribution is a Gaussian distribution with standard deviation and mean . Note, to compute we perform a coordinate transformation to obtain = .
Detection.– In the previous section we used the statistic for upper limits in order to compare our results with upper limits computed using the statistic presented in Chamberlin et al. (2015). For assessing our detection probability, however, we will make use of the statistic of Rosado et al. (2015). In this case, we wish to determine the detection probability
(5) 
where is the false alarm probability (FAP); , , and (distinct from previous) are the mean in the presence of a signal, the standard deviation in the presence (subscript 1) and the standard deviation in the absence (subscript 0) of a signal (see Eqs. A16–A18 of Rosado et al. 2015 for more details). Both and grow with increasing signal amplitude. Here we choose a FAP of to match the 3sigma detection threshold of Siemens et al. (2013).^{2}^{2}2This choice is clear if one notices that the 3sigma range is a twosided confidence limit, thus 0.27% of the probability density function is outside of this range. To convert to a standard FAP, we need the onesided limit which is simply 0.27/2 0.13.
Implementation.– In the work of Rosado et al. (2015), it is argued that timing model subtraction (namely subtraction of the quadratic term of the timing model; due to pulsar spin down) can be emulated by not including the lowest frequency in the sum over frequencies. However, we have found that these analytic results agree much better with simulations including timing model subtraction if all frequencies are included in the sum. Furthermore, to emulate data sets with different time spans we only include frequencies in the sum that are greater than , where is the shortest time span for the given pulsar pair.
In this work we always include the GWB power spectral density in the intrinsic pulsar noise . This is meant to mimic a real optimal statistic analysis where the intrinsic noise would be based on singlepulsaranalyses in which there is no way to distinguish the intrinsic noise from GWB power. This will make our analysis conservative in the sense that we are overestimating the intrinsic noise.
3. Results & discussion
In the following, we characterize detection prospects in terms of small and large PTAs, and specifically consider the following five configurations:

NANOGrav, consisting of the pulsars described in Table of Arzoumanian et al. (2015b) with their associated noise properties, to be expanded in future observations by adding four new pulsars each year with ns TOA measurement noise, in line with current expectations.

EPTA, consisting of the pulsars described in Table of Caballero et al. (2015) with their associated noise properties, and regularly adding pulsars as in . Some of these additional pulsars are monitored with the Large European Array for Pulsars (LEAP; Kramer & Champion 2013; Bassa et al. 2016), which synthesizes a 194m steerable dish from the five EPTA telescopes.

A conservative IPTA+, consisting of the pulsars of the first IPTA data release, described in Verbiest et al. (2016) with measurement and timing noise^{3}^{3}3For IPTA pulsars timing noise is taken to consist only of red spin noise, and not of any other system or bandspecific red noise. The latter components may be isolated with multisystem and multifrequency observations, while the former is completely conflated with a GWB signal in the absence of crosscorrelation measurements. properties as in Lentati et al. (2016), and expanding by six new pulsars each year, again with ns TOA measurement noise.

A theorist’s PTA (TPTA), consisting of the toy configuration of Rosado et al. (2015) which may be supported by an advanced radio telescope such as SKA: pulsars with ns TOA measurement noise and no intrinsic timing noise.
These configurations are summarized in Table 1.^{4}^{4}4We rescale PPTA, NANOGrav, EPTA, and IPTA measurement noises (but not the timing noise) by a common factor for each PTA, so that the corresponding statistic upper limits match the results of Shannon et al. (2015), Arzoumanian et al. (2015a), Lentati et al. (2015), and Verbiest et al. (2016); this rescaling has little impact on detection probability in the future, since white measurement noise becomes subdominant to red timing noise at low frequencies.
PTA  configuration  number of pulsars 

PPTA  PPTA (Shannon et al., 2015)  4 
NANOGrav  NANOGrav  new/year with ns 
(Arzoumanian et al., 2015b)  TOA measurement precision  
EPTA  EPTA (Caballero et al., 2015)  new/year with ns 
TOA measurement precision  
IPTA  IPTA Data Release  new/year with ns 
(Verbiest et al., 2016)  TOA measurement precision  
(Lentati et al., 2016)  
Theorist’s PTA  ns TOA measurement precision 
We consider four combined estimates for the amplitude and spectral shape of the stochastic GWB from SMBHBs, ranging from more optimistic (detection wise) to more conservative. For the amplitude, we adopt the observationally motivated lognormal distribution of Sesana (2013) with mean and standard deviation of ; however, we also consider the case where of binaries require longer than Gyr before reaching the PTA band (i.e., the binaries stall, BurkeSpolaor & Simon 2015; McWilliams et al. 2014), which corresponds to a mean , with the same standard deviation. For each of these amplitude priors, we examine the purely GWdriven powerlaw spectrum of Eq. (1), as well as the case where the GW strain has a turnover at caused by SMBHB interactions with stars in galactic nuclei, which accelerate binary inspiral and remove lowfrequency power with respect to the pure power law (see, e.g., Sampson et al. 2015 and Arzoumanian et al. 2015a). The turnover frequency was chosen to lie just beyond the lowfrequency reach of the current PPTA dataset, since the Shannon et al. (2015) analysis still found consistency with the purely GWdriven model of Sesana (2013) for . Similarly, the recent NANOGrav (Arzoumanian et al., 2015a) analysis finds 20% consistency using a 9year dataset (Arzoumanian et al., 2015b). Furthermore, our choice of a turnover at corresponds to reasonable assumptions about the density of stellar populations interacting with SMBHBs in galactic nuclei (see Figs. 10 and 11 of Arzoumanian et al. 2015a). At very low frequencies, the frequency scaling of the timingresidual spectrum becomes due to the environmental influences.
In the top panel of Fig. 1 we show the probability distribution obtained by setting the 95% statistic upper limit to the Shannon et al. (2015) value, , and by assuming nostalling (solid blue line) and 90%stalling (faded solid red line) amplitude priors, with pure powerlaw true spectra. The dashed lines correspond to turnover spectra.^{5}^{5}5When analyzing these latter cases, we use the “wrong” spectrum – a pure power law – for the in Eq. (3), for consistency with existing analyses that assume GWdriven inspirals in deriving limits. As discussed in Rosado et al. (2015), doing so has negligible effects on the distribution of the statistic. For comparison, in the bottom panel of Fig. 1 we show also as computed for NANOGrav’s optimalstatistic upper limit (Arzoumanian et al., 2015a). The underlying astrophysicallymotivated priors for nostalling and stalling are shown in each panel as blue and red dashdot lines, respectively.
Next, we compute the statistic GW detection probabilities, Eq. (5), for each PTA, as a function of and of the observation time beyond current datasets. The resulting DPs are shown in Figs. 2: the left and right columns correspond to the pure powerlaw and turnover spectra, respectively.^{6}^{6}6In computing the statistic, we do assume that the analysis accounts correctly for the true spectral shape. Again, this choice has minimal effects.
As mentioned above, for NANOGrav/EPTA and IPTA+ we add new pulsars at rates of four and six per year, respectively. This annual expansion has a positive impact on the DP, since it provides more independent pulsar pairs at differing angular separations, which help discriminate GWs with quadrupolar spatial correlations (the Hellings & Downs 1983 curve, or the more general anisotropic signatures discussed by Mingarelli et al. 2013 and Gair et al. 2014) from non–spatiallycorrelated red noise processes (Siemens et al., 2013). The unique GW correlation signature is the smoking gun for detection, and we must coordinate future efforts to maximize its measurement.
We obtain the expected detection probabilities as a function of time by integrating against the curves of Fig. 1 (top panel). The resulting are shown in Figs. 3. As in Fig. 1, the darker blue and lighter red curves correspond to nostalling and 90%stalling amplitude priors. For the PPTA array, expected detection probabilities remain below throughout the next years of observation, even in the most optimistic GWsignal scenario. This is unsurprising—a PTA consisting of a few exquisitely timed pulsars may provide very tight upper limits, but it will not usually yield convincing detection statistics, which require an array of many pulsars to map out the expected spatial correlation signature.
By contrast, large pulsar arrays such as NANOGrav, EPTA, and IPTA provide high detection probabilities even with strong binary environmental couplings, since they allow the quadrupolar spatial correlation signature of a GWB to be mapped by many different pulsar pairings. We expect these results to be qualitatively the same for a full PPTA that regularly adds pulsars to the array. In NANOGrav, EPTA, and IPTA+, detection probabilities begin to grow rapidly after only five years of observation beyond current datasets. Binary stalling (the red curves) stunts this growth by three years at most. While the lowfrequency turnover reduces detection probability, its effect is mitigated by the large number of pulsars and the annual catalog expansion.^{7}^{7}7Shannon et al. (2015) advocate performing observations with higher cadence to improve sensitivity at yr, where the GWB would be less affected by environmental couplings. In fact, highercadence observations would improve sensitivity across all frequencies, but not if they come at the cost of reducing the number of monitored pulsars because of limited observingtime allocations. Furthermore, GW searches are actually sensitive to the spectrum of timing residuals rather than of strain itself. The two are related by , so even a turnover spectrum will leave a steep rednoise signature in the pulsar timing residuals.
Finally, we see that the large number of welltimed pulsars in the TPTA builds highly convincing detection probabilities after only a few years of operation. By the time the pulsars have been observed long enough that the influence of the turnover at may be noticeable, the DP is already close to unity. The same is true were we to add timing noise at currently known levels to these TPTA pulsars–the DP will already be close to unity when lowfrequency timing noise begins to have a significant influence.
We stress that there are several important caveats to our analysis. We have focused on detecting the stochastic background rather than deterministic signals. Even for these, using data from more pulsars is desirable, in order to build evidence for a coherent signal. We have assumed for each pulsar a single value of measurement noise that does not improve over time (as occurs in reality when receivers and backends are upgraded), nor do we consider interstellar medium effects such as dispersion (see, e.g., Stinebring 2013). The influence of a GWB spectral turnover depends on its frequency, which is a function of the typical environmental properties of galactic nuclei (either directly or in how these properties drive SMBHB orbital eccentricity)—our choice corresponds to reasonable assumptions about the stellar mass density of SMBHB environments. The timelines for the growth of detection in each PTA are approximate, intended to emphasize the differences between PTAs suited to upperlimits versus detection, and to demonstrate the influence of various binary stalling and environmental scenarios on detection probabilities.
We conclude by emphasizing the different demands of placing stringent upper limits on the stochastic background versus actually detecting it. To wit:

Highly constraining, astrophysically significant upper limits are achievable with only a few exquisitely timed pulsars, but such a PTA is suboptimal for the detection of a stochastic GW background.

Timing many pulsars allows for the quadrupolar spatial correlation signature of the SMBHB background to be sampled at many different angular separations, enhancing prospects for detection.

Adding more pulsars regularly to PTAs will continually improve detection probability, in addition to the gains already made by timing existing pulsars for longer, and will help to mitigate the deleterious influences of binary stalling and environmental couplings.
4. Acknowledgements
It is our pleasure to thank Pablo Rosado, Alberto Sesana, Jonathan Gair, Lindley Lentati, Sarah BurkeSpolaor, Xavier Siemens, Maura McLaughlin, Joseph Romano, and Michael Kramer for very useful suggestions. We also thank the full NANOGrav collaboration for their comments and remarks. SRT was supported by an appointment to the NASA Postdoctoral Program at the Jet Propulsion Laboratory, administered by Oak Ridge Associated Universities through a contract with NASA. MV acknowledges support from the JPL RTD program. JAE and RvH acknowledge support by NASA through Einstein Fellowship grants PF4150120 and PF3140116, respectively. CMFM was supported by a Marie Curie International Outgoing Fellowship within the European Union Seventh Framework Programme. This work was supported in part by National Science Foundation Physics Frontier Center award no. 1430284, and by grant PHYS1066293 and the hospitality of the Aspen Center for Physics. This research was performed at the Jet Propulsion Laboratory, under contract with the National Aeronautics and Space Administration.
References
 Arzoumanian et al. (2015a) Arzoumanian, Z., et al. 2015a, ArXiv eprints, 1508.03024
 Arzoumanian et al. (2015b) —. 2015b, Astrophys. J., 813, 65
 Bassa et al. (2016) Bassa, C. G., et al. 2016, MNRAS, 456, 2196
 BurkeSpolaor & Simon (2015) BurkeSpolaor, S., & Simon, J. 2015, private communication
 Caballero et al. (2015) Caballero, R. N., et al. 2015, ArXiv eprints
 Chamberlin et al. (2015) Chamberlin, S. J., Creighton, J. D. E., Siemens, X., Demorest, P., Ellis, J., Price, L. R., & Romano, J. D. 2015, Phys. Rev. D , 91, 044048
 Gair et al. (2014) Gair, J., Romano, J. D., Taylor, S., & Mingarelli, C. M. F. 2014, Phys. Rev. D , 90, 082001
 Hellings & Downs (1983) Hellings, R. W., & Downs, G. S. 1983, Astrophys. J. Lett , 265, L39
 Hobbs (2013) Hobbs, G. 2013, Classical and Quantum Gravity, 30, 224007
 Hobbs et al. (2010) Hobbs, G., et al. 2010, Classical and Quantum Gravity, 27, 084013
 Jaffe & Backer (2003) Jaffe, A. H., & Backer, D. C. 2003, Astrophys. J., 583, 616
 Janssen et al. (2015) Janssen, G., et al. 2015, Advancing Astrophysics with the Square Kilometre Array (AASKA14), 37
 Kramer & Champion (2013) Kramer, M., & Champion, D. J. 2013, Classical and Quantum Gravity, 30, 224009
 Lentati et al. (2013) Lentati, L., Alexander, P., Hobson, M. P., Taylor, S., Gair, J., Balan, S. T., & van Haasteren, R. 2013, Phys. Rev. D , 87, 104021
 Lentati et al. (2015) Lentati, L., et al. 2015, MNRAS, 453, 2576
 Lentati et al. (2016) Lentati, L., et al. 2016, MNRAS
 McLaughlin (2013) McLaughlin, M. A. 2013, Classical and Quantum Gravity, 30, 224008
 McWilliams et al. (2014) McWilliams, S. T., Ostriker, J. P., & Pretorius, F. 2014, Astrophys. J., 789, 156
 Mingarelli et al. (2013) Mingarelli, C. M. F., Sidery, T., Mandel, I., & Vecchio, A. 2013, Physical Review D, 88, 062005
 Rajagopal & Romani (1995) Rajagopal, M., & Romani, R. W. 1995, Astrophys. J., 446, 543
 Ravi et al. (2014) Ravi, V., Wyithe, J. S. B., Shannon, R. M., Hobbs, G., & Manchester, R. N. 2014, MNRAS, 442, 56
 Reardon et al. (2016) Reardon, D. J., et al. 2016, MNRAS, 455, 1751
 Rosado et al. (2015) Rosado, P. A., Sesana, A., & Gair, J. 2015, MNRAS, 451, 2417
 Sampson et al. (2015) Sampson, L., Cornish, N. J., & McWilliams, S. T. 2015, Phys. Rev. D , 91, 084055
 Sesana (2013) Sesana, A. 2013, MNRAS, 433, L1
 Sesana et al. (2004) Sesana, A., Haardt, F., Madau, P., & Volonteri, M. 2004, Astrophys. J., 611, 623
 Sesana et al. (2008) Sesana, A., Vecchio, A., & Colacino, C. N. 2008, MNRAS, 390, 192
 Shannon et al. (2015) Shannon, R. M., et al. 2015, Science, 349, 1522
 Siemens et al. (2013) Siemens, X., Ellis, J., Jenet, F., & Romano, J. D. 2013, Classical and Quantum Gravity, 30, 224015
 Stinebring (2013) Stinebring, D. 2013, Classical and Quantum Gravity, 30, 224006
 Vallisneri (2012) Vallisneri, M. 2012, Phys. Rev. D , 86, 082001
 van Haasteren & Levin (2013) van Haasteren, R., & Levin, Y. 2013, Monthly Notices of the Royal Astronomical Society, 428, 1147
 van Haasteren et al. (2009) van Haasteren, R., Levin, Y., McDonald, P., & Lu, T. 2009, MNRAS, 395, 1005
 Verbiest et al. (2016) Verbiest, J. P. W., et al. 2016, MNRAS
 Wyithe & Loeb (2003) Wyithe, J. S. B., & Loeb, A. 2003, Astrophys. J., 590, 691