European Pulsar Timing Array Limits On An Isotropic Stochastic GravitationalWave Background
Abstract
We present new limits on an isotropic stochastic gravitationalwave background (GWB) using a six pulsar dataset spanning 18 yr of observations from the 2015 European Pulsar Timing Array data release. Performing a Bayesian analysis, we fit simultaneously for the intrinsic noise parameters for each pulsar, along with common correlated signals including clock, and Solar System ephemeris errors, obtaining a robust 95 upper limit on the dimensionless strain amplitude of the background of at a reference frequency of and a spectral index of , corresponding to a background from inspiralling supermassive black hole binaries, constraining the GW energy density to at 2.8 nHz. We also present limits on the correlated power spectrum at a series of discrete frequencies, and show that our sensitivity to a fiducial isotropic GWB is highest at a frequency of Hz. Finally we discuss the implications of our analysis for the astrophysics of supermassive black hole binaries, and present 95 upper limits on the string tension, , characterising a background produced by a cosmic string network for a set of possible scenarios, and for a stochastic relic GWB. For a NambuGoto field theory cosmic string network, we set a limit , identical to that set by the Planck Collaboration, when combining Planck and high Cosmic Microwave Background data from other experiments. For a stochastic relic background we set a limit of , a factor of 9 improvement over the most stringent limits previously set by a pulsar timing array.
1 Introduction
The first evidence for gravitationalwaves (GWs) was originally obtained through the timing of the binary pulsar B1913+16. The observed decrease in the orbital period of this system was found to be completely consistent with that predicted by general relativity, if the energy loss was due solely to the emission of gravitational radiation (Taylor & Weisberg, 1989). Despite a decrease of only 2.3ms over the course of 30 yr, by exploiting the high precision with which the time of arrival (TOA) of electromagnetic radiation from pulsars can be measured, deviations from general relativity have been constrained by this system to be less than 0.3 (Weisberg, Nice & Taylor, 2010).
Since then, observations of the doublepulsar, PSR J07373039, have provided even greater constraints, placing limits on deviations from general relativity of less than 0.05% (Kramer et al. (2006), Kramer et al. in prep.). It is this extraordinary precision that also makes pulsar timing one possible route towards the direct detection of GWs, which remains a key goal in experimental astrophysics.
For a detailed review of pulsar timing we refer to Lorimer & Kramer (2005). In general, one computes the difference between the expected arrival time of a pulse, given by a pulsar’s timing model which characterises the properties of the pulsar’s orbital motion, as well as its timing properties such as its spin frequency, and the actual arrival time. The residuals from this fit then carry physical information about the unmodelled effects in the pulse propagation, including those due to GWs (e.g. Sazhin, 1978; Detweiler, 1979).
Individual pulsars have, for several decades, been used to set limits on the amplitude of gravitational radiation from a range of sources (e.g. Kaspi, Taylor & Ryba, 1994). However, by using a collection of millisecond pulsars, known as a pulsar timing array (PTA, Foster & Backer, 1990), one can both increase the signaltonoise ratio of the effect of gravitational radiation in the timing residuals, and use the expected form for the cross correlation of the signal between pulsars in the array to discriminate between the GW signal of interest, and other sources of noise in the data, such as the intrinsic spinnoise due to rotational irregularities (e.g. Shannon & Cordes, 2010), or delays in the pulse arrival time due to propagation through the interstellar medium (e.g. Keith et al., 2013). In the specific case of an isotropic stochastic gravitationalwave background (GWB), which is the focus of this paper, this correlation is known as the ‘HellingsDowns’ curve (Hellings & Downs, 1983), and is only a function of the angular separation of pairs of pulsars in the array.
The lowest frequency to which a particular pulsar timing dataset will be sensitive is set by the total observing span for that dataset. Sensitivity to frequencies lower than this is significantly decreased due to the necessity of fitting a quadratic function in the pulsar timing model describing its spin down. PTA datasets are now entering the regime where observations span decades, and as such are most sensitive to GWs in the range Hz. The primary GW sources in this band are thought to be supermassive black hole binaries (SMBHBs) (Rajagopal & Romani, 1995; Jaffe & Backer, 2003; Wyithe & Loeb, 2003; Sesana et al., 2004; Sesana, Vecchio & Colacino, 2008), however other sources such as cosmic strings (see, e.g. Vilenkin, 1981; Vilenkin & Shellard, 1994) or relics from inflation (see, e.g. Grishchuk, 2005) have also been suggested.
The formation of SMBHBs is a direct consequence of the hierarchical structure formation paradigm. There is strong evidence that SMBHs are common in the nuclei of nearby galaxies (see Kormendy & Ho, 2013, and references therein). The fact that many distant galaxies harbour active nuclei for a short period of their life implies that they were also common in the past. In Cold Dark Matter (CDM) cosmology models galaxies merge frequently (Lacey & Cole, 1993). During a galaxy merger the SMBHs harboured in the galactic nuclei will sink to the center of the merger remnant, eventually forming a SMBHB (Begelman, Blandford & Rees, 1980). As a consequence the Universe should contain a potentially large number of gradually inspiralling SMBHBs. The incoherent superposition of GWs from these binaries is expected to form an isotropic stochastic GWB. Deviations from isotropy, however, such as from a small number of bright nearby sources, could result in individually resolvable systems (Lee et al., 2011), and an anisotropic distribution of power across the sky (Mingarelli et al., 2013; Taylor & Gair, 2013; Gair et al., 2014). These latter situations are the subject of two companion papers (Taylor et al., 2015; Babak et al., 2015) here we focus on the possibility of detecting a stochastic isotropic GWB, and we will discuss the implications of our findings for the astrophysics of SMBHBs, cosmic strings, and relics from inflation.
An isotropic, stochastic GWB of cosmological or astrophysical origin can be described in terms of its GW energy density content per unit logarithmic frequency, divided by the critical energy density, , to close the Universe:
(1) 
Here, is the GW frequency, is the critical energy density required to close the Universe, km s Mpc is the Hubble expansion rate, with the dimensionless Hubble parameter, and is the total energy density in GWs (Allen & Romano, 1999; Maggiore, 2000).
Typically the ‘characteristic strain’, , associated with a GWB energy density is parametrised as a single powerlaw for several backgrounds of interest:
(2) 
where is the strain amplitude at a characteristic frequency of 1yr, and describes the slope of the spectrum. Finally, is directly related to the observable quantity induced by a GWB in our timing residuals, the onesided power spectral density, , given by:
(3) 
where . Note that unless explicitly stated otherwise, henceforth when referring to spectral indices we will be referring to the quantity .
The expected spectral index varies depending on the source of the stochastic background. For a GWB resulting from inspiraling SMBHBs the characteristic strain is approximately (Rajagopal & Romani, 1995; Jaffe & Backer, 2003; Wyithe & Loeb, 2003; Sesana et al., 2004), or equivalently, , whereas primordial background contributions or cosmic strings are expected to have powerlaw indices of (Grishchuk, 2005), and (Ölmez, Mandic & Siemens, 2010; Damour & Vilenkin, 2005) respectively. However, for cosmic strings in particular, a single spectral index is not expected to accurately describe the spectrum in the PTA frequency band (Sanidas, Battye & Stappers, 2012).
A multitude of experiments have set limits on the amplitude of the stochastic GWB, either at a reference frequency as is done for PTAs (Shannon et al., 2013) and groundbased interferometers (Aasi et al., 2014), or by reporting a value for GW energy density integrated over all frequencies as is done by Big Bang Nucleosynthesis measurements, e.g. (Cyburt et al., 2005) and Cosmic Microwave Background (CMB) measurements (Smith, Pierpaoli & Kamionkowski, 2006; Sendra & Smith, 2012). As such, an upper limit on the stochastic GWB reported in terms of either , or for a specified value of provides a clear way to report our limits.
In the last few years the European PTA (EPTA), Parkes PTA (PPTA), and the North American NanoHertz Observatory for Gravitational waves (NANOGrav) have placed 95 upper limits on the amplitude of a stochastic GWB at a reference frequency of of 6 (van Haasteren et al., 2011), 2.4 (Shannon et al., 2013), and 7 (Demorest et al., 2013) respectively. While many of the same pulsars are used by all the PTAs, and both the EPTA and PPTA have similar total observing spans, all of these limits have been placed using different datasets, and different methodologies. As such, these similarly constraining limits should not be seen as redundant, but rather as complementary. For example, the first EPTA limit used Bayesian analysis methods, producing an upper limit while simultaneously fitting for the intrinsic timing noise of the pulsars. Subsequent limits have used simulations to obtain conservative upper bounds consistent with the data, or made use of frequentist methods, fixing the noise at values derived from analysis of the individual pulsars. Naturally a simultaneous analysis of the intrinsic properties of the pulsars with the GWB is the preferred method, and we will show explicitly in Section 5 that fixing the noise properties of the individual pulsars can lead to an erroneously stringent limit on the amplitude of a GWB in the pulsar timing data. The three PTA projects also work together as the International PTA (IPTA; Hobbs et al., 2010), where all three datasets are combined in order to produce ever more robust and constraining limits on the GWB, with the eventual goal of making a first detection.
In this work we make use of the Bayesian methods presented in Lentati et al. (2013) (henceforth L13), which allows us to greatly extend what is computationally feasible for a Bayesian analysis of pulsar timing data. In particular, while we obtain upper limits on the amplitude of a GWB using a simple, two parameter power law as in van Haasteren et al. (2011), we can also make use of a much more general model, enabling us to place robust limits on the correlated power spectrum at discrete frequencies. We can also include additional sources of common noise in our analysis simultaneously with the GWB, such as those that could be expected from errors in the Solar System ephemeris, or in the reference time standard used to measure the TOAs of the pulses. Finally we also take two approaches to parameterising the spatial correlations between pulsars, without having to assume anything about the form it might take. This spatial correlation is the ‘smoking gun’ of a signal from a GWB, and so the ability to extract it directly from the data is crucial for the credibility of any future detections from pulsar timing data.
The key results of our analysis, compared to current theoretical predictions for a range of models of stochastic background and indirect limits in the PTA range are summarised in Fig. 1. In Section 2 we describe the deterministic and stochastic models that we included in this analysis. In Section 3 we discuss the implementation of these methods in our Bayesian and frequentist frameworks. In Section 4 we introduce the EPTA dataset adopted for the analysis (Desvignes et al. in prep.), and in Section 5 we present the results obtained from our analysis. The implications of our findings for SMBHB astrophysics, cosmic strings, and relics from inflation are discussed in Section 6, and finally, in Section 7 we summarise and discuss future prospects.
This research is the result of the common effort to directly detect gravitationalwaves using pulsar timing, known as the EPTA (EPTA; Kramer & Champion, 2013) ^{1}^{1}1www.epta.eu.org/.
2 Signal and noise models
The search for a stochastic GWB in pulsar timing data requires the estimation of a correlated signal of common origin in the pulse TOAs recorded for the different pulsars in the array. The difficulty lies in the intrinsic weakness of the signal and the presence of a range of effects – both deterministic and stochastic – that conspire to mask the signal of interest. At the heart of our analysis methods is the variancecovariance matrix
(4) 
that describes the expectation value of the correlation between TOA from pulsar , with a TOA from pulsar . In the following description uppercase latin indices identify pulsars, and lower case latin indices are short hand notation for the TOAs . Eq. (4) depends on the unknown parameters that describe the model adopted to describe the data and enter the likelihood function in the Bayesian analysis, and the optimal statistic in the frequentist approach.
For any pulsar we adopt a model for the observed pulse TOAs, which we denote , that results from a number of contributions and physical effects according to:
(5) 
In Eq. (5) we have:

, the deterministic model that characterises the pulsar’s astrometric properties, such as position and proper motion, as well as its timing properties, such as spin period, and additional orbital parameters if the pulsar is in a binary.

, the stochastic contribution due to the combination of instrumental thermal noise, and intrinsic pulsar white noise.

, the stochastic contribution due to red spinnoise.

, the stochastic contribution due to changes in the dispersion of radio pulses traveling through the interstellar medium.

, the stochastic contribution due to ‘common noise’, present across all pulsars in the timing array (described in Sec. 2.6), as could be expected from errors in the Solar System ephemeris, or in the reference time standard used to measure the TOAs of the pulses.

, the stochastic contribution due to a GWB.
Our model assumes that all stochastic contributions are zero mean random Gaussian processes. Each of the contributions just described depends on a number of unknown parameters that need to be simultaneously estimated in the analysis. While all these elements, which we set out in detail below, will be present in the Bayesian analysis described in Section 3.1, we do not incorporate the common pulsar noise terms in the frequentist optimalstatistic analysis described in Section 3.2 as this approach by design interprets all crosscorrelated power as originating from a stochastic GWB.
2.1 The timing model
The first contribution to the total signal model that we must consider is the deterministic effect due to the intrinsic evolution of the ‘pulsar clock’, encapsulated by the pulsar’s timing ephemeris. We identify with the dimensional parameter vector for pulsar that describes the relevant set of timing model parameters, and denote as the set of arrival times determined by the adopted model and specific value of the parameters. We use Tempo2 (Hobbs, Edwards & Manchester, 2006; Edwards, Hobbs & Manchester, 2006) to construct a weighted leastsquares fit, in which the stochastic contributions have been determined from a Bayesian analysis of the individual pulsars using the TempoNest plugin (Lentati et al., 2014). We can define the set of ‘postfit’ residuals that results from subtracting the predicted TOA for each pulse at the Solar System Barycenter from our observed TOAs as:
(6) 
In everything that follows, rather than use the full nonlinear timing model we consider an initial estimate of the timing model parameters , and construct a linear approximation to that model such that any deviations from those initial estimates are encapsulated using the parameters such that:
(7) 
Therefore, we can express the change in the postfit residuals that results from the deviation in the timing model parameters as:
(8) 
where M is the ‘design matrix’ which describes the dependence of the timing residuals on the model parameters.
When we perform our Bayesian GWB analysis, we will marginalise analytically over the linear timing model, as described in Section 3.1. When performing this marginalisation the matrix M is numerically unstable. To remedy this issue we follow the same process as in van Haasteren & Vallisneri (2014) and take the SVD of M, to form the set of matrices . Here U is an matrix, which we can divide into two components:
(9) 
where G is a matrix, which can be thought of as a projection matrix (van Haasteren & Levin, 2013), and is the complement. represents a set of orthonormal basis vectors that contain the same information as but is stable numerically. We therefore replace M with in the subsequent analysis.
2.2 White noise
We next consider the contribution to the total signal model that results from a stochastic white noise component, . This noise component is usually divided into two components, and this is the model that we adopt in our analysis:

For a given pulsar , each TOA has an associated error bar, , the size of which will vary across a set of observations. We can introduce an extra free parameter, referred to as EFAC, to account for possible miscalibration of this radiometer noise. The EFAC parameter therefore acts as a multiplier for all the TOA error bars for a given pulsar, observed with a particular ‘system’ (i.e. a unique combination of telescope, recording system and receiver).

A second white noise component is also used to represent some additional source of time independent noise, which we call EQUAD, and adds in quadrature to the TOA error bar. In principle this parameter represents something physical about the pulsar, for example, contributions from the high frequency tail of the pulsar’s red spinnoise power spectrum, or jitter noise that results from the time averaging of a finite number of single pulses to form each TOA (see e.g. Cordes & Downs, 1985; Liu et al., 2011; Shannon et al., 2014). While this term should be independent of the observing system used to generate a given TOA, differences in the integration times between TOAs for different observing epochs can muddy this physical interpretation.
We can therefore modify the uncertainty , defining such that the statistical description is:
(10) 
where
(11) 
where and represent the EFAC and EQUAD parameters applied to TOA for pulsar respectively. In Section 4 we list the number of different observing systems per pulsar used in the analysis presented in this paper.
2.3 Spinnoise
Individual pulsars are known to sometimes suffer from ‘spinnoise’, which is observed in the pulsar’s residuals as a red noise process. This is a particularly important noise source, as most models for a stochastic GWB predict that this too will induce a red spectrum signal in the timing residuals. The spinnoise component is specific to each individual pulsar, and is uncorrelated between pulsars in the timing array. The statistical properties of the spinnoise signal are therefore given by:
(12) 
where the matrix element denotes the covariance in the spinnoise signal between residuals for pulsar . In order to construct the matrix , we will use the timefrequency method described in L13, which we will summarise below.
We begin by writing the spinnoise component of the stochastic signal as
(13) 
where the matrix denotes the Fourier transform such that for signal frequency and time we will have both:
(14) 
and an equivalent cosine term, and are the set of free parameters that describe the amplitude of the sine and cosine components at each frequency.
We include in our model the set of frequencies with values , where the longest period to be included in the model and the number of frequencies to be sampled is . In our analysis presented in Section 4 we take to be , which is the total observing span across all the pulsars in our dataset, and we take such that we include in our model periods up to which is sufficient to describe the stochastic signals present in the data (Caballero et al. in prep.). For typical PTA data Lee et al. (2012) and van Haasteren & Levin (2013) showed that taking to be the longest time baseline in the dataset is sufficient to accurately describe the expected longterm variations present in the data, as the quadratic term present in the timing model significantly diminishes our sensitivities to periods longer than this in the data.
The covariance matrix of the spinnoise coefficients between pulsars at model frequencies , which we denote will be diagonal, with components:
(15) 
where the set of coefficients represent the theoretical power spectrum of the spinnoise signal present in pulsar . In our analysis of the dataset presented in Section 4 we assume that this intrinsic spinnoise can be well described by a 2parameter power law model in frequency, given by:
(16) 
with and the amplitude and spectral index of the power law.
We note that as discussed in L13, whilst Eq. (15) states that the spinnoise model components are orthogonal to one another, this does not mean that we assume they are orthogonal in the time domain where they are sampled, and it can be shown that this nonorthogonality is accounted for within the likelihood (van Haasteren & Vallisneri, 2015). The covariance matrix for the red noise signal present in the data alone can then be written:
(17) 
with the diagonal matrix containing the TOA uncertainties, such that .
2.4 Dispersion measure variations
The plasma located in the interstellar medium (ISM) can result in delays in the propagation of the pulse signal between the pulsar and the observatory. Variations in the columndensity of this plasma along the line of sight to the pulsar can appear as a red noise signal in the timing residuals.
Unlike other red noise signals however, the severity of the observed dispersion measure (DM) variations is dependent upon the observing frequency, and as such we can use this additional information to isolate the component of the red noise that results from this effect.
In particular, the group delay at an observed frequency is given by the relation:
(18) 
where the dispersion constant is defined to be:
(19) 
and the DM is defined as the integral of the electron density from the Earth to the pulsar:
(20) 
While many different approaches to performing DM correction exist (e.g. Lee et al. (2014); Keith et al. (2013)), in our analysis we use the methods described in L13. DM corrections can then be included in the analysis as an additional set of stochastic parameters in a similar manner to the intrinsic spinnoise. Further details on the DM variations present in the EPTA dataset, including comparisons between different models, will be presented in a seperate paper (Janssen et al. in prep.). In our analysis, as for the spinnoise, we assume a 2parameter power law model, with an equivalent form to Eq. (16), however we omit the factor for the DM variations.
We first define a vector of length for a given pulsar as:
(21) 
for observation with observing frequency .
We then make a change to Eq. (14) such that our DM Fourier modes are described by:
(22) 
and an equivalent cosine term, where the set of frequencies to be included is defined in the same way as for the red spinnoise, such that we choose the number of frequencies, , to also be 50. Unlike when modelling the red spinnoise, where the quadratic terms in the timing model that accounts for pulsar spindown acts as a proxy to the low frequency () fluctuations in our data, we are still sensitive to the low frequency power in the DM signal. As such these terms must be accounted for either by explicitly including these low frequencies in the model, or by including a quadratic term in DM to act as a proxy, defined as:
(23) 
with free parameters to be fit for, and the barycentric arrival time for TOA . This can be achieved most simply by adding the timing model parameters and into the pulsar timing model, which are equivalent to and in Eq. (23), and this is the approach we take in our analysis here.
As for the spinnoise component we can then write down the time domain signal for our DM variations as:
(24) 
with the set of free parameters that describe the amplitude of the sine and cosine components at each frequency.
2.5 Combining model terms
In order to simplify notation from this point forwards, for each pulsar we combine the matrices , and into a single, matrix, where is the number of TOAs in pulsar , is the number of timing model parameters, and the factor 2 in front of both and accounts for the sine and cosine terms included for each model frequency. We denote this combined matrix , such that:
(25) 
and similarily we append the vectors , , and to form the single vector . In this way we can write our complete signal model for a single pulsar as:
(26) 
We can then construct the block diagonal matrix T such that each block is given by the matrix for each pulsar , and finally append the set of vectors for all pulsars to form the complete vector of signal coefficients . In this way the concatenated signal model as described thus far for all pulsars, which we denote here as , can be written simply:
(27) 
2.6 Common noise
In Tiburzi (2015 PhD Thesis) and Tiburzi et al. (2015, in prep.) it was shown that additional sources of noise which are common to all pulsars in the PTA can be highly correlated with the quadrupole signature of a stochastic GWB. If these sources of noise are present in our dataset, we will become less sensitive to a GWB if we do not include them in our model. Therefore, in order to ensure that our analysis remains robust to the presence of such signals, we will include in our model the 3 most likely sources of additional common noise:
 1:

A common, uncorrelated noise term. This allows us to account for the possibility that all the millisecond pulsars in our dataset suffer from a similar, potentially steep, red noise process, as discussed in Shannon & Cordes (2010).
 2:

A clock error. Hobbs et al. (2012) showed that a PTA is sensitive to errors in the time standard used to measure the arrival times of pulses. Errors in this time standard would result in a monopole signal being present in all pulsars in the dataset.
 3:

An error in the Solar System ephemeris. Champion et al. (2010) demonstrated that any error in the planet masses, or any unmodelled Solar System bodies will result in an error in our determining the barycentric time of arrival of the pulses. This leads to a dipole correlation being induced in the timing residuals.
We note that there are other possible sources of common correlated noise in a PTA dataset beyond the three listed above. In Section 2.7 we will describe models that allow us to fit for a correlated signal, where the form of the correlation is unknown, and is described by free parameters in our analysis. In principle one could then simultaneously fit for both a GWB, and this additional more general signal. While this would significantly decrease our sensitivity to the GWB it would ensure that our analysis remained robust to the existence of unknown correlated signals in the data. More optimally, one could perform an evidence comparison between a model that includes a GWB, and a model that includes a signal with an arbitrary correlation between pulsars in the PTA, in order to test which model the data supports.
A common, uncorrelated noise term can be trivially included by adding the model power spectrum to the diagonal of the elements of the matrix that correspond to the intrinsic red noise, such that we have:
(28) 
where the set of coefficients represent the theoretical power spectrum of the common uncorrelated signal, which is the same for all pulsars in the array.
In order to include a clock error within the framework described thus far, we append to our matrix an additional set of matrices – one for each pulsar in the array – each of them identical to the matrix , given by Eq. (14), for the corresponding pulsar . Each of these matrices is multiplied by the same set of signal coefficients , which are appended to the vector of coefficients , representing a single signal being fit to all pulsars simultaneously. We use the same number of frequencies in the model for the clock error as for the intrinsic spinnoise, and assume a 2parameter power law model for the power spectrum, which we denote , as in Eq. (16). From this we construct the covariance matrix which we define:
(29) 
the elements of which can be appended to the total covariance matrix for the signal coefficients . We stress that modelling the clock signal in this way ensures that we correctly account for both the uneven time spans, and unequal weighting of the individual pulsars. Additionally, because we fit for the timing model simultaneously with the clock signal, the uncertainty in the low frequency variations of the signal are factored into the analysis appropriately. We show this in a simple simulation in which we use the time sampling from our dataset described in Section 4, and include a clock error consistent with 10 times the difference between the TAI and BIPM2013 time standards, and white noise consistent with the TOA uncertainties in our dataset. In Fig. (2) we show the clock signal used in our simulation after the maximum likelihood timing model has been subtracted from the joint analysis (black line), and the time averaged maximum likelihood recovered clock signal with 1 uncertainties (red points with error bars). The uncertainties in the clock error vary by a factor 9 across the dataset, as different pulsars contribute different amounts to the constraints. We find the recovered signal is consistent with the injected signal across the whole dataspan.
Finally, in order to model an error in the Solar System ephemeris, we can define an error signal , which will be observed in any pulsar as the dot product between this error vector, and the position vector of the pulsar , such that the induced residual as a function of time, will be given by:
(30) 
We can incorporate this effect into our analysis by defining a set of basis vectors separately for each of the 3 components of , similarly to Eq 14. For example, the component in the direction for pulsar will have basis vectors:
(31) 
such that the signal induced in the pulsar will be given by:
(32) 
with the set of signal coefficients to be fit for. This model term is incorporated into our analysis in exactly the same way as for the clock error, with the basis vectors for the 3 components appended to the total matrix T, the 3 sets of signal coefficients appended to the vector , and the diagonal covariance matrix constructed from the power law model appended to the matrix .
While this parametrisation does not constitute a physical model of the Solar System dynamics, it allows us to incorporate our uncertainty regarding possible errors in the Solar System ephemeris, such as errors in the mass measurements of a number of planets or the effects of unknown Solar System bodies. Given the dominant source of error in the Solar System ephemeris is likely to come from errors in the masses of planets such as Saturn, it could be advantageous to include these parameters explicitly in our model. In our analysis presented in Section 5 we opt for the more conservative approach, and use the general model described here to model such errors.
Once again we include the same number of frequencies in the model as for the spinnoise model, and parameterise the power spectrum for each of the 3 components, (x,y,z), of the error vector with a separate 2 parameter power law, as in Eq. (16).
2.7 Gravitationalwave background
When dealing with a signal from a stochastic GWB, it is advantageous to include the cross correlated signal between the pulsars on the sky. We do this by using the overlap reduction function – a dimensionless function which quantifies the response of the pulsars to the stochastic GWB. For isotropic stochastic GWBs, when the pulsars are separated from the Earth and from each other by many GW wavelengths (i.e., in the shortwavelength approximation, cf Mingarelli & Sidery, 2014), this is also known as the HellingsDowns curve (Hellings & Downs, 1983):
(33) 
Here is the angle between the pulsars and on the sky and is the overlap reduction function, which represents the expected correlation between the TOAs given an isotropic stochastic GWB, and the term accounts for the pulsar term for the autocorrelation. With this addition, our covariance matrix for the Fourier coefficients becomes
(34) 
In our analysis presented in Section 5.1 we define using both the 2parameter power law model given in Eq. (16), and also take a more general approach, where the power at each frequency included in the model is a free parameter in the analysis. In this case we define simply as:
(35) 
where we fit for the set of parameters , and use a prior that is uniform in the amplitude .
If we do not want to assume the isotropic (HellingsDowns) overlap reduction function as the description of the correlations between pulsars in our dataset, we can instead fit for its shape. In Section 5, we will do this in two ways: firstly fitting directly for the correlation coefficient between each pulsar, , and secondly using a set of four Chebyshev polynomials, where we fit for the coefficients parameterised such that, defining we will have:
(36) 
3 Analysis Methods
While the majority of the results presented in Section 5 have been obtained using a Bayesian approach, we also employ a frequentist maximumlikelihood estimator of the GWB strainspectrum amplitude as a consistency check. In the following sections we outline the key elements of both these approaches to aid further discussion.
3.1 Bayesian approach
3.1.1 General remarks
Bayesian Inference provides a consistent approach to the estimation of a set of parameters in a model or hypothesis given the data, . Bayes’ theorem states that:
(37) 
where is the posterior probability distribution of the parameters, is the likelihood,
is the prior probability distribution, and
is the Bayesian Evidence.
In parameter estimation, the normalizing evidence factor is usually ignored, since it is independent of the parameters . Inferences are therefore obtained by taking samples from the (unnormalised) posterior using, for example, standard Markov chain Monte Carlo (MCMC) sampling methods.
In contrast to parameter estimation, for model selection the evidence takes the central role and is simply the factor required to normalise the posterior over :
(38) 
where is the dimensionality of the parameter space.
As the average of the likelihood over the prior, the evidence is larger for a model if more of its parameter space is likely and smaller for a model where large areas of its parameter space have low likelihood values, even if the likelihood function is very highly peaked. Thus, the evidence automatically implements Occam’s razor: a simpler theory with a compact parameter space will have a larger evidence than a more complicated one, unless the latter is significantly better at explaining the data.
The question of model selection between two models and can then be decided by comparing their respective posterior probabilities, given the observed data set , via the posterior odds ratio :
(39) 
where is the a priori probability ratio for the two models, which can often be set to unity but occasionally requires further consideration.
The posterior odds ratio then allows us to obtain the probability of one model compared with the other, simply as:
(40) 
3.1.2 MultiNest
The nested sampling approach (Skilling, 2004) is a MonteCarlo method targeted at the efficient calculation of the evidence, but also produces posterior inferences as a byproduct. In Feroz & Hobson (2008) and Feroz, Hobson & Bridges (2009) this nested sampling framework was built upon with the introduction of the MultiNest algorithm, which provides an efficient means of sampling from posteriors that may contain multiple modes and/or large (curving) degeneracies. Since its release MultiNest has been used successfully in a wide range of astrophysical problems, from detecting the SunyaevZel’dovich effect in galaxy clusters (AMI Consortium, 2012), to inferring the properties of a potential stochastic GWB in PTA data in a mock data challenge (L13).
In higher dimensions ( 50), the sampling efficiency of MultiNest begins to decrease significantly. To help alleviate this problem, MultiNest includes a ‘constant efficiency’ mode, which ensures that the sampling efficiency meets some user set target. This, however comes at the expense of less accurate evidence values. Recently, the MultiNest algorithm has been updated to include the concept of importance nested sampling (INS; Cameron & Pettitt 2013) which provides a solution to this problem. Details can be found in Feroz et al. (2013), but the key difference is that, where with normal nested sampling the rejected points play no further role in the sampling process, INS uses every point sampled to contribute towards the evidence calculation. One outcome of this approach is that even when running in constant efficiency mode the evidence calculated is reliable even in higher ( 50) dimensional problems. In pulsar timing analysis, and especially when determining the properties of a correlated signal between pulsars, we will often have to deal with models that can contain parameters. As such, the ability to run in constant efficiency mode whilst still obtaining accurate values for the evidence when these higher dimensional problems arise is crucial in order to perform reliable model selection.
All the analyses presented in Section 5 are performed using INS, running in constant efficiency mode, with 5000 live points and an efficiency of .
3.1.3 Likelihood function
Equivalent to the approach described in L13, we can write the joint probability density of:

the linear parameters , which describe variations in the determinstic timing model and the signal realisations for the red noise and DM variations for each pulsar, and the common noise terms.

the stochastic parameters, (, ) that describe the intrinsic white noise properties for each pulsar,

the powerspectrum hyperparameters that define the spinnoise and DM variation power laws, and the spectra of the common noise terms such as the stochastic GWB, which we collectively refer to as ,
as:
In our analysis we simply use priors that are uniform in all the model parameters, so we can write the conditional distributions that make up Eq. (3.1.3) as:
(42)  
and:
(43) 
We can now marginalise over all linear parameters analytically in order to find the posterior for the remaining parameters alone.
Defining as , and as our marginalised posterior for the stochastic parameters alone is given by:
3.2 Frequentist techniques
As a consistency check of our Bayesian method, we also employ a weaksignal regime maximumlikelihood estimator of the GWB strainspectrum amplitude, known as the optimalstatistic (Anholm et al., 2009; Siemens et al., 2013; Chamberlin et al., 2014). It also maximises the signaltonoise ratio (S/N) in this regime, reproducing the results of an optimallyfiltered crosscorrelation search without explicitly introducing a filter function.
The form of this statistic is
(45) 
where is the autocovariance of the postfit residuals in pulsar , which we can write in terms of the matrices and as:
(46) 
where the matrix is constructed from maximumlikelihood noise estimates obtained in previous singlepulsar analysis. Any GW signal will have been absorbed into the rednoise estimation during this previous analysis. The signal term is defined such that , where we assume that no signal other than GWs induce crosscorrelations between pulsar TOAs. The normalisation of is chosen such that .
The standard deviation of the statistic in the absence of a crosscorrelated signal reduces to
(47) 
which can be used as an approximation to the error on in the weaksignal regime. Hence, for a particular signal and noise realisation where we have measured the optimalstatistic, the S/N of the power in the crosscorrelated signal is given by
(48) 
with an expectation over all realisations of
(49) 
This S/N effectively measures how likely it is (in terms of number of standard deviations from zero) that we have found a crosscorrelated signal in our data rather than an uncorrelated signal. The properties of the signal crossterm are determined by a fixed input spectral shape, which in this case is a powerlaw with slope , matching the predicted spectral properties of the strainspectrum resulting from a population of circular GWdriven SMBHBs.
To compute upperlimits with the optimalstatistic, we follow the procedure outlined in Anholm et al. (2009), where the distribution of is assumed to be a Gaussian with mean and variance . The latter assumption is clearly only appropriate in the weaksignal regime, but serves as a useful approximation. We want to find such that, in some predetermined fraction of hypothetical experiments (), the value of the optimalstatistic would exceed the actual measured value. Hence we can claim that to confidence , otherwise we would have seen it exceed the measured value a fraction of the time. The solution is given by
(50) 
It was shown in Chamberlin et al. (2014) that the crosscorrelation statistic of Demorest et al. (2013) is identical to the aforementioned optimalstatistic, and in fact allows us to achieve a measure of the individual crosspower values between pulsars. In the high S/N limit one would expect these crosspower values to map out the Hellings and Downs curve when plotted as a function of pulsar angular separations. The crosspower values and their associated errors are given by
(51) 
where , and are the Hellings and Downs crosscorrelation values.
4 The Dataset
Pulsar  J06130200  J1012+5307  J16003053  J1713+0747  J17441134  J19093744 

Dataspan (yr)  16.05  16.83  7.66  17.66  17.25  9.38 
N^{a}  14  15  4  14  9  3 
s) ^{b}  1.691  1.610  0.563  0.679  0.801  0.131 
Log A  13.58 0.40 (13.41)  13.05 0.09 (13.04)  13.71 0.54 (13.42)  14.31 0.46 (14.20)  13.63 0.27 (13.60)  14.22 0.42 (13.98) 
2.50 0.99 (2.09)  1.56 0.37 (1.56)  1.91 1.05 (1.38)  3.50 1.16 (3.51)  2.21 0.82 (2.16)  2.23 0.89 (2.17)  
Log A  11.61 0.12 (11.57)  12.25 0.47 (11.92)  11.75 0.39 (11.67)  11.97 0.14 (11.90)  12.19 0.38 (11.93)  12.76 0.53 (12.51) 
1.36 0.48 (1.11)  1.26 0.97 (0.27)  1.64 0.80 (1.46)  2.03 0.55 (1.82)  1.41 1.09 (0.36)  2.23 1.07 (2.16)  
Global EFAC  1.01 0.02 (1.01)  0.98 0.02 (0.98)  1.03 0.04 (1.03)  1.00 0.02 (1.00)  1.01 0.03 (1.00)  1.02 0.04 (1.01) 
95% upper limit ^{c} 

Number of unique observing ‘systems’ that make up the dataset for each pulsar.

Weighted rms for the DM subtracted residuals for each pulsar (D15).

Upper limit obtained from the 5dimensional analysis described in Section 4.
Our limits for an isotropic stochastic background are obtained using a subset of the full 2015 EPTA data release described in Desvignes et al. (in prep.) (henceforth D15). In particular we use a set of 6 pulsars, listed in Table 1, that contribute 90 of the total S/N for this dataset (see Babak et al. (in prep.) for details). We use this subset of the full 42 pulsar dataset in order to minimise the dimensionality of the problem, and thus enable accurate evidence calculations using MultiNest. The pulsar that contributed next in terms of sensitivity, PSR J1640+2224, contributes at only the level, so even were we to add a small number of additional pulsars, the overall gain in sensitivity would be minimal. The DMsubtracted residuals, as well as the frequency coverage as a function of time for these pulsars are shown in Fig. 3 (left and right panel respectively). For each of these pulsars a full timing analysis has been performed using the TempoNest plugin for the Tempo2 pulsar timing package, which simultaneously includes the white noise modifiers EFAC and EQUAD for each observing system, as well as intrinsic red noise, and frequency dependent DM variations. In Fig. (4) we show the mean parameter estimates and 1 uncertainties for the EFAC parameters obtained for each system from this initial analysis. We find all EFACs are consistent with values equal to or greater than 1 within their uncertainties, with the exception of the Westerbork 1380 MHz data in PSR J1713+0747. This could be the result of systematic effects that occur in the template forming phase, and is the subject of ongoing work. As these systems do not contribute a large fraction of the total weight in the dataset, however, it will not have a significant impact on the subsequent analysis. Further analysis of the white noise parameters will be presented in Caballero et al. (in prep.). In the joint analysis presented in this work we use the linear approximation to the timing model. As such, timing solutions obtained from the initial TempoNest analysis were checked for convergence, and the linear regime was found to be suitable in all cases.
The EPTA dataset contains observations from four of the largest radio telescopes in Europe: the Effelsberg Radio Telescope in Germany, the Lovell Radio Telescope at the Jodrell Bank Observatory in the UK, the Nancay Radio Telescope in France, and the Westerbork Synthesis Radio Telescope in The Netherlands. Each of these telescopes operates at multiple observing frequencies, and so the number of unique ‘systems’ present for any one pulsar can be as large as 15. For the 6 pulsars in this dataset we list the number of observing systems present in each in Table 1, which in combination results in 118 white noise parameters. When accounting for the 4 spinnoise and DM variation power law parameters for each pulsar, we have a total of 142 intrinsic noise parameters before the addition of any correlated model components. In an effort to decrease the dimensionality we therefore use the mean estimates for the EFAC and EQUAD parameters from the individual timing analysis presented in D15, and fit a single global EFAC for each pulsar, reducing the number of intrinsic noise parameters to 30.
In order to check the validity of this simplification we performed a 5dimensional analysis for each of the 6 pulsars in this dataset, fitting for power law intrinsic red noise and DM variations, in addition to a global EFAC parameter after adjusting the error bars using the mean values from D15. The parameter estimates obtained are given in Table 1, and the 1dimensional marginalised posteriors for J19093744, J1713+0747, and J17441134 from this analysis are shown in Fig. 5. In each case we show the red noise and DM variation power law parameters for the full noise analysis (red) and 5dimensional analysis (blue). We also show the global EFAC parameter from the 5dimensional analysis in each case. We find the posteriors are consistent between the two sets of analysis.
Table 1 also lists the 95% upper limit on a red noise process with a spectral index of at a reference frequency of for each of the 6 pulsars used in our analysis. This limit was obtained when simultaneously fitting for the 5 intrinsic noise parameters for each pulsar in addition to the steep spectrum noise term. The two pulsars with the most constraining upper limit are PSRs J19093744, and J1713+0747, consistent with the results obtained in Babak et al. (in prep.), with values of , and respectively.
5 Results
5.1 Limits on an Isotropic Stochastic GWB
5.1.1 Bayesian approach
In Table 2 we list the complete set of free parameters that we include in the different models used in the analysis presented in this section, along with the prior ranges used for those parameters.
Parameter  Description  Prior range  
White noise  
Global EFAC  uniform in [0.5 , 1.5]  1 parameter per pulsar (total 6)  
Spinnoise  
Spinnoise power law amplitude  uniform in [ , ]  1 parameter per pulsar (total 6)  
Spinnoise power law spectral index  uniform in  1 parameter per pulsar (total 6)  
DM variations  
DM variations power law amplitude  uniform in [ , ]  1 parameter per pulsar (total 6)  
DM variations power law spectral index  uniform in  1 parameter per pulsar (total 6)  
Common noise  
Uncorrelated common noise power law amplitude  uniform in [ , ]  1 parameter for the array  
Uncorrelated common noise power law spectral index  uniform in  1 parameter for the array  
Clock error power law amplitude  uniform in [ , ]  1 parameter for the array  
Clock error power law spectral index  uniform in  1 parameter for the array  
Solar System ephemeris error power law amplitude  uniform in [ , ]  3 parameters for the array (x, y, z)  
Solar System ephemeris error power law spectral index  uniform in  3 parameters for the array (x, y, z)  
Stochastic GWB  
GWB power law amplitude  uniform in [ , ]  1 parameter for the array  
GWB power law spectral index  uniform in  1 parameter for the array  
GWB power spectrum coefficient at frequency  uniform in [ , ]  1 parameter for the array per frequency in  
unparameterised GWB power spectrum model (total 20)  
Stochastic background angular correlation function  
Chebyshev polynomial coefficient  uniform in  see Eq. (36)  
Correlation coefficient between pulsars (I,J)  uniform in  1 parameter for the array per unique pulsar pair (total 15) 
Model  95% upper limit 

Bayesian Analysis  
Fixed Noise  Fixed Spectral Index  
Varying Noise  Fixed Spectral Index  
Additional Common Signals  Fixed Spectral Index  
Fixed Noise  Varying Spectral Index  
Varying Noise  Varying Spectral Index  
Additional Common Signals  Varying Spectral Index  
Frequentist Analysis  
Fixed Noise  Fixed Spectral Index  
Simulations  Varying Spectral Index  
White Noise Only  
White and intrinsic spinnoise  
White and intrinsic spinnoise and DM variations 
Model  95% upper limit 

Additional Common Signals  Varying Spectral Index  
(x)  
(y)  
(z) 
When parameterising the stochastic GWB using the power law model in Eq. (3), we run two parallel sets of analyses: in the first set we fix , consistent with a stochastic GWB dominated by SMBHBs; in the second set we allow to vary freely within a prior range of [0,7]. In both cases we consider three different models:

with the intrinsic timing noise for each pulsar fixed at the maximum likelihood values given in Table 1;

with the intrinsic timing noise for each pulsar allowed to vary;

as in (ii), but including additional common uncorrelated red noise, a clock error, and errors in the Solar System ephemeris as discussed in Section 2.6.
The upper limits for the amplitude of an isotropic stochastic GWB in the six different models are listed in Table 3. In Table 4 we then list the upper limits for the additional common noise terms that were included in model (iii), when allowing the spectral indicies to vary. All upper limits in this section are reported at a reference frequency of .
The onedimensional marginalised posteriors for the amplitude of the GWB for each of these models are shown in Fig. 6. We find that in both the fixed, and varying spectral index model for the GWB, limits placed under the assumption of fixed intrinsic timing noise are erroneously more stringent than when the noise is allowed to vary, by a factor and respectively. This is a direct result of using values for the intrinsic noise that have been obtained from analysis of the individual pulsars, in which the red spinnoise signal, and any potential GWB signal will be completely covariant. The natural consequence is that fixing the properties of the intrinsic noise to those obtained from the single pulsar analysis will always push the upper limit for the GWB lower in a subsequent joint analysis.
Both of the most recent isotropic GWB limits from pulsar timing have been set using frequentist techniques, either by performing a fixed noise analysis (Demorest et al., 2013), or using simulations (Shannon et al., 2013), obtaining 95% upper limits of and respectively. In both cases, therefore, the analysis performed was fundamentally different to the Bayesian approach presented in this work. As such it is difficult to compare our results directly, or to ascertain the effect of fixing the intrinsic noise parameters on limits obtained using those methods.
The most recent limit placed when allowing the intrinsic noise parameters of the pulsars to vary is given by van Haasteren et al. (2011), in which a 95% upper limit of was obtained at a spectral index of . Our model (ii) is most comparable to this analysis, in which we obtain a 95% upper limit of , an improvement of a factor of two. This translates into a limit on at 2.8 nHz. We confirm this result by analysing the 2015 EPTA dataset with model (ii) using an independent code^{2}^{2}2https://github.com/stevertaylor/NX01, which makes use of the PAL, paralleltempered adaptive MCMC sampler^{3}^{3}3https://github.com/jellis18/PAL2 which explores the parameter space in a fundamentally different way to MultiNest, and obtain a consistent 95% upper limit.
Finally for model (iii) when including additional common or correlated terms in the analysis we find the extra parameters have a negligible impact on our sensitivity, with consistent upper limits obtained in both the fixed and varying spectral index models.
We find the upper limits for the uncorrelated common red noise model to be consistent with those obtained for the GWB, however we find the upper limit for the clock error signal to be slightly lower, with