# Free-form modelling of galaxy clusters: a Bayesian and data-driven approach

## Abstract

A new method is presented for modelling the physical properties of galaxy clusters. Our technique moves away from the traditional approach of assuming specific parameterised functional forms for the variation of physical quantities within the cluster, and instead allows for a ‘free-form’ reconstruction, but one for which the level of complexity is determined automatically by the observational data and may depend on position within the cluster. This is achieved by representing each independent cluster property as some interpolating or approximating function that is specified by a set of control points, or ‘nodes’, for which the number of nodes, together with their positions and amplitudes, are allowed to vary and are inferred in a Bayesian manner from the data. We illustrate our nodal approach in the case of a spherical cluster by modelling the electron pressure profile in analyses both of simulated Sunyaev–Zel’dovich (SZ) data from the Arcminute MicroKelvin Imager (AMI) and of real AMI observations of the cluster MACSJ0744+3927 in the CLASH sample. We demonstrate that one may indeed determine the complexity supported by the data in the reconstructed , and that one may constrain two very important quantities in such an analysis: the cluster total volume integrated Comptonisation parameter () and the extent of the gas distribution in the cluster (). The approach is also well-suited to detecting clusters in blind SZ surveys.

###### keywords:

galaxies: cluster– cosmology: observations – methods: data analysis^{1}

^{2}

## 1 Introduction

Determining the properties of clusters of galaxies, such as their total and baryonic mass, has the potential to provide an independent tool for constraining the parameters of the model, since cluster population properties are sensitive to several cosmological parameters, most notably (see e.g. Sievers et al. 2013, Planck Collaboration et al. 2016, and de Haan et al. 2016). The challenge lies, however, in obtaining a robust estimate of the cluster masses as these are not directly observable. The mass distribution within a cluster is usually measured using a variety of observational methods, including X-ray (see e.g. Vikhlinin et al. 2009, Ettori 2013, Mantz et al. 2014 and Olamaie et al. 2015), the Sunyaev–Zel’dovich (SZ) effect (see e.g. Planck Collaboration et al. 2013, Schammel et al. 2013, Hasselfield et al. 2013, Perrott et al. 2015, Rumsey et al. 2016, and Rodríguez-Gonzálvez et al. 2017), and gravitational lensing (see e.g. Rozo et al. 2010, Hoekstra et al. 2013, Rozo et al. 2014 and Battaglia et al. 2016).

Each of these approaches relies on developing some method for determining the cluster’s mass (distribution) from its observable properties, namely its distributions of X-ray surface brightness, SZ Comptonisation parameter, and weak-lensing shear distribution. This is usually achieved by modelling the physical properties through the cluster in terms of some specific parameterised functional forms. Typical examples include assuming an NFW profile (Navarro, Frenk, & White, 1996; Navarro, Frenk & White, 1997) for the dark matter density distribution, a -model (Cavaliere & Fusco-Femiano, 1976, 1978) for the gas density, or a generalised NFW (GNFW) profile (Nagai, Kravtsov & Vikhlinin, 2007) for the gas pressure. The cluster mass (distribution) is then usually calculated under the standard assumption of a spherical cluster model obeying hydrostatic equilibrium (HSE) and/or some scaling relationships.

Even for physically-based cluster models (see e.g. Olamaie et al. 2012, 2013), there still remains considerable uncertainty regarding the appropriate form one should assume for the radial variation of cluster properties, and this can lead to different cluster mass estimates (see e.g. Olamaie et al. 2012, Giodini et al. 2013, and Köhlinger, Hoekstra & Eriksen 2015, De Martino & Atrio-Barandela 2016). This may result from either adopting inappropriate functional forms, often by extrapolating their use to cluster masses and/or redshifts that are not well sampled by observations or simulations, or by fitting models that depend on parameters to which the data are insensitive.

In this paper, we therefore move away from the traditional approach of assuming specific parameterised forms for the variation of cluster properties, such as the pressure, density and/or temperature distribution, and instead allow for a ‘free-form’ reconstruction, but one for which the level of complexity is determined automatically by the observational data and may depend on position within the cluster. This is achieved by representing each independent cluster property as some interpolating or approximating function (for example, a piecewise linear interpolation or a spline) that is specified by a set of control points, or ‘nodes’. The positions and amplitudes of these nodes and, most importantly, the number of nodes used are allowed to vary and constitute the set of parameters to be inferred from the data in a Bayesian manner.

We note that we have already successfully applied such a Bayesian nodal modelling approach to a number of cosmological analysis problems, including the reconstruction of the primordial anisotropy power spectrum and the variation of the dark energy equation-of-state parameter with redshift (see e.g. Vázquez et al. 2012a,b and Hee et al. 2016, 2017 ).

The structure of this paper is as follows. In Section 2, we briefly summarise Bayesian inference, in particular parameter estimation and model selection. Our nodal approach to modelling galaxy clusters is presented in Section 3, and in Section 4 we apply it to the particular case of modelling the pressure profile in a spherical cluster observed via its SZ effect. Section 5 outlines our Bayesian methodology for inferring the cluster parameters from interferometric SZ observations, and summarises our simulated cluster observations and real observations of the cluster MACSJ0744+3927 using AMI. In Section 6, we present the results of our Bayesian nodal analysis of these simulations and real observations, and we conclude in Section 7.

## 2 Bayesian inference

For the analysis of some data in the context of a model (or hypothesis) that depends on some set of parameters , Bayes’ theorem states that

(1) |

which is usually interpreted as the prior probability distribution of the parameters being updated by the likelihood of obtaining the observed data given some set of parameter values to yield the posterior probability distribution of the parameters, which is normalised by the Bayesian evidence (which does not depend on the parameters ). The joint (unnormalised) posterior provides the complete inference in Bayesian parameter estimation, and can be subsequently marginalised over each parameter to obtain individual parameter constraints.

Similarly, in Bayesian model selection, one can calculate the probability of a model given the data as

(2) |

Taking the ratio of the probabilities of two models signifies our degree of belief in one model over another, and is given by the posterior odds ratio (POR)

(3) |

where is the Bayes factor (Jeffreys, 1961). Clearly, if , so that the two models are considered a priori equally probable, then . Table 1 lists a modern version of Jeffreys’ criteria, which are used to give meaning to this quantification (Kass & Raftery, 1995; Feroz, 2013).

Favouring of over | |
---|---|

None | |

Slight | |

Significant | |

Decisive |

Thus, for Bayesian model selection, in contrast to parameter estimation, the evidence takes a central role. Typically, the evidence for each model is calculated separately and their ratios evaluated. From (1), the evidence is the normalisation constant for the posterior, which is given by

(4) |

where is the dimensionality of the parameter space. As the average of the likelihood over the prior, the evidence embodies the notion of Occam’s razor (see e.g, Jaynes 1986 and Sivia 2005): a simple theory with a compact parameter space will have a larger evidence than a more complicated one, unless the latter is significantly better at explaining the data.

The evaluation of the multidimensional integral (4) over the whole parameter space is a challenging numerical task. We perform this calculation here using the nested sampling algorithm Multinest (Feroz & Hobson, 2008; Feroz, Hobson & Bridges, 2009; Feroz et al., 2013). This Monte-Carlo method is targeted at the efficient calculation of the evidence, but as a by-product also produces posterior inferences for parameter estimation; it is also very efficient at exploring posteriors that contain multiple modes and/or large (curving) degeneracies.

We note that elsewhere (Hee et al., 2016, 2017), we have presented an alternative method for performing Bayesian model selection, without explicitly computing evidences, which uses a combined likelihood and introduces an integer model selection parameter . If the total number of models under consideration is specified a priori, the full joint parameter space of the models is of fixed dimensionality and can be explored using standard sampling methods, without the need for trans-dimensional techniques, although the posterior is usually highly multimodal and hence nested sampling is again an obvious choice. Bayes factors, or more generally posterior odds ratios, may then be read off directly from the posterior on , which is obtained by straightforward marginalization. To keep our discussion simple, however, we will not use this method here, but plan to apply it to nodal modelling of galaxy clusters in a forthcoming publication.

In closing this section, it should be mentioned that, in general, a gain in information via a Bayesian analysis may be achieved in several ways: it can occur because of a tightening of the parameter constraints, a shift in position of the peak(s) of the distribution from prior to posterior, or an increase in the evidence (see e.g. Trotta et al. 2008, Seehars et al. 2014 and Seehars et al. 2016).

## 3 Nodal model for a galaxy cluster

As described briefly earlier, in our nodal approach to free-form modelling of a galaxy cluster, each independent physical property of the cluster is represented by some interpolating or approximating function that is determined by a set of control points, or nodes. In principle, these functions can be fully three-dimensional to allow modelling of arbitrary structure in each property of interest in the cluster. To illustrate the method simply, however, we will consider here the special case of a spherical cluster, such that each property is a function only of radius from the cluster centre. Moreover, we will specialise still further to the case where one constructs a nodal model of just a single property of interest, described by some function . It is a straightforward matter to extend the following analysis to multiple properties of interest.

The basic idea is to represent not with some standard parameterised functional form, as in most current cluster analyses, but in terms of a number of nodes in -space, together with their corresponding positions and amplitudes (thus each node can ‘move’ both horizontally and vertically– see figure 1; although we will in fact restrict the movement of the ‘end’ nodes, as described below). The nodes act as control points for the continuous function . In this way, one obtains a continuous free-form reconstruction of the profile , for which the complexity is regularised by the data under analysis.

One is free to choose from a wide range of possible interpolation or approximation methods, such as polynomials, rational functions, splines, Bézier curves or even Gaussian processes. It should be noted, however, that some forms of smooth interpolating or approximating function, such as cubic splines, have continuity requirements on the function and its derivatives which can significantly reduce the ability of the resulting to reproduce abrupt features in the cluster profile (Vázquez et al., 2012). In principle, the nature of the interpolating or approximating function could be determined by performing a straightforward Bayesian model selection between the various options, although we will not consider that further here. Instead, for illustration, we choose here the simplest approach of linearly interpolating between the nodes; since we are performing an interpolation, rather than an approximation, one has .

As mentioned above, given the nature of the one-dimensional function that we wish to model in the case of a galaxy cluster, we typically restrict the movement of the first and last nodes (or end nodes) as follows. The first node has a fixed position, at the origin, such that always. Consequently, its corresponding amplitude parameter represents the central value , which is often of interest in galaxy cluster analyses. In contrast, the last node has a fixed amplitude of zero, such that . Thus, its corresponding position parameter can sometimes be interpreted as an extent of the cluster , which is again often of interest (although this interpretation does depend on the nature of the quantity being modelled). In our demonstration of the method presented in Section 4, represents the electron pressure profile of the cluster and so represents the extent of the cluster gas. Figure 1 illustrates our linearly-interpolated nodal model of for nodes.

In the context of Bayesian inference, we denote the model consisting of nodes by , which has the parameters for (with delta function priors on and ), plus any further ‘global’ parameters, such as the cluster position on the sky or parameters describing any contaminating signals or noise in the data. The priors on the parameters may be chosen to accommodate whatever information is available a priori. In general, however, we typically also impose a ‘sorting condition’ on the node positions, such that . This can be done straightforwardly, without the need to reject any samples. Indeed, if each node position is considered to be drawn from a uniform distribution in some given range, as is usually the case, the corresponding (non-separable) joint prior on the node positions, which incorporates the sorting condition , has an analytic form in terms of the beta distribution (Handley, Hobson & Lasenby 2015). It is also worth mentioning that, although we will assume here that each model is equally likely a priori, such that (and hence PORs are equivalent to Bayes factors), this is not necessary. One may view this assumption as imposing a uniform prior (within some range) on the number of nodes, but one could equally well impose, for example, a Poisson prior on with some given mean, from which one can read off the corresponding values of .

In the straightforward model selection approach used here, one begins by analysing the model, using Multinest to calculate the evidence and obtain samples from the posterior . This process is then repeated separately for until the evidence has decreased well below that of the most favoured model. One may then proceed either by conditioning on the most favoured model and simply infer parameters and the corresponding constraints on from the posterior or perform a ‘multi-model’ inference (see e.g. Parkinson & Liddle 2013, Hee et al. 2016, 2017, and Planck Collaboration et al. 2016) in which the constraints on are determined by averaging over all the models considered, weighted by their PORs. In the interests of simplicity we will adopt the former approach here.

## 4 Application to SZ observations

Although our nodal approach may be used to model observations of galaxy clusters in any waveband, we illustrate the method here by applying it to SZ observations (see e.g. Sunyaev & Zeldovich 1970, Birkinshaw 1999, and Carlstrom, Holder & Reese 2002). Since the SZ surface brightness is proportional to the line-of-sight integral of the electron pressure, as we show below, it is natural to use the nodal approach to model the radial electron pressure profile of the ionised gas within the cluster.

The observed SZ surface brightness in the direction of an electron reservoir is given by

(5) |

Here is the blackbody spectrum, K (Fixsen et al. 1996) is the temperature of the CMB radiation, is the frequency dependence of thermal SZ signal, , is Planck’s constant, is the frequency and is Boltzmann’s constant. The function takes into account the relativistic corrections in the study of the thermal SZ effect which is due to the presence of thermal weakly relativistic electrons in the ICM and is derived by solving the Kompaneets equation up to the higher orders (Rephaeli 1995, Itoh, Kohyama, & Nozawa 1998, Nozawa, Itoh, & Kohyama 1998, Pointecouteau, Giard, & Barret 1998, Challinor & Lasenby 1998). It should be noted that at 15 GHz (AMI observing frequency) and therefore the relativistic correction, as shown by Rephaeli (1995), is negligible for . The dimensionless parameter , known as the Comptonization parameter, is the integral of the number of collisions multiplied by the mean fractional energy change of photons per collision, along the line of sight

(6) |

where is the electron pressure at radius respectively, is Thomson scattering cross-section, is the electron rest mass, is the speed of light and is the line element along the line of sight. It should be noted that (6) assumes an ideal gas equation of state.

The integral () of the Comptonisation parameter over the solid angle subtended by the cluster is proportional to the volume integral of the gas pressure. It is thus a good estimate for the total thermal energy content of the cluster and hence its mass (see e.g. Bartlett & Silk 1994). The parameter in cylindrical and spherical geometries, respectively, may be described as

(7) | |||||

(8) |

where is the projected radius of the cluster on the sky. Thus, by using our nodal approach to model , one may constrain in either geometry.

## 5 Analysis of interferometric SZ data

In order to verify that our proposed model, with its corresponding assumptions, can describe profiles of cluster physical properties accurately, we carry out a Bayesian analysis of simulated SZ observations using AMI (AMI Consortium: Zwart et al., 2008) of a set of three clusters, as well as real AMI SZ observations of the cluster MACSJ0744+3927 (Rumsey et al., 2016).

### 5.1 Bayesian methodology

An interferometer like AMI operating at a frequency measures samples from the complex visibility plane . These are given by a weighted Fourier transform of the surface brightness , namely

(9) |

where is the position on the sky relative to the phase centre, is the (power) primary beam of the antennas at observing frequency (normalised to unity at its peak) and is the baseline vector in units of wavelength.

Details of our Bayesian methodology for modelling interferometric SZ data, primordial CMB anisotropies, and resolved and unresolved radio point-source are given in Hobson & Maisinger (2002); Feroz & Hobson (2008); Feroz et al. (2009); AMI Consortium: Davies et al. (2011) and AMI Consortium: Olamaie M. et al. (2012).

In short, the measured interferometer visibilities in our model are assumed to have the form

(10) |

where the signal part, , contains the contributions from the SZ cluster and identified radio point sources, whereas the generalised noise part, , contains contributions from the background of unsubtracted radio point sources, primary CMB anisotropies and instrumental noise. The last two contributions to the generalised noise are well described by Gaussian processes, whereas the background of unsubtracted radio sources is strictly a Poisson process. Nonetheless, in the limit of a large number of faint unsubtracted sources, this contribution can also be well approximated as Gaussian (Feroz et al. 2009). Thus, the generalised noise is assumed to be Gaussian distributed, so that the likelihood function has the form

(11) |

where is the standard statistic quantifying the misfit between the observed data and the predicted data :

(12) |

in which is the generalised noise matrix relating the frequency channels and . Under the assumption that the three contributions to the generalized noise discussed above are independent, this matrix is simply the sum of the covariance matrices for each component. These matrices are described in Feroz et al. (2009) and are assumed to be independent of the parameters to be fitted in the analysis. In particular, the primary CMB anisotropies are assumed to be consistent with the concordance cosmology.

In the Bayesian analysis, the point sources are modelled using delta-function priors on position and Gaussian priors on flux and spectral index, usually determined from higher-resolution observations from the AMI Large Array, as discussed in Feroz et al. (2009). We focus here, however, on the inference of the cluster parameters. For the model , having nodes, the cluster parameters are

(13) |

where and are the cluster projected position on the sky. We carry out the analysis for . We assume that the priors on these sampling parameters are separable, apart from imposing the ‘sorting condition’ on the nodes positions, as discussed in Section 3, so that

(14) |

The position of the first node is fixed to , whereas the position of the last node has a uniform prior in the range . The positions of the intervening nodes have uniform priors in the range . The prior on the pressure of the first node is a truncated exponential distribution with mean MeVm in the range , whereas the pressure of the last node is fixed to zero. The pressures of the intervening nodes have uniform priors in the range . Finally, we assume Gaussian priors on the cluster position parameters and , centred on the origin with a standard deviation of 1 arcmin. It is also worth mentioning that we assume here that each model is equally likely a priori, such that (and hence PORs are equivalent to Bayes factors), which is equivalent to imposing a uniform prior on in the range . We note that the above priors are chosen to be consistent with the results of -body simulations and real cluster observations of galaxy clusters (see e.g. Olamaie et al. 2012, Olamaie et al. 2013 and references therein). In all analyses, the redshift of the cluster is assumed known.

### 5.2 Simulated AMI observations

To generate simulated SZ skies and observe them with a model AMI instrument, we use the methods outlined in Hobson & Maisinger (2002), Grainge et al. (2002), Feroz et al. (2009), and AMI Consortium: Olamaie M. et al. (2012). In particular, we generate simulated observations of three different model clusters, which we call , and . These differ in the input pressure profile used to generate them, and all include primordial CMB anisotropies, receiver thermal noise, and simulated residual points sources typical of AMI pointed observations of clusters.

For and , we assume an input pressure profile that matches our nodal model precisely, in order to investigate the ability of our approach to recover the true parameter values in the simplest case. In particular, the pressure profiles for and are generated by linear interpolation between and nodes, respectively, and are plotted in figure 2. For both simulations, we assume the cluster lies at a redshift of .

For SIM, we use a more realistic cluster model to test the ability of our nodal approach to recover a pressure profile that is not of the form assumed in the analysis. In this case, the cluster is simulated using the model described in Olamaie et al. (2012, 2013), which assumes that the dark matter density follows an NFW profile and the ICM plasma pressure is described by the generalised NFW (GNFW) profile. The model also assumes that hydrostatic equilibrium is satisfied and that the local gas fraction is small throughout the cluster. This cluster model is fully specified by just three parameters, for which we assume the values , and . The resulting pressure profile is shown in figure 2, and is formally singular at the origin, decreasing sharply with radius in the central regions of the cluster.

### 5.3 AMI observations of MACSJ0744+3927

We also analyse real AMI observations of MACSJ0744+3927, one of the clusters in the CLASH (Cluster Lensing And Supernova survey with Hubble) sample (Postman et al., 2012). MACSJ0744+3927 is a rich cluster at redshift z = 0.689 and has been studied through its X-ray emission, strong lensing, weak lensing and SZ effect (Schmidt & Allen, 2007; Ettori & Balestra, 2009; Umetsu et al., 2016; Rumsey et al., 2016). The SZ signal (decrement) on the AMI map appears circular, (figure 7 in Rumsey et al. 2016) in agreement with the X-ray surface brightness from the Chandra archive data (figure 6 in Postman et al. 2012). Details of AMI pointed observation towards the cluster, data reduction pipeline and mapping are described in Rumsey et al. (2016). In particular, we note that the Bayesian analysis includes 23 radio point sources in the AMI field. We focus here, however, on the determination of the parameters defining our nodal model of the cluster.

## 6 Results and Discussion

We now present the results of our Bayesian nodal analysis applied to the three simulated data sets , , and , and real AMI observations of the cluster MACSJ0744+3927.

### 6.1 Simulation SIM

The results of our evidence-based Bayesian model selection analysis to determine the number of nodes in the reconstruction for are given in figure 3. This shows the histogram of Bayes factors , i.e. relative to the ‘straight line’ model, as a function of the number of nodes in the reconstruction of the pressure profile for simulation SIM. Given our prior choice on the models, the Bayes factors are equal to the PORs , and so the plotted histogram is equivalent to the marginalised posterior on . Recalling that the input pressure profile for SIM is constructed by linear interpolation between nodes, one sees that our analysis has recovered the true value of as the most favoured. In particular, it is worth noting that the log Bayes factor , indicating strong evidence for the model over the (straight line) model, according to Table 1. The Bayes factors then gradually decline for , ultimately reaching the value , which indicates no preference for over the model, demonstrating that the ability of the model to (over)fit the data is offset by the penalty of its increased complexity.

As mentioned earlier, we will adopt here the straightforward approach of determining the constraints on parameters by conditioning on the most favoured model , rather than performing model averaging according to their Bayes factors. Since it is of interest to understand any biases or constraints on the parameters imposed by our choice of priors, we first consider the ‘posterior’ on the cluster parameters obtained in the absence of any data. This is calculated simply by setting the likelihood to a constant value, so that the sampler explores just the prior . The resulting 1-D and 2-D marginalised distributions for the sampling parameters are shown in figure 4 (left panel); in addition we plot the derived parameter defined in equation (7). These plots show that we correctly recover the assumed prior distributions, and also reveal the constraints that our choice of priors has placed on the derived parameter . The plots are produced using the open source Python library corner.py (Foreman–Mackey, 2016).

The corresponding plot obtained after analysing the simulation SIM is shown in figure 4 (right panel), and shows the effect that the data have in updating the prior, via the likelihood function, to produce the posterior, in the spirit of Bayes’ theorem. In particular, one sees that the cluster position ( and ) on the sky is firmly constrained and the true values lie within few arcseconds of the means of the posterior probability distributions for and . Turning to the parameters defining the nodal model of the pressure profile, one sees that the amplitude of the first node is not well constrained, and we are essentially recovering just the prior distribution. This is to be expected, since an interferometric SZ observation is insensitive to lengths scales on the sky that correspond to Fourier modes in the visibility plane lying well below the shortest baseline (in units of wavelengths) of the interferometer. Consequently, for this simulation, the observations cannot probe the cluster inner core and thus provide no information on the pressure at the centre. By contrast, the position and amplitude of the second node, (, ), are both constrained relative to their prior distributions, although their 2D marginal reveals a clear degeneracy between them. The position of the third (and final) node is very well constrained. Since this node corresponds to the point at which the gas pressure drops to zero, it is a valuable quantity for defining the extent of the gas distribution in the cluster. Indeed, obtaining a robust constraint on this quantity can provide insight to the dynamical state of the cluster. Finally, we note that the important derived parameter is also very well constrained. From their 2-D marginal, however, one sees that there is some degeneracy between the parameters and .

Rather than viewing the posterior constraints on the individual parameters , , and , as in figure 4, it can be more intuitive and instructive to plot the corresponding inference on the reconstructed pressure profile directly. This may be performed in a number of ways. For example, one may plot the posterior probability , in normalised slices at constant (Hee et al., 2016). We will not pursue that method here, however, and instead adopt the simpler approach of plotting the mean position and amplitude of each node, together with horizontal and vertical error bars representing the 68 percent marginalised Bayesian credible interval in each direction; these nodes are then linearly interpolated to obtain the reconstructed pressure profile, as shown in figure 5. The input pressure profile used to generate the simulation is also plotted. As one might expect from figure 4 (right panel), the reconstructed pressure profile is consistent with the input profile, with the true node locations in -space all lying within the Bayesian credible intervals of the corresponding inferred node locations. It is again clear, however, that the central pressure is poorly constrained, as discussed above, but that the remaining node parameters are reasonably well determined.

In the remainder of the paper, we will plot the reconstructed directly, as in figure 5, together with the posterior distributions on the position ( and ) of the cluster and the important physical parameters and .

### 6.2 Simulation SIM

The histogram of Bayes factors as a function of obtained in the analysis of simulation SIM is shown in figure 6. Recalling that the input pressure profile for SIM is constructed by linear interpolation between nodes, one sees that our analysis has again recovered the true value of as the most favoured. In this case, one sees the (straight-line) model is again strongly disfavoured. In particular, one finds , indicating strong evidence for the most favoured model over the model, according to Table 1. The Bayes factors then gradually decline for in a similar way to that found for SIM, ultimately reaching the value ; this again indicates no preference for over the model, for the reasons discussed above.

Conditioning on nodes, figure 7 shows the reconstructed pressure profile (top panel), and the 1-D and 2-D marginal posterior distributions on the cluster position parameters and (middle panel) and on the gas extent and total Comptonisation parameter (bottom panel). One again sees that the reconstructed pressure profile is consistent with the input profile used to generate the simulation, but that cluster gas pressure is poorly constrained, for the reasons we discussed above in the context of SIM. The remaining node parameters are again reasonably well constrained, especially the first internal node parameters . Moreover, one again obtains tight constraints on the cluster position, consistent with the input values. The cluster extent and total Comptonisation parameter are also both well constrained and in agreement with the input values. As in SIM, however, the 2-D marginal distribution of and reveals a mild degeneracy.

### 6.3 Simulation SIM

The histogram of Bayes factors as a function of obtained in the analysis of simulation SIM is shown in figure 8. The input pressure profile for SIM is not constructed from a linear interpolation between nodes, but instead from the cluster model of Olamaie et al. (2012, 2013), which assumes that the pressure is described by the generalised NFW (GNFW) profile (Nagai, Kravtsov & Vikhlinin, 2007). Hence, in this case, there is no ‘correct’ number of nodes to recover. Instead, the most favoured value of nodes gives an indication of the level of complexity in the pressure profile reconstruction that is supported by the data. Thus, for interferometric SZ observations of the type simulated, the data support only a very simple reconstruction of the pressure profile, favouring a representation consisting of just two straight-line segments. Indeed, the variation of the Bayes factor with is very similar to that shown in figure 3 for SIM, for which the input pressure profile had precisely this simple form.

Conditioning on nodes, figure 9 shows the reconstructed pressure profile (top panel), and the 1-D and 2-D marginal posterior distributions on the cluster position parameters and (middle), and the gas extent and total Comptonisation parameter (bottom). Although in this simulation there are no ‘correct’ locations for the nodes in -space, one sees that the reconstructed node locations yield reconstructed pressure profiles that are consistent with the input one, although the plotted ‘best-fit’ profile is somewhat shallower than the GNFW profile assumed in the simulation. One can understand this behaviour by recalling that the SZ Comptonisation parameter given in (7) is proportional to the integral of along the line-of-sight through the cluster. Thus, the larger amplitude of the first (only) internal node relative to the input profile can offset the smaller amplitude of the node at the origin. Once again our analysis serves to highlight the rather coarse level of detail in the reconstructed pressure profile that is achievable with SZ observations of the type simulated. Nonetheless, one still obtains tight constraints on the cluster position, which are consistent with the input values, and also on the cluster extent and total Comptonisation parameter. We again see, however, that there is a slight degeneracy in the 2-D marginal distribution of and .

### 6.4 Cluster MACSJ0744+3927

The histogram of Bayes factors as a function of obtained in the analysis of real AMI observations of the cluster MACSJ0744+3927 is shown in figure 10. For this real cluster, one sees that the variation of the Bayes factor is broadly similar to that shown in figure 8, obtained from the analysis of simulation SIM, for which an input GNFW pressure profile was assumed. In particular, the most favoured model again has nodes, after which the Bayes factors gradually decline with increasing . This indicates that the data support a model for the pressure profile that is no more complex than two straight-line segments. It is worth noting, however, that in comparing the favoured model with the base (straight-line) model, one obtains , which corresponds to a decisive favouring of the former model, according to Table 1. Thus, one may deduce at high confidence from our analysis, in a model independent manner, that the pressure profile is not simply a linear function of .

Conditioned on nodes, figure 11 shows the reconstructed pressure profile (top panel), and the 1-D and 2-D marginal posterior distributions on the cluster position parameters and (middle), and the gas extent and total Comptonisation parameter (bottom). Once again the central pressure is poorly constrained, but the remaining node locations in -space are better determined. It is interesting that the resulting reconstructed pressure profile is convex, rather than concave such as those obtained from the simulated observations, although the uncertainty in the location of the single internal node also allows for concave profiles. It is also worth recalling from our analysis of simulated data how SZ observations of this type allow for only a very coarse reconstruction of the cluster pressure profile, owing to the SZ effect being proportional to the line-of-sight integral of the pressure. Indeed, from figure 9, we recall that in the analysis of simulation SIM the reconstructed pressure profile was somewhat flatter than the input GNFW profile, although still consistent with it to within the error-bars, and a similar effect could be occurring in figure 11.

Nonetheless, as we found in our analysis of simulated data, the cluster position ( and ) is well constrained, and the corresponding 2-D marginal posterior distribution shows no sign of degeneracy. The gas extent and total Comptonisation parameter are also both very well determined, although their 2-D marginal posterior distribution shows that same slight degeneracy as seen in the analyses of the simulation observations.

## 7 Conclusions

Almost all current approaches to modelling observations of galaxy clusters rely on assuming some parameterised functional form for the properties of the cluster, such as gas density, dark matter density or temperature. A generic weakness of this approach is that these functional forms have usually been arrived at through empirical means, via the analysis of -body simulations or observations, and are often chosen to have simple analytic expressions, rather than being fundamental or physically well motivated.

In this paper, we have moved away from this approach and presented a free-form model for the physical properties of galaxy clusters. Previous attempts to model clusters in this way have typically relied simply on dividing the cluster into a predefined number of cells, or concentric shells for spherical clusters, and determining the value of each physical quantity of interest within these subregions (Tchernin et al., 2015). Such approaches typically lead to under-determined inverse problems that therefore need to be regularised in some way. There is considerable freedom in how to choose the level or nature of the regularisation to apply, and the results can vary significantly depending on how this choice is made. We have therefore presented an alternative approach to free-form reconstruction in which the complexity of the model is determined directly from the data. This is achieved by representing each independent cluster property as some interpolating or approximating function that is specified by a set of control points, or ‘nodes’, for which the number of nodes, together with their positions and amplitudes, are allowed to vary and are inferred from the data in a Bayesian manner, employing both model selection and parameter estimation.

To demonstrate our approach in a simple setting, we have applied it to the particular case of modelling interferometric SZ observations of spherical galaxy clusters. In this context, the free-form part of the cluster model is simply a nodal representation of the electron pressure profile . We have performed Bayesian analyses of simulated observations with the Arcminute Microkelvin Imager (AMI) of three separate model clusters.

In the first two simulations, the input pressure profile has the same form as that assumed in the analysis, namely a linear interpolation between a set of nodes (with and , respectively). We showed that, in both cases, our Bayesian model selection analysis returned the true value of as the most favoured. Moreover the resulting reconstructed pressure profiles were consistent with those used as input. In our third simulation, in which the input pressure was assumed to follow a GNFW, the most favoured model again had nodes, and the resulting reconstructed pressure profile was consistent with the input one. In all cases, we found that the central pressure of the cluster is not well determined, since interferometric observations of the type simulated do not probe length scales corresponding to the inner core. In the analysis of our third simulation, we also noted that the reconstructed pressure profile was somewhat shallower that the singular GNFW profile used to generate the simulation (although still consistent with it), which results from the SZ effect being proportional only to the line-of-sight integral of the pressure in the cluster. A general feature of our results is that SZ interferometric observations of this type allow for only a very coarse reconstruction of the cluster pressure profile. Nonetheless, we also find that in all cases one obtains tight constraints on the cluster position, and that the cluster extent and total Comptonisation parameter are also both well determined.

We also applied our approach to real AMI observations of the cluster MACSJ0744+3927. We found that the most favoured model has nodes, and that the corresponding best-fit reconstructed pressure profile is convex, although the uncertainties of the node locations in -space also allow for concave profiles. As we found in the analysis of simulations, the central pressure is poorly determined but the remaining node parameters are reasonably well constrained. Once again, we found that cluster position, cluster gas extent and total Comptonisation parameter are all very well constrained.

In closing, some further general points and avenues for future research are worth discussing. First, the tight constraints obtained on the cluster position and on the two very important cluster parameters and demonstrate the robustness of our approach. Moreover, with only minor modification, the method may prove very useful in cluster detection. Although, for the sake of illustration, we assumed the cluster redshift in our analyses presented here, this is not necessary. One can easily re-perform the analysis by instead constructing a nodal model for the pressure profile , where is the projected angle on the sky from the centre of the cluster. In this way, the approach does not depend on the redshift, but will still produce the tight constraints on the cluster position, angular extent and . Finally, since our Bayesian approach to the inference produces posterior weighted samples in the parameter space, further directions for future development include defining other derived parameters that capture particular features of interest in the pressure profile. One example, motivated by our analysis of the cluster MACSJ0744+3927, would be a statistic that embodies the concavity or convexity of the pressure profile. Others might include a parameter that quantifies the cuspy versus core nature of the central region of the cluster. In any case, one may easily use the posterior samples to determine the full (joint) posterior distribution of such derived parameters.

## Acknowledgments

The authors thank William Handley for illuminating discussions regarding Bayesian inference. This work was performed using both the Darwin Supercomputer of the University of Cambridge High Performance Computing Service (http://www.hpc.cam.ac.uk/), and COSMOS Shared Memory system at DAMTP, University of Cambridge operated on behalf of the STFC DiRAC HPC Facility. Darwin Supercomputer is provided by Dell Inc. using Strategic Research Infrastructure Funding from the Higher Education Funding Council for England and funding from the Science and Technology Facilities Council. COSMOS Shared Memory system is funded by BIS National E-infrastructure capital grant ST/J005673/1 and STFC grants ST/H008586/1, ST/K00333X/1. We are grateful to Stuart Rankin and COSMOS management team for their computing assistance. MO thanks Astro Hack Week 2016 for valuable discussions and insights on Bayesian inference and statistics. YCP acknowledges support from a Trinity College Junior Research Fellowship.

### Footnotes

- pagerange: LABEL:firstpage–References
- pubyear: 2017

### References

- AMI Consortium: Davies et al., 2011, MNRAS, 415, 2708
- AMI Consortium: Olamaie M. et al., 2012, MNRAS, 419, 2921
- AMI Consortium: Zwart J. T. L., et al., 2008, MNRAS, 391, 1545
- Arnaud M., Pratt G. W., Piffaretti R., Böhringer H., Croston J. H., Pointecouteau E., 2010, A &A, 517, A92
- Battaglia N., et al., 2016, JCAP, 8, 013
- Birkinshaw M., 1999, PhR, 310, 97
- Bonamente M., et al., 2012, NJPh, 14, 025010
- Borgani S., et al., 2004, MNRAS, 348, 1078
- Carlstrom J. E., Holder G. P., Reese E. D., 2002, ARA &A, 40, 643
- Cavaliere A., Fusco-Femiano R., 1976, A&A, 49, 137
- Cavaliere A., Fusco-Femiano R., 1978, A&A, 70, 677
- Challinor A., Lasenby A., 1998, ApJ, 499, 1
- de Haan T., et al., 2016, ApJ, 832, 95
- De Martino I., Atrio-Barandela F., 2016, MNRAS, 461, 3222
- Ettori S., Tozzi P., Borgani S., Rosati P., 2004, A&A, 417, 13
- Ettori S., Balestra I., 2009, A&A, 496, 343
- Ettori S., Gastaldello F., Leccardi A., Molendi S., Rossetti M., Buote D., Meneghetti M., 2011, A&A, 526, 1
- Ettori S., 2013, MNRAS, 435, 1265
- Feroz F., Hobson M. P., 2008, MNRAS, 384, 449
- Feroz F., Hobson M. P., Zwart J. T. L., Saunders R. D. E., Grainge K. J. B., 2009, MNRAS,398, 2049
- Feroz F., Hobson M. P., Bridges M., 2009, MNRAS, 398, 1601
- Feroz F., Hobson M. P., Cameron E., Pettitt A. N., 2013, arXiv:1306.2144
- Feroz F.,2013,IEEE 13th International Conference, 10.1109/ICDMW.2013.21
- Foreman–Mackey D., 2016, The Journal of Open Source Software, 24
- Grainge K., Jones M. E., Pooley G., Saunders R., Edge A., Grainger W. F., Kneissl R., 2002, MNRAS, 333, 318
- Giodini S., Lovisari L., Pointecouteau E., Ettori S., Reiprich T. H., Hoekstra H., 2013, SSRv, 177, 247
- Handley W. J., Hobson M. P., Lasenby A. N., 2015, MNRAS, 453, 4384
- Hasselfield M., et al., 2013, JCAP, 7, 008
- Hee S., Handley W. J., Hobson M. P., Lasenby A. N., 2016, MNRAS, 455, 2461
- Hee S., Vázquez J. A., Handley W. J., Hobson M. P., Lasenby A. N., 2017, MNRAS, 466, 369
- Hobson M. P., Maisinger K., 2002, MNRAS, 334, 569
- Hoekstra H., Bartelmann M., Dahle H., Israel H., Limousin M., Meneghetti M., 2013, SSRv, 177, 75
- Itoh N., Kohyama Y., Nozawa S., 1998, Ap J, 502, 7
- Jaynes E. T., 1986, Bayesian methods: an introductory tutorial, Cambridge University Press
- Jeffreys H., 1961, Theory of Probability, 3rd ed. Oxford: University Press.
- Kass Robert E. and Raftery Adrian E., 1995, Journal of the American Statistical Association, 90, 430
- Köhlinger F., Hoekstra H., Eriksen M., 2015, MNRAS, 453, 3107
- Mantz A. B., Allen S. W., Morris R. G., Rapetti D. A., Applegate D. E., Kelly P. L., von der Linden A., Schmidt R. W., 2014, MNRAS, 440, 2077
- Nagai D., Kravtsov A. V., Vikhlinin A., 2007, ApJ, 668, 1
- Navarro J. F., Frenk C. S., White S. D. M., 1996, ApJ, 462, 563
- Navarro J. F., Frenk C. S., White S. D. M., 1997, ApJ, 490, 493
- Nozawa S., Itoh N., Kohyama Y., 1998, ApJ, 508, 17
- Olamaie M. et al., 2012, MNRAS, 421, 1136
- Olamaie M., Hobson M. P., Grainge K. J. B., 2012, MNRAS, 423, 1534
- Olamaie M., Hobson M. P., Grainge K. J. B., 2013, MNRAS, 430, 1344
- Olamaie M., Feroz F., Grainge K. J. B., Hobson M. P., Sanders J. S., Saunders R. D. E., 2015, MNRAS, 446, 1799
- Parkinson D., Liddle A. R., 2013, SADM, 6, 3
- Planck Collaboration, et al., 2013, A&A, 550, A128
- Planck Collaboration, et al., 2016, A&A, 594, A24
- Perrott Y. C., et al., 2015, A&A, 580, A95
- Planck Collaboration, et al., 2016, A&A, 594, A20
- Pointecouteau E., Giard M., Barret D., 1998, A&A, 336, 44
- Postman M., et al., 2012, ApJS, 199, 25
- Rephaeli Y., 1995, ARA&A, 33, 541
- Rodríguez-Gonzálvez C., Chary R. R., Muchovej S., Melin J.-B., Feroz F., Olamaie M., Shimwell T., 2017, MNRAS, 464, 2378
- Rozo E., et al., 2010, ApJ, 708, 645
- Rozo E., Bartlett J. G., Evrard A. E., Rykoff E. S., 2014, MNRAS, 438, 78
- Rumsey C., et al., 2016, MNRAS, 460, 569
- Sayers J., et al., 2013, ApJ, 768, 177
- Schammel M. P., et al., 2013, MNRAS, 431, 900
- Schmidt R. W., Allen S. W., 2007, MNRAS, 379, 209
- Seehars S., Amara A., Refregier A., Paranjape A., Akeret J., 2014, PhRvD, 90, 023533
- Seehars S., Grandis S., Amara A., Refregier A., 2016, PhRvD, 93, 103507
- Sievers J. L., et al., 2013, JCAP, 10, 060
- Sivia D. S., Skilling J., 2005, Data Analysis A Bayesian tutorial, Oxford University Press
- Sunyaev R. A., Zeldovich Y. B., 1970, CoASP, 2, 66
- Tchernin C., Majer C. L., Meyer S., Sarli E., Eckert D., Bartelmann M., 2015, A&A, 574, A122
- Trotta R., Feroz F., Hobson M., Roszkowski L., Ruiz de Austri R., 2008, JHEP, 12, 024
- Umetsu K., Zitrin A., Gruen D., Merten J., Donahue M., Postman M., 2016, ApJ, 821, 116
- Vázquez J. A., Bridges M., Hobson M. P., Lasenby A. N., 2012, JCAP, 9, 020
- Vázquez J. A., Bridges M., Hobson M. P., Lasenby A. N., 2012, JCAP, 6, 006
- Vikhlinin A., Markevitch M., Murray S. S., Jones C., Forman W., Van Speybroeck L., 2005, ApJ, 628, 655
- Vikhlinin A., Kravtsov A., Forman W., Jones C., Markevitch M., Murray S. S., Van Speybroeck L., 2006, ApJ, 640,691
- Vikhlinin A., et al., 2009, ApJ, 692, 1060