# Discrete versus continuous domain models for disease mapping

###### Abstract

Disease mapping aims to assess variation of disease risk over space and identify high-risk areas. A common approach is to use the Besag-York-Mollié model on data aggregated to administrative areal units. When precise geocodes are available, it is more natural to use Log-Gaussian Cox processes. In a simulation study mimicking childhood leukaemia incidence using actual residential locations of all children in the canton of Zürich, Switzerland, we compare the ability of these models to recover risk surfaces and identify high-risk areas. We then apply both approaches to actual data on childhood leukaemia incidence in the canton of Zürich during 1985-2015.

Konstantinoudis G]Garyfallos Konstantinoudis Schuhmacher D]Dominic Schuhmacher Rue H]Håvard Rue Konstantinoudis G, Schuhmacher D, Rue H & Spycher B]Ben Spycher

## 1 Introduction

Disease mapping, i.e. calculating and visualising disease risk across space, is an important exploratory tool in epidemiology. The information obtained can provide new clues about the aetiology of a disease, identify areas of high risk or hot spots, and support monitoring prevention efforts. Data used for disease mapping usually consist of disease counts in smaller area units, typically administrative units such as counties, covering a larger area of interest. Mapping directly area-level incidence can be misleading, often yielding extreme estimates when the denominator (population at risk) is small (Wakefield, 2007). This problem is usually confronted by exploiting spatial autocorrelation and borrowing information from neighbouring areas. In the Bayesian framework a popular class of models are those proposed by Besag et al. (1991), often referred to as Besag–York–Mollié (BYM) models, which assume global and local smoothing through conditional autoregressive priors; see Freni-Sterrantino et al. (2018) for a recent treatment. Less frequently, exact geocodes are available, allowing modelling a disease as a point process over the continuous spatial domain. An attractive model class of choice in this situation are the Log-Gaussian Cox processes (LGCPs), among other things because of the tractability of their first and second moments (Møller et al., 1998). Nowadays we have the computational tools to fit LGCPs in reasonable time but the additional benefits over the widely used BYM model are not well understood.

Disease mapping based on areal data is commonly done using the BYM model, see Halonen et al. (2016); Riesen et al. (2018) for examples. The BYM model is an extension of the ICAR (Intrinsic Conditional Autoregressive) model, obtained by adding a spatially unstructured random effect to the already given spatially structured random effect. The latter is a realisation of a Gaussian Markov random field (GMRF) with zero mean and a sparse precision matrix capturing strong spatial dependence (Rue and Held, 2005). The unstructured random effect may be seen as a collection of independent random intercepts for the various areal units. This specification leads to a piecewise constant risk surface which depends on the spatial unit selected and assumes uniform risk across this spatial unit. Advances in Bayesian inference using integrated nested Laplace approximations (INLA) have made this method widely accessible and investigators can get quickly posterior estimates (Rue et al., 2009; Illian et al., 2012; Rue et al., 2017; Bakka et al., 2018). The combination of easy accessible data and freely available code with a toolbox (Lindgren and Rue, 2015) have contributed significantly to the popularity of the BYM model (Blangiardo et al., 2013).

When precise geocodes are available, it is more natural to study the point pattern using spatial point process models, see Diggle et al. (2005, 2013) and Giorgi et al. (2016) for examples in disease mapping. LGCPs model locations of cases (geocodes) as an inhomogeneous Poisson process conditional on a latent field, which is a realisation of a Gaussian random field (GRF) (Møller et al., 1998; Illian et al., 2012; Yuan et al., 2017). In order to do computations, the GRF is often discretized to a regular grid. The covariance matrix of the discretized field has an intuitive interpretation, but is typically a dense matrix, leading to high computation costs (big problem; Lasinio et al., 2013). Computational techniques can be exploited which make this procedure tractable, but when combined with Monte Carlo algorithms the computational burden remains large. Advances include more efficient inferential tools that use better proposal mechanisms (Girolami and Calderhead, 2011), INLA (Rue et al., 2009) or different approximations of the covariance matrix (Heaton et al., 2017). Recently, Lindgren et al. (2011) proposed a finite element based approximation to the stochastic weak solutions of the stochastic partial differential equations (SPDE; Whittle, 1954) that describe certain GRFs with Matérn covariance function. This approach allows to specify an arbitrary triangulation of space and yields a GMRF representation of the (approximate) solution indexed by the vertices; see Bakka et al. (2018) for a recent review. This is more appealing than the dense LGCP approach described above, since the Markov property allows to do computations based on a sparse precision matrix, while keeping the continuous GRF model without an artificial specification of a regular grid; see Pereira et al. (2017) for an example.

The continuous nature of LGCPs leads to several preferable theoretical characteristics compared to the BYM models. First LGCPs are resolution invariant, i.e. they bypass all the problems arising when dealing with arbitrary boundaries; for example, the modifiable areal unit problem, where the results are highly dependent on the areal unit selected (Openshaw, 1984). Inference for BYM is also complicated by numerous irregular changes in the regions on which health data is reported (Li et al., 2012b). In addition, BYM assumes constant risk within the spatial units, but in most situations the unknown spatial covariates associated with the disease of interest are expected to be continuous, making this starting point a strong assumption. Furthermore, if the areas of higher risk are smaller than the areal unit selected, the BYM model is not expected to be as sensitive and specific as a continuously indexed model. Lastly covariates are often available at different spatial scales. LGCPs allow using all the data sources available, retaining high-resolution and overcoming problems such as spatial misalignment and ecological bias (Gotway and Young, 2002). So in theory LGCPs should outperform the BYM model. But is this true in practice and how can we quantify any such improvement?

There are a few published studies that compared these methods. A study examining lupus incidence in Toronto, simulated 40 Gaussian random fields using a Matérn correlation function with roughness and variance parameters fixed, varying the range parameter (Li et al., 2012a). They compared the models’ ability to calculate the risk and identify areas of higher risk and concluded that LGCPs outperform BYM in all instances. Using similar simulation procedure and metrics, Li et al. (2012b) extended the LGCP model, assuming that exact case locations are unknown and information is only available at larger area units (census tracks in their example), and compared this version with the BYM model. They reported that their LGCP version outperforms the BYM model, however when case locations are available, it is preferable to use LGCP on the exact points rather than LGCP on aggregated data. It is not surprising though that in both studies LGCPs performed best, given that the processes used to generate and fit the data (Matérn with roughness parameter 2) were the same. An Australian study using 6 scenarios consistent with a previous study (Illian et al., 2012) assessed the performance of, among other models, the BYM and LGCP with a Matérn correlation function on different spatial scales (Kang et al., 2013) by assessing the deviance information criterion (DIC) and the logarithmic score. They concluded that the models’ prediction performance was scenario dependent and suggested that the analysis should be performed using different spatial scales and thus smoothness priors. However, they did not examine their ability to identify areas of higher risk. All three studies were based on a small number of datasets and none incorporated the continuous (triangulation-based) specification of the precision matrix by Lindgren et al. (2011).

The objective of this article is to investigate the performance of BYM and LGCP when the interest lies in quantifying risk across space (mapping) and identifying areas of increased risk. For this we perform an extensive simulation study based on a real spatial population. Our findings are then used to interpret the BYM and LGCP model fits for the childhood leukaemia incidence during 1985-2015 in the canton of Zürich. The remainder of the paper is laid out as follows. Section 2 describes the methods used in this article, how data was simulated and what metrics are used to assess the performance. In Section 3 we present and discuss the results of the simulation study, whereas in Section 4 the models are applied to the childhood leukaemia incidence in the canton of Zürich. Section 5 gives a general discussion and areas for future work and section 6 ends with the conclusion.

## 2 Methods

### 2.1 Models

Let be an observation window subdivided in spatial units and denote by be the disease count in the -th unit. Suppose that , where is the population in the -th spatial unit and the corresponding risk. The BYM model specification assumes:

(1) |

where is a constant, is a spatially structured random effect (ICAR component; denotes ), and is a spatially unstructured random effect (independent random intercepts for different ). The represent weights taking the value 1 when spatial units and are first order neighbours and 0 otherwise, and and denote random precision parameters. Specifying appropriate priors for the precision parameters completes the Bayesian representation of the above model. Following the parametrisation by Simpson et al. (2017) and Riebler et al. (2016) the above equation is rewritten as:

(2) |

where , is a standardised spatial component that has characteristic marginal variance equal to 1 (Sørbye and Rue, 2014), is a mixing parameter and controls the marginal precision. Using the representation given in (1) leads to an independent assignment of priors on the precision parameters, which may lead to identifiability issues for the case where no spatial dependence is found (Simpson et al., 2017; MacNab, 2011). In 2 the hyperparameters and are orthogonal in interpretation, which allows us to specify priors independently.

Turning now to the continuous domain, let be a an inhomogeneous Poisson point process on with mean expected number of points in any set equal to , where is the population density and is the risk at location (Simpson et al., 2016). In an LGCP model we assume that the log-risk (and hence the log-intensity of ) is the realisation of a Gaussian random field . Assuming stationarity and isotropy yields the model specification:

(3) |

where is a symmetric non-negative definite function depending on the marginal variance and a range parameter , beyond which correlations fall below a certain threshold of approximately . The LGCP specification (3) allows for the inclusion of covariates via further additive terms in the first equation; the same holds for the BYM specification (1). Typical choices for include the exponential, Gaussian, and spherical covariance functions. For this particular approach we used the popular and very flexible class of Matérn covariance functions, which has an additional roughness parameter that is fixed (determined by the investigator). Following Lindgren et al. (2011), we assume a finite element representation of the Matérn field based on a fairly dense triangulation referred to as mesh (online supplement, Figure S1):

(4) |

where denotes the total number of mesh nodes, are random weights and is a set of piecewise linear basis functions taking the value at the -th mesh node, and at every other node. Whittle (1954, 1963) showed that the solution of the stochastic partial differential equation

(5) |

is a GRF with Matérn covariance function under the reparametrization

where is the dimension of the space. Here denotes Gaussian white noise and is the Laplacian. For this analysis we use . Computing an approximate stochastic weak solution of (5) based on the finite element representation (4) results in a Gaussian vector with mean zero and sparse precision matrix . Unlike traditional methods for inference in LGCP models, this appoach uses the precise locations in the point pattern without aggregation and provides a continuous approximation of the latent field.

### 2.2 Data simulation

To compare the performance of the two models described above, we conducted a simulation study. In this section, we describe the data simulation procedure.

The selection of scenarios was motivated by the example of childhood leukaemia incidence in Switzerland. Childhood leukaemia is a rare cancer and over the period 1985–2015 we observed childhood leukaemia cases in the canton of Zürich, which had a total childhood population ( years of age) of in 2000. Precise geocodes were available from the national census in 2000 allowing to simulate case locations from the true underlying geographic distribution of the population at risk.

We considered scenarios varying in the size of high-risk areas (radius of circular high risk areas in ; ), the risk ratio between the low risk area and the high risk area (), the expected number of cases generated (, where with from above) and the shape of the risk surface (step function or smooth function). All of the resulting 36 scenarios included 3 high risk areas with centres located in a highly urban area (Zürich; Figure 1, circles on the left), a semi-urban area (Winterthur; Figure 1, top-right circles) and a highly rural area (Gossau; Figure 1, bottom-right circles). We also included 3 scenarios with a flat risk surface for . For each of the resulting 39 scenarios, we generated 300 datasets.

We selected a circular shape for the high risk areas because of its simplicity (defined only by centre and radius), rotational invariance (thus avoiding arbitrary choices of angular orientation), and because it can be regarded as a generic model of environmental contamination from a point source. Furthermore it is unlikely to favour any of the models by unintentional alignment with the subdivisions of space used in model fitting, i.e. municipalities for BYM or a Voronoi tesselation or regular grid for LGCP models.

In the scenarios for which the true risk surface is a step function set

where is the risk outside the circles, is the proportion of the excess risk inside the circles, is the centre of the -th circle, , and takes the value 1 if the condition is satisfied and 0 otherwise. The risk at the location of residence of the -th child is then given by for .

For each value of and , the baseline risk was selected such that the overall number of expected cases generated would equal . To generate case locations, we sampled a value from for each person , and declared the person to be a case if the sampled value was smaller than . We thus generated datasets. The full algorithm used to generate the datasets is given in the online supplement as Algorithm S1.

In the scenarios with a smooth risk surface the excess risk was modelled using Gaussian functions as follows:

where denotes the background risk and are as above. While taking the sum of the three Gaussian components may seem more intuitive, we selected the , because this way the shape of the high risk areas remains intact (clear circles). For each combination of and , we selected the new parameters such that a) on average 80% of the excess cases produced by an isolated Gaussian risk function over an infinite area occur within a circle of radius ; b) the expected number of excess cases produced by the risk surface over the canton of Zürich is the same as under , and c) the expected total number of cases is the same under both risk surfaces. To sample locations we used the same procedure as described above. For more information how and were derived, for the sampling algorithm and a graphical representation of the risk surfaces under different scenarios, refer to the online supplement, Section 1, Algorithm S2 and Figures S2–4.

### 2.3 Prior selection and inference

Both for the BYM and LGCP models and across all datasets in the simulation, we followed the results from Simpson et al. (2017) to construct penalised complexity priors. These priors are invariant to parametrisations, have a natural connection with Jeffrey’s priors, are parsimonious and have excellent robustness properties (Fuglstad et al., 2018; Sørbye and Rue, 2017; Simpson et al., 2017). For the BYM model we set a prior for in (2) such that indicating that the log-risk in a fixed area is unlikely to have variance more than . For the mixing parameter we assigned implying that the median of the mixing parameter is 0.5 (i.e. equal contribution of the overdispersion component and the ICAR component to the latent field). For the LGCP model we followed a similar approach for the marginal standard deviation, setting again , whereas for the range parameter we set corresponding to a weakly informative prior using the fact that is roughly half of the diameter of the domain. Inference for both models was conducted using INLA as introduced by Rue et al. (2009); see Blangiardo and Cameletti (2015) for book-treatment of the subject.

### 2.4 Performance measures

We used the root mean integrated squared error evaluated on a fine grid as a metric to assess the ability of a model to estimate the true risk surface:

(6) |

where denotes a weight function, is the fitted value at (a random variable having the marginal posterior distribution) and is the true value at . For approximating the integral we use on the right hand side the partition of the domain into small pixels and , , are suitably chosen representative values of , , on , respectively. More precisely, is a value simulated from the marginal posterior distribution at and the expectation on the right hand side is the average over all such simulated values. We considered four versions of this RMISE, varying the weights among and where denotes the area of and the -values among and , where is evaluated at the centroid of grid cells. For the rest of the paper, RMISE refers to the version with and unless otherwise stated.

As a second measure to assess a model’s ability to capture the true risk, we used the coverage probability. Let be an indicator taking the value one whenever lies inside the 95% credibility region of and zero otherwise for the -th dataset. We defined the coverage probability of the -th cell as . We also calculated a coverage proportion of cells correctly covered by the -th map defined as . For the BYM on municipalities we used the credibility regions of the municipality, in which the centroid of the grid cell lay.

To assess a model’s ability to identify high-risk areas we estimated the receiver operating characteristic (ROC) curve and determined the area under this curve (AUC). More specifically, we defined regions of high risk based on exceedance probabilities as the set of grid cells satisfying for some , where the probability is taken over the posterior distribution of . Denoting the true high risk region, given by , as and the region of high risk indicated by the exceedance probability as , we define the area-based sensitivity and specificity as

where denotes area and and denote the complements of and , respectively; see Figure S5 in the online supplement for illustration. We evaluate the area-based sensitivity and specificity at and calculate as the area under the ROC curve defined by plotting sensitivity against specificity. We also use a population-based version of sensitivity and specificity using the same formulae as above with denoting population in a given area. For the rest of the manuscript, refers to the area-based version unless otherwise stated.

## 3 Results

Data generating model | Step function | Smooth function | ||

Fitted model | BYM | LGCP | BYM | LGCP |

k=1 | ||||

Radius = | ||||

c = 2 | 6.76 (4.8, 12.1) | 6.83 (4.5, 12.4) | 6.71 (4.7, 12.2) | 6.76 (4.37, 12.1) |

c = 5 | 11.8 (7.92, 17.8) | 16.4 (10.5, 21.9) | 12.3 (8.15, 19.4) | 16.2 (10.9, 22.6) |

Radius = | ||||

c = 2 | 14.8 (12.4, 19.5) | 14.6 (12, 18.9) | 13.6 (10.7, 18.3) | 13.8 (10.9, 18.4) |

c = 5 | 28.3 (25.4, 33.3) | 26.6 (24.1, 32.3) | 25.1 (22.6, 30.1) | 23.3 (20.3, 28.9) |

Radius = | ||||

c = 2 | 16.9 (15.1, 19.7) | 14.7 (13.5, 17.9) | 15.4 (13.3, 18.3) | 13.5 (11.6, 17.4) |

c = 5 | 35.6 (34, 37.6) | 27 (25.4, 29.5) | 27.2 (25.6, 29.4) | 19.8 (18.1, 23.5) |

k=5 | ||||

Radius = | ||||

c = 2 | 4.47 (3.17, 6.81) | 6.62 (4.24, 9.88) | 4.48 (3.1, 6.88) | 6.51 (4.27, 9.9) |

c = 5 | 10.4 (8.77, 12.5) | 14.8 (13.1, 17.1) | 10.8 (8.82, 12.5) | 14.8 (13, 16.8) |

Radius = | ||||

c = 2 | 11.6 (10.6, 13.1) | 12.2 (10.8, 14.7) | 10.4 (9.32, 12) | 11 (9.33, 14.3) |

c = 5 | 22.8 (21.4, 24.5) | 21.5 (19.6, 24.6) | 19.2 (18, 20.6) | 16.8 (14.8, 19.9) |

Radius = | ||||

c = 2 | 14.9 (14.3, 15.8) | 12.1 (11, 14.4) | 12.3 (11.5, 13.4) | 10.1 (8.57, 12.7) |

c = 5 | 28.4 (27.3, 29.8) | 22.3 (20.8, 24.6) | 21.8 (21, 22.8) | 13.9 (12.1, 17) |

k=10 | ||||

Radius = | ||||

c = 2 | 4 (3.01, 5.77) | 7.32 (5.42, 9.68) | 3.99 (2.89, 5.89) | 7.34 (5.43, 9.82) |

c = 5 | 9.76 (8.65, 11) | 14 (12.8, 15.7) | 9.88 (8.77, 11.1) | 13.9 (12.7, 15.6) |

Radius = | ||||

c = 2 | 10.4 (9.8, 11.4) | 11.5 (10.2, 13.4) | 9.12 (8.44, 10) | 10.3 (8.66, 12.4) |

c = 5 | 20.6 (19.6, 21.8) | 19.9 (18.2, 22.8) | 16.9 (16.1, 18.1) | 14.7 (12.9, 17.2) |

Radius = | ||||

c = 2 | 13.6 (13.1, 14.2) | 11.8 (10.4, 13.9) | 11.1 (10.5, 11.8) | 9.17 (7.75, 11.7) |

c = 5 | 25 (24.2, 26) | 21 (19.7, 23.3) | 19 (18.2, 19.8) | 11.9 (10.5, 14.8) |

Data generating model | Step function | Smooth function | ||

Fitted model | BYM | LGCP | BYM | LGCP |

k=1 | ||||

Radius = | ||||

c=2 | 0.99(0.94,0.99) | 1.00(0.94,1.00) | 0.99(0.95,1.00) | 0.99(0.94,1.00) |

c=5 | 0.94(0.90,0.99) | 0.99(0.98,1.00) | 0.94(0.91,0.99) | 0.99(0.98,1.00) |

Radius = | ||||

c=2 | 0.94(0.85,0.97) | 0.97(0.87,1.00) | 0.95(0.93,0.96) | 0.99(0.94,1.00) |

c=5 | 0.90(0.86,0.94) | 0.95(0.91,0.97) | 0.92(0.90,0.93) | 1.00(0.97,1.00) |

Radius = | ||||

c=2 | 0.62(0.29,0.99) | 0.90(0.47,0.98) | 0.97(0.54,1.00) | 0.99(0.84,1.00) |

c=5 | 0.51(0.43,0.91) | 0.82(0.59,0.90) | 0.86(0.62,0.96) | 0.99(0.92,1.00) |

k=5 | ||||

Radius = | ||||

c=2 | 0.94(0.91,0.99) | 0.99(0.89,1.00) | 0.95(0.91,0.99) | 0.99(0.88,1.00) |

c=5 | 0.90(0.88,0.90) | 0.99(0.98,0.99) | 0.90(0.89,0.93) | 0.99(0.98,1.00) |

Radius = | ||||

c=2 | 0.90(0.85,0.94) | 0.95(0.89,0.97) | 0.92(0.90,0.93) | 0.99(0.93,1.00) |

c=5 | 0.88(0.84,0.90) | 0.92(0.89,0.94) | 0.87(0.85,0.89) | 1.00(0.96,1.00) |

Radius = | ||||

c=2 | 0.88(0.52,0.95) | 0.88(0.8,0.94) | 0.93(0.82,0.95) | 1.00(0.94,1.00) |

c=5 | 0.85(0.76,0.9) | 0.84(0.78,0.88) | 0.85(0.71,0.9) | 0.99(0.96,1.00) |

k=10 | ||||

Radius = | ||||

c=2 | 0.94(0.90,0.99) | 0.98(0.85,0.99) | 0.94(0.91,0.99) | 0.99(0.86,1.00) |

c=5 | 0.90(0.88,0.90) | 0.98(0.98,0.99) | 0.90(0.88,0.90) | 0.99(0.98,0.99) |

Radius = | ||||

c=2 | 0.89(0.85,0.91) | 0.92(0.85,0.96) | 0.90(0.88,0.91) | 0.97(0.91,1.00) |

c=5 | 0.87(0.82,0.89) | 0.92(0.88,0.93) | 0.85(0.82,0.87) | 0.99(0.94,1.00) |

Radius = | ||||

c=2 | 0.88(0.74,0.93) | 0.86(0.79,0.91) | 0.90(0.82,0.93) | 0.98(0.92,1.00) |

c=5 | 0.86(0.80,0.90) | 0.84(0.79,0.87) | 0.83(0.77,0.86) | 0.99(0.95,1.00) |

Table 1 shows the median and the th and th percentile over the simulations of the area-based RMISE, evaluating the error on the log scale ( and ). Regardless of the sample size or the shape of the data-generating risk surface, LGCP outperforms BYM for large radii (), but also for medium radii () combined with high risk increases (). In contrast, BYM tends to outperfom LGCP in the case of small radii, small risk increases, and when the risk surface is flat (online supplement, Table S1). The results across the scenarios are similar when we consider the population weights or the fitted values on the risk scale; refer to the online supplement, Tables S2–4.

Maps of coverage probabilities are shown in Figures S6–11 in the online supplement. From these it is clear that LGCP outperforms BYM for all data-generating scenarios with medium () to large () size of the high risk areas. Coverage probabilities of LGCP are high both in and outside the high-risk areas, and the only regions of poor coverage are along the immediate boundaries of the high risk areas in the step function scenarios. This was to be expected, given that it is impossible for a smooth function to perfectly approximate a step function. For the BYM, considerable extents of areas within or without the high risk areas show sub-optimal coverage in all these scenarios. None of the models properly capture the high risk areas when these are confined to small circles (). However, even for this case the areas of low coverage are restricted to the circles for LGCP, while they extend to the entire municipalities for BYM.

Table 2 shows the median and th and th percentiles of the coverage proportions (proportion of area for which the true risks lie within the credibility regions). In line with the maps of coverage probabilities, LGCP consistently shows a higher coverage proportion when the data-generating process has a smooth risk surface, while the BYM coverage proportion remains often under 95%. In this scenario, the only situation in which BYM and LGCP perform similarly is when high risk areas are small () and the disease rare (). Similarly, LGCP outperforms BYM in almost all scenarios when the underlying risk is a step function. There are few exceptions for which BYM appears to perform marginally better, namely for the combinations of medium or large circles, higher risk increases, and higher incidence rates. But as the Figures S6–11 in the online supplement show, areas of poorer coverage for LGCPs are confined to the circular transition areas from high to low risk. On the remaining area (both within and outside of high risk areas) coverage probabilities tend to be high in all these situations.

Figure 2 shows the variation across grid cells of the mean (over simulations) of the posterior mean and standard deviation of estimated risk when the expected number of generated cases is set to (). In all scenarios the geographic variability of risks estimated by LGCP is closer to the true variability of risks compared to estimates from BYM. This suggests a stronger tendency for shrinkage to the mean for BYM. Thus, even when the high risk areas are small (), LGCP models attempt to capture these risk increases, likely leading to greater variability in the estimates even for the areas outside the circles. This is a plausible explanation for the poorer performance of LGCPs in terms of RMISE for small radii and small risk increase: The BYM model better captures the risk outside the circles and, although it fails to capture the risks within the circles, this yields a better RMISE because the circles are very small. Stronger shrinkage to the mean is also a plausible explanation for the better performance of the BYM model in the constant risk scenario (online supplement, Figure S12). Except in the scenarios of small risk areas, the LGCP risk estimates tend to be more stable, i.e. on average have narrower posterior distribution as shown by the distribution of standard deviations. The results are similar for and (online supplement, Figures S13 and S14).

Figure 3 shows the pointwise median and 95% envelopes of the area-based sensitivity against specificity (ROC curve). The legend states the median and th and th percentiles of the over the simulations, where the expected number of generated cases is set to . For all scenarios LGCP clearly outperforms BYM in terms of identifying areas of high risk (AUC consistently higher). While the two ROC curves are similar for scenarios with both small risk areas () and small risk increases (), it is clearly visible that LGCP has higher sensitivity and specificity in all other scenarios for all the exceedance probability thresholds considered. We observe similar results when increasing or decreasing the number of cases or using the population-based version of sensitivity and specificity (online supplement, Figures S15–19). For more information on the sensitivity and specificity per probability threshold refer to the online supplement, Figures S20–25.

## 4 Childhood leukaemia incidence in the Canton of Zürich

Childhood leukaemia is a rare cancer and the only established environmental risk factor is ionising radiation in high doses (Wakeford, 2013). The childhood leukaemia example is of particular interest, as there have been a number of reports of childhood leukaemia clusters in the literature (McNally and Eden, 2004). Most of these clusters were discovered incidentally and it is not possible, in retrospect, to judge whether they represent true deviations from a flat risk scenario. Indeed in a recent systematic investigation of spatial clustering in Switzerland, we found that quite remarkable aggregations of cases are well compatible with a flat risk scenario (Konstantinoudis et al., 2017). Disease mapping is another approach of identifying areas of high risk, that may be more sensitive to areas of irregular shapes and long range spatial trends.

Data for childhood leukaemia was available through the Swiss Childhood Cancer Registry (SCCR), which is a nationwide registry with a estimated completeness since the mid 90s (Schindler et al., 2015). For this study we used the precise geocoded locations of place at diagnosis of the 334 registered childhood leukaemia cases diagnosed during 1985–2015 in the canton of Zürich. Precise geocodes for the population were available through the 2000 Swiss census and the density was calculated using a grid. We have fitted LGCP and BYM models using the exact same specifications as in the simulation study (see online supplement, Figure S26 for prior-posterior plots of the hyperparameters). We have mapped the risk estimates of both models as well as the exceedance probabilities defined as , where 0.0016 is the ratio of the number of leukaemia cases divided by the total population at risk . We have also highlighted areas, for which the exceedance probabilities surpass the thresholds 0.5 and 0.8. The sensitivity and specificity observed in our simulation study for these thresholds are reported in Table S5 of the online supplement.

Figure 4 shows the fitted risk suggested by the BYM and LGCP models in the top panels and the exceedance probabilities in the lower panels. Overall there appears to be little spatial variation of childhood leukaemia risk in the canton of Zürich. The variation of risk estimates from the LGCP is somewhat larger with a median risk of 0.00158 and compared to the variation retrieved from the BYM model, where the median risk is 0.0016 and . The map based on the BYM model is more patchy, highlighting individual municipalities that stand out quite markedly from their neighbours. In contrast the risk surface based on the LGCP model shows gradual changes with two spatially coherent areas of higher risk, one near the city of Zürich and one in the South-East of the canton. While the BYM highlights the whole municipality of Zürich, the LGCP shows no elevated risk in the western part of the municipality, but locates a high risk area in the eastern part of the municipality. The exceedance probability in this small area surpasses 0.8, while the BYM does not find any region exceeding this threshold. The estimated factor of risk increase (posterior mean) is with 95% CI of , compared to the overall risk over the domain (ie. ). Assuming that there is a real increase at this location, LGCP would have greater sensitivity than the BYM in identifying it. This illustrates that assuming constant risk over administrative areas may be quite misleading.

We cannot know if there is true spatial variation in risk over the period considered. The observed geographical variation in the posterior mean of the risk is compatible with the scenario of the simulation study, where and ; see online supplement, Figure S12. The observed risk increase could be spurious and an attribute to sampling variability or imperfect spatial adjustment for person years at risk (we used population density at the census 2000, but cases were diagnosed during 1985–2015). On the other hand, the observed risk increase could be also real and an attribute to environmental factors, such as traffic related air pollutants (Spycher et al., 2015), though it is not obvious which environmental factor might be implicated in the two areas indicated by the LGCP. Identifying potential factors underlying the observed variation is out of the scope of this study, and more research is required incorporating putative risk factors.

## 5 Discussion

Overall, we have found that in the framework of our study LGCP models perform better than BYM models in quantifying disease risk over space and in identifying areas of high-risk. LGCP clearly outperformed BYM when risk increases and the areas affected by these were sufficiently large to be detected. In these situations LGCP remained superior regardless of whether the underlying risk surface was a step function or a baseline risk plus a Gaussian, and regardless of any changes in the disease incidence rate. When the high-risk areas were small none of the models managed to reliably detect the increases or quantify the risks within these areas. In these scenarios BYM tended to produce a smaller RMISE due to a more efficient estimation of the flat risk surface in the large remaining area. The more reliable estimation of a flat risk surface appears to be the only advantage of BYM over LGCP. In our example using true childhood leukaemia incidence data from the canton of Zürich, the LGCP model identified smooth risk increases over the continuous domain in two spatially coherent areas, while the map produced by BYM was patchy, with multiple non-contiguous areas of elevated risk. Furthermore, risks estimated by LGCP showed greater variation over space and revealed variation at the sub-municipal level that could not be picked up by BYM.

Our results are consistent with two out of three previous studies in the literature. Motivated by studying the lupus incidence in Toronto, Li et al. (2012b) simulated 40 Gaussian surfaces with zero mean, keeping the variance and roughness parameters constant ( and ) and varying the range parameter (). They compared the performance of BYM and LGCP. Arguing that lupus risk is too low, they simulated cases using stomach and lung cancer risk. They used the mean squared error and ROC curves to examine the ability of the models to estimate the risk and pick up areas of higher risk. They consistently reported that the LGCP outperforms the BYM model. Li et al. (2012a) extended the LGCP model to aggregated data and compared them with the LGCP model based on case locations and the BYM model using a similar simulation procedure and metrics as in their previous study. They reported that the LGCP extension on aggregated data performed better than the BYM on aggregated data, however the LGCP on case location data was always superior. Kang et al. (2013) simulated point data, as guided by a previous study by Illian et al. (2012), aggregated this data on a range of different spatial scales and used a variety of smoothness priors to examine the impact of spatial scale and prior in the predictive performance of spatial models. Among the different priors were the BYM and a Matérn model, which with a fine grid selection approximates an LGCP. They conducted inference with INLA and reported mixed results in the sense that model performance depended on the individual scenarios.

Our work has some strengths and limitations. At its heart it is an extensive simulation study using samples from a true population that yields datasets with realistic spatial distribution of cases and persons at risk. We considered a range of different scenarios with different sizes of high-risk areas, risk increases, levels of urbanicity and shapes of the risk function, attempting not to favour either of the models used for fitting. The shape of the high-risk areas was always circular, which is an intuitive shape for disease mapping (hot spots). This choice also provides parsimony with respect to the parameters that need to be set and varied (centres and radii). However, more complex shapes should be considered in future studies. We also did not examine the effect of any spatially varying covariates, an issue discussed by Sørbye et al. (2017).

The results we found are subject to the mesh specification and a denser mesh provides more precise estimates (see Teng et al., 2017). It is tempting to assume that the failure of LGCPs to capture the risk increases over small areas (radius ) can be attributed to the mesh selection and the resulting loss of spatial resolution. However, this is unlikely to be the case: We performed an ad-hoc analysis simulating 100 datasets to examine the effect of the mesh size on the RMISE, setting and and assuming smooth risk surface and . We selected the centroids of grid cells as the mesh nodes, which resulted in a mesh with nodes, almost twice as many as used for the main analysis (; supplementary Figure S1). The reason for choosing this regular grid is that the same grid is used for estimating the posterior risk and calculating RMISE, so that no projection of the representation (4) to the regular grid is required but the values can be used directly. Consequently, we expect our estimates to be as close to the truth as a model of this grid size can produce. The results are reported in the online supplement, Figure S27. As expected the denser mesh yields a more accurate risk surfaces for the LGCP model, with the results being more pronounced for radius . However, the denser mesh does not remove the outperformance of BYM when radius . Increasing the mesh comes with a considerable increase in computation time: the mean processing time of the LGCP model in this case is approximately in contrast to needed on average for the same scenarios under the coarser mesh specification. This initial mesh selection was a compromise between precision and computation time across all simulations.

A more plausible explanation for the tendency of BYM to perform better when there are just a few peaks of radius seems to be that the large flat risk surface dominates the estimation of parameters determining variance and spatial correlation of the Gaussian field, and as a consequence these risk peaks are smoothed out. At the same time the sensitivity estimates for both models are fairly similar (online supplement, Figures S20–25 and Table S5). These findings are in line with previous simulation studies that reported a tendency of the BYM model to oversmooth the point estimates but to perform well at overall classification of areas into higher-risk areas (Best et al., 2005).

Our results suggest that, under the given scenarios and when using exceedance probabilities to define areas of high-risk, LGCPs may be a promising tool for cluster detection. The most popular cluster detection test is Kulldorff’s circular (or elliptic) scan (Kulldorff, 1997; Kulldorff et al., 2006). However, these methods do not provide smooth risk estimates over the domain, have difficulties in detecting clusters of irregular shapes and are slightly conservative when there is more than one cluster in the domain. Using a model-based approach we bypass some of these issues. However results are expected to be sensitive to the prior specification. Furthermore the selection of a threshold for the exceedance probabilities is often arbitrary, creates an additional bottleneck in the analysis and possibly multiple testing issues. For our scenarios the circular scan would be expected to perform better, as it is constructed to be used for circular cluster detection. LGCP and other disease mapping models provide no formal test for the presence of clusters, however this avenue might be pursued in future research. Future studies should examine different methods for identifying high-risk areas using LGCP or BYM models, such as excursion sets (Bolin and Lindgren, 2015) or quantile regression (Padellini and Rue, 2018)), and compare these approaches with Kulldorff’s scan.

This study highlights the strengths of continuous domain models for disease mapping when precise geocodes are available. However, data confidentiality concerns are an important reason for not making such data available even when they exist. Future research should seek ways to utilizing data at its maximum resolution while fully respecting privacy concerns. In this line, it would be interesting to examine how sensitive the results are to data perturbation (jittering) as a way for preserving data confidentiality. Future studies should also compare the performance of discrete and continuous domain models when the underlying risk is linked to individual or spatial covariates. In theory, continuous domain models should allow bypassing problems in regression models based on discrete area units, including ecological bias and spatial misalignment.

## 6 Conclusion

This study suggests that the use of LGCP models in combination with point pattern data in disease mapping offers important advantages over traditional BYM models in combination with aggregated areal counts. LGCPs outperform BYM models in quantifying risks and in identifying areas of high risk when the true risk surface shows important spatial variation. In contrast BYM models show a stronger tendency for shrinkage toward the mean and, although being efficient in retrieving flat risk surfaces, tend to oversmooth risk increases that occur on an intermediate spatial scale. Our findings suggest that there are important gains to be made from the use of continuous domain models in spatial epidemiology.

## Acknowledgements

The authors thank Dr Alex Karagiannis-Voules and Dr Haakon Bakka for their valuable input and comments.

This work was supported by Swiss Cancer Research (4012-08-2016, 3515-08-2014). BD Spycher was supported by a Swiss National Science Foundation fellowship (PZ00P3_147987).

The work of the Swiss Childhood Cancer Registry (SCCR) is supported by the Swiss Paediatric Oncology Group (www.spog.ch), Schweizerische Konferenz der kantonalen Gesundheitsdirektorinnen und -direktoren (www.gdk-cds.ch), Swiss Cancer Research (www.krebsforschung.ch), Kinderkrebshilfe Schweiz (www.kinderkrebshilfe.ch), Ernst-Göhner Stiftung, Stiftung Domarena and National Institute of Cancer Epidemiology and Registration (www.nicer.ch).

We thank the Swiss Federal Statistical Office for providing mortality and census data and for the support, which made the Swiss National Cohort (SNC) and this study possible. The work of the SNC was supported by the Swiss National Science Foundation (grant nos. 3347CO-108806, 33CS30_134273 and 33CS30_148415).

The members of the Swiss Pediatric Oncology Group Scientific Committee: M Ansari (Geneva), M Beck-Popovic (Lausanne), P Brazzola (Bellinzona), J Greiner (St Gallen), M Grotzer (Zürich), H Hengartner (St Gallen), T. Kuehne (Basel), C Kuehni (Bern), F Niggli (Zürich), J Rössler (Bern), F Schilling (Lucerne), K Scheinemann (Aarau), N von der Weid (Basel).

The members of the Swiss National Cohort Study Group: Matthias Egger (Chairman of the Executive Board), Adrian Spoerri and Marcel Zwahlen (all Bern), Milo Puhan (Chairman of the Scientific Board), Matthias Bopp (both Zürich), Nino Künzli (Basel), Fred Paccaud (Lausanne) and Michel Oris (Geneva).

## Conflict of interest

The authors declare that they have no conflict of interest.

## Ethical standard

Ethics approval was granted through the Ethics Committee of the Canton of Bern to the SCCR on the 22th of July 2014 (KEK-BE: 166/2014).

## References

- Bakka et al. (2018) Bakka, H., Rue, H., Fuglstad, G. A., Riebler, A., Bolin, D., Illian, J., Krainski, E., Simpson, D. and Lindgren, F. (2018) Spatial modelling with R-INLA: A review. arXiv preprint arxiv:1802.06350.
- Besag et al. (1991) Besag, J., York, J. and Mollie, A. (1991) Bayesian image restoration with two applications in spatial statistics. Ann Inst Statist Math, 43, 1–20.
- Best et al. (2005) Best, N., Richardson, S. and Thomson, A. (2005) A comparison of Bayesian spatial models for disease mapping. Statistical Methods in Medical Research, 14, 35–59.
- Blangiardo and Cameletti (2015) Blangiardo, M. and Cameletti, M. (2015) Spatial and Spatio-temporal Bayesian Models with R-INLA. John Wiley & Sons.
- Blangiardo et al. (2013) Blangiardo, M., Cameletti, M., Baio, G. and Rue, H. (2013) Spatial and spatio-temporal models with R-INLA. Spatial and Spatio-Temporal Epidemiology, 4, 39–55.
- Bolin and Lindgren (2015) Bolin, D. and Lindgren, F. (2015) Excursion and contour uncertainty regions for latent gaussian models. Journal of the Royal Statistical Society Series B (Statistical Methodology), 77, 85–106.
- Diggle et al. (2005) Diggle, P., Rowlingson, B. and Su, T. L. (2005) Point process methodology for on-line spatio-temporal disease surveillance. Environmetrics, 16, 423–434.
- Diggle et al. (2013) Diggle, P. J., Moraga, P., Rowlingson, B. and Taylor, B. M. (2013) Spatial and spatio-temporal Log-Gaussian Cox Processes: Extending the geostatistical paradigm. Statistical Science, 28, 542–563.
- Freni-Sterrantino et al. (2018) Freni-Sterrantino, A., Ventrucci, M. and Rue, H. (2018) A note on intrinsic Conditional Autoregressive models for disconnected graphs. Spatial and Spatio-temporal Epidemiology, 26, 25–34.
- Fuglstad et al. (2018) Fuglstad, G. A., Simpson, D., Lindgren, F. and Rue, H. (2018) Constructing priors that penalize the complexity of Gaussian random fields. Journal of the American Statistical Association, 0, 1–8.
- Giorgi et al. (2016) Giorgi, E., Kreppel, K., Diggle, P. J., Caminade, C., Ratsitorahina, M., Rajerison, M. and Baylis, M. (2016) Modeling of spatio-temporal variation in plague incidence in madagascar from 1980 to 2007. Spatial and Spatio-temporal Epidemiology, 19, 125–135.
- Girolami and Calderhead (2011) Girolami, M. and Calderhead, B. (2011) Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Statistical Society Series B (Statistical Methodology), 73, 123–214.
- Gotway and Young (2002) Gotway, C. A. and Young, L. J. (2002) Combining incompatible spatial data. Journal of the American Statistical Association, 97, 632–648.
- Halonen et al. (2016) Halonen, J. I., Blangiardo, M., Toledano, M. B., Fecht, D., Gulliver, J., Ghosh, R., Anderson, H. R., Beevers, S. D., Dajnak, D., Kelly, F. J., Wilkinson, P. and Tonne, C. (2016) Is long-term exposure to traffic pollution associated with mortality? A small-area study in London. Environmental Pollution, 208, 25 – 32.
- Heaton et al. (2017) Heaton, M. J., Datta, A., Finley, A., Furrer, R., Guhaniyogi, R., Gerber, F., B. Gramacy, R., Hammerling, D., Katzfuss, M., Lindgren, F., Nychka, D., Sun, F. and Zammit-Mangion, A. (2017) Methods for analyzing large spatial data: A review and comparison. arXiv preprint arXiv:1710.05013.
- Illian et al. (2012) Illian, J. B., Sorbye, S. H. and Rue, H. (2012) A toolbox for fitting complex spatial point process models using Integrated Nested Laplace Approximation (INLA). Annals of Applied Statistics, 6, 1499–1530.
- Kang et al. (2013) Kang, S. Y., McGree, J. and Mengersen, K. (2013) The impact of spatial scales and spatial smoothing on the outcome of Bayesian spatial model. Plos One, 8.
- Konstantinoudis et al. (2017) Konstantinoudis, G., Kreis, C., Ammann, R. A., Niggli, F., Kuehni, C. E., Spycher, B. D., Swiss Paediatric Oncology, G. and the Swiss National Cohort Study, G. (2017) Spatial clustering of childhood leukaemia in switzerland: A nationwide study. Intenrational Journal of Cancer, 141, 1324–1332.
- Kulldorff (1997) Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics - Theory and Methods, 26, 1481–1496.
- Kulldorff et al. (2006) Kulldorff, M., Huang, L., Pickle, L. and Duczmal, L. (2006) An elliptic spatial scan statistic. Statistics in Medicine, 25, 3929–43.
- Lasinio et al. (2013) Lasinio, G. J., Mastrantonio, G. and Pollice, A. (2013) Discussing the ”big n problem”. Statistical Methods and Applications, 22, 97–112.
- Li et al. (2012a) Li, Y., Brown, P., Gesink, D. C. and Rue, H. (2012a) Log Gaussian Cox processes and spatially aggregated disease incidence data. Statistical Methods in Medical Research, 21, 479–507.
- Li et al. (2012b) Li, Y., Brown, P., Rue, H., al Maini, M. and Fortin, P. (2012b) Spatial modelling of lupus incidence over 40 years with changes in census areas. Journal of the Royal Statistical Society Series C (Applied Statistics), 61, 99–115.
- Lindgren and Rue (2015) Lindgren, F. and Rue, H. (2015) Bayesian spatial modelling with R-INLA. Journal of Statistical Software, 63, 1–25.
- Lindgren et al. (2011) Lindgren, F., Rue, H. and Lindstrom, J. (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society Series B (Statistical Methodology), 73, 423–498.
- MacNab (2011) MacNab, Y. C. (2011) On gaussian markov random fields and bayesian disease mapping. Statistical Methods in Medical Research, 20, 49–68.
- McNally and Eden (2004) McNally, R. J. and Eden, T. O. (2004) An infectious aetiology for childhood acute leukaemia: a review of the evidence. British journal of haematology, 127, 243–263.
- Møller et al. (1998) Møller, J., Syversveen, A. R. and Waagepetersen, R. P. (1998) Log Gaussian Cox processes. Scandinavian Journal of Statistics, 25, 451–482.
- Openshaw (1984) Openshaw, S. (1984) The modifiable areal unit problem. Concepts and techniques in modern geography.
- Padellini and Rue (2018) Padellini, T. and Rue, H. (2018) Model-based quantile regression for discrete data. arXiv preprint arXiv:1804.03714.
- Pereira et al. (2017) Pereira, S., Turkman, K. F., Correia, L. and Rue, H. (2017) Unemployment estimation: Spatial point referenced methods and models. arXiv preprint arXiv:1706.08320.
- Riebler et al. (2016) Riebler, A., Sørbye, S. H., Simpson, D. and Rue, H. (2016) An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Statistical Methods in Medical Research, 25, 1145–1165.
- Riesen et al. (2018) Riesen, M., Konstantinoudis, G., Lang, P., Low, N., Hatz, C., Maeusezahl, M., Spaar, A., Buehlmann, M., Spycher, B. D. and Althaus, C. L. (2018) Exploring variation in human papillomavirus vaccination uptake in switzerland: a multilevel spatial analysis of a national vaccination coverage survey. BMJ Open, 8.
- Rue and Held (2005) Rue, H. and Held, L. (2005) Gaussian Markov Random Fields: Theory and Applications, vol. 104 of Monographs on Statistics and Applied Probability. London: Chapman & Hall.
- Rue et al. (2009) Rue, H., Martino, S. and Chopin, N. (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society Series B (Statistical Methodology), 71, 319–392.
- Rue et al. (2017) Rue, H., Riebler, A., Sorbye, S. H., Illian, J. B., Simpson, D. P. and Lindgren, F. K. (2017) Bayesian computing with INLA: A review. Annual Review of Statistics and Its Application, 4, 395–421.
- Schindler et al. (2015) Schindler, M., Mitter, V., Bergstraesser, E., Gumy-Pause, F., Michel, G., Kuehni, C. E. and Swiss Paediatric Oncology, G. (2015) Death certificate notifications in the swiss childhood cancer registry: assessing completeness and registration procedures. Swiss Medical Weekly, 145.
- Simpson et al. (2016) Simpson, D., Illian, J., Lindgren, F., Sørbye, S. and Rue, H. (2016) Going off grid: Computational efficient inference for log-Gaussian Cox processes. Biometrika, 103, 1–22.
- Simpson et al. (2017) Simpson, D., Rue, H., Riebler, A., Martins, T. G. and Sorbye, S. H. (2017) Penalising model component complexity: A principled, practical approach to constructing priors. Statistical Science, 32, 1–28.
- Sørbye et al. (2017) Sørbye, S. H., Illian, J. B., Simpson, D. P., Burslem, D. and Rue, H. (2017) Careful prior specification avoids incautious inference for log-Gaussian Cox point processes. arXiv preprint arXiv:1709.06781.
- Sørbye and Rue (2014) Sørbye, S. H. and Rue, H. (2014) Scaling intrinsic Gaussian Markov random field priors in spatial modelling. Spatial Statistics, 8, 39–51.
- Sørbye and Rue (2017) — (2017) Penalised complexity priors for stationary autoregressive processes. Journal of Time Series Analysis, 38, 923–935.
- Spycher et al. (2015) Spycher, B. D., Lupatsch, J. E., Zwahlen, M., Röösli, M., Niggli, F., Grotzer, M. A., Rischewski, J., Egger, M. and Group, S. N. C. S. (2015) Background ionizing radiation and the risk of childhood cancer: A census-based nationwide cohort study. Environmental Health Perspectives (Online), 123, 622.
- Teng et al. (2017) Teng, M., Nathoo, F. and Johnson, T. D. (2017) Bayesian computation for Log-Gaussian Cox processes: A comparative analysis of methods. Journal of Statistical Computation and Simulation, 1–26.
- Wakefield (2007) Wakefield, J. (2007) Disease mapping and spatial regression with count data. Biostatistics, 8, 158–183.
- Wakeford (2013) Wakeford, R. (2013) The risk of childhood leukaemia following exposure to ionising radiation - A review. Journal of Radiological Protection, 33, 1.
- Whittle (1954) Whittle, P. (1954) On stationary processes in the plane. Biometrika, 41, 434–449.
- Whittle (1963) — (1963) Stochastic processes in several dimensions. Bull. Inst. Internat. Statist., 40, 974–994.
- Yuan et al. (2017) Yuan, Y., Bachl, F. E., Borchers, D. L., Lindgren, F., Illian, J. B., Buckland, S. T., Rue, H. and Gerrodette, T. (2017) Point process models for spatio-temporal distance sampling data. Annals of Applied Statistics, 11, 2270–2297.