Bootstrap based uncertainty bands for prediction in functional kriging
Abstract
The increasing interest in spatially correlated functional data has led to the development of appropriate geostatistical techniques that allow to predict a curve at an unmonitored location using a functional kriging with external drift model that takes into account the effect of exogenous variables (either scalar or functional). Nevertheless uncertainty evaluation for functional spatial prediction remains an open issue. We propose a semiparametric bootstrap for spatially correlated functional data that allows to evaluate the uncertainty of a predicted curve, ensuring that the spatial dependence structure is maintained in the bootstrap samples. The performance of the proposed methodology is assessed via a simulation study. Moreover, the approach is illustrated on a well known data set of Canadian temperature and on a real data set of PM concentration in the Piemonte region, Italy. Based on the results it can be concluded that the method is computationally feasible and suitable for quantifying the uncertainty around a predicted curve. Supplementary material including R code is available upon request.
Keywords: Bsplines; band depth; functional data modelling; generalized additive models; geostatistics; tracevariogram
1 Introduction
Kriging is a well known prediction method in the geostatistics community (see e.g. Chiles and Delfiner (2012)); it allows to predict a (scalar) random field or spatial process in a new spatial location given a set of observed values , taking into account the underlying correlation structure. Spatially dependent functional data (see e.g. the last two chapters of the book by Horváth and Kokoszka (2012)) have received increasing interest over the last few years. Geostatistical techniques for functional data were first introduced in the pioneering work of Goulard and Voltz (1993), but the development of such techniques is rather recent. The simplest case would be that of ordinary kriging, which allows to predict a curve at an unmonitored site under the assumption of a constant mean (see e.g. Delicado et al. (2010); Giraldo et al. (2011); Nerini et al. (2010)). The case of a mean function that depends on longitude and latitude was considered in Caballero et al. (2013); Menafoglio et al. (2013); Reyes et al. (2015). In their work, Ignaccolo et al. (2014) consider more complex forms of nonstationarity, where the mean function may depend on exogenous variables (either scalar or functional), developing the so called kriging with external drift  or regression kriging  in a functional data setting.
While much effort has been put in prediction, the uncertainty of a predicted curve remains an open issue, since there is no functional version of the kriging variance. The lack of a distribution function in the functional framework leads to the use of resampling methods for confidence band calculation. In this context, Cuevas et al. (2006) consider the standard bootstrap and a smoothed version of it to obtain confidence intervals for location estimators; an informal discussion on the asymptotic validity of the bootstrap approach in a functional framework can also be found in their paper. Goldsmith et al. (2013) use a bootstrap approach to account for the uncertainty in Functional Principal Components decomposition in estimanting the functional mean and constructing a confidence band for it. Further, Ferraty et al. (2010) propose using “wild bootstrapping” in the case of a nonparametric regression model with scalar response and functional covariate and derive asymptotic results; Rana et al. (2016) extend this last work to the case of mixing dependence. The recent paper by GonzálezRodríguez and Colubi (2017) shows the consistency of some bootstrap approaches for separable Hilbertvalued random elements but under the assumption of independence.
While the bootstrap theory is well established for independent data, in the spatial data setting a bootstrap procedure needs to mimic the data generating mechanism in order to reproduce the spatial dependence structure in the bootstrap samples. Mostly by extending bootstrap for time series, several variants of spatial subsampling and spatial block bootstrap methods have been proposed in the literature (see chapter 12 in Lahiri (2003) for a good overview on resampling methods for spatial data). In classical geostatistics, it is common to assume a decomposition of data variability in large and smallscale components, so that a “semiparametric” bootstrap method as described in Cressie (1993) (p.493)  and inspired by Freedman and Peters (1984) and Solow (1985)  seems appropriate. The latter consists in transforming the residuals of a regression model (i.e. after estimation of the largescale component) to remove the spatial dependence structure (smallscale) so that resampling can be done on uncorrelated data, to then reintroduce the spatial correlation on bootstrapped samples and finally add up the largescale component. Recently, Iranpanah et al. (2011) compare the semiparametric bootstrap with a moving block bootstrap for variance estimation of estimators in a simulation study, and point out some advantages of the semiparametric approach in terms of precision and accuracy of the estimator. A semiparametric bootstrap approach has also been considered in Schelin and de Luna (2010), where the focus is on the ordinary kriging predictor for data whose distribution is not necessarily Gaussian, and indeed their proposal does not need any distributional assumptions about the data generating process. While Iranpanah et al. (2011) consider the presence of a nonconstant mean structure too, the main difference between the two proposals is related to the considered statistics: Iranpanah et al. (2011) suggest to create the bootstrap distribution of the spatial predictor, while Schelin and de Luna (2010) construct a bootstrap distribution for the contrast defined as the difference between the spatial predictor and the unknown value (the one we want to predict).
The literature available considers either prediction bands for functional data in the case of independent observations or prediction intervals for spatially correlated data but in a scalar framework. In this sense, we fill the gap by providing a solution for spatially correlated functional data. In this framework, however, mimicking the data generating process is expected to be more difficult than in the scalar case. In this paper, we consider the case of functional kriging with external drift (FKED) developed in Ignaccolo et al. (2014) and extend it to take into account spatial correlation when estimating the drift functional coefficients by means of an iterative algorithm. To evaluate the uncertainty of a predicted curve, we propose to extend the semiparametric bootstrap approach for spatially correlated data introduced by Schelin and de Luna (2010) to the case of functional data, with the addition of a functional drift in the kriging model (a scalar drift was considered in Iranpanah et al. (2011)). Concurrently, Pigoli et al. (2016) have proposed a similar approach to evaluate the kriging prediction error for manifold valued data, but always in terms of a unique value for each location. The extension of the semiparametric bootstrap to the functional data setting is not straightforward and implies dealing with two main issues: i) the specification and estimation of the spatial dependence structure and ii) the ordering of curves to obtain functional quantiles. In lack of theoretical results about the asymptotic validity of the proposed bootstrap, we rely on a simulation study to evaluate the performance of the proposed method by analysing widths of the functional prediction bands and coverages defined for functional data.
The paper is organized as follows. In Section 2, we summarize the kriging with external drift methodology whereas in Section 3 we illustrate the proposed method for deriving prediction bands in the general FKED setting. A simulation study is presented in Section 4, followed by an application to two real data sets. All computations are coded in R (R Core Team, 2015). A discussion completes the paper.
2 Functional kriging with external drift (FKED)
Let be a functional random variable observed at location , whose realization is a function of , where is a compact subset of . Assume that we observe a sample of curves , for , , taking values in a separable Hilbert space of square integrable functions. The set constitutes a functional random field or a spatial functional process (Delicado et al., 2010) that is not necessarily stationary. The following model is assumed:
(1) 
where is the drift describing a spatial trend and is a zeromean, secondorder stationary and isotropic residual random field, with covariance function with , where represents the Euclidean distance between locations and . At a fixed site , , and domain point , the model can be rewritten as a functional concurrent linear model (Ignaccolo et al., 2014)
(2) 
where represents the residual spatial functional process at site . The drift term can be expressed in terms of a set of scalar and functional covariates:
(3) 
where is a functional intercept, is the scalar covariate at site , is the functional covariate at site and and are the covariate functional coefficients. Model (3) parameters can be estimated by means of a generalized additive model (GAM) representation using the R package mgcv (see Ignaccolo et al. (2014), Wood (2006) and Wood (2015) for details). The generalized additive model representation of Model (2) can be reexpressed as a mixed effects model (Robinson, 1991; Speed, 1991) whose parameters are estimated using REstricted Maximum Likelihood (REML) (Wood, 2011) under the assumption of Gaussianity for with a longitudinal point of view.
To take into account the spatial correlation between functional observations when estimating the drift term, we propose an iterative algorithm that considers the term as a functional random intercept (i.e. a location specific smooth residual) with a given covariance structure, similarly to Scheipl et al. (2015). An iterative algorithm is also proposed in Menafoglio et al. (2013) to estimate drift coefficients for scalar covariates in universal kriging, as well as in Ignaccolo et al. (2015) to take into account the heteroskedasticity of functional residuals. The algorithm can be summarized as follows:

Fit a standard functional concurrent linear model; estimate the drift term following Model (3) assuming independent functional observations and obtain the functional residuals .

Estimate the correlation matrix of the residual spatial functional process using the tracesemivariogram (Giraldo et al., 2011). This is defined, for a zeromean weaklystationary isotropic process, as . The tracesemivariogram can be estimated as:
where . The estimate becomes computationally efficient when data are expressed using cubic Bsplines, as integration can be avoided by reexpressing the integral in terms of the spline coefficients and basis (Giraldo et al., 2011). Once estimated, the empirical tracesemivariogram provides a cloud of points to which a parametric model (e.g. exponential, spherical, Matérn) can be fitted as in classical geostatistics using, for example, weighted least squares (Cressie, 1993).

Fit Model (2) considering the term as a functional random effect, where the inverse of the estimated correlation matrix (dim) is used as the precision matrix of a random field across locations, as proposed in Scheipl et al. (2015). In practice this can be implemented using the function gamm in the mgcv package.
The algorithm’s convergence is determined based on the Akaike Information Criterion , since the effective degrees of freedom may change from iteration to iteration. The algorithm stops when the rate is smaller than 0.1%, where the criterion rate at the iteration is calculated as
The resulting functional residuals (at the last iteration) can be used to predict the residual curve at an unmonitored site via one of three kriging options: 1) ordinary kriging for functional data (Giraldo et al., 2011), according to which , with kriging coefficients ; 2) continuous timevarying kriging (Giraldo et al., 2010), where the kriging coefficients now depend on and 3) functional kriging total model (Giraldo et al., 2009; Nerini et al., 2010), where the kriging coefficients are defined on and Prediction at the unmonitored site is obtained by adding up, as in the classical regression kriging, the two terms, i.e.
depends on the covariate values and at site .
From now on we focus on the ordinary kriging case for the residual field; this contributes to keep a moderate computational complexity in what follows (the two other cases will be considered in the discussion). Indeed, the ordinary case turns to be computationally convenient because of the tracevariogram’s use. Not only integration can be avoided when using Bsplines, but also it is possible to show (see appendix) that the tracevariogram induces a covariance structure that is separable with respect to space and the domain of the functional data (e.g. time or depth). Moreover, Menafoglio and Petris (2016) show that the solution to the Ordinary Kriging Problem via the tracecovariogram turns out to be the best finitedimensional approximation of the operatorial kriging predictor for Hilbertspace valued random fields.
Note that, in practice, data are gathered as a finite discrete set of observations , , , . Thus, before fitting Model (2), raw data can be transformed into functional observations assuming , where represents measurement error and is a continuous function that corresponds to a realization of the functional random field at site . The conversion from discrete data to curves involves smoothing; we use cubic Bsplines and the R package fda (Ramsay et al., 2014), choosing the number of basis functions and penalty parameter using functional crossvalidation (Ignaccolo et al., 2014).
3 Bootstrap uncertainty bands for functional kriging
To evaluate the uncertainty of a predicted curve at a new site , we propose a semiparametic bootstrap approach for spatially correlated functional data that builds on the work done by Schelin and de Luna (2010) and Iranpanah et al. (2011) for scalar data. The main idea is to decorrelate the data so that resampling can be done on independent observations and then transform back, ensuring that the spatial dependence structure is maintained in the bootstrapped samples. After removing the drift as in Iranpanah et al. (2011), we propose to follow Schelin and de Luna (2010) according to which in the following algorithm a sample of size is drawn and an augmented covariance matrix created, such that a bootstrap datum is generated at the unmonitored location .
Suppose that follows the distribution , a prediction interval for can be built as , with the quantile of the unknown distribution . The idea is to construct bootstrap replicates and approximate by , the empirical distribution of . The bootstrapping algorithm can be summarized as follows:

Iteratively estimate the drift following Model (3) as proposed in steps 13 of the FKED algorithm to take into account the spatial correlation and take the functional residuals .

Estimate the functional residuals covariance matrix through the estimated tracesemivariogram. Resampling directly from the functional residuals is not appropriate due to spatial dependence; instead, we can transform them first as it is usual practice when bootstrapping spatial data. Using Cholesky decomposition, and the functional residuals can be transformed so that they become (spatially) uncorrelated:

Generate bootstrap samples with size , by sampling with replacement from .

Create the augmented covariance matrix , where , is the estimated covariance function and is the estimated scale. Use Cholesky decomposition so that and transform the bootstrap samples as

The final bootstrap sample is determined as , and .
The bootstrap samples are then fed into the FKED model to obtain prediction curves and the differences are considered. The prediction interval for can be written as
with the percentile of , that can be obtained ordering the set of curves . However, the idea of ordering curves is not as straightforward as ordering scalar values, and to our knowledge there is no gold standard for doing so. We consider two different ordering techniques available in the literature.
The first one builds on the idea of band depth (LopezPintado and Romo, 2009), that can be defined for any set of curves. In their paper, LopezPintado and Romo (2009) suggest using =3, stating that there is no need to increase as “the band depth induced order is very stable in ”. Even for =2, the computational cost is considerable when the sample includes a large number of functional curves; to improve computation times, we have adopted the fast algorithm proposed in Sun and Genton (2011) and Sun et al. (2012) with =2. The band in delimited by the curves is defined as
The sample band depth (BD) of a curve in a set of curves can be calculated as
where is the indicator function and is the graph of a curve defined as the subset of the plane . Potential problems of using include ties (i.e. more than one curve with the same depth value) and crossing over of the curves delimiting the band (in which case the band “is degenerated in a point and, with probability one, no other curve will be inside this band”; see LopezPintado and Romo (2009)). To avoid these problems and still count with the computational advantage of using , band depth can be modified to take into account whether a portion of the curve is in the band, giving rise to the modified band depth (MBD), defined as
where is the Lebesgue measure on (for further details see LopezPintado and Romo (2009)). With this scheme, the bigger the band depth value, the more central the curve is.
The second ordering scheme is based on distance between curves (Cuevas et al., 2006). In this case, the bootstrapbased predicted curves are ordered based on how distant they are from the zero curve, according to the distance definition:
With this scheme, the smaller the distance, the more central the curve is.
For a confidence level , the lower/upper limits of a prediction band are obtained by taking either the pointwise (w.r.t. ) minimum/maximum of the deepest curves using band depth (i.e. those closest to the center of the distribution) (Sun and Genton, 2011) or of the curves closest to the zero curve using distance (Cuevas et al., 2006).
To evaluate the performance of the bootstrap method proposed, we can use different indicators. First of all, we consider the width of the resulting prediction band at the unmonitored site . Then we evaluate the “domain coverage” , defined as the proportion over the domain of the simulated curve within the prediction band, calculated as
(4) 
where is the number of points used for discretizing the curve . The latter should not be understood as coverage in the classical sense, and thus we propose a “functional coverage” defined as the percentage of times that the prediction band contains the true curve in a set of simulations, and calculated as
where and represent the prediction band and domain coverage for the simulation, ; in the simulation study that follows . In practice, the functional coverage depends somehow on the discretization that one makes of the corresponding curve , in the sense that the finer the discretizing grid, the more difficult it is for the whole curve to be contained in the band, and indeed the functional coverage varies slightly depending on the number of points chosen to represent the curve. That said, we opt for a fine discretization (101 points) but allow for a small tolerance in the functional coverage, in the sense that rather than requiring the whole true curve to be contained in the band, we require at least of the true curve to be contained in the band. This modified version of the functional coverage with tolerance can be evaluated as
(5) 
Obviously, coincides with when .
4 Simulation Study
The performance of the bootstrapping method proposed in Section 3 is evaluated here through a simulation study. Our aim is to analyse the impact of the spatial structure of the functional residual random field, by means of the covariance function parameters (scale and range), as well as that of the ordering technique chosen to derive the uncertainty bands when increasing the number of sites.
Data were simulated using cubic Bsplines on a spatial irregular grid ( locations) on and curve domain . We used a Bspline basis on with 10 basis functions. The residual functional random field was built as , where is the basis function evaluated at (i.e. a curve) and are the spatially correlated spline coefficients. These were generated independently for each using the same exponential covariance function with range and scale parameters and respectively, resulting in 9 different scenarios. As already mentioned in Section 2 and shown in the appendix, the tracevariogram of turns out to be proportional to the variogram of the spline coefficients . The drift was obtained as
where and are the spatial coordinates, is a functional intercept and are functional coefficients. The functional coefficients , and can be expressed in terms of the same Bspline basis with scalar spline coefficients (common for all sites) that are drawn from normal distributions as follows: for we draw 10 splines coefficients from ; for and we draw 10 splines coefficients from with , where represents the identity matrix with dimension . Finally, simulated observations were built as
where is a vector of random errors for each fixed ; in practice we consider 101 equally spaced points in .
For each simulation scenario, we generated functional data at and 90 nested locations. Additionally, data were generated at 10 more sites (always the same for all three sample sizes) used as validation stations. Note that these validation stations, numbered 1 to 10 throughout the paper, represent a number of different situations that could be found in real applications, ranging from isolated locations with no information nearby (e.g. number 3) to locations situated very close to sites where data are available (e.g. number 9). The locations can be seen in Figure 1, while the simulated data can be seen in Figure 2 for (note that the cases and are just subsets of this).
For each of the three sample sizes, and each simulation scenario, the FKED model presented in Section 2 was applied to the corresponding data set to predict curves at the 10 validation sites. In particular, a drift term depending on longitude and latitude was considered and ordinary kriging was used to obtain the predicted residuals. In practice, the variogram model is chosen automatically among exponential, gaussian and spherical based on minimum SSE. Computational times ranged from 2 seconds (n=25) up to 16 seconds (n=90). The resulting predicted curves, along with the observed data, can be seen in Figure 3 in the case of and range and scale parameters and =0.25, respectively. The estimated range and scale parameters and for all 9 simulation scenarios are in agreement with what one would expect given the values set for and in the simulation design and the relationship between the tracevariogram and the variogram described in the appendix. It appears that there is good accordance between simulated and predicted observations. Similar figures for the remaining cases are available upon request as supplementary material. Following the algorithm illustrated in Section 3, a bootstrap sample of size was obtained for each validation station. These 500 curves were ordered using both distance and band depth, where the latter was calculated using the modified version MBD. The resulting 95% prediction bands are shown in Figure 3. Overall, the two uncertainty measures provide very similar prediction bands.
In Figure 4 (top), we show the depth based band width corresponding to a sample size of and all simulation scenarios, while Figure 4 (bottom) summarizes the difference in width when using band depth and distance for the same sample size. The remaining figures (for ) are not shown here due to space limitations but are available upon request as supplementary material. Greater values of the scale parameter lead to wider prediction bands, while for a fixed value of the scale parameter, the width of the band decreases slightly with increasing range . When comparing band depth and distance (Figure 4 (bottom)) it is interesting to see that the depth based interval is predominantly wider than the distance based one, regardless of the value of and . Figure 5 shows the depth based prediction band width for a fixed simulation scenario ( and ) and all three samples sizes; here it can be seen that as increases, the width of the interval decreases and it becomes slightly more stable on .
On the other hand, the median domain coverage (see Eq. (4)) over simulations is summarized in Figure 6 for all simulation scenarios and sample sizes. Looking at the empirical distribution of over simulations the maximum is always 100%, whereas the minimum varies from 78.2% to 100%. This coverage is linked to the width of the interval (the larger the width the larger the coverage) but also to the goodness of FKED prediction because the uncertainty band is built around the predicted curve: if the prediction is far from the original data, the corresponding coverage will be poor, and viceversa. We can see in Figure 6 that median domain coverage improves as increases and it is slightly better for small values of . Overall, median coverage ranges from 85.2% to 100% and is highly dependent on validation site. In particular, it can be seen that when all locations apart from validation stations numbered 4 and 7 have a median domain coverage greater than 95%. Station 3 seems to perform particularly badly when but this is explained by the fact that it is located on the border of the spatial region considered (see Figure 1(left)) and has no neighbouring sites.
Following the earlier discussion at the end of Section 3, we allow for a small tolerance when calculating functional coverage (see Eq. (5)), and this is shown in Figure 7 for and . It can be seen that decreases as increases but improves with increasing values of (i.e. when the dependence structure becomes stronger) and varies greatly with validation site. Once again, when (Figure 7 top) sites 4 and 7 show a lower functional coverage for ; if we disregard these two, for the remaining stations is very close to 100% in the best scenario (, ). Given the empirical distribution of domain coverage discussed above, it is intuitive to see that we cannot expect a functional coverage around the nominal level 95% for all validation stations. For example, the minimum and median domain coverage for station 4 when , , are 82.2% and 90.1% respectively, while the percentile is 84.2%; this means that if we set the tolerance the nominal level 95% cannot be reached. If instead we allow for a tolerance (see Section 6 for a discussion), the functional coverage improves considerably as shown in Figure 7(bottom). Moreover, if rather than discretizing the curve to 101 points we use , the results on coverage improve; for example, median domain coverage ranges from 90% to 100%.
5 Real data analysis
5.1 Canadian temperature data
As first case study, we have chosen the well known data set of Canadian temperature. This has been repeatedly used in the functional data literature (see, for example, Menafoglio et al. (2013); Giraldo et al. (2010); Ramsay and Silverman (2006); Scheipl et al. (2015)). The data set consists of daily annual mean temperature collected at 35 meteorological stations in Canada’s Maritimes Provinces: Nova Scotia, New Brunswick and Prince Edward Island, over the period 1960 to 1994. Note that the data set used here (available in the geofd R package; Giraldo et al. (2012)), which is the same as in Menafoglio et al. (2013); Giraldo et al. (2010), covers a smaller geographical area than that in Ramsay and Silverman (2006). While other papers have been devoted to prediction of these temperature curves, our objective is to provide uncertainty bands for the predicted curves. Therefore, discussing the fitted model is out of the scope of this paper. For this aim, we selected five stations at random to use as validation stations for which we will provide prediction bands according to our proposal in Section 3. These are marked in red in Figure 8 (left). Data were converted to functional observations through smoothing by using penalized cubic Bsplines with 120 basis functions and penalty parameter equal to zero. These values were chosen using functional crossvalidation. The FKED model, with longitude and latitude as covariates, was then fitted to the remaining 30 stations and predicted temperature curves were obtained using ordinary functional kriging with an exponential variogram model for the 5 validation stations. From the empirical tracevariogram, there was no evidence of a discontinuity at the origin and hence we fixed the nugget equal to zero. For each of the validation sites, a bootstrap sample of size was obtained. Band depth was calculated using the modified version MBD. The resulting 95% prediction bands are shown in Figure 9. The uncertainty bands are fairly narrow, as expected when observing the small variability among curves in Figure 8 (right) but they become slightly wider in winter. Overall, the two uncertainty measures seem to agree well, although in some cases the distance based prediction band appears to be slightly narrower than the depth based one. Domain coverage percentages for all 5 validation sites range from 98.9% to 100%.
5.2 Air pollution data
The second case study considered consists of daily PM concentrations (in ) measured in 24 sites (red triangles in Figure 10) from October 2005 to March 2006 by the monitoring network of Piemonte region (Italy). Measurements were also available at 10 additional locations (blue dots in Figure 10) that are used as validation stations. Apart from geographical information, i.e. longitude, latitude and altitude of each station, which were considered as scalar covariates, information was available on daily maximum mixing height, daily total precipitation, daily mean wind speed, daily mean temperature and daily emission rates of primary aerosols, which were taken as functional covariates. These are available as a result of a nested system of deterministic computerbased models implemented by the environmental agency ARPA Piemonte (Finardi et al., 2008). This data set had already been analyzed in Cameletti et al. (2011). For further details on the model the reader is referred to Ignaccolo et al. (2014). A log transformation was used on the response variable to achieve normality and stabilize withinstation variability. Prior to modelling, data (both response and functional covariates) were smoothed by means of cubic Bsplines with 146 basis functions and penalty parameter equal to 0. These values were chosen using functional crossvalidation (Ignaccolo et al., 2014).
The functional kriging with external drift model described in Section 2 was applied to the air pollution data, including the (standardized) scalar and functional covariates mentioned above, to obtain prediction curves (via ordinary kriging for functional data with an exponential variogram model and zero nugget) at the 10 validation sites. To obtain 95% prediction bands for each of the predicted curves, a bootstrap sample of size 1000 was obtained for each site following the algorithm proposed in Section 3. We obtained prediction bands according to both the modified band depth (with ) and distance induced order. Prediction bands for the ten validation sites are shown in Figure 11.
Overall, the two uncertainty measures seem to agree well, although in some cases the depth based band appears to be slightly wider than the distance based one. The domain coverage varies from 97.3% to 100%.
6 Discussion
The functional kriging with external drift model allows to predict a whole curve  regardless of the domain of the functional observations  taking into account exogenous covariates and the underlying spatial dependence. Nevertheless, uncertainty evaluation in the functional kriging context has hardly ever been addressed in the literature. The classic functional kriging variance provides a unique value over the whole domain , while our aim is to provide an uncertainty measure whose value may change along the predicted curve. Given the lack of an analytic expression of a domainvarying kriging variance for a curve, we propose a semiparametric bootstrap approach that not only allows the uncertainty to change over the domain but also takes into account the uncertainty due to drift estimation. The contrast considered to build the uncertainty band, defined as the difference between the spatial prediction and the unknown value, could be thought of as a sort of pivotal quantity, the distribution of which is unknown. The use of a pivotal quantity is recalled to improve the performance of the bootstrap in the parametric framework (Canty et al., 2006).
We considered two different techniques proposed in the literature for ordering the bootstrapped curves, namely functional depth and distance, but we did not find great differences in the results for either the simulations or the two case studies. To be conservative, one could argue that it is safer to use the band depth as this criterion gives larger bands on average.
Overall, using the known data at validation sites in both simulations and real case studies, it can be concluded that the proposed method is a valid approach in evaluating uncertainty for a predicted curve. In particular, the simulation study shows that the width of the bands depends on the scale and range parameters of the covariance structure, at least for the exponential case, and becomes more stable over the domain with increasing sample size.
In terms of functional coverage, we can conclude that once we allow for a small tolerance the method performs generally well. If for a nominal domain coverage we are willing to accept an effective coverage of 90%, then it seems reasonable to assume a tolerance , in which case results are satisfactory. Note that, in practice, when the method is applied to a real dataset, the functional coverage cannot be calculated and assessment would have to be done based solely on domain coverage, whose median in the simulation study was not far from the nominal level of 95%.
Data in the two case studies considered in this paper were regularly spaced over the domain ; however, that is not always the case (see e.g. atmospheric profiles in Ignaccolo et al. (2015)). For irregularly spaced data, curves could be aligned at the initial smoothing step before fitting the functional kriging with external drift model, as it is straightforward to evaluate the curves for every in a common grid for all curves.
The algorithm proposed for uncertainty evaluation of spatially correlated curves is computationally feasible (running times with : 4 hours for the PM case study and 2.35 hours for the Canadian temperature case study using an Intel Core i74770 CPU 3.40GHz 16GB RAM) and applies to a wide range of practical situations, as it can be used regardless of the complexity of the drift term, or even in the absence of the latter. Our proposal has been specified in the case of ordinary kriging where the weight coefficients are constant and the spatial structure is determined by means of the tracevariogram. We are aware that the simulation study is limited to the simple case in which the spatial dependence structure does not change w.r.t. , as assumed by the tracecovariogram. While it might be worth exploring more complex scenarios a full analysis of the air pollution data described in Section 5.2, Ignaccolo et al. (2014) showed that in practice the assumption of a spatial structure that is constant over time is reasonable, as the drift term in the model picks up most of the spatial variability over time. This suggests that in similar situations, where the available covariates are able to explain a large part of the timespace variability, ordinary residual kriging should be appropriate.
Nevertheless, it may be the case that a more complex kriging alternative is desiderable in order to let the weights vary with (for both the real data and the bootstrap samples). To be consistent with the way the kriging weights have been specified in the cases of continuous timevarying kriging and functional kriging total model, the underlying spatial structure (and hence the matrices in Section 2 and in Section 3) should not be determined by means of the tracevariogram. In fact, in Giraldo et al. (2011) and Giraldo et al. (2009) functional data are expressed as linear combinations of splines and a Linear Model of Coregionalization is used to estimate crosscorrelations among the spline coefficients; this way a possible interaction between the curve domain and the space domain is taken into account. Consequently the matrices and would need to be adjusted accordingly but with the added burden of an increased computational load.
We believe that the method proposed in this paper is appropriate in the framework of functional data and is able to provide uncertainty bands for a predicted curve in an unmonitored site. Further, half the width of the resulting prediction band could be considered as an approximate margin of error. We think this will prove useful for monitoring purposes and policy assessment where the uncertainty should always accompany the related prediction.
7 Acknowledgments
This work was supported by “Futuro in Ricerca” 2012 Grant (project no. RBFR12URQJ, named StEPhI) provided by the Italian Ministry of Education, Universities and Research. The authors are grateful to all the participants of the StEPhI workshops for the fruitful discussions.
References
 Caballero et al. (2013) Caballero, W., R. Giraldo, and J. Mateu (2013). A universal kriging approach for spatial functional data. Stochastic Environmental Research and Risk Assessment 27(7), 1553–1563.
 Cameletti et al. (2011) Cameletti, M., R. Ignaccolo, and S. Bande (2011). Comparing spatiotemporal models for particulate matter in piemonte. Environmetrics 22(8), 985–996.
 Canty et al. (2006) Canty, A., A. Davison, D. Hinkley, and V. Ventura (2006). Bootstrap diagnostics and remedies. Canadian Journal of Statistics 34(1), 5–27.
 Chiles and Delfiner (2012) Chiles, J.P. and P. Delfiner (2012). Geostatistics: Modeling Spatial Uncertainty, 2nd Edition. Wiley, Hoboken, NJ.
 Cressie (1993) Cressie, N. (1993). Statistics for Spatial Data. Revised Edition Wiley, New York.
 Cuevas et al. (2006) Cuevas, A., M. Febrero, and R. Fraiman (2006). On the use of the bootstrap for estimating functions with functional data. Computational Statistics and Data Analysis 51, 1063–1074.
 Delicado et al. (2010) Delicado, P., R. Giraldo, C. Comas, and J. Mateu (2010). Statistics for spatial functional data: Some recent contributions. Environmentrics 21, 224–239.
 Ferraty et al. (2010) Ferraty, F., I. V. Keilegom, and P. Vieu (2010). On the validity of the bootstrap in nonparametric functional regression. Scandinavian Journal of Statistics 37, 286–306.
 Finardi et al. (2008) Finardi, S., R. DeMaria, A. D’Allura, C. Cascone, G. Calori, and F. Lollobrigida (2008). A deterministic air quality forecasting system for Torino urban area, Italy. Environmental Modelling and Software 23(3), 344–355.
 Freedman and Peters (1984) Freedman, D. and S. Peters (1984). Bootstrapping a regression equation: Some empirical results. Journal of the American Statistical Association 79, 97–106.
 Giraldo et al. (2009) Giraldo, R., P. Delicado, and J. Mateu (2009). Geostatistics with infinite dimensional data: A generalization of cokriging and multivariable spatial prediction. Technical report, Reporte Interno de Investigacion No. 14, Universidad Nacional de Colombia.
 Giraldo et al. (2010) Giraldo, R., P. Delicado, and J. Mateu (2010). Continuous timevarying kriging for spatial prediction of functional data: An environmental application. Journal of Agricultural, Biological, and Environmental Statistics 15(1), 66–82.
 Giraldo et al. (2011) Giraldo, R., P. Delicado, and J. Mateu (2011). Ordinary kriging for functionvalued spatial data. Environmental and Ecological Statistics 18(3), 411–426.
 Giraldo et al. (2012) Giraldo, R., J. Mateu, and P. Delicado (2012). geofd: An R package for functionvalued geostatistical prediction. Revista Colombiana de Estadística 35(3), 383–405.
 Goldsmith et al. (2013) Goldsmith, J., S. Greven, and C. Crainiceanu (2013). Corrected confidence bands for functional data using principal components. Biometrics 69, 41–51.
 GonzálezRodríguez and Colubi (2017) GonzálezRodríguez, G. and A. Colubi (2017). On the consistency of bootstrap methods in separable Hilbert spaces. Econometrics and Statistics 1, 118–127.
 Goulard and Voltz (1993) Goulard, M. and M. Voltz (1993). Geostatistical interpolation of curves: A case study in soil science. In A. Soares (Ed.), Geostatistics Troia ’92, pp. 805–816. Kluwer Academic, Dordrecht.
 Horváth and Kokoszka (2012) Horváth, L. and P. Kokoszka (2012). Inference for functional data with applications. Springer, New York.
 Ignaccolo et al. (2015) Ignaccolo, R., M. FrancoVilloria, and A. Fassò (2015). Modelling collocation uncertainty of 3D atmospheric profiles. Stochastic Environmental Research and Risk Assessment 29(2), 417–429.
 Ignaccolo et al. (2014) Ignaccolo, R., J. Mateu, and R. Giraldo (2014). Kriging with external drift for functional data for air quality monitoring. Stochastic Environmental Research and Risk Assessment 28, 1171–1186.
 Iranpanah et al. (2011) Iranpanah, N., M. Mohammadzadeh, and C. Taylor (2011). A comparison of block and semiparametric bootstrap methods for variance estimation in spatial statistics. Computational Statistics and Data Analysis 55, 578–587.
 Lahiri (2003) Lahiri, S. (2003). Resampling Methods for Dependent Data. SpringerVerlag, New York.
 LopezPintado and Romo (2009) LopezPintado, S. and J. Romo (2009). On the concept of depth for functional data. Journal of the American Statistcal Association 104(486), 718–734.
 Menafoglio and Petris (2016) Menafoglio, A. and G. Petris (2016). Kriging for Hilbertspace valued random fields: The operatorial point of view. Journal of Multivariate Analysis 146, 84–94.
 Menafoglio et al. (2013) Menafoglio, A., P. Secchi, and M. D. Rosa (2013). A universal kriging predictor for spatially dependent functional data of a Hilbert space. Electronic Journal of Statistics 7, 2209–2240.
 Nerini et al. (2010) Nerini, D., P. Monestiez, and C. Manté (2010). Cokriging for spatial functional data. Journal of Multivariate Analysis 101, 409–418.
 Pigoli et al. (2016) Pigoli, D., A. Menafoglio, and P. Secchi (2016). Kriging predictions for manifoldvalued random fields. Journal of Multivariate Analysis 145, 117–131.
 R Core Team (2015) R Core Team (2015). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
 Ramsay and Silverman (2006) Ramsay, J. and B. Silverman (2006). Functional Data Analysis. Springer, New York.
 Ramsay et al. (2014) Ramsay, J. O., H. Wickham, S. Graves, and G. Hooker (2014). fda: Functional Data Analysis. R package version 2.4.4.
 Rana et al. (2016) Rana, P., G. Aneiros, J. Vilar, and P. Vieu (2016). Bootstrap confidence intervals in functional nonparametric regression under dependence. Electronic Journal of Statistics 10, 1973–1999.
 Reyes et al. (2015) Reyes, A., R. Giraldo, and J. Mateu (2015). Residual kriging for functional spatial prediction of salinity curves. Communications in Statistics  Theory and Methods 44(4), 798–809.
 Robinson (1991) Robinson, G. (1991). That BLUP is a good thing: The estimation of random effects. Statistical Science 6, 15–32.
 Scheipl et al. (2015) Scheipl, F., A.M. Staicu, and S. Greven (2015). Functional additive mixed models. Journal of Computational and Graphical Statistics 24(2), 477–501.
 Schelin and de Luna (2010) Schelin, L. and S. S. de Luna (2010). Kriging prediction intervals based on semiparametric bootstrap. Mathematical Geosciences 42, 985–1000.
 Solow (1985) Solow, A. (1985). Bootstrapping correlated data. Journal of the International Association for Mathematical Geology 17, 769–775.
 Speed (1991) Speed, T. (1991). [That BLUP is a good thing: The estimation of random effects]: Comment. Statistical Science 6(1), 42–44.
 Sun and Genton (2011) Sun, Y. and M. Genton (2011). Functional boxplots. Journal of Computational and Graphical Statistics 20(2), 316–334.
 Sun et al. (2012) Sun, Y., M. Genton, and D. Nychka (2012). Exact fast computation of band depth for large functional datasets: How quickly can one million curves be ranked? Stat 1, 68–74.
 Wood (2006) Wood, S. (2006). Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC.
 Wood (2011) Wood, S. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1), 3–36.
 Wood (2015) Wood, S. (2015). mgcv: Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness Estimation. R package version 1.8.6.
Appendix: note on the tracevariogram
Let us recall that the tracevariogram is defined, for a zeromean weaklystationary isotropic process, as
where represents the Euclidean distante between locations and .
By assuming a finite development for the functions such that where is the number of considered basis functions and is the th basis function evaluated at , we can write
where is assumed identical for all
and . Thus we get a factorization with respect to the spatial domain and the curve domain .
Moreover, for every , and we can write
With the assumption of stationarity and isotropy for we obtain
so that the tracevariogram is written as a product of a (classical) spatial variogram and a constant depending on the basis functions .