1 Introduction

Simultaneous Confidence Band for Stationary

Covariance Function of Dense Functional Data

Jiangyan Wang, Guanqun Cao, Li Wang and Lijian Yang

Nanjing Audit University, China, Auburn University, USA,

Iowa State University, USA, and Tsinghua University, China

Address for correspondence: Li Wang (lilywang@iastate.edu) and Lijian Yang (yanglijian@tsinghua.edu.cn)

Abstract: Inference via simultaneous confidence band is studied for stationary covariance function of dense functional data. A two-stage estimation procedure is proposed based on spline approximation, the first stage involving estimation of all the individual trajectories and the second stage involving estimation of the covariance function through smoothing the empirical covariance function. The proposed covariance estimator is smooth and as efficient as the oracle estimator when all individual trajectories are known. An asymptotic simultaneous confidence band (SCB) is developed for the true covariance function, and the coverage probabilities are shown to be asymptotically correct. Simulation experiments are conducted on the numerical performance of the proposed estimator and SCB. The proposed method is also illustrated by two real data examples.
Key words and phrases: Confidence band, Covariance function, Functional data, Stationary.

## 1 Introduction

Since Ramsay and Dalzell (1991) first coined the term “functional data analysis” (FDA), recent years have seen numerous publications emerging in the FDA theory, methods and applications, making it an important area in statistics research. Motivated by specific problems and complex data being collected in modern experiments, such as Ramsay and Silverman (2002), Ramsay and Silverman (2005), Hsing and Eubank (2015), considerable efforts have been made to analyze functional data. Estimation for population mean function in functional data has been extensively studied, for instance, Cardot (2000), Cao et al. (2012), Chen and Song (2015), Ma et al. (2012), Ferraty and Vieu (2006) and Zheng et al. (2016) and so on.

Being related to the smoothness, the second-order structure of random functions can be depicted by the covariance, thus the covariance function is another indispensable ingredient in many areas. In this sense, Cao et al. (2016) proposed a simultaneous confidence envelope of covariance function in functional data; Horváth and Kokoszka (2012) considered the detection of sequential changes in the covariance structure of functional observations; Horváth et al. (2013) proposed a consistent estimator for the long-run covariance operator of stationary time series; Pantle et al. (2010) were concerned with the estimation of integrated covariance functions, which is required to construct asymptotic confidence intervals and significance tests for the mean vector in the context of stationary random fields. Since the covariance function measures stronger association among variables that are closer to each other, the employing of covariance function is considerably highlighted in spatial data analysis when the geometric structure of the surface is rough and self-similar. A common situation is that the observations are specified via a Gaussian process whose finite dimensional joint distributions are determined by a valid covariance function; see, for instance, Choi et al. (2013).

Let be a stochastic process defined on a compact interval , with . It is covariance stationary if , where

 G(x,x′)=Cov{η(x),η(x′)},x,x′∈χ. (1)

Consider a collection of trajectories , which are iid realizations of , with mean and covariance functions , , respectively. The trajectories are decomposed as , where can be viewed as a small-scale variation of on the th trajectory, and is assumed to be a weakly stationary process with and covariance .

According to classical functional analysis, there exist eigenvalues , , and corresponding eigenfunctions of , the latter being an orthonormal basis of , and , . The standard process , , then allows the well-known Karhunen-Loève  representation , in which the random coefficients , called functional principal component (FPC) scores, are uncorrelated with mean  and variance . The rescaled eigenfunctions, , called FPC, satisfy that , , for . The th process , , is written as , in which the FPC scores , , are iid copies of . Although the sequences ,  and exist mathematically, they are unknown or unobservable, respectively.

The actual observed functional data are noisy sampled points from trajectories . Let be repeated measurements on a random sample of  experimental units, where is the response observed on the th unit at value of the variable . The observed data can be modeled as

 Yij=ηi(Xij)+σ(Xij)εij=m(Xij)+Zi(Xij)+σ(Xij)εij

for , , where are iid random errors with mean and variance , and is the variance function of the measurement errors, and ’s are independent of the ’s. For the data considered in this paper, is recorded on a regular grid in , and without loss of generality, is taken to be , and , . This type of functional data was considered in Li and Hsing (2010), Crainiceanu et al. (2009), Cao et al. (2016) and among others. Consequently, our observed data can be written as

 Yij=m(j/N)+Zi(j/N)+σ(j/N)εij, 1≤i≤n, 1≤j≤N. (2)

It would not be a far stretch if the sample points for the th subject admit the structure of a nonstationary or locally stationary time series, as in Fryzlewicz and Ombao (2009) and Sanderson et al. (2010). One may further ask if these random observations at regular grid points would even admit the structure of stationary time series. Cao et al. (2016), for instance, concluded that the Tecator near infrared spectra data is nonstationary based on the simultaneous confidence envelope for the covariance function. There are, however, interesting functional data for which the covariance function exhibits stationarity, because a closer relationship between the geometric structures and covariance function relies on the stationary assumption. In particular, the stationary random processes or fields are prominent in the analysis of 1D and 2D signals, see, for instance, the important spatial covariance model studied in Matérn random fields, the carrying out of multivariate time series such as stationary functional series that are linear and the stationary spectral-space statistics studied in physics such as Tsyrulnikov and Gayfulin (2017). As a fundamental issue, the study of covariance structure in stationary stochastic processes can be applied to a wide range of areas such as hydrosciences and geostatistics. The contribution of this paper is twofold. First, it provides methodology and asymptotic theory for the estimation of the covariance in the framework of stationary dense functional data under mild assumptions; second, the estimator of is accompanied by a procedure for constructing asymptotic simultaneous confidence band (SCB).

Our estimation procedure is based on spline approximation, where the first step involves the estimation of the th trajectory and the mean function; the second step estimates the covariance function through smoothing using the residuals of the first step. The proposed covariance estimator is smooth and as efficient as the oracle estimator when all trajectories and the mean are known.

After the estimation of the covariance function, the next question of interest is to provide an inferential tool to further examine whether a covariance function has some specific parametric form. For global inference, we construct SCB for the covariance function to test the adequacy and validity of certain covariance models, and specifically, one can test the hypothesis for some . An SCB is an intuitive and theoretically reliable tool for global inference of functions, for example, Wang et al. (2014) constructed SCB for the autoregressive error distribution function in time series, Cao et al. (2012), Ma et al. (2012), Gu et al. (2014) and Zheng et al. (2014) proposed SCBs for the mean function , Cao et al. (2016) offered a simultaneous confidence envelope for the covariance function in the framework of FDA, other contexts of work containing more theoretical results and applications of SCB may be found in Chen and Song (2015), Wang (2012), Gu and Yang (2015), Wang et al. (2016) and Zheng et al. (2016).

The rest of paper is organized as follows. In Section 2, we introduce the two-stage B-spline estimation procedure for the covariance function. Section 3 shows that the proposed estimator is as efficient as if all the trajectories and the mean function are known over the entire data range. Section 4 presents the asymptotic SCB for the covariance function, and describes the implementation procedure of the SCB. Section 5 reports findings from a simulation study to evaluate the finite sample performance of the proposed SCB. Technical lemmas and proofs are deferred to Appendices A and B. More simulation studies are carried out in Appendix C. Analysis of real data is given in Appendix D.

## 2 B-spline covariance function estimation

In this section, we describe the estimation procedure for the covariance function . If the small-scale variation of , , , , on the th trajectory could be observed, one would estimate the covariance as

 ˜C(h)=11−h∫1−h01nn∑i=1Zi(x)Zi(x+h)dx,h∈[0,h0], (3)

where is a pre-specified upper limit.

Since , , are unobserved, the above estimator is “infeasible” in practice. In this paper, we propose to estimate the covariance function based on the following residuals

 ˆZi(x)=ˆηi(x)−ˆm(x),1≤i≤n,x∈[0,1], (4)

where and are the estimators of and .

In such case, a sample-based consistent estimator can be employed, such as the spline smoother proposed in Cao et al. (2012). Denote by a sequence of equally-spaced points, , , , called interior knots, which divide the interval  into equal subintervals , , , . For any positive integer , let and be auxiliary knots. Let be the polynomial spline space of order on , , which consists of all times continuously differentiable functions on that are polynomials of degree on subintervals , . Following the notation in de Boor (2001), we denote by the th order B-spline basis functions of , hence .

The th unknown trajectory is estimated by using the following formula

 ˆηi(⋅)=argming(⋅)∈S(p−2)N∑j=1{Yij−g(xj)}2. (5)

One can then estimate the unknown mean function as

 ˆm(x)=n−1n∑i=1ˆηi(x), (6)

and obtain the covariance estimator

 ˆC(h)=11−h∫1−h01nn∑i=1ˆZi(x)ˆZi(x+h)dx,h∈[0,h0]. (7)

## 3 Asymptotic Properties

This section studies the asymptotic properties for the proposed estimators.

### 3.1 Assumptions

To study the asymptotic properties of the two-step spline estimator , one needs some assumptions. Throughout the paper, for any function defined on a domain , denote , and its th order derivative with respect to . For any integrable functions and , , define their theoretical inner product as , and the empirical inner product as . The related theoretical and empirical norms are , .

For a non-negative integer and a real number , write as the space of -Hölder continuous functions, i.e.,

 H(q,μ)[0,1]={φ:[0,1]→ R∣∣ ∣∣∥φ∥q,μ=supx,y∈R,x≠y∣∣ ∣∣φ(q)(x)−φ(q)(y)|x−y|μ∣∣ ∣∣<+∞}.

We next introduce some technical assumptions.

1. There exist an integer and a constant , such that the regression function . In the following, one denotes .

2. The standard deviation function for positive index and for some constants , , .

3. There exists a constant , such that as , , .

4. The rescaled FPCs with , ; for increasing positive integers , as , and for some .

5. There are positive constants , and iid variables , such that for the index  in Assumption (A2), in Assumption (A1), and

 Pr{max1≤k≤knmax1≤t≤n∣∣ ∣∣t∑i=1ξik−t∑i=1Uik,ξ∣∣ ∣∣>nβ1} Nβ2}
6. The iid variables are independent of . The number of distinct distributions for all FPC scores  is finite. There exist constants , , for in Assumption (A4) and in Assumption (A3), such that and are finite.

7. The spline order , the number of interior knots for some with as , and for in Assumption (A1), in Assumption (A2), in Assumption (A3), and in Assumption (A5)

 max{5θ4p∗,θ+(γ1+1+ω)−18θβ12p∗,1−ν}<γ<1−θ2−β2−θ2β1.

Assumptions (A1)–(A2) are standard in the literature, see Cao et al. (2012) and Song and Yang (2009) for instance. In particular, (A1) and (A4) control the size of the bias of the spline smoother for and . Assumption (A2) ensures the variance function is a uniformly bounded function. Assumption (A3) regulates that sample size grows as a fractional power of , the number of observations per subject. The bounded smoothness of the principal components is guaranteed in Assumption (A4). Assumption (A5) provides a strong approximation of estimation errors and FPC scores. Assumption (A5’) is an elementary assumption to guarantee the high level Assumption (A5). It is noteworthy that the smoothness of our estimator is controlled by the knots of the splines. Assumption (A6) specifies the requirement that the number of knots has to meet for the B-spline smoothing.

###### Remark 1.

The above assumptions are mild conditions that can be satisfied in many practical situations. One simple and reasonable setup for the above parameters , , , , can be the following: , , , (cubic spline), , , where means and are asymptotically equivalent. These constants are used as defaults in implementing the method; see Section 4.

### 3.2 Oracle efficiency

We now show that the proposed two-step estimator defined in (7) is oracle-efficient, i.e., it is as efficient as if all trajectories are known over the entire data range. To begin with, we first investigate the asymptotic property of the infeasible covariance estimator . Denote by , .

According to the definition of and in (1) and (3), one has

 C(h) =∞∑k=1∞∑k′=1E(ξikξik′)11−h∫1−h0ϕk(x)ϕk′(x+h)dx =11−h∫1−h0∞∑k=1ϕk(x)ϕk(x+h)dx, ˜C(h) =1n(1−h)n∑i=1∞∑k=1∞∑k′=1ξikξik′∫1−h0ϕk(x)ϕk′(x+h)dx.

Thus,

 Δ(h)=11−h∞∑k,k′=1(¯ξ⋅kk′−δkk′)∫1−h0ϕk(x)ϕk′(x+h)dx,

where , and for and otherwise.

Then the asymptotic mean squared error of the infeasible covariance estimator is provided in Theorem 1 below.

###### Theorem 1.

Under Assumptions (A1)–(A6), , in which

 Ξ(h)= ∞∑k,k′=1{11−h∫1−h0ϕk(x)ϕk′(x+h)dx}2 +∞∑k,k′=1{11−h∫1−h0ϕk(x)ϕk′(x+h)dx}{11−h∫1−h0ϕk′(x)ϕk(x+h)dx} +∞∑k=1(Eξ41k−3){11−h∫1−h0ϕk(x)ϕk(x+h)dx}2. (8)
###### Remark 2.

By rewriting , one has

 Ξ(h)= ∞∑k=1(Eξ41k−1){11−h∫1−h0ϕk(x)ϕk(x+h)dx}2 +∞∑k

Following from (3.2) in Cao et al. (2016),

 V(x,x+h)=∞∑k

thus, , . Therefore, if the covariance function is stationary, the infeasible estimator is more efficient than the covariance estimator given in Cao et al. (2016).

###### Proposition 1.

Under Assumptions (A1)–(A6), as , , where is a Gaussian process defined on such that , , with covariance function

 Ω(h,h′)= Cov(ζ(h),ζ(h′))=(1−h)−1(1−h′)−1 ×{∫1−h0∫1−h′0∞∑k,k′=1ϕk(x)ϕk(x′)ϕk′(x+h)ϕk′(x′+h′)dxdx′ +∫1−h0∫1−h′0∞∑k,k′=1ϕk(x)ϕk(x′+h′)ϕk′(x+h)ϕk′(x′)dxdx′ +∫1−h0∫1−h′0∞∑k=1(Eξ41k−3)ϕk(x)ϕk(x+h)ϕk(x′)ϕk(x′+h′)dxdx′},

for any .

The proof is deferred to the Appendix. Although the oracle smoother enjoys the desirable theoretical property, it is not a statistic since is unknown. According to Proposition 2 below, the price for using in place of in the covariance estimator is asymptotically negligible, that is, two-step estimator is as efficient as the infeasible estimator .

###### Proposition 2.

Under Assumptions (A1)–(A6), .

Combining the above two propositions, we obtain the following result.

###### Theorem 2.

Under Assumptions (A1)–(A6), .

Theorem 2 indicates that is the leading term of .

## 4 Simultaneous confidence band

In this section, we construct the SCB for the covariance function .

### 4.1 Asymptotic SCB

Next theorem presents the asymptotic behavior of the maximum of the normalized deviation of the covariance estimator , which sheds the lights on how to construct the asymptotic SCB for .

###### Theorem 3.

Under Assumptions (A1)–(A6), for any ,

 limN→∞Pr{n1/2∣∣ˆC(h)−C(h)∣∣Ξ(h)−1/2≤z1−α/2}=1−α,∀h∈[0,h0],

where is the percentile of the absolute maxima distribution of , while is denoted as the percentile of the standard normal distribution, and is the mean zero Gaussian process defined in Proposition 1.

Theorem 3 is a direct result of Propositions 1, 2 and Theorem 2, thus the proof is omitted.

###### Corollary 1.

Under Assumptions (A1)–(A6), an asymptotic exact SCB for is , . While an asymptotic pointwise confidence band for is given by , .

Note that the percentile and the variance function have to be estimated from the data. These issues are addressed in Section 4.4.

### 4.2 Knots selection

In spline smoothing, the number of knots are often treated as unknown tuning parameters, and the spline fitting can be sensitive to the knots selection. However, there is no optimal method to choose the number of knots in the literature, and the following two ways are suggested to select : (a) criterion-based selection strategies such as Generalized Cross Validation (GCV) and Bayesian Information Criterion (BIC); (b) formula based selection strategies, which is stated in Remark 1, where the smoothness order of mean function and eigenfunctions are taken as default or with a matching spline order (cubic spline). The default of parameter . Both methods are tried in the numerical studies, and they give very similar estimators and SCBs.

### 4.3 FPC analysis

We now describe how to obtain the covariance function , and its eigenfunctions and eigenvalues in the FPC analysis.

We estimate by

 ˆG(x,x′)=n−1n∑i=1ˆZi(x)ˆZi(x′)=Js+p∑s=1Js+p∑s′=1ˆβss′Bs,p(x)Bs′,p(x′), (9)

where is defined in (4) and ’s are the coefficients.

In FPC applications, it is typical to truncate the spectral decomposition at an integer to account for the some predetermined proportion of the variance. For example, in our numerical studies below, is selected as the number of eigenvalues that can explain 95% of the variation in the data. Next, let , and the design matrix for spline regression is

 B={B(1/N),…,B(N/N)}⊤=⎛⎜ ⎜⎝B1,p(1/N)⋯BJs+p,p(1/N)⋮⋯⋮B1,p(N/N)⋯BJs+p,p(N/N)⎞⎟ ⎟⎠. (10)

Then for any , we consider the following spline approximation for : , where ’s are coefficients of B-spline estimator subject to with . The estimates of eigenfunctions and eigenvalues correspond and can be obtained by solving the eigenequations,

 ∫ˆG(x,x′)ˆψk(x′)dx′=ˆλkˆψk(x),k=1,…,κ. (11)

According to (9), solving (11) is equivalent to solve the following: , , where .

By simple algebra and Lemma 3.1 in Wang and Yang (2009), one needs to solve , for any . Consider the following Cholesky decomposition: . Therefore, solving (11) is equivalent to solve , that is, and , , are the eigenvalues and unit eigenvectors of . In other words, is obtained by multiplying immediately after the unit eigenvectors of , hence is obtained. Consequently, . Then, the th FPC score of the th curve can be estimated by a numerical integration

 ˆξik=1NN∑j=1ˆλ−1k{Yij−ˆm(jN)}ˆϕk(jN).

### 4.4 Estimating the variance function Ξ and the percentile Q1−α

Notice the fact that (8) entails us to estimate the variance function by merely computing , and . In practice, the following estimator is employed

 ˆΞ(h)= κ∑k,k′=1(11−h∫1−h0ˆϕk(x)ˆϕk′(x+h)dx)2 +ˆC2(h)+κ∑k=1(Eˆξ41k−3){11−h∫1−h0ˆϕk(x)ˆϕk(x+h)dx}2.

Next, to derive the percentile , the Gaussian process is simulated as follows

 ˆζ(h)= κ∑k≠k′11−h∫1−h0ϵkk′ˆϕk(x)ˆϕk′(x+h)dx +κ∑k=111−h∫1−h0ϵkˆϕk(x)ˆϕk(x+h)(Eˆξ41k−1)1/2dx,

where and are independent standard Gaussian random variables. Hence, is a zero mean Gaussian process with variance function and covariance function

 ˆΩ (h,h′)=Cov{ˆζ(h),ˆζ(h′)} =11−h11−h′∫1−h0∫1−h′0⎧⎨⎩κ∑k,k′=1ˆϕk(x)ˆϕk(x′)ˆϕk′(x+h)ˆϕk′(x′+h′) +κ∑k=1(Eˆξ41k−3)ˆϕk(x)ˆϕk(x+h)ˆϕk(x′)ˆϕk(x′+h′)}dxdx′+ˆC(h)ˆC(h′),

for any . A large number of independent realizations of are simulated, then the maximal absolute deviation for each copy of is taken. Eventually, is estimated by the empirical percentiles of these maximum values.

## 5 Simulation studies

In this section, we conduct some simulation studies to illustrate the finite-sample performance of the proposed method. More simulation studies can be found in Appendix C.

The data are generated from the following model:

 Yij=m(j/N)+∞∑k=1ξikϕk(j/N)+