Estimation and Prediction using generalized Wendland Covariance Functions under fixed domain asymptotics

# Estimation and Prediction using generalized Wendland Covariance Functions under fixed domain asymptotics

\fnmsMoreno \snmBevilacqua\corref\thanksreft1,m1label=e1]moreno.bevilacqua@uv.cl [    \fnmsTarik \snmFaouzi\thanksrefm2label=e2]tfaouzi@ubiobio.cl [    \fnmsReinhard \snmFurrer\thanksreft3,m3label=e3]reinhard.furrer@math.uzh.ch [    \fnmsEmilio \snmPorcu\thanksreft4,m4label=e4]emilio.porcu@usm.cl [ Universidad de Valparaiso\thanksmarkm1 Universidad del BioBio\thanksmarkm2 University of Zurich\thanksmarkm3 Universidad Federico Santa Maria\thanksmarkm4, University of Newcastle\thanksmarkm4 Department of Statistics
Universidad de Valparaiso
2360102 Valparaiso, Chile
\printeade1
Department of Statistics
Universidad del BioBio
4081112 Concepcion, Chile
\printeade2
Universidad Federico Santa Maria
Department of Mathematics and
Department of Computational Science
University of Zurich
8057 Zurich, Switzerland
\printeade3
Department of Mathematics
Universidad Federico Santa Maria
2360102 Valparaiso, Chile
\printeade4
###### Abstract

We study estimation and prediction of Gaussian random fields with covariance models belonging to the generalized Wendland (GW) class, under fixed domain asymptotics. As for the Matérn case, this class allows for a continuous parameterization of smoothness of the underlying Gaussian random field, being additionally compactly supported. The paper is divided into three parts: first, we characterize the equivalence of two Gaussian measures with GW covariance function, and we provide sufficient conditions for the equivalence of two Gaussian measures with Matérn and GW covariance functions. In the second part, we establish strong consistency and asymptotic distribution of the maximum likelihood estimator of the microergodic parameter associated to GW covariance model, under fixed domain asymptotics. The third part elucidates the consequences of our results in terms of (misspecified) best linear unbiased predictor, under fixed domain asymptotics. Our findings are illustrated through a simulation study: the former compares the finite sample behavior of the maximum likelihood estimation of the microergodic parameter with the given asymptotic distribution. The latter compares the finite-sample behavior of the prediction and its associated mean square error when using two equivalent Gaussian measures with Matérn and GW covariance models, using covariance tapering as benchmark.

[
\kwd
\startlocaldefs\endlocaldefs\runtitle

Generalized Wendland Covariance Functions

{aug}

, and \thankstextt1Supported by grant FONDECYT 1160280 from Chilean government \thankstextt3Acknowledges support of URPP-GCB and SNSF-175529 \thankstextt4Supported by grant FONDECYT 1130647 from Chilean government

class=MSC] \kwd[Primary ]62M30 \kwd[; secondary ]62F12 \kwd60G25

compactly supported covariance \kwdspectral density \kwdlarge dataset \kwdmicroergodic parameter

## 1 Introduction

Covariance functions cover a central aspect of inference and prediction of Gaussian fields defined over some (compact) set of . For instance, the best linear unbiased prediction at an unobserved site depends on the knowledge of the covariance function. Since a covariance function must be positive definite, practical estimation generally requires the selection of some parametric classes of covariances and the corresponding estimation of these parameters.

The maximum likelihood (ML) estimation method is generally considered best for estimating the parameters of covariance models. The study of asymptotic properties of ML estimators is complicated by the fact that more than one asymptotic framework can be considered when observing a single realization from a Gaussian field. In particular, under fixed domain asymptotics, one supposes that the sampling domain is bounded and that the sampling set becomes increasingly dense. Under increasing domain asymptotics, the sampling domain increases with the number of observed data, and the distance between any two sampling locations is bounded away from zero.

The asymptotic behavior of ML estimators of the covariance parameters can be quite different under these two frameworks (Zhang and Zimmerman, 2005). Under increasing domain asymptotic framework and some mild regularity conditions Mardia and Marshall (1984) give a general result. Specifically, they show that ML estimators are consistent and asymptotically Gaussian, with asymptotic covariance matrix equal to the inverse of the Fisher information matrix.

Equivalence of Gaussian measures (Skorokhod and Yadrenko, 1973; Ibragimov and Rozanov, 1978) represents an essential tool to establish the asymptotic properties of both prediction and estimation of Gaussian fields under fixed domain asymptotics. In his tour de force, Stein (1988, 1990, 1993, 1999a, 2004) provides conditions under which predictions under a misspecified covariance function are asymptotically efficient, and mean square errors converge almost surely to their targets. Since Gaussian measures depend exclusively on their mean and covariance functions, practical evaluation of Stein’s conditions translates into the fact that the true and the misspecified covariances must be compatible, i.e., the induced Gaussian measures are equivalent.

Under fixed domain asymptotics, no general results are available for the asymptotic properties of ML estimators. Yet, some results have been obtained when assuming that the covariance belongs to the parametric family of Matérn covariance functions (Matérn, 1960) that has been very popular in spatial statistics for its flexibility with respect to continuous parameterization of smoothness, in the mean square sense, of the underlying Gaussian field. For a Gaussian field defined over a bounded and infinite set of , Zhang (2004) shows that when the smoothness parameter is known and fixed, not all parameters can be estimated consistently when . Instead, the ratio of variance and scale (to the power of the smoothing parameter), sometimes called microergodic parameter (Zhang and Zimmerman, 2005; Stein, 1999b), is consistently estimable. In contrast for , Anderes (2010) proved the orthogonality of two Gaussian measures with different Matérn covariance functions and hence, in this case, all the parameters can be consistently estimated under fixed-domain asymptotics. The case is still open.

Asymptotic results for ML estimator of the microergodic parameter of the Matérn model can be found in Zhang (2004), Du, Zhang and Mandrekar (2009), Wang and Loh (2011) and Kaufman and Shaby (2013). In particular, Kaufman and Shaby (2013) give strong consistency and asymptotic distribution of the microergodic parameter when estimating jointly the scale and variance parameters, generalizing previous results in Zhang (2004) and Wang and Loh (2011) where the scale parameter is assumed known and fixed. Kaufman and Shaby (2013) show by means of simulation study that asymptotic approximation using a fixed scale parameter can be problematic when applied to finite samples, even for large sample sizes. In contrast, they show that performance is improved and asymptotic approximations are applicable for smaller sample sizes when the parameters are jointly estimated.

Under the Matérn family, similar results have been obtained for the covariance tapering (CT) method of estimation, as originally proposed in Kaufman, Schervish and Nychka (2008) and consisting of setting to zero the dependence after a given distance. This is in turn achieved by multiplying the Matérn covariance with a taper function, that is, a correlation function being additionally compactly supported over a ball with given radius. Thus, the resulting covariance tapered matrix is sparse, with the level of sparseness depending on the radius of compact support. Sparse matrix algorithms can then be used to evaluate efficiently an approximate likelihood where the original covariance matrix is replaced by the tapered matrix. The results proposed in Kaufman, Schervish and Nychka (2008) have then inspired the works in Du, Zhang and Mandrekar (2009), Wang and Loh (2011) and Kaufman and Shaby (2013), where asymptotic properties of the CT estimator of the Matérn microergodic parameter are given.

Using the Matérn family, Furrer, Genton and Nychka (2006) study CT when applied to the best unbiased linear predictor and show that under fixed domain asymptotics and some specific conditions of the taper function, asymptotically efficient prediction and asymptotically correct estimation of mean square error can be achieved using a tapered Matérn covariance function. Extensions have been discussed by, e.g., Stein (2013) and Hirano and Yajima (2013). The basic message of CT method is that large data sets can be handled both for estimation and prediction exploiting sparse matrix algorithms when using the Matérn model.

Inspired by this idea, we focus on a covariance model that offers the strength of the Matérn family and allows the use of sparse matrices. Specifically, we study estimation and prediction of Gaussian fields under fixed domain asymptotics, using the generalized Wendland (GW) class of covariance functions (Gneiting, 2002a; Zastavnyi, 2006), the members of which are compactly supported over balls of with arbitrary radii, and additionally allows for a continuous parameterization of differentiability at the origin, in a similar way to the Matérn family.

In particular, we provide the following results. First, we characterize the equivalence of two Gaussian measures with covariance functions belonging to the GW class and sharing the same smoothness parameter. A consequence of this result is that, as in the Matérn case (Zhang, 2004), when the smoothness parameter is known and fixed, not all parameters can be estimated consistently under fixed domain asymptotics. Then we give sufficient conditions for the equivalence of two Gaussian measures where the state of truth is represented by a member of the Matérn family and the other measure has a GW covariance model and vice versa.

We assess the asymptotic properties of the ML estimator of the microergodic parameter associated to the GW class. Specifically, for a fixed smoothness parameter, we establish strong consistency and asymptotic distribution of the microergodic parameter estimate assuming the compact support parameter fixed and known. Then, we generalize these results when jointly estimating with ML the variance and the compact support parameter.

Finally, using results in Stein (1988, 1993), we study the implications of our results on prediction under fixed domain asymptotics. One remarkable implication is that when the true covariance belongs to the Matérn family, asymptotic efficiency prediction and asymptotically correct estimation of mean square error can be achieved using a compatible GW covariance model.

The remainder of the paper is organized as follows. In Section 2, we review some results of Matérn and GW covariance models. In Section 3, we first characterize the equivalence of Gaussian measure under the GW covariance model. Then we find a sufficient condition for the equivalence of two Gaussian measures with Matérn and GW covariance models. In Section 4, we establish strong consistency and asymptotic distribution of the ML estimator of the microergodic parameter of the GW models, under fixed domain asymptotics. Section 5 discusses the consequences of the results in Section 3 on prediction under fixed domain asymptotics. Section 6 provides two simulation studies: The first shows how well the given asymptotic distribution of the microergodic parameter applies to finite sample cases when estimating with ML a GW covariance model under fixed domain asymptotics. The second compare the finite-sample behavior of the prediction when using two compatible Matérn and GW models, using CT as a benchmark. The final section provides discussion on the consequence of our results and identifies problems for future research.

## 2 Matérn and generalized Wendland covariance models

We denote a zero mean Gaussian field on a bounded set of , with stationary covariance function . We consider the class of continuous mappings with , such that

 cov(Z(s),Z(s′))=C(s′−s)=ϕ(∥s′−s∥),

with , and denoting the Euclidean norm. Gaussian fields with such covariance functions are called weakly stationary and isotropic.

Schoenberg (1938) characterized the class as scale mixtures of the characteristic functions of random vectors uniformly distributed on the spherical shell of , with any positive measure, :

 ϕ(r)=∫∞0Ωd(rξ)F(dξ),r≥0,

with and a Bessel function of order . The class is nested, with the inclusion relation being strict, and where is the class of continuous mappings , the radial version of which is positive definite on any -dimensional Euclidean space.

The Matérn function, defined as:

 Mν,α,σ2(r)=σ221−νΓ(ν)(rα)νKν(rα),r≥0,

is a member of the class for any positive values of and . Here, is a modified Bessel function of the second kind of order , is the variance and a positive scaling parameter. The parameter characterizes the differentiability at the origin and, as a consequence, the differentiability of the associated sample paths. In particular for a positive integer , the sample paths are times differentiable, in any direction, if and only if .

When and is a nonnegative integer, the Matérn function simplifies to the product of a negative exponential with a polynomial of degree , and for tending to infinity, a rescaled version of the Matérn converges to a squared exponential model being infinitely differentiable at the origin. Thus, the Matérn function allows for a continuous parameterization of its associated Gaussian field in terms of smoothness.

We also define as the class that consists of members of being additionally compactly supported on a given interval, , . Clearly, their radial versions are compactly supported over balls of with radius .

We now define GW correlation functions as introduced by Gneiting (2002b), Zastavnyi (2006) and Chernih and Hubbert (2014). For , we define

 φμ,κ(r):={1B(2κ,μ+1)∫1ru(u2−r2)κ−1(1−u)μdu,0≤r<1,0,r≥1, (1)

with denoting the beta function. Arguments in Gneiting (2002b) and Zastavnyi (2006) show that, for a given , if and only if

 μ≥λ(d,κ):=(d+1)/2+κ. (2)

Throughout, we use instead of whenever no confusion arises. Integration by parts shows that the first part of (1) can also be written as:

 1B(1+2κ,μ)∫1r(u2−r2)κ(1−u)μ−1du,0≤r<1.

Note that is not defined because must be strictly positive. In this special case we consider the Askey function (Askey, 1973)

 Aμ(r)=(1−r)μ+={(1−r)μ,0≤r<1,0,r≥1,

where denotes the positive part. Arguments in Golubov (1981) show that if and only if and we define .

Finally, we define the GW covariance function, with compact support , variance and smoothness parameter as:

 φμ,κ,β,σ2(r):=σ2φμ,κ(r/β),r≥0, (3)

and if and only if . Accordingly, for , we define

 φμ,0,β,σ2(r):=σ2φμ,0(r/β),r≥0. (4)

When computing (3), numerical integration is obviously feasible, but could be cumbersome to (spatial) statisticians used to handle closed form parametric covariance model. Nevertheless, closed form solution of the integral in Equation (1) can be obtained when , a positive integer. In this case, , with a polynomial of order . These functions, termed (original) Wendland functions, were originally proposed by Wendland (1995). Other closed form solutions of integral (1) can be obtained when , using some results in Schaback (2011). Such solutions are called missing Wendland functions.

Recently, Porcu, Zastavnyi and Bevilacqua (2016) have shown that the GW class includes almost all classes of covariance functions with compact supports known to the geostatistical and numerical analysis communities. Not only original and Wendland functions, but also Wu’s functions (Wu, 1995), which in turn include the spherical model (Wackernagel, 2003), as well as the Trigub splines (Zastavnyi, 2006). Finally, Chernih, Sloan and Womersley (2014) show that, for tending to infinity, a rescaled version of the GW model converges to a squared exponential covariance model.

As noted by Gneiting (2002a), GW and Matérn functions exhibit the same behavior at the origin, with the smoothness parameters of the two covariance models related by the equation .

Fourier transforms of radial versions of members of , for a given , have a simple expression, as reported in Yaglom (1987) or Stein (1999b). For a member of the class , we define its isotropic spectral density as

 ˆϕ(z)=z1−d/2(2π)d∫∞0ud/2Jd/2−1(uz)ϕ(u)du,z≥0, (5)

and throughout the paper, we use the notations: , and for the radial parts of Fourier transforms of and , respectively.

A well-known result about the spectral density of the Matérn model is the following:

 ˆMν,α,σ2(z)=Γ(ν+d/2)πd/2Γ(ν)σ2αd(1+α2z2)ν+d/2,z≥0. (6)

For two given non negative functions and , with we mean that there exist two constants and such that and for each . The next result follows from Zastavnyi (2006), Chernih and Hubbert (2014), and from standard properties of Fourier transforms. Their proofs are thus omitted. Let us first define the function as:

 (1F2(a;b,c;z)=∞∑k=0(a)kzk(b)k(c)kk!,z∈R,

which is a special case of the generalized hypergeometric functions (Abramowitz and Stegun, 1970), with for , being the Pochhammer symbol.

###### Theorem 1.

Let be the function defined at Equation (3) and let as defined through Equation (2). Then, for and :

1.     for ;

2. ,  for

where , , , and

 Kς=2−κ−d+1π−d2Γ(μ+1)Γ(2κ+d)Γ(κ+d2)Γ(μ+2λ),

where .

Point 1 has been shown by Zastavnyi (2006). Points 2 and 3 can be found in Chernih and Hubbert (2014). Note that the case is not included in Theorem 1. We consider it in the following result, whose proof follows the lines of Zastavnyi (2006) and Chernih and Hubbert (2014) for the case .

###### Theorem 2.

Let as being defined at Equation (4). Then, for , :

1.     ;

2.       for ;

3. ,  for ,

with , , and defined as in Theorem 1 but with .

The spectral density and its decay for in Theorems 1 and 2 are useful when studying some geometrical properties of a Gaussian field or its associated sample paths (Adler, 1981). For instance, using Theorem 1 Point 1 or Theorem 2 Point 1, it is easy to prove that for a positive integer , the sample paths of a Gaussian field with GW function are times differentiable, in any direction, if and only if . Table 1 compares the GW for with for with the associated degree of sample paths differentiability.

## 3 Equivalence of Gaussian measures with GW models

Equivalence and orthogonality of probability measures are useful tools when assessing the asymptotic properties of both prediction and estimation for Gaussian fields. Denote with , , two probability measures defined on the same measurable space . and are called equivalent (denoted ) if for any implies and vice versa. On the other hand, and are orthogonal (denoted ) if there exists an event such that but . For a stochastic process , to define previous concepts, we restrict the event to the -algebra generated by , where . We emphasize this restriction by saying that the two measures are equivalent on the paths of .

Gaussian measures are completely characterized by their mean and covariance function. We write for a Gaussian measure with zero mean and covariance function . It is well known that two Gaussian measures are either equivalent or orthogonal on the paths of (Ibragimov and Rozanov, 1978).

Let , be two zero mean Gaussian measures with isotropic covariance function and associated spectral density , , as defined through (5). Using results in Skorokhod and Yadrenko (1973) and Ibragimov and Rozanov (1978), Stein (2004) has shown that, if for some , is bounded away from 0 and as , and for some finite and positive ,

 ∫∞czd−1{ˆρ1(z)−ˆρ0(z)ˆρ0(z)}2dz<∞, (7)

then for any bounded subset , on the paths of .

For the reminder of the paper, we denote with a zero mean Gaussian measure induced by a Matérn covariance function with associated spectral density , and with a zero mean Gaussian measure induced by a GW covariance function with associated spectral density .

Using (7) and (6), Zhang (2004) established the following characterization concerning the equivalent conditions of two Gaussian measures with Matérn covariance models.

###### Theorem 3 (Zhang, 2004).

For a given , let , , be two zero mean Gaussian measures. For any bounded infinite set , , on the paths of , if and only if

 σ20/α2ν0=σ21/α2ν1. (8)

The first relevant result of this paper concerns the characterization of the equivalence of two zero mean Gaussian measures under GW functions. The crux of the proof is the arguments in Equation (7), coupled with the asymptotic expansion of the spectral density as in Theorem 1 and 2.

###### Theorem 4.

For a given , let , , be two zero mean Gaussian measures and let , with as defined through Equation (2). For any bounded infinite set , , on the paths of if and only if

 σ20/β2κ+10=σ21/β2κ+11. (9)
###### Proof.

We first consider the case . Let us start with the sufficient part of the assertion. From Theorem 1 (Point 3), there exist two positive constants and such that

 ci≤z2λˆφμ,κ,βi,σ2i(z)≤Ci,i=0,1.

In order to prove the sufficient part, we need to find conditions such that, for some positive and finite ,

 ∫∞czd−1(ˆφμ,κ,β1,σ21(z)−ˆφμ,κ,β0,σ20(z)ˆφμ,κ,β0,σ20(z))2dz<∞. (10)

We proceed by direct construction, and, using Theorem 1 (Points 1 and 2), we find that, as ,

 ∣∣∣ ˆφμ,κ,β1,σ21(z)−ˆφμ,κ,β0,σ20(z)ˆφμ,κ,β0,σ20(z)∣∣∣≤Lςc−10z2λ∣∣∣σ21βd1[cς3(β1z)−2λ{1+O(z−2)} +cς4(zβ1)−(μ+λ){cos(β1z−cς5)+O(z−1)}] −σ20βd0[cς3(β0z)−2λ{1+O(z−2)} +cς4(zβ0)−(μ+λ){cos(β0z−cς5)+O(z−1)}]∣∣∣ ≤Lςc−10∣∣∣cς3[σ21β−(1+2κ)1−σ20β−(1+2κ)0]+O(z−2) +cς4zλ−μ[σ21β~λ1cos(β1z−cς5)−σ20β~λ0cos(β0z−cς5)] +cς4zλ−μO(z−1){σ21β~λ1−σ20β~λ0}∣∣∣,

where . Let us now write

 A(z) =cς3[σ21β−(1+2κ)1−σ20β−(1+2κ)0]+O(z−2), B(z) =cς4zλ−μ[σ21β~λ1cos(β1z−cς5)−σ20β~λ0cos(β0z−cς5)], and D(z) =cς4zλ−μO(z−1){σ21β~λ1−σ20β~λ0}.

Then, a sufficient condition for (10) is the following:

 (Lς/c0)2∫∞czd−1(A(z)+B(z)+D(z))2dz<∞. (11)

Note that is of order under Condition (9). We claim that (11) is satisfied if for , .

In fact, we have, for ,

 |B(z)| ≤cς4zλ−μ[σ21β~λ1+σ20β~λ0]≤c6zλ−μ,

and

 |D(z)| ≤cς4zλ−μO(z−1){σ21β~λ1+σ20β~λ0} ≤c7cς4zλ−μ−1{σ21β~λ1+σ20β~λ0}≤c8zλ−μ−1

with , and being positive and finite constants. Expanding (11) we notice that the dominant terms are and , independently on the cross products. These are respectively of the order and . This in turn implies that the integral (11) is finite if , for and and this implies that  (10) is satisfied under the same conditions. The sufficient part of our claim is thus proved.

The necessary part follows the proof of Zhang (2004). For and , we suppose and let . Then and define two equivalent Gaussian measures. We need to show that and define two orthogonal Gaussian measures. The rest of the proof follows the same arguments in Zhang (2004).

We omit the proof of the special case , since is similar to the case , but using the arguments in Theorem 2. ∎

An immediate consequence of Theorem 4 is that for fixed and , the and parameters cannot be estimated consistently (Zhang, 2004). Instead, the microergodic parameter is consistently estimable. In Section 4, we establish the asymptotic properties of ML estimation associated to the microergodic parameter.

The next result depicts an interesting scenario in which a GW and Matérn model are considered and gives sufficient conditions for the compatibility of these two covariance models. We offer a constructive proof, the crux of the argument being again Equation (7). We treat the cases and separately.

###### Theorem 5.

For given and , let and be two zero mean Gaussian measures. If , , with as defined through Equation (2), and

 σ20α−2ν=Cν,κ,μσ21β−(1+2κ), (12)

. where , then for any bounded infinite set , , on the paths of .

###### Proof.

In order to prove Theorem 5, we need to find conditions such that for some positive and finite ,

 ∫∞czd−1(ˆφν,κ,β,σ21(z)−ˆMν,α,σ20(z)ˆMν,α,σ20(z))2dz<∞. (13)

It is known that is bounded away from 0 and as for some (Zhang, 2004). Using (6) and Theorem 1 (Points 1 and 2), we have, as ,

 ∣∣∣ ˆφμ,κ,β,σ21(z)−ˆMν,α,σ20(z)ˆMν,α,σ20(z)∣∣∣ =∣∣∣σ21βdΓ(ν)LςΓ(ν+d/2)σ20α−2νπ−d2[cς3(βz)−2λ{1+O(z−2)} =∣∣∣σ21βdΓ(ν)LςΓ(ν+d/2)σ20α−2νπ−d2[cς3(βz)−2λ{1+O(z−2)} =∣∣∣w1z−2λ{1+O(z−2)}z2ν+d[1+(ν+d/2)(αz)−2+O(z−2)] +w2z−(μ+λ)z2ν+d[1+(ν+d/2)(αz)−2+O(z−2)]{cos(βz−cς5)+O(z−1)}−1∣∣∣,

where , . Since , we have

 ∫∞czd−1∣∣∣ˆφμ,κ,β,σ21(z)−ˆMν,α,σ20(z)ˆMν,α,σ20(z)∣∣∣2dz =∫∞czd−1∣∣∣w1z−2λO(z2ν+d−2)+{w1z2ν−(1+2κ)−1}+w1z−2λ ×{O(z2ν+d−2)+O(z2ν+d−4)}+w2z−(μ+λ){O(z2ν+d−2)+z2ν+d}{cos(βz−cς5) +O(z−1)}∣∣∣2dz.

For assessing the last integral, the following is relevant:

• if and .

• if and .

• if and .

• if and .

• if and .

• if and .

• if and .

This allows us to conclude that, for a given , if , , and , then (13) holds and thus .

Condition is equivalent to

 Lςcς3σ21β−(1+2κ)=π−d/2Γ(ν+d/2)Γ(ν)−1σ20α−2ν,

and from the definition of and , the previous condition can be rewritten as . ∎

###### Theorem 6.

Let and be two zero mean Gaussian measures. If and

 σ20α−2ν