On LSE in regression model for long-range dependent random fields on spheres.

# On LSE in regression model for long-range dependent random fields on spheres.

\nameVo Anha,b, Andriy Olenkoc and Volodymyr Vaskovychc CONTACT Andriy Olenko. Email: a.olenko@latrobe.edu.au aSchool of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, 4001, Australia
bFaculty of Science, Engineering and Technology, Swinburne University of Technology, Victoria, 3122, Australia
cDepartment of Mathematics and Statistics, La Trobe University, Melbourne, 3086, Australia
###### Abstract

We study the asymptotic behaviour of least squares estimators in regression models for long-range dependent random fields observed on spheres. The least squares estimator can be given as a weighted functional of long-range dependent random fields. It is known that in this scenario the limits can be non-Gaussian. We derive the limit distribution and the corresponding rate of convergence for the estimators. The results were obtained under rather general assumptions on the random fields. Simulation studies were conducted to support theoretical findings.

S

patial regression, LSE, Long-range dependence, Non-central limit theorems, Hermite-type distribution. {amscode} 62H11, 62J05, 60G60.

## 1 Introduction

Studying the asymptotic behavior of the regression parameter estimators is an important topic in Spatial Statistics, see [1], [2], [3]. A potential area of application of the results in this paper is cosmology, and more specifically, models involving the cosmic microwave background (CMB). The recent data obtained by the Wilkinson Microwave Anisotropy Probe and Planck missions, combined with data that will be obtained during the future missions, will allow to predict the behaviour of CMB at increasing distances that are far beyond the reach of any modern telescopes. Another area of application is studying the porosity of mineral resources, see, for example, [4] about data and analysis of the porosity of hydrocarbon reservoirs. Some other potential applications of the model discussed in this article might involve 3D rupture models, soil fertility and paleomagnetism.

One of the first works that developed statistical theory of random fields was the book by [5]. Direct probability techniques for models involving random fields were proposed and various estimators of regression parameters were investigated. The asymptotic behaviour of some of these estimators was studied in [6]. Both Gaussian and non-Gaussian limits were obtained for particular models. No results about rates of convergence were given.

In this paper we consider the regression model , where is a known deterministic function, is an unknown parameter and the error is a random field. For results concerning such regression models see [7] and references therein. Since there are numerous situations in practice when random processes and fields are non-Gaussian, we consider a particular case when is a non-linear function of a Gaussian random field. Such random fields are very common in non-Gaussian modeling since they can be analyzed using Wiener chaos expansions and in many cases offer a good data approximation, see [8, 9]. In this article, we consider the underlying Gaussian random fields to be long-range dependent.

Long-range dependence is a well-established phenomenon that can be observed in various fields such as hydrology, agriculture, image analysis, earth sciences, cosmology, just to name a few. For this reason, models involving long-range dependent random fields have been an object of statistical interest for years, see [10, 11, 12, 13].

Various earth science and cosmology applications require studying random fields defined on surfaces, for example, see [14, 15]. In this paper we consider random fields that are defined on expanding spheres. One can find detailed information about such spherical random fields in [16, 5].

The regression model studied in this paper was first considered in [5]. For this model the best linear unbiased estimator (BLUE) of the unknown parameter was derived. In [17] the least squares estimator (LSE) of was obtained. Its mean-square efficiency was compared to BLUE and it was shown that, apart from some degenerate cases, LSE is less efficient than BLUE. However, for most functions BLUE is much more difficult to compute than LSE. For this reason using LSE is preferable to BLUE in practice. No results concerning limit distributions of the regression parameter estimators were obtained in [17, 5] or other literature known to the authors. This article investigates the limit distribution of LSE and its rate of convergence.

For the particular case , some approaches to study non-linear functionals of random fields on surfaces and rates of convergence were proposed in [18]. This article shows how the methodology developed in [18] can be extended to the general case of weighted non-linear functionals of random fields, i.e. the case of arbitrary functions . To reduce repetitions to a minimum, in this article we present only those parts of proofs that are new or require modifications of results in [18].

The main goal of this article is to demonstrate potential applications of the developed methodology to statistical problems. We focus on the LSE for a regression model, however there are numerous other statistical problems that can be studied using weighted non-linear functionals of random fields. For example, by choosing the appropriate integrand functions, various characteristics of random fields, such as moments, can be represented by the considered functionals. In [18] it was shown that Minkowski functionals can be obtained by choosing indicator functions as the integrands. Another important statistical application is assigning different weights to different observation regions. In spatial statistics it is often used to control the edge effects. Also, assigning weights to observations or observation regions is an important case of data tapering that is often used to reduce effects of less reliable observations.

Section 5 presents detailed simulation studies that back up the theory and suggest new research questions. To the best of our knowledge, simulation studies have not been done in the available literature on convergence rates in non-central limit theorems for long-range dependent random fields. The numerical study suggests that the exponential rate of convergence obtained in all known theoretical studies might be improved. For the cases of a sphere and a cube, various observation windows and weight functions were investigated. The detailed methodology for practical simulations and studying weighted functionals of random fields observed on surfaces is provided. The corresponding R codes for one-core and parallel computing are freely available on-line and can be used by researchers in this field.

To obtain the main results some fine properties of the Fourier transforms on surfaces are employed. Note that the Fourier transforms were more frequently studied on solid bodies than on surfaces. The assumptions for obtaining the required rate of decay are weaker in the case of solid bodies. Therefore, all results derived in this article can be analogously obtained for solid bodies.

As mentioned earlier, to study the asymptotic behaviour of LSE we investigate weighted non-linear functionals of Gaussian random fields. Non-linear functionals of random processes were considered in [6, 19]. In [6] only the case of short-range dependent random processes and Gaussian asymptotics for weighted functionals was studied. In [19] non-Gaussian asymptotics were obtained for long-range dependent random processes. The results in [19] can be obtained from the methodology of this article as random processes can be considered as random fields indexed by a one-dimensional Euclidean space. For random fields and the simplest particular case it was shown that the functionals can produce non-Gaussian limits, see [20, 21, 22]. The more detailed overview of this particular case was given in [23]. In the general case, weighted linear functionals of the Gaussian random fields were studied in [24]. The assumptions on weighted functions used in this article are weaker than those used in [24]. Moreover, we do not apply any conditions on Fourier transforms of weight functions. The known Gaussian asymptotics are a particular case when integrand functions are linear. None of the above studies provided the rate of convergence to the obtained limits. To the best of our knowledge, this paper is the first work that provides the rate of convergence for weighted functionals and LSE of long-range dependent random fields on hypersurfaces.

The article is organized as follows. In Section 2 we recall some basic definitions of the spectral theory of random fields. Section 3 presents the model and states the results. Proofs of the results are provided in Section 4. In Section 5 some simulation studies are presented to confirm theoretical findings. Conclusions and directions for future research are stated in Section 6.

## 2 Definitions

In this section we provide the main definitions that are used in this article.

In what follows and denote the Lebesgue measure and the Euclidean distance in , , respectively. Let be a -dimensional ball with centre and radius , and let be a -dimensional sphere in with the centre at the origin and radius . We use the symbols and to denote constants which are not important for our exposition. Moreover, the same symbol may be used for different constants appearing in the same proof.

Let be a complete probability space and let be a set.

###### Definition 1.

[12] A random field is a function such that is a random variable for any . It will also be denoted as

###### Definition 2.

[12] A random field satisfying is called homogeneous in the wide sense if its mathematical expectation and covariance function are invariant with respect to the group of shifts in , that is, , for any .

It means that , and the covariance depends only on the difference .

###### Definition 3.

[12] A random field satisfying is called isotropic in the wide sense if its mathematical expectation and covariance function are invariant with respect to rotations.

In this article homogeneity and isotropy in the wide sense will be simply called homogeneity and isotropy.

Let us consider a measurable mean-square continuous zero-mean homogeneous isotropic real-valued random field defined on a probability space with the covariance function

 \rm{B}(r):=Cov(η(x),η(y))=∫∞0Yd(rz)dΦ(z), x,y∈Rd,

where is the isotropic spectral measure, the function is defined by

 Yd(z):=2(d−2)/2Γ(d2) J(d−2)/2(z) z(2−d)/2,z≥0,

and is the Bessel function of the first kind of order

###### Definition 4.

The random field defined above is said to possess an absolutely continuous spectrum if there exists a function such that

 Φ(z)=2πd/2Γ−1(d/2)∫z0ud−1f(u)du,z≥0,ud−1f(u)∈L1(R+).

The function is called the isotropic spectral density function of the field  The field  with an absolutely continuous spectrum has the following representation

where is the complex Gaussian white noise random measure on

Let , , , be the Hermite polynomials, see [25]. These polynomials form a complete orthogonal system in the Hilbert space

 L2(R,ϕ(w)dw)={G:∫RG2(w)ϕ(w)dw<∞},ϕ(w):=1√2πe−w22.

An arbitrary function admits the mean-square convergent expansion

 G(w)=∞∑j=0CjHj(w)j!,Cj:=∫RG(w)Hj(w)ϕ(w)dw.

By Parseval’s identity

 ∞∑j=0C2jj!=∫RG2(w)ϕ(w)dw.

It will be shown that studying asymptotics of non-linear functionals defined by a function can be done by investigating leading terms in their expansions. The following definitions and remarks provide basic notations and tools to formulate and prove these asymptotic results.

###### Definition 5.

[21] Let and assume there exists an integer such that , for all but Then is called the Hermite rank of and is denoted by

The following remark gives a well known property of Hermite polynomials of Gaussian vectors, see [12].

###### Remark 1.

If is a -dimensional zero-mean Gaussian vector with

 Eξjξk=⎧⎨⎩1,if k=j,rj,if k=j+p and 1≤j≤p,0,otherwise,

then

 E p∏j=1Hkj(ξj)Hmj(ξj+p)=p∏j=1δmjkj kj! rkjj, (1)

where  is the Dirac delta function.

Let be a bounded set in with a boundary . Let , be the homothetic image of the set with the centre of homothety at the origin and the coefficient , that is . Let be the -dimensional Lebesgue measure on . Let and be two independent and uniformly distributed random vectors on the hypersurface . We denote by the probability density function of the distance between and Note that if Using the above notations, we obtain the representation

 ∫∂Δ(r)∫∂Δ(r)G(∥x−y∥)σ(dx)σ(dy)=|∂Δ|2r2d−2E G(∥U−V∥)=
 =|∂Δ|2r2d−2∫diam{∂Δ(r)}0G(ρ) ψΔ(r)(ρ)dρ. (2)

For various hypersurfaces explicit expressions for are presented in [12]. In the case of spheres it takes the following form.

###### Remark 2.

If , then

 ψΔ(r)(ρ)=1√πΓ(d2)Γ−1(d−12)r1−dρd−2(1−ρ24u2)d−32,0<ρ<2r.
###### Definition 6.

[26] A measurable function is said to be slowly varying at infinity if for all

 limλ→∞L(λt)L(λ)=1.

If varies slowly, then for an arbitrary when see Proposition 1.3.6 in [26].

###### Definition 7.

[26] A measurable function is said to be regularly varying at infinity, denoted , if there exists such that, for all it holds that

 limλ→∞g(λt)g(λ)=tτ.
###### Definition 8.

[26] Let be a measurable function and as . A slowly varying function is said to be slowly varying with remainder of type 2, or that it belongs to the class SR2, if

 ∀λ>1:L(λx)L(x)−1∼k(λ)g(x),x→∞,

for some function .

If there exists such that and for all , then for some and , where

 hτ(λ)={ln(λ),ifτ=0,λτ−1τ,ifτ≠0. (3)
###### Remark 3.

An example of a function that satisfies Definition 8 for is Indeed,

 L(λx)L(x)−1=ln(λ)+ln(x)ln(x)−1=ln(λ)⋅1ln(x).
###### Definition 9.

Let and be arbitrary random variables. The uniform (Kolmogorov) metric for the distributions of and is defined by

 ρ(Y1,Y2)=supz∈R|P(Y1≤z)−P(Y2≤z)|.

The next result follows from Lemma 1.8 in [27].

###### Lemma 1.

If and are arbitrary random variables, then for any

 ρ(X+Y,Z)≤ρ(X,Z)+ρ(Z+ε,Z)+P(|Y|≥ε).

## 3 Model and results

Let us consider the random field

 ξ(x)=arh(x)+ψ(x),x∈S(r),r>0,

where is a known deterministic function, is an unknown parameter and is a homogeneous isotropic mean-square random field with mean 0.

Suppose that where is a radial function such that for all and is a smooth bounded function defined on the unit sphere .

In this case, the least squares estimator (LSE) of the coefficient has the explicit form, see [5],

Let where and is a random field that satisfies the following assumptions.

###### Assumption 1.

Let , be a homogeneous isotropic Gaussian random field with and a covariance function such that

 B(0)=1,B(x)=Eη(0)η(x)=∥x∥−αL0(∥x∥),

where is a function slowly varying at infinity.

This assumption is a classical way to introduce hyperbolically decaying dependencies between observations, see [20, 11, 22, 12] and references therein. Random fields satisfying this assumption for are weakly-dependent. If then the long-range dependence case is considered.

###### Assumption 2.

The random field has the spectral density

 f(∥λ∥)=c2(d,α)∥λ∥α−dL(1∥λ∥),

where and is a locally bounded function slowly varying at infinity which satisfies for sufficiently large the condition

 ∣∣∣1−L(tr)L(r)∣∣∣≤Cg(r)hτ(t), t≥1, (4)

where , such that , and is defined by (3).

Long-range dependence is usually introduced by requiring the hyperbolic decay of covariance functions or power-type singularity of the corresponding spectral densities. For many real data these two definitions are operationally equivalent, see also Tauberian-Abelian theorems in [28]. However, there are cases when Assumption 2 does not follow from Assumption 1, see [28]. To make all following results rigorous, we require that the both assumptions hold.

Examples of popular classes of random fields that satisfy Assumption 1 and 2 simultaneously are Bessel, Cauchy, and Linnik random fields.

###### Remark 4.

By Tauberian and Abelian theorems, see [28], for and given in Assumptions 1 and 2 it holds

In [23] the following property of slowly varying functions satisfying conditions (4) was proven.

###### Remark 5.

If satisfies (4), then for any , , and sufficiently large

 ∣∣ ∣∣1−Lk/2(tr)Lk/2(r)∣∣ ∣∣≤Cg(r)hτ(t)tδ, t≥1.

Since the LSE has the form

Let . Our main object of interest is the random variable

 Xr,G:=ch(r)cr(d,α)(^ar−ar),

where

It is straightforward to see that

 Xr,G=cr(d,α)∫S(r)G(η(x))hsp(x∥x∥)σ(dx)
 =cr(d,α)Cκκ!∫S(r)Hκ(η(x))hsp(x∥x∥)σ(dx)
 +cr(d,α)∑j≥κ+1Cjj!∫S(r)Hj(η(x))hsp(x∥x∥)σ(dx)=:Xr,κ+Vr. (5)
###### Theorem 1.

Suppose that and satisfies Assumption 1 for . If at least one of the following random variables

 Xr,G√Var Xr,G,Xr,G√Var Xr,κandXr,κ√Var Xr,κ,

has a limit distribution, then the limit distributions of the other random variables also exist and they coincide when

By Theorem 1 to study limit distributions one can use instead of .

In Theorem 2 below, we show that the limit distribution of is a Hermite-type random variable, which can be represented by a Wiener-Itô integral. Let

 K(x):=∫S(1)eihsp(u)σ(du),x∈Rd,

be the Fourier transform of the function over the -dimensional sphere with radius 1. Using the decay properties of the Fourier transform one can prove the following.

###### Lemma 2.

If are such positive constants, that then

 ∫Rdκ|K(λ1+⋯+λκ)|2dλ1…dλκ∥λ1∥d−τ1⋯∥λκ∥d−τκ<∞. (6)

Let

 Xκ:=∫Rdκ′K(λ1+⋯+λκ)W(dλ1)…W(dλκ)∥λ1∥(d−α)/2…∥λκ∥(d−α)/2,

where denotes the multiple Wiener-Itô integral.

###### Theorem 2.

Let be a homogeneous isotropic Gaussian random field with If Assumptions 1 and 2 hold, and then for the random variables converge weakly to .

To obtain the rate of convergence of to we will use some fine properties of Hermite-type distributions. We will denote the Wiener-Itô integrals of rank by , where is an integrand. For more details about Wiener-Itô integrals and admissible functions one can refer to [29, 30]. The following result was obtained in [23] for specific form of the integrand. Since the proof does not rely on the form of the integrand, this theorem can be easily generalized as follows.

###### Theorem 3.

[23] For any and an arbitrary positive it holds

 ρ(Iκ(f),Iκ(f)+ε)≤Cεb,

where if and if .

Also, we will use the following result.

###### Lemma 3.

[18] Let and be symmetric functions in . Then,

 ρ(Iκ(f1),Iκ(f2))≤C∥f1−f2∥1κ+1/2,if κ≥3,

and

 ρ(Iκ(f1),Iκ(f2))≤C∥f1−f2∥23,if κ<3.

Now we are ready to formulate the main result.

###### Theorem 4.

Let and Assumptions 1 and 2 hold for .

If then for any

 ρ(Xr,G,Xκ)=o(r−ϰ),r→∞,

where and is the parameter from Theorem 3.

If then

 ρ(Xr,G,Xκ)=g2b2+b(r),r→∞.

## 4 Proofs of the result from section 3

In this section the results stated in Section 2 are proved. In [18] Theorems 1, 2, 4 and Lemma 2 were proved in the particular case . Since the proofs for the general setting are very similar to the ones in [18], only parts of the proofs that differ are provided.

###### Proof of Theorem 1.

By Remark 1 it holds .

By (1) and (2) we get

 VarVr=c2r(d,α)∑j≥κ+1C2jj!∫S(r)∫S(r)Lj0(∥x−y∥)∥x−y∥αjhsp(x∥x∥)hsp(y∥y∥)σ(dx)σ(dy)
 ≤κ!2c−κ2(d,α)C2κr2d−2−καLκ(r)maxx∈S(1)|hsp(x)|2∑j≥κ+1C2jj!∫S(r)∫S(r)|L0(∥x−y∥)|j∥x−y∥αjσ(dx)σ(dy)
 =κ!2|S(1)|2r2d−2cκ2(d,α)C2κr2d−2−καLκ(r)maxx∈S(1)|hsp(x)|2∑j≥κ+1C2jj!2∫0|L0(rz)|j(rz)αjψS(1)(z)dz.

It follows from that

 VarVr≤κ!2|S(1)|2r2d−2−(κ+1)αcκ2(d,α)C2κr2d−2−καLκ(r)maxx∈S(1)|hsp(x)|2∑j≥κ+1C2jj!2∫0|L0(rz)|κ+1zα(κ+1)ψS(1)(z)dz.
 =κ!2|S(1)|2|L0(r)|κcκ2(d,α)C2κLκ(r)maxx∈S(1)|hsp(x)|2∑j≥κ+1C2jj!2∫0z−ακ|L0(rz)|κ|L0(r)|κ|L0(rz)|(rz)αψS(1)(z)dz.

Considering as a new slowly varying function, one can estimate the integral above using bounds (9) and (10) in [18] to obtain

 VarVr≤C|L0(r)|κLκ(r)(r−β1(d−1−κα−δ)+o(r−(α−δ)(1−β1))), (7)

where is an arbitrary number in .

Similar to we obtain

 VarXr,κ=1cκ2(d,α)Lκ(r)∫S(1)∫S(1)Lκ0(r∥x−y∥)∥x−y∥ακhsp(x)hsp(y)dσ(x)dσ(y).

Let us consider a constant and let . The function can be considered as a probability density function of some random variable defined on . Therefore,

 VarXr,κ=c2Icκ2(d,α)Lκ(r)∫S(1)∫S(1)Lκ0(r∥x−y∥)∥x−y∥ακ(hsp(x)+chcI)
 ×(hsp(y)+chcI)dσ(x)dσ(y)−2cI|S(1)|cκ2(d,α)Lκ(r)∫S(1)∫S(1)Lκ0(r∥x−y∥)∥x−y∥ακ(hsp(x)+chcI)
 ×ch|S(1)|dσ(x)dσ(y)−|S(1)|2cκ2(d,α)Lκ(r)∫S(1)∫S(1)Lκ0(r∥x−y∥)∥x−y∥ακ(ch|S(1)|)2dσ(x)dσ(y)
 =c2Icκ2(d,α)Lκ(r)E[Lκ0(r∥X1−X2∥)∥X1−X2∥ακ]−2cI|S(1)|chcκ2(d,α)Lκ(r)E[Lκ0(r∥X1−Y1∥)∥X1−Y1∥ακ]
 −|S(1)|2c2hcκ2(d,α)Lκ(r)E[Lκ0(r∥Y1−Y2∥)∥Y1−Y2∥ακ],

where are random variables distributed on with the probability density and are uniformly distributed on random variables.

Let , , and denote their corresponding probability densities as , and . Thus

 VarXr,κ=c2Icκ2(d,α)Lκ(r)2∫0z−ακLκ0(rz)¯ψS(1)(z)dz−2cI|S(1)|chcκ2(d,α)Lκ(r)
 ×2∫0z−ακLκ0(rz)~ψS(1)(z)dz−|S(1)|2c2hcκ2(d,α)Lκ(r)2∫0z−ακLκ0(rz)ψS(1)(z)dz.

If then by asymptotic properties of integrals of slowly varying functions (see Theorem 2.7 in [31]) we get

 VarXr,κ=(¯c1(κ,α)+~c1(κ,α)+c1(κ,α))cκ2(d,α)Lκ0(r)Lκ(r)(1+o(1)),r→∞, (8)

where , , and .

By (2) it holds

 ¯c1(κ,α)+~c1(κ,α)+c1(κ,α)=∫S(1)∫S(1)(hsp(x)+ch)(hsp(y)+ch)∥x−y∥ακdσ(x)dσ(y)
 ≤maxx|hsp(x)|2|S(1)|22∫0ψS(1)(