On Goodness-of-fit Testing for Ergodic Diffusion Process with Shift Parameter1footnote 11footnote 1This work has been partially supported by MIUR grant 2009.

# On Goodness-of-fit Testing for Ergodic Diffusion Process with Shift Parameter

## Abstract

A problem of goodness-of-fit test for ergodic diffusion processes is presented. In the null hypothesis the drift of the diffusion is supposed to be in a parametric form with unknown shift parameter. Two Cramer-Von Mises type test statistics are studied. The first one is based on local time estimator of the invariant density, the second one is based on the empirical distribution function. The unknown parameter is estimated via the maximum likelihood estimator. It is shown that both the limit distributions of the two test statistics do not depend on the unknown parameter, so the distributions of the tests are asymptotically parameter free. Some considerations on the consistency of the proposed tests and some simulation studies are also given.

Keywords: Ergodic diffusion process, goodness-of-fit test, Cramer-Von Mises type test.

## 1 Introduction

We consider the problem of goodness of fit test for the model of ergodic diffusion process when this process under the null hypothesis belongs to a given parametric family. We study the Cramer-von Mises type statistics in two different cases. The first one is based on local time estimator and the second one is based on empirical distribution function estimator. We show that the Cramer-von Mises type statistics converge in both cases to some limits which do not depend on the unknown parameter, so the test is asymptotically parameter free (APF).

Let us remind the similar statement of the problem in the well known case of the observations of independent identically distributed random variables . Suppose that the distribution of under hypothesis is , where is some unknown parameter. Then the Cramer-von Mises type test is

 ^ψn(Xn)=1I{ω2n>eε},ω2n=n∫∞−∞[^Fn(x)−F(x−^ϑn)]2dF(x−^ϑn)

where the statistic under hypothesis converges in distribution to a random variable which does not depend on . Therefore the threshold can calculated as solution of the equation

 {{P}}{ω2>eε}=ε.

The details concerning this result can be found in Darling [3]. For more general problems see the works of Kac, Kiefer & Wolfowitz [8], Durbin [4] or Martynov [12], [13].

A similar problem exists for the continuous time stochastic processes, which are widely used as mathematic models in many fields. The goodness of fit tests (GoF) are studied by many authors. For example Kutoyants [9] discusses some possibilities of the construction of such tests. In particular, he considers the Kolmogorov-Smirnov statistics and the Cramer-von Mises Statistics based on the continuous observation. Note that the Kolmogorov-Smirnov statistics for ergodic diffusion process was studied in Fournie [6] and in Fournie and Kutoyants [7]. However, due to the structure of the covariance of the limit process, the Kolmogorov-Smirnov statistics is not asymptotically distribution free in diffusion process models. More recently Kutoyants [10] has proposed a modification of the Kolmogorov-Smirnov statistics for diffusion models that became asymptotically distribution free. See also Dachian and Kutoyants [2] where they propose some GoF tests for diffusion and inhomogeneous Poisson processes with simple basic hypothesis. It was shown that these tests are asymptotically distribution free. In the case of Ornstein-Uhlenbeck process Kutoyants showed that the Cramer-von Mizes type tests are asymptotically parameter free [11]. Another test was studied by Negri and Nishiyama [15].

## 2 Main Results

Suppose that we observe an ergodic diffusion process, solution to the following stochastic differential equation

 dXt=S(Xt)dt+dWt,X0, 0≤t≤T. (2.1)

We want to test the following null hypothesis

 H0:S(x)=S∗(x−ϑ),ϑ∈Θ,

where is some known function and the shift parameter is unknown. We suppose that . Let us introduce the family

 S(Θ)={S∗(x−ϑ),ϑ∈Θ=(α,β)}.

The alternative is defined as

 H1:S(⋅)∉¯¯¯¯¯¯¯¯¯¯¯¯S(Θ),

where .

We suppose that the trend coefficients of the observed diffusion process under both hypotheses satisfy the conditions:

. The function is locally bounded and for some ,

 xS(x)≤C(1+x2).

and

. The function satisfies

 ¯¯¯¯¯¯¯¯lim|x|→∞sgn(x)S(x)<0. (2.2)

Remind that under the condition , the equation (2.1) has a unique weak solution (See [5]). Moreover under the condition , the diffusion process is recurrent and its invariant density under hypothesis can be given explicitly (See [9], Theorem 1.16):

 f(x,ϑ)=1G(ϑ)exp{2∫xϑS∗(y−ϑ)dy}.

Denote by a random variable (r.v.) having this density and the corresponding mathematic expectation by . To simplify the notations, for the case , we denote the density function as , and the corresponding distribution function as ; correspondingly the r.v. is , and the mathematical expectation is . Denote as the class of functions having polynomial majorants i.e.

 P={h(⋅): |h(x)|≤C(1+|x|p)},

with some . Let the derivative of w.r.t. .

Let us fix some , and denote by the class of tests of asymptotic size , i.e.

 {E}0ψT=ε+o(1).

Our object is to construct this kind of tests.

To verify the hypothesis , we propose two tests. The first one is based on the local time estimator (LTE) of the invariant density, which can be written as

 ^fT(x)=1T(|XT−x|−|X0−x|)−1T∫T0sgn(Xt−x)dXt.

The unknown parameter is estimated via the maximum likelihood estimator (MLE) , which is defined as the solution of the equation

 L(^ϑT,XT)=supθ∈ΘL(θ,XT),

where is the log-likelihood ratio

 L(ϑ,XT)=∫T0S∗(Xt−ϑ)dXt−12∫T0S∗(Xt−ϑ)2dt.

We give the following regularity conditions to have the consistency and the asymptotical normality of the MLE:

Condition .

. The function is continuously differentiable, the derivative and is uniformly continuous in the following sense:

 limν→0sup|τ|<ν{E}0∣∣S′∗(ξ0)−S′∗(ξ0+τ)∣∣2=0.

. The Fisher information

 I={E}0S′∗(ξ0)2>0. (2.3)

Moreover, for any

 inf|τ|>ν{E}0(S∗(ξ0)−S∗(ξ0+τ))2>0.

Denote the statistic based on the LTE as follows

 δT=T∫∞−∞(^fT(x)−f(x−^ϑT))2dx,

we will prove that under hypothesis , it converges in distribution to

 δ=∫∞−∞(∫∞−∞(2f(x)1I{y>x}−F(y)√f(y)−1IS′∗(y)√f(y)f′(x))dW(y))2dx, (2.4)

with , , where and are independent Wiener processes. The Cramer-von Mises type test is defined as

 ψT=1I{δT>dε},

where is the quantile of the distribution of , that is the solution of the following equation

 {{P}}(δ≥dε)=ε. (2.5)

The main result for the Cramer von Mises test based on local time estimator is the following:

###### Theorem 2.1.

Let the conditions , and be fulfilled, then the test belongs to .

The theorem is proved in Section 3.

Note that neither nor depends on the unknown parameter. This allows us to conclude that the test is APF.

The second test is based on the same MLE and the empirical distribution function (EDF):

 ^FT(x)=1T∫T01I{Xt

The corresponding statistic is

 ΔT=T∫∞−∞(^FT(x)−F(x−^ϑT))2dx,

which converges in distribution to

 Δ=∫∞−∞(∫∞−∞(2F(y∧x)−F(y)F(x)√f(y)−1IS′∗(y)√f(y)f(x))dW(y))2dx. (2.6)

Thus we propose the Cramer-von Mises type test

 ΨT=1I{ΔT>cε},

where is the solution of the equation

 {{P}}(Δ≥cε)=ε. (2.7)

The main result for the Cramer von Mises test based on empirical distribution function estimator is the following:

###### Theorem 2.2.

Under conditions , and , the test belongs to .

The theorem is proved In Section 4.

## 3 Proof of Theorem 2.1

In this section, we study the test , where

 δT=T∫∞−∞(^fT(x)−f(x−^ϑT))2dx.

Under the basic hypothesis , the density of the invariant law can be presented as follows:

 f(x,ϑ) = exp{2∫xϑS∗(y−ϑ)dy}∫∞−∞exp{2∫yϑS∗(z−ϑ)d% z}dy = exp{2∫x−ϑ0S∗(y)dy}∫∞−∞exp{2∫y−ϑ0S∗(z)dz}dy = f(x−ϑ).

Note that the distribution function of the process satisfies

 F(x,ϑ) = ∫x−∞f(y−ϑ)dy=∫x−ϑ−∞f(y)dy=F(x−ϑ).

In addition, for any integrable function ,

 {E}ϑh(ξϑ−ϑ) = ∫∞−∞h(x−ϑ)f(x−ϑ)dx (3.1) = ∫∞−∞h(x)f(x)dx={E}0h(ξ0).

Note that the Fisher information in our case does not depend on the unknown parameter :

 I={E}ϑ0S′∗(ξϑ0−ϑ0)2={E}0S′∗(ξ0)2>0.

where is the true value of the unknown parameter.

From the condition , it follows that there exist some constants and such that for all ,

 sgn(x)S∗(x)<−γ. (3.2)

It can be shown that for ,

 f(x) = 1G(S∗)exp{2(∫A0+∫xA)S∗(y)dy}

Similar result can be deduced for , so we have

 f(x)A. (3.3)

Let the conditions and be fulfilled, then the MLE is consistent, i.e., for any ,

 limT→∞{{P}}ϑ0{|^ϑT−ϑ0|>ν}=0;

it is asymptotically normal

 Lϑ0{√T(^ϑT−ϑ0)}⟹N(0,I−1); (3.4)

and the moments converge i.e., for

 limT→∞{E}ϑ0∣∣√T(^ϑT−ϑ0)∣∣p={E}0|^u|p,

where . The proof can be found in [9],Theorem 2.8. We can define

 ^u=1I∫∞−∞S′∗(y)√f(y)% dW(y),

and denoted , the asymptotical normality (3.4) can be written as

 Lϑ0{^uT}⟹L{^u}. (3.5)

We define . In [9] Theorem 4.11, we can find the following representation

 ηT(x) = √T(^fT(x)−f(x−ϑ0)) (3.6) = 2f(x−ϑ0)√T∫XTX0(1I{y>x}−F(y−ϑ0)f(y−ϑ0))dy −2f(x−ϑ0)√T∫T0(1I{Xt>x}−F(Xt−ϑ0)f(Xt−ϑ0))dWt.

Let us put

 M(y,x)=2f(x)1I{y>x}−F(y)f(y).

Then can be written as

 ηT(x) = 1√T∫XTX0M(y−ϑ0,x−ϑ0)dy (3.7) −1√T∫T0M(Xt−ϑ0,x−ϑ0)dWt.

We can state

###### Lemma 3.1.

Let the condition be fulfilled, then

 ∫∞−∞{E}0(∫ξ00M(y,x)d% y)2dx<∞.

Proof. Applying the estimate (3.3), for ,

 {E}0(∫ξ00M(y,x)dy)2 =4f(x)2∫∞−∞(∫z01I{y>x}−F(y)f(y)dy)2f(z)dz =4f(x)2(∫−A−∞+∫A−A+∫xA)(∫z0−F(y)f(y)dy)2f(z)%dz +4f(x)2∫∞x(∫x0−F(y)f(y)dy+∫zx1−F(y)f(y)dy)2f(z)dz

Further,

 f(x)2∫−A−∞(∫z0−F(y)f(y)dy)2f(z)dz =f(x)2∫−A−∞((∫−Az+∫0−A)F(y)f(y)dy)2f(z)dz ≤f(x)2∫−A−∞(∫−Az∫y−∞1Gexp(−2∫yuS∗(v)dv)dudy+C1)2f(z)dz ≤f(x)2∫−A−∞(C2∫−Az∫y−∞e−2γ(y−u)dudy+C1)2f(z)dz ≤Cf(x)2∫−A−∞(1+z)2f(z)dz≤Cf(x)2≤Ce−4γx,

moreover

 f(x)2∫xA(∫z0−F(y)f(y)% dy)2f(z)dz ≤∫xA((∫A0+∫zA)f(x)f(y)dy)2f(z)dz ≤∫xA(C1f(x)+C2∫zAe−2γ(x−y)dy)2f(z)dz ≤∫xA(C1e−2γx+C′2e−2γ(x−z)−C′2e−2γ(x−A))2⋅Ce−2γzdz ≤e−4γx∫xA(C3e2γz+C4e−2γz)dz≤Ce−2γx,

and finally

 f(x)2∫∞x(∫zx1−F(y)f(y)dy)2f(z)dz ≤Cf(x)2∫∞x(∫zx∫∞ye−2γ(u−y)dudy)2e−2γzdz ≤Cf(x)2∫∞x(z−x)2e−2γzdz ≤Cf(x)2∫∞0s2e−2γ(s+x)ds≤Ce−6γx.

Then we have

 {E}0(∫ξ00M(y,x)dy)2≤Ce−2γ|x|for x>A. (3.8)

Similar estimate can be obtained for , therefore the result holds for . We obtain finally

 ∫∞−∞{E}0(∫ξ00M(y,x)dy)2dx =(∫−A−∞+∫A−A+∫∞A){E}0(∫ξ00M(y,x)dy)2dx ≤C1∫−A−∞e2γxdx+C2+C3∫∞Ae−2γxdx<∞.

This result yields directly the conditions of Theorem 4.11 in [9]:

 {E}ϑ0M(ξϑ0−ϑ0,x−ϑ0)2={E}0M(ξ0,x−ϑ0)2<∞,

and

 {E}ϑ0(∫ξϑ00M(y−ϑ0,x−ϑ0)dy)2<∞.

So we can deduce the convergence and the asymptotical normality of . In fact under the condition , the LTE is consistent and asymptotically normal, that is

 ηT(x)=√T(^fT(x)−f(x−ϑ0))⟹η(x−ϑ0),

where , and

 d(x)2=4f(x)2{E}0(1I{ξ0>x}−F(ξ0)f(ξ0))2.

Moreover

 {E}ϑ0(ηT(x)ηT(y)) =4f(x−ϑ0)f(y−ϑ0){E}0⎛⎜ ⎜⎝(1I{ξ0>x−ϑ0}−F(ξ0))(1I{ξ0>y−ϑ0}−F(ξ0))f(ξ0)2⎞⎟ ⎟⎠.

We can define

 η(x)=∫∞−∞M(y,x)√f(y)dW(y).

The distribution of is , and we have the following convergence

 ηT(x)⟹η(x−ϑ0). (3.9)

For and , we need more than (3.5) and convergence (3.9).

###### Lemma 3.2.

Let conditions and be fulfilled, then is asymptotically normal:

 L(ηT(x1),...,ηT(xk),^uT)⟹L(η(x1−ϑ0),...,η(xk−ϑ0),^u),

for any .

Proof. The first integral in (3.7) converges to zero, so we only need to verify the convergence for the part of Itô integral. Let us denote for simplicity

 η0T(x)=1√T∫T0M(Xt−ϑ0,x)dWt.

It is sufficient to verify that for any ,

 (η0T(x1),...,η0T(xk),^uT)⟹(η(x1),...,η(xk),^u). (3.10)

Remember that can be defined as follows,

 ZT(^uT)=supu∈\mathbbmUTZT(u),\mathbbmUT={u:ϑ+u√T∈Θ}, (3.11)

where

 ZT(u)=d{{P}}Tϑ+u√Td{{P}}Tϑ(XT)=exp{uΛT−u22I+rT}.

Here and . It was proved in [9], Theorem 2.8 that converges in distribution to , where

 Z(u)=exp{uΛ−u22I},

where is a r.v. with normal distribution , which can be written as

 Λ=∫∞−∞S′∗(y)√f(y)dW(y).

Therefore

 ^uT⟹^u=ΛI.

Take . We have to verify that the joint finite-dimensional distribution of

 YT=(η0T(x1),η0T(x2),...,η0T(xk),ZT(u1),ZT(u2),...,ZT(um))

converges to the finite-dimensional distribution of

 Y=(η(x1),η(x2),...,η(xk),Z(u1),Z(u2),...,Z(um)).

Note that the only stochastic term in is , so (3.10) is equivalent to

 (η0T(x1),η0T(x2),...,η0T(xk),ΛT)⟹(η(x1),η(x2),...,η(xk),Λ). (3.12)

Take , and put

 h(y,x,λ)=k∑l=1λlM(y,xl)+λk+1S′∗(y).

We have

 {E}ϑ0h(ξϑ0−ϑ0,x,λ)2={E}0h(ξ0,x,λ)2 =∫∞−∞(k∑l=1λlM(y,xl)+λk+1S′∗(y))2f(y)dy =∫∞−∞(k∑l=12λlf(xl)1I{y>xl}−F(y)√f(y)+λk+1S′∗(y)√f(y))2f(y)dy =∫∞−∞(k∑l=1k∑m=14λlλmf(xl)f(xm)(1I{y>xl}−F(y))(1I{y>xm}−F(y))f(y) +k∑l=1λlλk+1(1I{y>xl}−F(y))S′∗(y)+λ2k+1S′∗(y)2f(y))dy<∞.

The law of large number gives us

 1T∫T0h(Xt−ϑ0,x,λ)2dt⟶{E}0h(ξ0,x,λ)2.

Moreover, the central limit theorem for stochastic integral gives us

 1√T∫T0h(Xt−ϑ0,x,λ)dWt⟹N(0,{E}0h(ξ0,x,λ)2).

In addition is a zero mean normal r.v. with variance

 {E}0(k∑l=1λlη(xl)+λk+1Λ)2 =k∑l=1k∑m=1λlλm{E}0(η(xl)η(xm))+k∑l=1λlλk+1{E}0(η(xl)Λ)+λ2k+1{E}0(Λ)2.

Furthermore

 {E}0(η(xl)η(xm)) =4f(xl)f(xl)∫∞−∞(1I{y>xl}−F(y))(1I{y>xm}−F(y))f(y)dy,

and

 {E}0(η(xl)Λ)=−2f(xl)∫∞−∞(1I{y>xl}−F(y))S′∗(y)dy,
 {E}0(Λ)2=∫∞−∞S′∗(y)2f(y)dy.

We find that

 {E}ϑ0h(ξϑ0−ϑ0,x,λ)2={E}0h(ξ0,x,λ)2={E}0(k∑l=1λlη(xl)+λk+1Λ)2.

This is as to say

 k</