Convergence of long-memory discrete k-th order Volterra processes

# Convergence of long-memory discrete $k$-th order Volterra processes

## Abstract

We obtain limit theorems for a class of nonlinear discrete-time processes called the -th order Volterra processes of order . These are moving average -th order polynomial forms:

 X(n)=∑0

where is i.i.d. with , , where is a nonrandom coefficient, and where the diagonals are included in the summation. We specify conditions for to be well-defined in , and focus on central and non-central limit theorems. We show that normalized partial sums of centered obey the central limit theorem if decays fast enough so that has short memory. We prove a non-central limit theorem if, on the other hand, is asymptotically some slowly decaying homogeneous function so that has long memory. In the non-central case the limit is a linear combination of Hermite-type processes of different orders. This linear combination can be expressed as a centered multiple Wiener-Stratonovich integral.

\@footnotetext

Key words Long memory; Long-range dependence; Volterra process; Wiener chaos; Wiener; Stratonovich; Limit theorems

2010 AMS Classification: 60G18, 60F05

## 1 Introduction

A common assumption when analyzing a stationary time series , is that is a causal linear process, that is,

 X1(n)=∞∑i=1aiϵn−i, (1)

where is a sequence of i.i.d. random variables with mean and variance . This assumption is based on the Wold’s decomposition, which states that if is stationary with mean and finite second moment, and is also purely non-deterministic, then the representation (1) always holds with a sequence of uncorrelated random variables (Brockwell and Davis [5] §5.7). The independence assumption of in (1) obliterates the higher-order dependence structure. In some applications, linear processes provide good approximations, while in others, not, as in the case of the ARCH model for volatility data.

The Volterra process extends linear process by incorporating non-linearity. A (causal) Volterra process with highest order is of the form

 XK(n)=K∑k=1∑0

To understand the importance of (2), suppose that the stationary process is for some regular function . Then (2) can be heuristically regarded as its -th order Taylor series approximation. The homogeneous polynomial-form expansion in (2) and its continuous-time counterpart where the sums are replaced with integrals, was originally proposed by Vito Volterra (see Volterra [22]) for modeling deterministic nonlinear systems, and later extended by Norbert Wiener (see Wiener [23]) to random systems, which eventually lead to the well-developed theory of Wiener chaos (see, e.g., Cameron and Martin [8], Itô [14], and the recent survey Peccati and Taqqu [19]). In the context of approximation of stationary processes, Nisio [18] shows that any stationary process can be approximated in the sense of finite-dimensional distributions by a Volterra process with ’s Gaussian. Some nonlinear time series models admit Volterra expansions (2) with . For example, the LARCH() model

 X(n)=a+∞∑i=1biY(n−i),Y(n)=X(n)ϵn,

under suitable conditions admits the following Volterra expansion (see, e.g., Theorem 2.1 of Giraitis et al. [12]):

 X(n)=a⎛⎝1+∞∑k=1∑0

We are interested here in stationary processes that have long memory, or long-range dependence. A common choice is a linear process in (1) with as , where is the memory parameter, and is some constant. This is the case, for instance, when is the stationary solution of the fractional difference equation

 ΔdX(n)=ϵn−1,

where is the difference operator with being identity operator and being the backward shift operator, and is understood as a binomial series (see, e.g., Giraitis et al. [11] Chapter 7.2). We note that such long-memory linear processes have an autocovariance decaying like as , and a spectral density exploding at the origin as as .

If one wants to consider a nonlinear long memory model, a natural choice is to have a Volterra process (2) with coefficients decaying slowly as tends to infinity, so that the autocovariance has a slow hyperbolic decay. The major goal in this paper is to study the limit of normalized partial sum of some long-memory Volterra processes. When is a long-memory linear process, that is, a long-memory Volterra process with , then the limit, as is well-known, is fractional Brownian motion (Davydov [9]). When is polynomial of a long memory linear processes, that is, when in (2) for some constant , and is large enough, then the limit is a Hermite process of a fixed order (Surgailis [20], Avram and Taqqu [1]). Such limit theorems involving non-Brownian motion limits are often called non-central limit theorems.

In this paper, we focus on Volterra processes of a single order :

 X(n)=∑0

which avoids possible cancellations between terms of different orders. Note that the multiple sum (3) includes diagonals, that is, it allows to be equal to each other. In the literature, one often considers multiple sums of the type (3) where summation over the diagonals is excluded, which greatly simplifies the theory. Although the exclusion of the diagonals is a typical theoretical assumption, it is, from a practical perspective, an artificial one. Expression (3) is the natural one since it includes all the terms.

To obtain a non-central limit theorem for (3), we assume that the coefficient behaves asymptotically as a homogeneous function on which is bounded excluding a neighborhood of the origin. We shall show that in this case, the limit of a normalized sum of centered is a linear combination of Hermite-type processes of different orders. These Hermite-type processes that appear in the limit were first introduced in Mori and Oodaira [17], and were called in Bai and Taqqu [2] generalized Hermite processes. They live in Wiener chaos, and extend in a natural way the usual Hermite processes considered in the literature, e.g., Dobrushin and Major [10] and Taqqu [21].

The limit, which is a linear combination involving different orders of multiple Wiener-Itô integrals, can be re-expressed as a single centered multiple Wiener-Stratonovich integral with the zeroth-order term excluded. These integrals were introduced by Hu and Meyer [13]. Loosely speaking, in contrast to the usual Wiener-Itô integrals, the multiple Wiener-Stratonovich integrals include diagonals, and intuitively they are the continuous counterpart of the multiple sums in (3) which, as was noted, do include diagonals.

The paper is organized as follows. In Section 2, we introduce the generalized Hermite processes which appear in the formulation of the non-central limit theorem. In Section 3, we provide conditions for the polynomial form (3) to be well-defined in . In Section 4, we introduce the class of long-memory Volterra processes of interest in the non-central limit theorem. In Section 5, we establish central limit theorems when in (3) decays fast enough so that has short memory. In Section 6, we state a non-central limit theorem for processes in (3). Before launching into the article, the reader may want to have a look at this result, formulated as Theorem 6.2, and also at the illustrative Example 6.4. The connection between the limit and multiple Wiener-Stratonovich integrals is indicated in Section 7. Section 8 contains an extended hypercontractivity formula.

## 2 Generalized Hermite processes and kernels

We introduce here the kernels which will be used to define both the coefficient in (3), and the processes that will appear in the non-central limit.

First, some notation which will be used throughout the paper. Let , , , , and let denote the vector made of ’s. If , then , and . We write if , and use the following standard notations: denotes a norm in some suitable space, is the indicator function of a set , denotes the cardinality of set , and if and are two functions on and respectively, then defines a scalar function on as .

The following class of functions was introduced in Bai and Taqqu [2]:

###### Definition 2.1.

A generalized Hermite kernel (GHK) is a nonzero measurable function defined on satisfying:

1. , , ;

2. .

###### Remark 2.2.

As shown in Theorem 3.5 and Remark 3.6 in Bai and Taqqu [2], if is a GHK on , then for every ,

 ∫t0|g(s1−y)|1{s1>y}ds<∞

for a.e. . Furthermore,

 ht(y)=∫t0g(s1−y)1{s1>y}ds

is a.e. defined, and . In addition, if is nonzero, then .

These functions were used in Bai and Taqqu [2] as defining kernels for a class of stochastic processes called generalized Hermite processes.

###### Definition 2.3.

The generalized Hermite processes are defined through the following multiple Wiener-Itô integrals:

 Z(t)=Ik(ht):=∫′Rk∫t0 g(s1−x)1{s1>x}ds B(dx1)…B(dxk)， (4)

where the prime indicates that one does not integrate on the diagonals , , is a Brownian random measure, and is a GHK defined in Definition 2.1.

The generalized Hermite processes are self-similar with Hurst exponent

 H=α+k/2+1∈(1/2,1), (5)

that is, has the same finite-dimensional distributions as , and they have also stationary increments.

###### Example 2.4.

When takes the particular form where , becomes the usual Hermite process obtained through a non-central limit theorem in the context of long memory (e.g., Taqqu [21], Dobrushin and Major [10], Surgailis [20]).

In Bai and Taqqu [2] the following subclass of functions , called generalized Hermite kernel of Class (B) was considered.

###### Definition 2.5.

We say that a nonzero homogeneous function on having homogeneity exponent is of Class (B) (abbreviated as “GHK(B)”, “B” stands for “boundedness”), if

1. is a.e. continuous on ;

2. for some constant , where is as in Definition 2.1.

###### Remark 2.6.

The norm in Definition 2.5 can be any norm in the finite-dimensional space since all the norms are equivalent. For convenience, we choose throughout this paper . The GHK(B) class is a subset of the GHK class, because if is a GHK(B), then it is homogeneous and hence satisfies Condition 1 of Definition 2.1. It also satisfies Condition 2 of Definition 2.1. Indeed, we have for some that

 |g(x)|≤C∥x∥α=C(k∑j=1xj)α≤C′k∏j=1xα/kj, x∈Rk+, (6)

where the last inequality follows from the arithmetic-geometric mean inequality

 k−1k∑j=1yj≥(k∏j=1yj)1/k%andα<0.

In view of Condition 1 of Definition 2.1, since , we hence have

 ∫Rk+|g(x)g(1+x)|dx≤C′(∫∞0xα/k(1+x)α/kdx)k<∞.
###### Example 2.7.

As an example of a GHK(B), we can simply set equal to

 g1(x)=∥x∥α=|x1+…+xk|α=(x1+…+xk)α,α∈(−k+12,−k2),

since .

###### Example 2.8.

As another example, consider

 g2(x)=k∏j=1xajj/(k∑j=1xbj),aj>0, b>0,

and

 k∑j=1aj−b∈(−k+12,−k2).

is continuous and homogeneous with exponent . It is a GHK(B) because the functions and are bounded on the -dimensional unit sphere restricted to . For instance,

 (k∑j=1xbj)1/b≤C∥x∥

by the equivalence of norms on . Thus .

###### Example 2.9.

It is easy to see that the set of GHK(B) functions on with fixed homogeneity exponent (with the zero function added) is closed under linear combinations and taking maximum or minimum. Thus one can consider , and using the and in the foregoing examples.

In Bai and Taqqu [2], non-central limit theorems involving GHK(B) are established1. These theorems involve sums of a long-memory stationary process called discrete chaos process defined as

 X′(n)=′∑i∈Zk+a(i1,…,ik)ϵn−i1…ϵn−ik=′∑i∈Zk+a(i)ϵn−i1…ϵn−ik, (7)

where , is a GHK(B), is some asymptotically negligible function (see (25) and the lines below), and the prime means that we do not sum on the diagonals , , i.e., the summation in (7) is only over unequal . We note that when is symmetric, the autocovariance of in (7) is

 γ(n)=EX′(n)X′(0)=k!′∑i∈Zk+a(i)a(i+n1), n≥0.
###### Remark 2.10.

The difference between the discrete chaos process defined in (7) and the Volterra process in (3) is the exclusion of the diagonals.

## 3 L2(Ω)-definiteness

In this section, we derive conditions under which a -th order polynomial form with diagonals is well-defined.

The -th order Volterra process in (3) is a polynomial form in i.i.d. random variables . To allow for long memory and obtain non-central limit theorems, the coefficient in (3) must be nonzero at an infinite number of . Otherwise is an -dependent sequence and thus subject to the central limit theorem (Billingsley [4]). So the first problem is to ensure that such a polynomial form with an infinite number of terms is well-defined, that is, to determine when the following random variable is well-defined:

 X=∑0

where is an i.i.d. sequence such that

 Eϵi=0, Eϵ2i=1, E|ϵi|k<∞. (9)

One can restrict to be a symmetric function in , since a permutation of the variables does not affect , but we shall not do so unless indicated, because it is easier to write down non-symmetric ’s.

First, we have the following straightforward criterion for the -well-definedness of :

###### Proposition 3.1.

If , then in (8) is well-defined in the -sense.

###### Proof.

Let

 Xm=∑00.

It suffices to check that is a Cauchy sequence in . This is true since for any ,

 E|Xm−Xn|≤∑m1

where is bounded above by a constant because of the assumption in (9). ∎

The absolute summability assumption in Proposition 3.1 is easy to work with, but it is unfortunately too restrictive for incorporating long memory. We will introduce instead a condition on so that is well-defined in the -sense. Beside the obvious assumption , some delicate assumptions on need to be imposed, which are stated in Proposition 3.3 below. We first give an outline of the idea. If in (8) is instead defined as an off-diagonal polynomial form:

 X′=′∑i∈Zk+a(i)ϵi1…ϵik, (10)

then due to the off-diagonality, it is easy to see that the -well-definedness of is guaranteed by the simple square-summability condition:

 ′∑i∈Zk+a(i)2<∞,

which equals if is symmetric. In fact, this -defineness criterion still holds if one has more generally

 X′=′∑i∈Zk+a(i)ϵ(1)i1…ϵ(k)ik, (11)

where forms an i.i.d. sequence of -dimensional vector with mean and finite variance in each component. We will need this fact below.

In order to check that the polynomial-form in (8), which includes diagonals, is well-defined, we shall decompose it into a finite number of off-diagonal polynomial forms, and check the well-definedness of each using the simple square-summability condition. In order to do this, we introduce some further notation, which will also be useful in the sequel.

We let denote all the partitions of . If , then denotes the number of sets in the partition. If we have a variable , then denotes a new variable where its components are identified according to . For example, if , and , then . In this case we write where and . If is a function on , then , where . In the preceding example, with .

Suppose that , where , . We suppose throughout that the ’s are ordered according to their smallest element. In the preceding example, and . We define the following summation operation on a function on .

###### Definition 3.2.

For any , the summation is obtained by summing over its variables indicated by off-diagonally, yielding a function with variables.

For instance, if , then and if , then

 (S′Taπ)(i)=′∑i1,i3a(i1,i,i3,i3,i1), (12)

provided that it is well-defined. Note that in this off-diagonal sum, we require, in addition to , that neither nor equals to . If , is understood to be the identity operator, where no summation is performed.

We need also Appell polynomials which we briefly introduce here. For more details, see, e.g. Avram and Taqqu [1] or Chapter 3.3 of Beran et al. [3]. Given a random variable with , the -th order Appell polynomial with respect to the law of , is defined through the following recursive relation:

 ddxAp(x)=pAp−1(x),EAp(ϵ)=0,A0(x)=1,p=1,…,K.

For example, if , then , , etc. If in addition , then , and . For consistency, one sets . We will use an important property of Appell polynomials, namely, for any integer ,

 xp=p∑j=0(pj)μp−jAj(x). (13)
###### Proposition 3.3.

The polynomial form in (8) is a random variable defined in the -sense, if the following three conditions hold:

1. ;

2. satisfies the following: for any , we have

 ′∑0
3. for any and any nonempty satisfying for all , we have

 ′∑0

where if , (15) is understood as merely stating that the sum converges.

###### Remark 3.4.

To understand the need for (14) and (15), note that, in order to use the -definiteness of (11), it is necessary to center the powers of . For example, consider

 X=∑i1,i2,i3>0a(i1,i2,i3)ϵi1ϵi2ϵi3.

If we focus on the subset , then we have

 ′∑i1,i2>0a(i1,i1,i2)ϵ2i1ϵi2 =′∑i1,i2>0a(i1,i1,i2)(ϵ2i1−μ2)ϵi2+μ2∑i2>0∑i1≠i2>0a(i1,i1,i2)ϵi2 =′∑i1,i2>0a(i1,i1,i2)A2(ϵi1)A1(ϵi2)+μ2∑i2>0∑i1≠i2>0a(i1,i1,i2)A1(ϵi2),

where . For the preceding two terms to be well-defined in , we require respectively

 ′∑i1,i2>0a(i1,i1,i2)2<∞

and

 ∑i2>0⎛⎝∑i1≠i2>0a(i1,i1,i2)⎞⎠2=∑i2>0[(S′Taπ)(i2)]2<∞, π={{1,2},{3}}, T={1}.

An example of satisfying (14) but not (15) is given by:

 a(i1,i2)=(i1+i2)−1(logi2)−1.

Note that is summable because is finite by the integral test, while is not summable.

###### Proof of Proposition 3.3.

By collecting various diagonal cases, we express as

 X=∑π∈Pk′∑0

where , , , , . Since is finite, one can focus on the -definedness of each term

 Xπ:=′∑0

Let be the -th order Appell polynomial with respect to the law of . Let

 μj=Eϵji.

Then by (13),

 ϵp1i1…ϵpmim =p1∑j1=0…pm∑jm=0(p1j1)…(pmjm)μp1−j1…μpm−jmAj1(ϵi1)…Ajm(ϵim).

Thus to ensure , it suffices to show that

 Xjπ:=′∑0

is well-defined in for any .

Note now the following crucial fact. Since by assumption, we do not need to consider in (17). Thus:

 If jt=0, then we need to consider only pt=|Pt|≥2. (18)

Suppose first that . Since by assumption and for , then in view of the discussion concerning (11), it is sufficient to require (14). Now suppose that some , and observe that is then the constant . Thus if is the set of ’s such that , then

 Xjπ=′∑0

where , and

 c(p,j)=(p1j1)…(pmjm)μp1−j1…μpm−jm. (20)

So one can bound by a constant times the sum in (15) since (19) has the form (11).

###### Remark 3.5.

Since for , one can see from (17) that only when , which implies

 EX=∑π∈Pk′∑i∈Zm+aπ(i)μp1…μpm. (21)

Relation (15) with ensures that .

We now state here a practical sufficient condition for Proposition 3.3:

###### Proposition 3.6.

Let be a function on such that

 |a(i)|≤ck∏j=1iγjj,

where is some constant and , . Then

 X:=∑i∈Zk+a(i)ϵi1…ϵik

is a well-defined random variable in , where is i.i.d. with mean and variance and .

###### Proof.

We set . Relation (14) holds because

 |aπ(i)|≤c1m∏j=1iβjj (22)

for some , where , so

 ′∑0

To check (15), note that when , we have by (18), and so we have in addition in (22). Thus for some finite ,

 [S′T|aπ|(i)]2≤c2[∑it,t∈T(∏jt,t∈Tiβjtjt)]2⎛⎝∏js,s∉Tiβjsjs⎞⎠2=c3⎛⎝∏js,s∉Ti2βjsjs⎞⎠,

where the summation in the middle is finite, and hence

 ∑0

## 4 Volterra processes with long memory

We introduce in this section the -th order Volterra processes for which we establish non-central limit theorems in Section 6.

### 4.1 The off-diagonal process

We first introduce for convenience the following -th order discrete chaos process with different noises:

 X′(n):=′∑i∈Zk+a(i)ϵ(1)n−i1…ϵ(k)n−ik, (23)

where is an i.i.d. sequence of vectors, where and , . This is just an extension of (7) adapted to (11). For such , it is easy to show that the autocovariance satisfies

 |γ(n)|≤ck!′∑i∈Zk+˜|a|(i)˜|a|(i+