A functional limit theorem forcoin tossing Markov chains

A functional limit theorem for coin tossing Markov chains

Stefan Ankirchner Stefan Ankirchner, Institute of Mathematics, University of Jena, Ernst-Abbe-Platz 2, 07745 Jena, Germany. Email: s.ankirchner@uni-jena.de, Phone: +49 (0)3641 946275.    Thomas Kruse Thomas Kruse, Faculty of Mathematics, University of Duisburg-Essen, Thea-Leymann-Str. 9, 45127 Essen, Germany. Email: thomas.kruse@uni-due.de, Phone: +49 (0)201 1837444.    Mikhail Urusov Mikhail Urusov, Faculty of Mathematics, University of Duisburg-Essen, Thea-Leymann-Str. 9, 45127 Essen, Germany. Email: mikhail.urusov@uni-due.de, Phone: +49 (0)201 1837428.
Abstract

We prove a functional limit theorem for Markov chains that, in each step, move up or down by a possibly state dependent constant with probability , respectively. The theorem entails that the law of every one-dimensional regular continuous strong Markov process in natural scale can be approximated with such Markov chains arbitrarily well. The functional limit theorem applies, in particular, to Markov processes that cannot be characterized as solutions to stochastic differential equations. Our results allow to practically approximate such processes with irregular behavior; we illustrate this with Markov processes exhibiting sticky features, e.g., sticky Brownian motion and a Brownian motion slowed down on the Cantor set.

Keywords: one-dimensional Markov process; speed measure; Markov chain approximation; functional limit theorem; sticky Brownian motion; slow reflection; Brownian motion slowed down on the Cantor set.

2010 MSC: Primary: 60F17; 60J25; 60J60. Secondary: 60H35; 60J22.

Introduction

Let be an iid sequence of random variables, on a probability space with a measure , satisfying . Given , and a function , we denote by the Markov chain defined by

 Xh0=yandXh(k+1)h=Xhkh+ah(Xhkh)ξk+1, for k∈N0. (1)

We choose as the Markov chain’s index set the set of non-negative multiples of because we interpret as the length of a time step. We extend to a continuous-time process by linear interpolation, i.e., we set

 Xht=Xh⌊t/h⌋h+(t/h−⌊t/h⌋)(Xh(⌊t/h⌋+1)h−Xh⌊t/h⌋h),t∈[0,∞). (2)

Let and let be a family of real functions and the associated family of extended Markov chains defined as in (2). A fundamental problem of probability theory is to find conditions on such that the laws of the processes , , converge in some sense as . In this article we provide an asymptotic condition on the family guaranteeing that the laws of the processes , , converge as to the law of a one-dimensional regular continuous strong Markov process (in the sense of Section VII.3 in [18] or Section V.7 in [19]). In what follows we use the term general diffusions for the latter class of processes. Recall that a general diffusion has a state space that is an open, half-open or closed interval . We denote by the interior of , where . Moreover, the law of any general diffusion is uniquely characterized by its speed measure on , its scale function and its boundary behavior. Throughout the introduction we assume that is in natural scale and that every accessible boundary point is absorbing (see the beginning of Section 1 and Section 6 on how to incorporate diffusions in general scale and with reflecting boundary points). This setting covers, in particular, solutions of driftless SDEs with discontinuous and fast growing diffusion coefficient (see Section 2) and also diffusions with “sticky” behavior that cannot be modeled by SDEs (see Section 7).

Our main result, Theorem 1.1, shows that if a family of functions satisfies for all , the equation

 12∫(y−ah(y),y+ah(y))(ah(y)−|u−y|)m(du)=h, (3)

with a precision of order uniformly in over compact subsets of (see Condition (A) below for a precise statement), then the associated family converges in distribution, as , to the general diffusion with speed measure . We show that for every general diffusion a family of functions satisfying (3) exists implying that every general diffusion can be approximated by a Markov chain of the form (1). Equation (3) dictates how to compute the functions and therefore paves the way to approximate the distribution of a general diffusion numerically (see, e.g., Section 8).

The central idea in the derivation of Equation (3) is to embed for every the Markov chain into with a sequence of stopping times. To explain this idea assume for the moment that the state space is . For every let and then recursively define as the first time exits the interval after . It follows that the discrete-time process has the same law as the Markov chain . Instead of controlling now directly the spacial errors , we first analyze the temporal errors , . We show that for every , the expected time it takes started in to leave the interval is equal to . In particular, if satisfies (3) for all , it follows that for all the time lag between two consecutive stopping times is in expectation equal to . In this case we refer to as Embeddable Markov Chain with Expected time Lag (we write shortly ).

For some diffusions one can construct EMCEL approximations explicitly (see, e.g., Section 7). For cases where (3) cannot be solved in closed form, we perform a perturbation analysis and show that it suffices to find for all , a number satisfying (3) with an error of order uniformly in belonging to compact subsets of . We prove that for the associated stopping times the temporal errors , , converge to as in every -space, . This ultimately implies convergence of to in distribution as .

To illustrate the benefit of the perturbation analysis, we construct in Section 8 approximations for a Brownian motion slowed down on the Cantor set (see Figure 3). Moreover, we note that our main result, Theorem 1.1, is not only applicable to perturbations of the EMCEL approximation but can also be used to derive new convergence results for other approximation methods such as, e.g., weak Euler schemes (see Corollary 2.3).

The idea to use embeddings in order to prove a functional limit theorem goes back to Skorokhod. In the seminal book [21] scaled random walks are embedded into Brownian motion in order to prove Donsker’s invariance principle. In [4] we embed Markov chains into the solution process of an SDE and prove a functional limit theorem where the limiting law is that of the SDE. In [21] and [4] the approximating Markov chains have to be embeddable with a sequence of stopping times such that the expected distance between two consecutive stopping times is exactly equal to , the time discretization parameter. In contrast, in the present article we require that the expected distance between consecutive embedding stopping times is only approximately equal to . We show that for the convergence of the laws it is sufficient to require that the difference of the expected distance and is of the order . Moreover, compared to [4], we allow for a larger class of limiting distributions. Indeed, our setting includes processes that can not be characterized as the solution of an SDE, e.g. diffusions with sticky points.

There are further articles in the literature using random time grids to approximate a Markov process, under the additional assumption that it solves a one-dimensional SDE. In [8] the authors first fix a finite grid in the state space of the diffusion. Then they construct a Bernoulli random walk on this grid that can be embedded into the diffusion. The authors determine the expected time for attaining one of the neighboring points by solving a PDE.

[17] describes a similar approximation method for the Cox-Ingersoll-Ross (CIR) process. Also here the authors first fix a grid on and then construct a random walk on the grid that can be embedded into the CIR process. In contrast to [8], the authors in [17] compute the distributions of the embedding stopping times (and not only their expected value) by solving a parabolic PDE. In the numerical implementation of the scheme the authors then draw the random time increments from these distributions and thereby obtain a scheme that is exact along a sequence of stopping times. Note that in contrast to [8] and [17], in our approach the space grid is not fixed a priori. Instead, we approximately fix the expected time lag between the consecutive embedding stopping times.

Yet a further scheme that uses a random time partition to approximate a diffusion with discontinuous coefficients is suggested in [16]. In contrast to our approach the distribution of the time increments is fixed there. More precisely, the authors of [16] use the fact that the distribution of sampled at an independent exponential random time is given by the resolvent of the process. Consequently, if it is possible to generate random variables distributed according to the resolvent kernel, one obtains an exact simulation of at an exponentially distributed stopping time. Iterating this procedure and letting the parameter of the exponential distribution go to infinity provides an approximation of .

The article is organized as follows. In Section 1 we rigorously formulate and discuss the functional limit theorem. In Section 2 we discuss some of its implications for diffusions that can be described as solution of SDEs. In Sections 3 and 4 we explain, for a given general diffusion, how to embed an approximating coin tossing Markov chain into the diffusion and prove some properties of the embedding stopping times. Section 5 provides the proof of the functional limit theorem, where we, in particular, need the material discussed in Sections 3 and 4. The functional limit theorem is shown under the additional assumption that if a boundary point is attainable, then it is absorbing. In Section 6 we explain how one can extend the functional limit theorem to general diffusions with reflecting boundary points. In the last two sections we illustrate our main result with diffusions exhibiting some stickiness. In Section 7 we construct coin tossing Markov chains approximating sticky Brownian motion, with and without reflection, respectively. In Section 8 we first describe a Brownian motion that is slowed down on the Cantor set, and secondly we explicitly construct coin tossing Markov chains that approximate this process arbitrarily well.

1 Approximating general diffusions with Markov chains

Let be a one-dimensional continuous strong Markov process in the sense of Section VII.3 in [18]. We refer to this class of processes as general diffusions in the sequel. We assume that the state space is an open, half-open or closed interval . We denote by the interior of , where , and we set . Recall that by the definition we have for all . We further assume that is regular. This means that for every and we have that , where . If there is no ambiguity, we simply write in place of . Moreover, for in we denote by the first exit time of from , i.e. . Without loss of generality we suppose that the diffusion is in natural scale. If is not in natural scale, then there exists a strictly increasing continuous function , the so-called scale function, such that , , is in natural scale. Let be the speed measure of the Markov process (see VII.3.7 and VII.3.10 in [18]). Recall that for all in we have

 0

Finally, we also assume that if a boundary point is accessible, then it is absorbing. We drop this assumption in Section 6, where we extend our approximation method to Markov processes with reflecting boundaries. The extension works for both instantaneous and slow reflection.

Let and suppose that for every we are given a measurable function such that and for all we have . We refer to each function as a scale factor. We next construct a sequence of Markov chains associated to the family of scale factors . To this end we fix a starting point of . Let be an iid sequence of random variables, on a probability space with a measure , satisfying . We denote by the Markov chain defined by

 Xh0=yandXh(k+1)h=Xhkh+ah(Xhkh)ξk+1, for k∈N0. (5)

We extend to a continuous-time process by linear interpolation, i.e., for all , we set

 Xht=Xh⌊t/h⌋h+(t/h−⌊t/h⌋)(Xh(⌊t/h⌋+1)h−Xh⌊t/h⌋h). (6)

To highlight the dependence of on the starting point we also sometimes write .

To formulate our main result we need the following condition.

Condition (A) For all compact subsets of it holds that

 supy∈K∣∣∣12∫(y−ah(y),y+ah(y))(ah(y)−|u−y|)m(du)−h∣∣∣∈o(h),h→0. (7)
Theorem 1.1.

Assume that Condition (A) is satisfied. Then, for any , the distributions of the processes under converge weakly to the distribution of under , as ; i.e., for every bounded and continuous functional111As usual, we equip with the topology of uniform convergence on compact intervals, which is generated, e.g., by the metric where denotes the sup norm in . , it holds that

 E[F(Xh,y)]→Ey[F(Y)],h→0. (8)
Remark 1.2.

It is worth noting that Condition (A) is, in fact, nearly necessary for weak convergence (8) (see Example 2.1).

Remark 1.3.

For all , it holds that

 ∫(y−ah(y),y+ah(y))(ah(y)−|u−y|)m(du)=∫I(ah(y)−|u−y|)+m(du).

This yields an alternative representation of Condition (A) which is occasionally used below.

It is important to note that for every speed measure there exists a family of scale factors such that Condition (A) is satisfied and hence every general diffusion can be approximated by Markov chains of the form (5). Indeed, for all , let and

 ˆah(y)=sup{a≥0:y±a∈I and 12∫(y−a,y+a)(a−|z−y|)m(dz)≤h} (9)

and denote by the associated family of processes defined in (5) and (6). Then the proof of Corollary 1.4 below shows that for all compact subsets of there exists such that for all , it holds that

 12∫(y−ˆah(y),y+ˆah(y))(ˆah(y)−|z−y|)m(dz)=h.

In particular, the family satisfies Condition (A) and we show in Section 3 below that the Markov chain is embeddable into with a sequence of stopping times with expected time lag . We refer to , , as EMCEL approximations and write shortly .

Corollary 1.4.

For every the distributions of the EMCEL approximations under converge weakly to the distribution of under as .

Proof.

Let be a compact subset of . Without loss of generality assume that with . Let . It follows with dominated convergence that the function

 K∋y↦12∫I(a0−|u−y|)+m(du)∈(0,∞)

is continuous. In particular, it is bounded away from zero, i.e., there exists such that for all it holds that . Next observe that for all the function is continuous and strictly increasing. Hence for all , the supremum in (9) is a maximum and it holds that In particular, Condition (A) is satisfied and the statement of Corollary 1.4 follows from Theorem 1.1. ∎

2 Application to SDEs

A particular case of our setting is the case, where is a solution to the driftless SDE

 dYt=η(Yt)dWt, (10)

where is a Borel function satisfying the Engelbert-Schmidt conditions

 η(x)≠0∀x∈I∘, (11) η−2∈L1loc(I∘) (12)

( denotes the set of Borel functions locally integrable on ). Under (11)–(12) SDE (10) has a unique in law weak solution (see [7] or Theorem 5.5.7 in [14]). This means that there exists a pair of processes on a filtered probability space , with satisfying the usual conditions, such that is an -Brownian motion and satisfies SDE (10). The process possibly reaches the endpoints or in finite time. By convention we force to stay in (resp., ) in this case. This can be enforced in (10) by extending to with . In this example is a regular continuous strong Markov process with the state space being the interval with the endpoints and  (whether and belong to the state space is determined by the behavior of near the boundaries). Moreover, is in natural scale, and its speed measure on is given by the formula

 m(dx)=2η2(x)dx.

In this situation a change of variables shows that it holds for all , that

 ∫(y−ah(y),y+ah(y))(ah(y)−|u−y|)m(du)=2ah(y)2∫1−11−|z|η2(y+ah(y)z)dz. (13)

Condition (A) hence becomes that for every compact subset of it holds that

 limh→0(supy∈K∣∣∣ah(y)2h∫1−11−|z|η2(y+ah(y)z)dz−1∣∣∣)=0. (14)
Example 2.1 (Brownian motion).

In the special case where is a Brownian motion (i.e., , ), Condition (A) requires that for all compact sets it holds that as . In particular, Condition (A) is satisfied for the choice , , , and we recover from Theorem 1.1 Donsker’s functional limit theorem for the scaled simple random walk.

Moreover, in the case of a Brownian motion it is natural to restrict ourselves to space-homogeneous (i.e., constant) scale factors , , so that Condition (A) takes the form . It is straightforward to show that the latter condition is also necessary for the weak convergence of approximations (5)–(6) driven by space-homogeneous scale factors to the Brownian motion.

Example 2.2 (Geometric Brownian motion).

Let and assume that satisfies for all that . Then the solution of (10) with positive initial value is a geometric Brownian motion. Its state space is and both boundary points are inaccessible. Note that for all , it holds that

 ∫1−11−|z|η2(y+az)dz=1(σy)2∫1−11−|z|(1+az/y)2dz=−1(σa)2log(1−a2y2).

Hence, Condition (A) requires that for all compact sets it holds that

 limh→0(supy∈K∣∣ ∣∣1hσ2log(1−ah(y)2y2)+1∣∣ ∣∣)=0. (15)

To obtain the EMCEL approximation of we solve for all , the equation in and obtain . Note that also the usual choice , , , which corresponds to the weak Euler scheme for geometric Brownian motion, satisfies (15).

Convergence of the weak Euler scheme

Throughout this subsection we assume that . A common method to approximate solutions of SDEs is the Euler scheme. For equations of the form (10) with initial condition the Euler scheme with time step is given by

 XEu,h0=yandXEu,h(k+1)h=XEu,hkh+η(XEu,hkh)(W(k+1)h−Wkh), for k∈N0.

Weak Euler schemes are variations of the Euler scheme, where the normal increments , , are replaced by an iid sequence of centered random variables with variance . Therefore, with the choice , , , the Markov chain defined in (5) represents a weak Euler scheme with Rademacher increments.

In this subsection we show how Theorem 1.1 can be used to derive new convergence results for weak Euler schemes. To this end let the setting of Section 2 be given and let , , . Then it follows from (14) that Condition (A) is equivalent to assuming that for every compact subset we have

 supy∈K∣∣ ∣∣∫1−1η2(y)(1−|z|)η2(y+√hη(y)z)dz−1∣∣ ∣∣=supy∈K∣∣ ∣∣∫1−1η2(y)−η2(y+√hη(y)z)η2(y+√hη(y)z)(1−|z|)dz∣∣ ∣∣→0, (16)

as .

Suppose that is continuous, let be compact and let . Then is bounded on and since every continuous function is uniformly continuous on compact sets, we obtain that there exists such that for all , , it holds that

 |η(y)−η(y+√hη(y)z)|≤ε.

By (11) and the continuity of the function is strictly bounded away from on every compact subset of and hence we obtain that there exists such that for all it holds that

 supy∈K∣∣ ∣∣∫1−1η2(y)−η2(y+√hη(y)z)η2(y+√hη(y)z)(1−|z|)dz∣∣ ∣∣≤Cε.

It follows with (16) that Condition (A) is satisfied. Therefore we obtain the following Corollary of Theorem 1.1.

Corollary 2.3.

Assume the setting of Section 2 with and that is continuous. Let satisfy for all , . Then for all the distributions of the processes under converge weakly to the distribution of under , as .

Remark 2.4.

Corollary 2.3 complements convergence results for the Euler scheme for example obtained in [23] and [10]. Theorem 2.2 in [23] shows weak convergence of the Euler scheme if has at most linear growth and is discontinuous on a set of Lebesgue measure zero. Theorem 2.3 in [10] establishes almost sure convergence of the Euler scheme if is locally Lipschitz continuous. Moreover, [10] allows for a multidimensional setting and a drift coefficient. In contrast, Corollary 2.3 above applies to the weak Euler scheme and does not require linear growth or local Lipschitz continuity of .

Remark 2.5.

As stated in Corollary 1.4, EMCEL approximations can be constructed for every general diffusion. In particular, they can be used in cases where is not continuous and where (weak) Euler schemes do not converge (see, e.g., Section 5.4 in [3]). In Sections 7 and 8 we consider further irregular examples.

3 Embedding the chains into the Markov process

In this section we construct the embedding stopping times. To this end, we need some auxiliary results. Throughout the section we assume the setting of Section 1.

We introduce the function defined by

 q(y,x)=12m({y})|x−y|+∫xym((y,u))du, (17)

where for we set . Notice that, for , the function is decreasing on and increasing on . Recall the Feller test for explosions: for any ,

 l is accessible (i.e., l∈I) ⟺q(y,l)<∞, (18) r is accessible (i.e., r∈I) ⟺q(y,r)<∞ (19)

(see, e.g., Lemma 1.1 in [2]). Consequently, is finite on . Notice that for all and we have

 q(z,x)=q(y,x)−q(y,z)−∂0q∂x(y,z)(x−z), (20)

where .

Lemma 3.1.

Let in and . Then

 EyHa,b=Eyq(y,YHa,b). (21)
Proof.

Recall that the speed measure satisfies

 EyHa,b=∫(a,b)(b−x∨y)(x∧y−a)b−am(dx),

(see e.g. Section VII.3 in [18]). This implies

 EyHa,b =1b−a[∫(a,y)(b−y)(x−a)m(dx)+∫(y,b)(b−x)(y−a)m(dx)+(b−y)(y−a)m({y})] =b−yb−a[∫(a,y)(x−a)m(dx)+(y−a)m({y})2]+y−ab−a[∫(y,b)(b−x)m(dx)+(b−y)m({y})2] =b−yb−aq(y,a)+y−ab−aq(y,b) =Eyq(y,YHa,b).

Lemma 3.2.

Let and such that . Then it holds

 q(y,y+a)+q(y,y−a)=∫(y−a,y+a)(a−|u−y|)m(du).
Proof.

It follows from the definition of the function that

 q(y,y+a)+q(y,y−a)=m({y})a+∫y+aym((y,u))du+∫yy−am((u,y))du.

Using Fubini’s theorem we compute

 ∫y+aym((y,u))du=∫y+ay∫(y,u)m(dz)du=∫(y,y+a)(y+a−z)m(dz)

and

 ∫yy−am((u,y))du=∫yy−a∫(u,y)m(dz)du=∫(y−a,y)(z−y+a)m(dz).

Substituting the latter formulas in the former one yield the result. ∎

The following result is an immediate consequence of Lemma 3.1 and Lemma 3.2

Corollary 3.3.

Let and such that . Then

 Ey[Hy−a,y+a]=12∫(y−a,y+a)(a−|u−y|)m(du). (22)

We need to introduce an auxiliary subset of . To this end, if , we define, for all ,

 lh:=l+inf{a∈(0,r−l2]:a<∞and12∫(l,l+2a)(a−|u−(l+a)|)m(du)≥h},

where we use the convention . If , we set . Similarly, if , then we define, for all ,

 rh:=r−inf{a∈(0,r−l2]:a<∞and12∫(r−2a,r)(a−|u−(r−a)|)m(du)≥h}.

If , we set .

Equations (18) and (19), together with Lemma 3.2 below, imply that is inaccessible if and only if for all . Similary, is inaccessible if and only if for all .

The auxiliary subset is defined by

 Ih=(lh,rh)∪{y∈I∘:y±ah(y)∈I∘}.

Now we have everything at hand to start constructing a sequence of embedding stopping times. Suppose starts at a point and fix . Set .

Let . Note that by Corollary 3.3 we have

 Ey[σh1]=12∫(y−ah(y),y+ah(y))(ah(y)−|u−y|)m(du).

We now define by distinguishing two cases.

Case 1: (i.e., or ). In this case we set .

Case 2: (i.e., and ( or )). In this case we deterministically extend so as to make it have expectation . Observe that by the definition of and we have in this case . Moreover, we can assume in this case that it must hold that (only in the case , and this probability is , but we exclude this case by considering a sufficiently small , so that ; notice that Condition (A) implies that, for any , ). We define by

 τh1=σh1+2(h−E[σh1])1{l,r}(Yσh1).

Observe that the definition implies and that the three random variables , and have all the same law.

We can proceed in a similar way to define the subsequent stopping times. Let . Suppose that we have already constructed . We first define . On the event we set . On the event we extend as follows. Note that takes only finitely many values. Let be a possible value of such that or . Consider the event . Observe that . We extend on the event by setting

 τhk+1=σhk+1+2(h−c)1{l,r}(Yσhk+1) (23)

(notice that ). This implies that on the event . Moreover, the processes and