Stochastic B–series analysis of iterated Taylor methods

# Stochastic B–series analysis of iterated Taylor methods

Kristian Debrabant Technische Universität Darmstadt, Fachbereich Mathematik, Dolivostraße 15, D-64293 Darmstadt, Germany  and  Anne Kværnø Department of Mathematical Sciences, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
###### Abstract.

For stochastic implicit Taylor methods that use an iterative scheme to compute their numerical solution, stochastic B–series and corresponding growth functions are constructed. From these, convergence results based on the order of the underlying Taylor method, the choice of the iteration method, the predictor and the number of iterations, for Itô and Stratonovich SDEs, and for weak as well as strong convergence are derived. As special case, also the application of Taylor methods to ODEs is considered. The theory is supported by numerical experiments.

###### Key words and phrases:
Stochastic Taylor method, stochastic differential equation, iterative scheme, order, Newton’s method, weak approximation, strong approximation, growth functions, stochastic B–series
###### 2000 Mathematics Subject Classification:
65C30, 60H35, 65C20, 68U20

## 1. Introduction

Besides stochastic Runge–Kutta methods, one important class of schemes to approximate the solution of stochastic differential equations (SDEs) are stochastic Taylor methods. As in the deterministic setting [1], they are especially suitable for problems with not too high dimension, and here especially in the case of strong approximation, because weak approximation of low-dimensional problems can often be done more efficiently by numerically solving the corresponding deterministic PDE problem obtained by applying the Feynman-Kac formula. For solving stiff SDEs, implicit methods have to be considered, as is illustrated in the following two examples.

###### Example 1.1 (see [12]).

Consider the linear Itô-SDE

 (1) dX(t)=μX(t)\leavevmode\nobreak dt+σX(t)\leavevmode\nobreak dW(t),X(0)=x0,

with . We assume that the exact solution is mean-square stable, i. e.

 limt→∞E(|X(t)|2)=0,

which is the case if and only if To achieve that also the numerical approximation obtained with the (explicit) Euler-Maruyama scheme with step size is mean-square stable, i. e. , we have to restrict the step size according to , whereas for the numerical approximations explode, . In contrast to this, the semi-implicit Euler scheme is mean-square stable without any step size restriction. For a numerical affirmation, see Figure 1.

###### Example 1.2.

Consider the following stochastic Van der Pol equation,

 dX1(t) =X2(t)\leavevmode\nobreak dt, dX2(t) =(μ(1−X1(t)2)X2(t)−X1(t))\leavevmode\nobreak dt+θ(1−X1(t)2)X2(t)\leavevmode\nobreak dW(t), X1(0) =x0,1,X2(0)=x0,2.

Application of the explicit Milstein scheme (see Example 1.3 with and ) with step-size to approximate a solution path leads to an explosion of the approximation, see Figure 2(a), whereas application of the semi-implicit Milstein scheme, given by substituting by ( and in Example 1.3), yields (for the same Brownian path) the result of Figure 2(b).

Implicit stochastic Taylor methods have been considered both for strong [15, 17] and weak [15] approximation. For these methods, the approximation values are only given implicitly. However, in practice these implicit equations are solved by iterative schemes like simple iteration or Newton iteration. The “exact numerical” solution can be written in terms of B–series [8]. As we will prove in this paper, so can the iterated solution. Moreover, for each iteration scheme in question, we will define a growth function. Briefly explained, when the exact numerical and the times iterated solutions are both written in terms of B–series, then all terms of these series for which the growth function has a value not greater than coincide. Thus the growth functions give a quite exact description of the development of the iterations. B–series and corresponding growth functions for iterated solutions have been derived for Runge–Kutta methods applied to deterministic ordinary differential equations [14], differential algebraic equations [13], and SDEs [7]. Somewhat surprisingly, the growth functions are exactly the same in all these cases, and, as we will show in this paper, this also holds for implicit Taylor methods.

The outline of the paper is as follows: First, we will give the SDE to be solved and the iterated Taylor methods used for its approximation. In Section 2 stochastic B–series are introduced and some useful preliminary results are presented. The main results of the paper can be found in Section 3, where the B–series of the iterated solutions are developed and the before mentioned growth functions derived. In Section 4, these findings are interpreted in terms of the order of the overall scheme, giving concrete results on the order of the considered methods depending on the kind and number of iterations, both for SDEs and ODEs. Contrary to the results obtained for Runge–Kutta methods [7], the order of the iteration error is shown to be independent on whether Itô or Stratonovich SDEs, weak or strong convergence is considered. Finally, in Section 5 we present several numerical examples to support our theoretical findings.

Let be a probability space. We denote by the stochastic process which is the solution of a -dimensional SDE defined by

 (2) dX(t)=g0(X(t))dt+m∑l=1gl(X(t))⋆dWl(t),X(t0)=x0,

with an -dimensional Wiener process and . As usual, (2) is construed as abbreviation of

 (3) X(t)=x0+∫tt0g0(X(s))ds+m∑l=1∫tt0gl(X(s))⋆dWl(s).

The integral w. r. t. the Wiener process has to be interpreted e. g. as Itô integral with or as Stratonovich integral with . We assume that the Borel-measurable coefficients are sufficiently differentiable and chosen such that SDE (3) has a unique solution.

To simplify the presentation, we define , so that (3) can be written as

 (4) X(t)=x0+m∑l=0∫tt0gl(X(s))⋆dWl(s).

Let a discretization with of the time interval with step sizes for be given. Now, we consider the general class of stochastic Taylor methods given by and

 (5) Yn+1=B(Φex,Yn;hn)+B(Φim,Yn+1;hn)

for with , , , .

###### Example 1.3.

Consider the family of Milstein schemes applied to an Itô SDE with one-dimensional noise,

 (6) Yn+1 =Yn+hn((1−α)g0(Yn)+αg0(Yn+1)) +I(1),hn((1−β)g1(Yn)+βg1(Yn+1))+(I(1,1),hn−βI2(1),hn)[g′1g1](Yn).

Here,

 I(1),hn =W(tn+1)−W(tn)=ΔWn, I(1,1),hn =∫tn+1tnW(s)dW(s)=12(ΔW2n−hn),

and the parameters indicate the degree of implicitness. When we have the explicit Milstein scheme, with , a semi-implicit scheme. In all cases, the method (6) can be written in the form (5) with

 B(ϕex,Yn;hn) =x0+hn(1−α)g0(Yn)+I(1),hn(1−β)g1(Yn) +(I(1,1),hn−βI2(1),hn)(g′1g1)(Yn), B(ϕim,Yn+1;hn) =hnαg0(Yn+1)+I(1),hnβg1(Yn+1).

The terms and refer to the time- and method-dependent part of each term: In this case

 Φex(∙0) =hn(1−α), Φim(∙0) =hnα, Φex(∙1) =I(1),hn(1−β), Φim(∙1) =I(1),hnβ, Φex([∙1]1]) =I(1,1),hn−βI2(1),hn.

The notation will be explained in detail in Section 2.

What the general method (5) concerns, application of an iterative Newton-type method yields

 (7) Yn+1,k+1=B(Φex,Yn;hn)+B(Φim,Yn+1,k;hn)+Jk(Yn+1,k+1−Yn+1,k)

with some approximation to the Jacobian of and a predictor . In the following we assume that (7) can be solved uniquely at least for sufficiently small . To simplify the presentation, we assume further that all step sizes are constant, .

For the approximation there exist several common choices. If we choose to be the exact Jacobian , then we obtain the classical Newton iteration method for solving (5), with quadratic convergence. It will be denoted in the following as full Newton iteration. If we choose instead , then we obtain the so called modified Newton iteration method, which is only linearly convergent. Here, is independent of the iteration number , thus its computation is much cheaper and simpler than in the full Newton iteration case. The third and simplest possibility is to choose equal to zero. In this case we don’t even have to solve a linear system for . This iteration method is called simple iteration method or predictor corrector method. Its disadvantage is that for stiff systems it requires very small step sizes to converge.

For most stiff problems and problems with additive noise, the use of semi-implicit methods suffices. As will be demonstrated in Section 4 these have the advantage that less iterations are required to obtain the correct order of the underlying method.

## 2. Stochastic B–series

B–series, symbolized by , for SDEs were first constructed by Burrage and Burrage [2, 3] to study strong convergence in the Stratonovich case. In the following years, this approach has been further developed by several authors to study weak and strong convergence, for the Itô and the Stratonovich case, see e. g. [7] for an overview. A uniform theory for the construction of stochastic B–series has been presented in [7], in [8] this approach has been used to construct order conditions for implicit Taylor methods. Following the exposition of these two papers, we define in this section stochastic B–series and present some preliminary results that will be used later.

### 2.1. Some useful definitions and preliminary results

###### Definition 2.1 (Trees).
\thlabel

bsa2:def:trees The set of -colored, rooted trees

 T={∅}∪T0∪T1∪⋯∪Tm

is recursively defined as follows:

a):

The graph with only one vertex of color belongs to .

Let be the tree formed by joining the subtrees each by a single branch to a common root of color .

b):

If then .

Thus, is the set of trees with an -colored root, and is the union of these sets.

###### Definition 2.2 (Elementary differentials).
\thlabel

bsa2:def:eldiff For a tree the elementary differential is a mapping defined recursively by

a):

,

b):

,

c):

If then

 F(τ)(x0)=g(κ)l(x0)(F(τ1)(x0),F(τ2)(x0),…,F(τκ)(x0)).

With these definitions in place, we can define the stochastic B–series:

###### Definition 2.3 (B–series).
\thlabel

bsa2:def:bseries A (stochastic) B–series is a formal series of the form

 B(ϕ,x0;h)=∑τ∈Tα(τ)⋅ϕ(τ)(h)⋅F(τ)(x0),

where assigns to each tree a random variable, and is given by

 α(∅) =1, α(∙l) =1, α(τ=[τ1,…,τκ]l) =1r1!r2!⋯rq!κ∏j=1α(τj),

where count equal trees among .

Note that is the inverse of the order of the automorphism group of .

To simplify the presentation, in the following we assume that all elementary differentials exist and all considered B–series converge. Otherwise, one has to consider truncated B–series and discuss the remainder term.

The next lemma is proved in [7].

###### Lemma 2.1.
\thlabel

bsa2:lem:f_y If with is some B–series and , then can be written as a formal series of the form

 (8) f(Y(h))=∑u∈Ufβ(u)⋅ψϕ(u)(h)⋅G(u)(x0),

where

a):

is a set of trees derived from as follows: , and if
then ,

b):

and
,

c):

and ,
where count equal trees among ,

d):

and .

For notational convenience, in the following the -dependency of the weight functions and the -dependency of the elementary differentials will be suppressed whenever this is unambiguous, so will be written as and as .

The next step is to present the composition rule for B–series. In the deterministic case, this is e. g. given by [10], using ordered trees. The same rule applies for multicolored trees, as in the stochastic case. But it is also possible to present the result without relying on ordered trees, as done in [8], and this is the approach that will be used in the following.

Consider triples consisting of some , a subtree sharing the root with , and a remainder multiset of trees left over when is removed from . We also include the empty tree as a possible subtree, in which case the triple becomes .

###### Example 2.1.
\thlabel

bsa2:ex:triples Two examples of such triples are

 (\pstree\TC∗\leavevmode\nobreak [tnpos=r]j\TC∗\leavevmode\nobreak [tnpos=l]j\TC∗\leavevmode\nobreak [tnpos=r]j\pstree\TC∗\leavevmode\nobreak [tnpos=r]j\TC∗\leavevmode\nobreak [tnpos=l]j\TC∗\leavevmode\nobreak [tnpos=r]j,\pstree\TC∗\leavevmode\nobreak [tnpos=r]j\TC∗\leavevmode\nobreak [tnpos=l]j\TC∗\leavevmode\nobreak [tnpos=r]j,{\TC∗\leavevmode\nobreak [tnpos=r]j,\TC∗\leavevmode\nobreak [tnpos=r]j,\TC∗\leavevmode\nobreak [tnpos=r]j}) and (\pstree\TC∗\leavevmode\nobreak [tnpos=r]j\TC∗\leavevmode\nobreak [tnpos=l]j\TC∗\leavevmode\nobreak [tnpos=r]j\pstree\TC∗\leavevmode\nobreak [tnpos=r]j\TC∗\leavevmode\nobreak [tnpos=l]j\TC∗\leavevmode\nobreak [tnpos=r]j,\pstree\TC∗\leavevmode\nobreak [tnpos=r]j\TC∗\leavevmode\nobreak [tnpos=l]j\TC∗\leavevmode\nobreak [tnpos=r]j,{\pstree\TC∗\leavevmode\nobreak [tnpos=r]j\TC∗\leavevmode\nobreak [tnpos=l]j\TC∗\leavevmode\nobreak [tnpos=r]j}).

So, for the same and there might be different ’s.

We next define as the set of all possible subtrees of together with the corresponding remainder multiset , that is, for each we have

 ST(∙l) ={(∅,∙l),(∙l,∅)}, ST(τ=[τ1,…,τκ]l) ={(ϑ,ω):ϑ=[ϑ1,…,ϑκ]l,ω={ω1,…,ωκ}, (ϑi,ωi)∈ST(τi),i=1,…,κ}∪(∅,τ).

We also have to take care of possible equal terms in the formula presented below. This is done as follows: For a given triple write first , where the latter only expresses that is composed by different nonempty trees, each appearing times, hence . Let . For , each is a subtree of some of the ’s, with corresponding remainder multisets . Assume that there are exactly different such triples each appearing exactly times so that . Finally, let be the distinct trees with multiplicity , , of the remainder multiset which are directly connected to the root of . Then, can be written as

 (9) τ=[¯δs11,…,¯δsp0p0,¯τr1111,…,¯τr1p11p1,…,¯τrq1q1,…,¯τrqpqqpq]l=[¯τR11,…,¯τRQQ]l,

where the rightmost expression above indicates that is composed by different trees each appearing times.

With these definitions, we can state the following theorem, proved in [8]:

###### Theorem 2.1 (Composition of B–series).
\thlabel

bsa2:thm:comp Let and . Then the B–series inserted into is again a B–series,

 B(ϕy,B(ϕx,x0;h);h)=B(ϕx∘ϕy,x0;h),

where

 (ϕx∘ϕy)(τ)=∑(ϑ,ω)∈ST(τ)γ(τ,ϑ,ω)(ϕy(ϑ)∏δ∈ωϕx(δ))(τ),

, and

 γ(τ,ϑ,ω)=R1!⋯RQ!s1!⋯sp0!r11!⋯rqpq!q∏i=1pi∏k=1γ(¯τik,¯ϑi,¯ωik)rik

for given by (9).

The combinatorial term gives the number of equal terms that will appear if the composition rule using ordered trees is preferred.

In general, the composition law is not linear, neither is it associative. It is, however, linear in its second operand. Further, if both , then the composition law can be turned into a group operation (Butcher group, see [6, 10, 11] for the deterministic case): The inverse element can be recursively computed by

 (10) (ϕ∘ϕ−1)(τ)=e(τ)≡{1ifτ=∅,0otherwise,

and associativity is proved by

 (11) B(ϕz,B(ϕy,B(ϕx,x0;h);h);h)=B(ϕx∘(ϕy∘ϕz),x0;h)=B((ϕx∘ϕy)∘ϕz),x0;h).

This holds even if .

The next result will be needed for the investigation of modified Newton iterations.

###### Lemma 2.2.
\thlabel

bsa2:lem:Bfprime If we have

 ∂2B(ϕy,x0;h)B(ϕx,x0;h)=B(ϕx∗ϕy,x0;h),

where the bi-linear operator is given by

 (12) (ϕx∗ϕy)(τ)=⎧⎨⎩0ifτ=∅,∑(ϑ,{δ})∈SP(τ)γ(τ,ϑ,{δ})⋅ϕy(ϑ)ϕx(δ)otherwise,

with

 SP(τ)={(ϑ,ω)∈ST(τ):\leavevmode\nobreak ω % contains exactly one element δ}.
###### Proof.

Written in full, the statement of the theorem claims that

 ∑ϑ∈T∑δ∈Tα(ϑ)α(δ)⋅ϕy(ϑ)ϕx(δ)⋅(∂F(ϑ)F(δ)) (13) =∑τ∈T∖{∅}α(τ)⎛⎝∑(ϑ,{δ})∈SP(τ)γ(τ,ϑ,{δ})⋅ϕy(ϑ)ϕx(δ)⎞⎠⋅F(τ).

This is true if

 (14) (∂F(ϑ)F(δ)) =∑τ∈A(ϑ,δ)β(τ,ϑ,δ)F(τ) and (15) α(ϑ)α(δ)β(τ,ϑ,δ) =α(τ)γ(τ,ϑ,{δ}),

where is the set of all ’s constructed by attaching to one of the vertices of . We will prove this by induction.

First, let . Since we have and (14) and (15) are trivially true with . Now, let . Then . As this gives , and again (15) is trivially true. Finally, let

 ϑ=[¯ϑr11,…,¯ϑrii,…,¯ϑrqq]l

with distinct trees . Then

 ∂F(ϑ)F(δ) =g(κϑ+1)l(F(δ),r1 timesF(¯ϑ1),…,F(¯ϑ1),…,rq % timesF(¯ϑq),…,F(¯ϑq)) +q∑i=1rig(κϑ)l(F(¯ϑ1),…,∂F(¯ϑi)F(δ),ri−1 timesF(¯ϑi),…,F(¯ϑi),…,F(¯ϑq)),

where , so is either

 τ=[δ,¯ϑr11,…,¯ϑrii,…,¯ϑrqq]l with β(τ,ϑ,δ)=1

or

 τ=[¯ϑr11,…,τi,¯ϑri−1i,…,¯ϑrqq]l with τi∈A(¯ϑi,δ) and β(τ,ϑ,δ)=riβ(τi,¯ϑi,δ).

In the first case, if for some , then and with . Otherwise the same is valid with . So (15) holds. In the second case, assume that our induction hypothesis (15) is true for all . We obtain

 α(τ)=riMα(τi)α(¯ϑi)α(ϑ) and γ(τ,ϑ,{δ})=Mγ(τi,¯ϑi,{δ})

with if for some and otherwise. It follows that

 α(τ)γ(τ,ϑ,{δ}) =riα(τi)α(¯ϑi)γ(τi,¯ϑi,{δ})α(ϑ)=α(ϑ)α(δ)riβ(τi,¯ϑi,δ) =α(ϑ)α(δ)β(τ,ϑ,δ).

### 2.2. B–series of the exact and the numerical solutions

From the results of the previous subsection, it is possible to find the B–series of the exact and numerical solutions. Here, the proofs are only sketched, for details consult [7, 8].

###### Theorem 2.2.
\thlabel

bsa2:thm:Bex The solution of (4) can be written as a B–series with

 φ(∅) ≡1, φ(∙l)(h) =Wl(h), φ(τ=[τ1,…,τκ]l)(h) =∫h0κ∏j=1φ(τj)(s)⋆dWl(s).
###### Proof.

Write the exact solution as some B–series . As , apply \threfbsa2:lem:f_y to and \threfbsa2:def:trees,bsa2:def:eldiff,bsa2:def:bseries to obtain

 (16) gl(B(φ,x0;h))=∑τ∈Tlα(τ)⋅φ′l(τ)(h)⋅F(τ)(x0)

in which

 φ′l(τ)(h)=⎧⎪ ⎪⎨⎪ ⎪⎩1if τ=∙l,κ∏j=1φ(τj)(h)ifτ=[τ1,…,τκ]l∈Tl.

Insert this into the SDE (3) and compare term by term. ∎

###### Theorem 2.3.
\thlabel

bsa2:thm:Bnum The numerical solution given by (5) can be written as a B–series

 Y1=B(Φ,x0;h)

with recursively defined by

 (17a) Φ(∅) ≡ 1, (17b) Φ(τ) = Φex(τ)+(Φ∘Φim)(τ).
###### Proof.

Write and insert this into (5). As , apply \threfbsa2:thm:comp, and compare term by term. ∎

To study the consistency of the numerical methods, we need to assign to each tree an order:

###### Definition 2.4 (Tree order).

The order of a tree respectively is defined by

 ρ(∅)=0,ρ(u=[τ1,…,τκ]f)=κ∑i=1ρ(τi),

and

 ρ(τ=[τ1,…,τκ]l)=κ∑i=1ρ(τi)+{1for l=0,12otherwise.