Stochastic B–series analysis of iterated Taylor methods
Abstract.
For stochastic implicit Taylor methods that use an iterative scheme to compute their numerical solution, stochastic B–series and corresponding growth functions are constructed. From these, convergence results based on the order of the underlying Taylor method, the choice of the iteration method, the predictor and the number of iterations, for Itô and Stratonovich SDEs, and for weak as well as strong convergence are derived. As special case, also the application of Taylor methods to ODEs is considered. The theory is supported by numerical experiments.
Key words and phrases:
Stochastic Taylor method, stochastic differential equation, iterative scheme, order, Newton’s method, weak approximation, strong approximation, growth functions, stochastic B–series2000 Mathematics Subject Classification:
65C30, 60H35, 65C20, 68U201. Introduction
Besides stochastic Runge–Kutta methods, one important class of schemes to approximate the solution of stochastic differential equations (SDEs) are stochastic Taylor methods. As in the deterministic setting [1], they are especially suitable for problems with not too high dimension, and here especially in the case of strong approximation, because weak approximation of lowdimensional problems can often be done more efficiently by numerically solving the corresponding deterministic PDE problem obtained by applying the FeynmanKac formula. For solving stiff SDEs, implicit methods have to be considered, as is illustrated in the following two examples.
Example 1.1 (see [12]).
Consider the linear ItôSDE
(1) 
with . We assume that the exact solution is meansquare stable, i. e.
which is the case if and only if To achieve that also the numerical approximation obtained with the (explicit) EulerMaruyama scheme with step size is meansquare stable, i. e. , we have to restrict the step size according to , whereas for the numerical approximations explode, . In contrast to this, the semiimplicit Euler scheme is meansquare stable without any step size restriction. For a numerical affirmation, see Figure 1.
Example 1.2.
Consider the following stochastic Van der Pol equation,
Application of the explicit Milstein scheme (see Example 1.3 with and ) with stepsize to approximate a solution path leads to an explosion of the approximation, see Figure 2(a), whereas application of the semiimplicit Milstein scheme, given by substituting by ( and in Example 1.3), yields (for the same Brownian path) the result of Figure 2(b).
Implicit stochastic Taylor methods have been considered both for strong [15, 17] and weak [15] approximation. For these methods, the approximation values are only given implicitly. However, in practice these implicit equations are solved by iterative schemes like simple iteration or Newton iteration. The “exact numerical” solution can be written in terms of B–series [8]. As we will prove in this paper, so can the iterated solution. Moreover, for each iteration scheme in question, we will define a growth function. Briefly explained, when the exact numerical and the times iterated solutions are both written in terms of B–series, then all terms of these series for which the growth function has a value not greater than coincide. Thus the growth functions give a quite exact description of the development of the iterations. B–series and corresponding growth functions for iterated solutions have been derived for Runge–Kutta methods applied to deterministic ordinary differential equations [14], differential algebraic equations [13], and SDEs [7]. Somewhat surprisingly, the growth functions are exactly the same in all these cases, and, as we will show in this paper, this also holds for implicit Taylor methods.
The outline of the paper is as follows: First, we will give the SDE to be solved and the iterated Taylor methods used for its approximation. In Section 2 stochastic B–series are introduced and some useful preliminary results are presented. The main results of the paper can be found in Section 3, where the B–series of the iterated solutions are developed and the before mentioned growth functions derived. In Section 4, these findings are interpreted in terms of the order of the overall scheme, giving concrete results on the order of the considered methods depending on the kind and number of iterations, both for SDEs and ODEs. Contrary to the results obtained for Runge–Kutta methods [7], the order of the iteration error is shown to be independent on whether Itô or Stratonovich SDEs, weak or strong convergence is considered. Finally, in Section 5 we present several numerical examples to support our theoretical findings.
Let be a probability space. We denote by the stochastic process which is the solution of a dimensional SDE defined by
(2) 
with an dimensional Wiener process and . As usual, (2) is construed as abbreviation of
(3) 
The integral w. r. t. the Wiener process has to be interpreted e. g. as Itô integral with or as Stratonovich integral with . We assume that the Borelmeasurable coefficients are sufficiently differentiable and chosen such that SDE (3) has a unique solution.
To simplify the presentation, we define , so that (3) can be written as
(4) 
Let a discretization with of the time interval with step sizes for be given. Now, we consider the general class of stochastic Taylor methods given by and
(5) 
for with , , , .
Example 1.3.
Consider the family of Milstein schemes applied to an Itô SDE with onedimensional noise,
(6)  
Here,
and the parameters indicate the degree of implicitness. When we have the explicit Milstein scheme, with , a semiimplicit scheme. In all cases, the method (6) can be written in the form (5) with
The terms and refer to the time and methoddependent part of each term: In this case
The notation will be explained in detail in Section 2.
What the general method (5) concerns, application of an iterative Newtontype method yields
(7) 
with some approximation to the Jacobian of and a predictor . In the following we assume that (7) can be solved uniquely at least for sufficiently small . To simplify the presentation, we assume further that all step sizes are constant, .
For the approximation there exist several common choices. If we choose to be the exact Jacobian , then we obtain the classical Newton iteration method for solving (5), with quadratic convergence. It will be denoted in the following as full Newton iteration. If we choose instead , then we obtain the so called modified Newton iteration method, which is only linearly convergent. Here, is independent of the iteration number , thus its computation is much cheaper and simpler than in the full Newton iteration case. The third and simplest possibility is to choose equal to zero. In this case we don’t even have to solve a linear system for . This iteration method is called simple iteration method or predictor corrector method. Its disadvantage is that for stiff systems it requires very small step sizes to converge.
For most stiff problems and problems with additive noise, the use of semiimplicit methods suffices. As will be demonstrated in Section 4 these have the advantage that less iterations are required to obtain the correct order of the underlying method.
2. Stochastic B–series
B–series, symbolized by , for SDEs were first constructed by Burrage and Burrage [2, 3] to study strong convergence in the Stratonovich case. In the following years, this approach has been further developed by several authors to study weak and strong convergence, for the Itô and the Stratonovich case, see e. g. [7] for an overview. A uniform theory for the construction of stochastic B–series has been presented in [7], in [8] this approach has been used to construct order conditions for implicit Taylor methods. Following the exposition of these two papers, we define in this section stochastic B–series and present some preliminary results that will be used later.
2.1. Some useful definitions and preliminary results
Definition 2.1 (Trees).
\thlabelbsa2:def:trees The set of colored, rooted trees
is recursively defined as follows:
 a):

The graph with only one vertex of color belongs to .
Let be the tree formed by joining the subtrees each by a single branch to a common root of color .
 b):

If then .
Thus, is the set of trees with an colored root, and is the union of these sets.
Definition 2.2 (Elementary differentials).
\thlabelbsa2:def:eldiff For a tree the elementary differential is a mapping defined recursively by
 a):

,
 b):

,
 c):

If then
With these definitions in place, we can define the stochastic B–series:
Definition 2.3 (B–series).
\thlabelbsa2:def:bseries A (stochastic) B–series is a formal series of the form
where assigns to each tree a random variable, and is given by
where count equal trees among .
Note that is the inverse of the order of the automorphism group of .
To simplify the presentation, in the following we assume that all elementary differentials exist and all considered B–series converge. Otherwise, one has to consider truncated B–series and discuss the remainder term.
The next lemma is proved in [7].
Lemma 2.1.
\thlabelbsa2:lem:f_y If with is some B–series and , then can be written as a formal series of the form
(8) 
where
 a):

is a set of trees derived from as follows: , and if
then ,  b):

and
,  c):

and ,
where count equal trees among ,  d):

and .
For notational convenience, in the following the dependency of the weight functions and the dependency of the elementary differentials will be suppressed whenever this is unambiguous, so will be written as and as .
The next step is to present the composition rule for B–series. In the deterministic case, this is e. g. given by [10], using ordered trees. The same rule applies for multicolored trees, as in the stochastic case. But it is also possible to present the result without relying on ordered trees, as done in [8], and this is the approach that will be used in the following.
Consider triples consisting of some , a subtree sharing the root with , and a remainder multiset of trees left over when is removed from . We also include the empty tree as a possible subtree, in which case the triple becomes .
Example 2.1.
\thlabelbsa2:ex:triples Two examples of such triples are
So, for the same and there might be different ’s.
We next define as the set of all possible subtrees of together with the corresponding remainder multiset , that is, for each we have
We also have to take care of possible equal terms in the formula presented below. This is done as follows: For a given triple write first , where the latter only expresses that is composed by different nonempty trees, each appearing times, hence . Let . For , each is a subtree of some of the ’s, with corresponding remainder multisets . Assume that there are exactly different such triples each appearing exactly times so that . Finally, let be the distinct trees with multiplicity , , of the remainder multiset which are directly connected to the root of . Then, can be written as
(9) 
where the rightmost expression above indicates that is composed by different trees each appearing times.
With these definitions, we can state the following theorem, proved in [8]:
Theorem 2.1 (Composition of B–series).
\thlabelbsa2:thm:comp Let and . Then the B–series inserted into is again a B–series,
where
, and
for given by (9).
The combinatorial term gives the number of equal terms that will appear if the composition rule using ordered trees is preferred.
In general, the composition law is not linear, neither is it associative. It is, however, linear in its second operand. Further, if both , then the composition law can be turned into a group operation (Butcher group, see [6, 10, 11] for the deterministic case): The inverse element can be recursively computed by
(10) 
and associativity is proved by
(11) 
This holds even if .
The next result will be needed for the investigation of modified Newton iterations.
Lemma 2.2.
\thlabelbsa2:lem:Bfprime If we have
where the bilinear operator is given by
(12) 
with
Proof.
Written in full, the statement of the theorem claims that
(13) 
This is true if
(14)  
and  
(15) 
where is the set of all ’s constructed by attaching to one of the vertices of . We will prove this by induction.
First, let . Since we have and (14) and (15) are trivially true with . Now, let . Then . As this gives , and again (15) is trivially true. Finally, let
with distinct trees . Then
where , so is either
or
In the first case, if for some , then and with . Otherwise the same is valid with . So (15) holds. In the second case, assume that our induction hypothesis (15) is true for all . We obtain
with if for some and otherwise. It follows that
∎
2.2. B–series of the exact and the numerical solutions
From the results of the previous subsection, it is possible to find the B–series of the exact and numerical solutions. Here, the proofs are only sketched, for details consult [7, 8].
Proof.
Write the exact solution as some B–series . As , apply \threfbsa2:lem:f_y to and \threfbsa2:def:trees,bsa2:def:eldiff,bsa2:def:bseries to obtain
(16) 
in which
Insert this into the SDE (3) and compare term by term. ∎
Theorem 2.3.
\thlabelbsa2:thm:Bnum The numerical solution given by (5) can be written as a B–series
with recursively defined by
(17a)  
(17b) 
Proof.
Write and insert this into (5). As , apply \threfbsa2:thm:comp, and compare term by term. ∎
To study the consistency of the numerical methods, we need to assign to each tree an order:
Definition 2.4 (Tree order).
The order of a tree respectively is defined by
and