Application of Operator Splitting Methods in Finance

# Application of Operator Splitting Methods in Finance

Karel in ’t Hout Karel in ’t Hout Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium, 22email: karel.inthout@uantwerp.beJari Toivanen Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA 94305, USA,
Department of Mathematical Information Technology, FI-40014 University of Jyväskylä, Finland, 44email: toivanen@stanford.edu, jari.toivanen@jyu.fi
Jari Toivanen Karel in ’t Hout Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium, 22email: karel.inthout@uantwerp.beJari Toivanen Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA 94305, USA,
Department of Mathematical Information Technology, FI-40014 University of Jyväskylä, Finland, 44email: toivanen@stanford.edu, jari.toivanen@jyu.fi
###### Abstract

Financial derivatives pricing aims to find the fair value of a financial contract on an underlying asset. Here we consider option pricing in the partial differential equations framework. The contemporary models lead to one-dimensional or multidimensional parabolic problems of the convection-diffusion type and generalizations thereof. An overview of various operator splitting methods is presented for the efficient numerical solution of these problems.
Splitting schemes of the Alternating Direction Implicit (ADI) type are discussed for multidimensional problems, e.g. given by stochastic volatility (SV) models. For jump models Implicit-Explicit (IMEX) methods are considered which efficiently treat the nonlocal jump operator. For American options an easy-to-implement operator splitting method is described for the resulting linear complementarity problems.
Numerical experiments are presented to illustrate the actual stability and convergence of the splitting schemes. Here European and American put options are considered under four asset price models: the classical Black–Scholes model, the Merton jump-diffusion model, the Heston SV model, and the Bates SV model with jumps.

## 1 Introduction

In the contemporary international financial markets option products are widely traded. The average daily turnover in the global over-the-counter derivatives markets is huge. For example, in the foreign exchange market this was approximately equal to 337 billion US dollars in April 2013 BIS13 (). In addition to standard call and put options, the so-called vanilla options, a broad range of exotic derivatives exists. One of the primary goals of financial mathematics is to determine the fair values of these derivatives as well as their sensitivities to underlying variables and parameters, which are crucial for hedging. To this purpose, advanced mathematical models are employed nowadays, yielding initial-boundary value problems for time-dependent partial differential equations (PDEs) and generalizations thereof, see e.g. Andersen10 (); Clark11 (); Lipton01 (); Shreve08 (); Tavella00 (); Wilmott98 (). These problems are in general multidimensional and of the convection-diffusion kind. In some cases analytical formulas in semi-closed form for the exact solutions have been obtained in the literature. For the majority of option valuation problems, however, such formulas are not available. In view of this, one resorts to numerical methods for their approximate solution. To banks and other financial institutions, the efficient, stable, and robust numerical approximation of option values and their sensitivities is of paramount importance.

A well-known and versatile approach to the numerical solution of time-dependent convection-diffusion equations is given by the method of lines. It consists of two general, consecutive steps. In the first step the PDE is discretized in the spatial variables, e.g. by finite difference, finite volume, or finite element methods. This leads to a so-called semidiscrete system of ordinary differential equations. In the second step the obtained semidiscrete system is numerically solved by applying a suitable, implicit time-discretization method. If the PDE is multidimensional, then the latter task can be computationally very intensive when standard application of classical implicit methods, such as the Crank–Nicolson scheme, is used. In the recent years, a variety of operator splitting methods have been developed that enable a highly efficient and stable numerical solution of semidiscretized multidimensional PDEs and generalizations thereof that arise in financial mathematics.

The aim of this chapter to give an overview of main classes of operator splitting methods with applications in finance. Here we have chosen to consider a variety of, increasingly sophisticated, models that are well-known in the financial option valuation literature.

We deal in the following with two basic types of options, involving a given so-called strike price and a given maturity time , where today is always denoted by time . A European call (put) option is a contract between two parties, the holder and the writer, which gives the holder the right to buy from (sell to) the writer a prescribed asset for the price at the future date . An American call (put) option is the same, except that the holder can exercise at any time between today and the maturity date. An option is a right and not an obligation. The underlying asset can be a stock, a foreign currency, a commodity, etc. For a detailed introduction to financial options we refer to Hull11 (). Clearly, an option has value and a central question in financial mathematics is what its fair value is.

## 2 Models for Underlying Assets

### 2.1 Geometric Brownian Motion

The seminal papers by Black & Scholes Black73 () and Merton Merton73 () present a key equation for the fair values of European call and put options. In these papers the dynamics of the underlying asset price is modeled by the stochastic differential equation (SDE)

 dS(t)=μS(t)dt+σS(t)dW(t)(t≥0). (1)

Here denotes the Wiener process or standard Brownian motion, and , are given real parameters that are called the drift and the volatility, respectively. The volatility is a degree for the uncertainty of the return realized on the asset.

The SDE (1) describes a so-called geometric Brownian motion, which satisfies whenever . Under this asset price model and several additional assumptions, Black, Scholes, and Merton derived the famous partial differential equation (PDE)

 ∂u∂t=12σ2s2∂2u∂s2+rs∂u∂s−ru(s>0,0

Here represents the fair value at time of a European vanilla option if . The quantity in (2) is the risk-free interest rate and is given. A main consequence of the Black, Scholes, and Merton analysis is that the drift actually does not appear in the option pricing PDE. This observation has led to the important risk-neutral valuation theory. It is beyond the scope of the present chapter to discuss this theory in more detail, but see e.g. Hull11 (); Shreve08 ().

In formulating (2) we have chosen as the time till maturity. Thus the time runs in the opposite direction compared to (1). Accordingly, the payoff function , which defines the value of the option contract at maturity time , leads to an initial condition

 u(s,0)=ϕ(s)(s≥0). (3)

For a European vanilla option with given strike price there holds

 ϕ(s)={max(s−K,0)for  s≥0  (call),max(K−s,0)for  s≥0  (put), (4)

and at one has the Dirichlet boundary condition

 u(0,t)={0       for  0≤t≤T  (call),e−rtKfor  0≤t≤T  (put). (5)

Equation (2) is called the Black–Scholes PDE or Black–Scholes–Merton PDE. It is fully deterministic and it can be viewed as a time-dependent convection-diffusion-reaction equation. For European vanilla options, an analytical solution in semi-closed form was derived in Black73 (), constituting the well-known Black–Scholes formula.

The Black–Scholes PDE is generic in the sense that it is valid for a wide range of European-style options. The initial and boundary conditions are determined by the specific option. As an example, for a European up-and-out call option with given barrier , the PDE (2) holds whenever . In this case, the initial condition is

 u(s,0)=max(s−K,0)for  0≤s

and one has the Dirichlet boundary conditions

 u(0,t)=u(B,t)=0for  0≤t≤T.

The homogeneous condition at corresponds to the fact that, by construction, an up-and-out call option becomes worthless whenever the underlying asset price moves above the barrier.

For many types of options, including (continuous) barrier options, semi-analytical pricing formulas have been obtained in the literature in the Black–Scholes framework, see e.g. Hull11 (). At present it is well-known, however, that each of the assumptions underlying this framework are violated to a smaller or larger extent in practice. In particular, the interest rate and the volatility are not constant, but vary in time. In view of this, more advanced asset pricing models have been developed and, as a consequence, more advanced option valuation PDEs are obtained. In this chapter we do not enter into the details of the mathematical connection between asset price SDEs and option valuation PDEs, but mention that a main tool is the celebrated Feynman–Kac theorem, see e.g. Shreve08 (). In the following we discuss typical, contemporary instances of more advanced option valuation PDEs.

### 2.2 Stochastic Volatility and Stochastic Interest Rate Models

Heston Heston93 () modeled the volatility itself by a SDE. The Heston stochastic volatility model is popular especially in the foreign exchange markets. The corresponding option valuation PDE is

 ∂u∂t=12s2v∂2u∂s2+ρσsv∂2u∂s∂v+12σ2v∂2u∂v2+rs∂u∂s+κ(η−v)∂u∂v−ru (6)

for , , and . Here represents the fair value of a European-style option if at time units before maturity the asset price equals and the variance equals . We note that by definition the variance is the square of the volatility. The positive parameters and are the mean-reversion rate and long-term mean, respectively, of the variance, is the volatility-of-variance, and denotes the correlation between the two underlying Brownian motions. Equation (6) is called the Heston PDE. It can be viewed as a time-dependent convection-diffusion-reaction equation on an unbounded, two-dimensional spatial domain. If the correlation is nonzero, which almost always holds in practice, then the Heston PDE contains a mixed spatial derivative term.

For a European vanilla option under the Heston model, one has an initial condition as well as a boundary condition at that are the same as in the Black–Scholes case discussed above. In the Heston case there is also a boundary . Observe that as , then all second-order derivative terms vanish in (6). It has been proved in Ekstrom10 () that for the fair option value function the Heston PDE is fulfilled if , which constitutes the (nonstandard) boundary condition at .

For the Heston asset pricing model (which we did not explicitly formulate) the so-called Feller condition is often considered in the literature. This condition determines whether or not the variance process can attain the value zero (given a strictly positive initial variance): it cannot attain zero if and only if Feller holds. The situation where the Feller condition is violated is well-known to be challenging when numerically solving the Heston asset pricing model. For the Heston option valuation PDE (6), on the other hand, it turns out that this issue is not critical in the numerical solution.

A refinement of the Heston model is obtained by considering also a stochastic interest rate, see e.g. Grzelak11 (); Grzelak12 (); Haentjens13a (); Haentjens12 (). As an illustration we consider the case where the interest rate is described by the well-known Hull–White model Hull11 (); Hull90 (). This leads to the following so-called Heston–Hull–White (HHW) PDE for the option value function :

 ∂u∂t= 12s2v∂2u∂s2+12σ21v∂2u∂v2+12σ22∂2u∂r2+ρ12σ1sv∂2u∂s∂v+ρ13σ2s√v∂2u∂s∂r +ρ23σ1σ2√v∂2u∂v∂r+rs∂u∂s+κ(η−v)∂u∂v+a(b(T−t)−r)∂u∂r−ru (7)

for , , , and . Here , , , , and are given positive real constants and denotes a given deterministic, positive function of time. Further, there are given correlations , , . Clearly, the HHW PDE is a time-dependent convection-diffusion-reaction equation on an unbounded, three-dimensional spatial domain with three mixed derivative terms. For a European vanilla option, initial and boundary conditions are the same as in the Heston case above. Note that if , then all second-order derivative terms, apart from the term, vanish in (2.2).

The Heston and HHW models are two of many instances of asset pricing models that lead to multidimensional option valuation PDEs. Multidimensional PDEs are also obtained when considering other types of options, e.g. options on a basket of assets. Then, in the Black–Scholes framework, the dimension of the PDE is equal to the number of assets. In general, analytical solutions in (semi-)closed form to these PDEs are not available.

### 2.3 Jump Models

Sometimes the value of the underlying asset changes so rapidly that this would have very tiny probability under the above Brownian motion based models. For example, the stock price during a market crash or after a major news event can move very fast. Already in 1976, Merton proposed in Merton76 () to add a jump component in the model of the underlying asset price. In his model, the jumps are log-normally distributed and their arrival times follow a Poisson process. After a jump the value of the asset is obtained by multiplying the value before the jump by a random variable with the probability density function (PDF)

 f(y)=1yδ√2πexp(−(logy−γ)22δ2) (8)

for , where is the mean of the normal distribution and is its standard deviation. Kou proposed in Kou02 () a log-double-exponential distribution defined by the PDF

 (9)

where , and are positive constants such that . These models have finite jump activity which is denoted by here. There are also many popular infinite jump activity models like the CGMY model Carr02 (). In the following we shall consider only finite activity models.

The value of a European option satisfies the partial integro-differential equation (PIDE)

 ∂u∂t=12σ2s2∂2u∂s2+(r−λζ)s∂u∂s−(r+λ)u+λ∫∞0u(sy,t)f(y)dy (10)

for and , where is the mean jump size given by

 ζ=∫∞0(y−1)f(y)dy. (11)

For the Merton and Kou models the mean jumps are and , respectively.

Bates proposed to combine the Heston stochastic volatility model and the Merton jump model in Bates96 (). Under this model the value of a European option satisfies the PIDE

 ∂u∂t=12s2v∂2u∂s2+ρσsv∂2u∂s∂v+12σ2v∂2u∂v2+(r−λζ)s∂u∂s+κ(η−v)∂u∂v−(r+λ)u+λ∫∞0u(sy,v,t)f(y)dy (12)

for , , and , where the PDF is given by (8). For an extensive discussion on jump models in finance see e.g. Cont04 ().

## 3 Linear Complementarity Problem for American Options

Unlike European-style options, American-style options can be exercised at any time up to the maturity date. Hence, the fair value of an American option is always greater than or equal to the instantaneous payoff,

 u≥ϕ. (13)

Due to this early exercise constraint, the P(I)DE does not hold everywhere anymore. Instead, a linear complementarity problem (LCP) or partial (integro-)differential complementarity problem is obtained in general for the fair value of an American option:

 ⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩∂u∂t≥Au,u≥ϕ,(∂u∂t−Au)(u−ϕ)=0, (14)

where stands for the pertinent spatial differential operator. For example, for the Black–Scholes model,

 Au=12σ2s2∂2u∂s2+rs∂u∂s−ru.

The above inequalities and equation hold pointwise. The equation in (14) is the complementarity condition. It states that at each point one of the two inequalities has to be an equality. The paper Huang98 () discusses the LCP formulation for American-style options under various asset price models and studies the structure and properties of the obtained fully discrete LCPs.

We note that the penalty approach is a popular alternative for LCPs. Here a penalty term is added to the P(I)DE for a European option with the aim to enforce the early exercise constraint (13). The resulting problems are nonlinear and their efficient numerical solution is considered in Forsyth02 (), for example. For several other alternative formulations and approximations for LCPs, we refer to Toivanen10b ().

## 4 Spatial Discretization

In this chapter we employ finite difference (FD) discretizations for the spatial derivatives. An alternative approach would be to use finite element discretizations; see e.g. Achdou05 (); Seydel12 (). It is common practice to first truncate the infinite -domain to with a sufficiently large, real . Typically one wishes to be such that the error caused by this truncation is a small fraction of the error due to the discretization of the differential (and integral) operators. Similarly, with multidimensional models including the variance or the interest rate , their corresponding infinite domains are truncated to sufficiently large bounded domains. The truncation requires additional boundary conditions to be specified. For an actual choice of these conditions for the models considered in Sections 2, 3 we refer to Section 7.

Let the grid in the -direction be defined by the grid points . The corresponding grid sizes are denoted by ,  . For multidimensional models, we use tensor product grids. For example, in the case of a stochastic volatility model, if a grid for the variance is given by , then spatial grid points are defined by with and . In financial applications nonuniform grids are often preferable over uniform grids. The use of suitable nonuniform grids will be illustrated in Section 7.

For discretizing the first derivative and the second derivative at , we employ in this chapter the well-known central FD schemes

 ∂ui∂s≈−Δsi+1Δsi(Δsi+Δsi+1)ui−1+Δsi+1−ΔsiΔsiΔsi+1ui+Δsi(Δsi+Δsi+1)Δsi+1ui+1 (15)

and

 ∂2ui∂s2≈2Δsi(Δsi+Δsi+1)ui−1−2ΔsiΔsi+1ui+2(Δsi+Δsi+1)Δsi+1ui+1. (16)

With multidimensional models the analogous schemes are used for the other spatial directions, thus e.g. for and at . For the mixed derivative at we consider the 9-point stencil obtained by successively applying the central FD schemes for the first derivative in the - and -directions. With sufficiently smooth varying grid sizes, the above central FDs give second-order accurate approximations for the derivatives.

We mention that in financial applications other FD schemes are employed as well, such as upwind discretization for first derivative terms or alternative discretizations for mixed derivative terms.

With the jump models the integral term needs to be discretized at grid points . First the integral is divided into two parts

 ∫∞0u(siy,t)f(y)dy=∫Smax/si0u(siy,t)f(y)dy+∫∞Smax/siu(siy,t)f(y)dy,

which correspond to the values of in the computational domain and outside of it, respectively. The second part can be estimated using knowledge about in the far field . For example, for put options is usually assumed to be close to zero for and, thus, the second integral is approximated by zero in this case. The PDFs are smooth functions apart from the potential jump at in the Kou model. Due to the smoothness of the integrand the trapezoidal rule leads to second-order accuracy with respect to the grid size. This gives the approximation

 ∫Smax/si0u(siy,t)f(y)dy≈m1∑j=1Δsj2si(u(sj−1,t)f(sj−1/si)+u(sj,t)f(sj/si)).

For example, the papers Salmi11 () and Toivanen08 () describe more accurate quadrature rules for the Merton and Kou jumps models, respectively. The discretization of the integral term leads to a dense matrix. The integral can be transformed into a convolution integral and due to this FFT can be used to compute it more efficiently; see Almendral05 (); Andersen00 (); dHalluin05 (); Tavella00 (), for example. In the case of the Kou model, efficient recursion formulas can be used Carr07 (); Toivanen08 ().

## 5 Time Discretization

### 5.1 The θ-method

For any P(I)DE from Section 2, the spatial discretization outlined in Section 4 leads to an initial value problem for a system of ordinary differential equations,

 ˙U(t)=A(t)U(t)+G(t)(0≤t≤T),U(0)=U0. (17)

Here for is a given square real matrix and is a given real vector that depends on the boundary conditions. The entries of the solution vector represent approximations to the exact solution of the option valuation P(I)DE at the spatial grid points, ordered in a convenient way. The vector is given by direct evaluation of the option’s payoff function at these grid points.

The semidiscrete system (17) is stiff in general and, hence, implicit time discretization methods are natural candidates for its numerical solution. Let parameter be given. Let time step with integer and temporal grid points for integers . The -method  forms a well-known implicit time discretization method. It generates approximations to successively for by

 Un=Un−1+(1−θ)ΔtA(tn−1)Un−1+θΔtA(tn)Un+ΔtGn−1+θ, (18)

where denotes an approximation to at . This can also be written as

 (I−θΔtA(tn))Un=(I+(1−θ)ΔtA(tn−1))Un−1+ΔtGn−1+θ,

with the identity matrix of the same size as . For one obtains the first-order backward Euler method  and for the second-order Crank–Nicolson method  or  trapezoidal rule. For simplicity we consider in this chapter only constant time steps, but most of the presented time discretization methods can directly be extended to variable time steps.

When applying the Crank–Nicolson method, it is common practice in finance to first perform a few backward Euler steps to start the time stepping. This is often called Rannacher smoothing Rannacher84 (). It helps to damp high-frequency components in the numerical solution, due to the nonsmooth initial (payoff) function, which are usually not sufficiently damped by the Crank–Nicolson method itself.

Clearly, in order to compute the vector defined by (18), one has to solve a linear system of equations with the matrix . When the option valuation PDE is multidimensional, the size of this matrix is usually very large and it possesses a large bandwidth. For a PIDE, this matrix is dense. In these situations, the solution of the linear system can be computationally demanding when standard methods, like LU decomposition, are applied. Time discretization methods based on operator splitting can then form an attractive alternative. The key idea is to split the matrix into several parts, each of which is numerically handled more easily than the complete matrix itself.

### 5.2 Operator Splitting Methods Based on Direction

For multidimensional PDEs, splitting schemes of the Alternating Direction Implicit (ADI) type are often applied in financial practice. To illustrate the idea, the two-dimensional Heston PDE and three-dimensional HHW PDE, given in Section 2.2, are considered. For the Heston PDE the semidiscrete system (17) is autonomous; we split

 A=A0+A1+A2.

Next, for the HHW PDE,

 A(t)=A0+A1+A2+A3(t).

Here is chosen as the part that represents all mixed derivative terms. It is nonzero whenever (one of) the correlation factor(s) is nonzero. The parts , , and represent all spatial derivatives in the -, -, and -directions, respectively. The latter three matrices have, possibly up to permutation, all a fixed small bandwidth. The vector in the semidiscrete system is splitted in a similar way. For notational convenience, define functions by

 Fj(t,V)=AjV+Gj  (j=0,1,2)  and  F3(t,V)=A3(t)V+G3(t)

for , . Set with for Heston and for HHW. We discuss in this section four contemporary ADI-type splitting schemes:

Douglas (Do) scheme

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Y0=Un−1+ΔtF(tn−1,Un−1),Yj=Yj−1+θΔt(Fj(tn,Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),Un=Yk. (19)

Craig–Sneyd (CS) scheme

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Y0=Un−1+ΔtF(tn−1,Un−1),Yj=Yj−1+θΔt(Fj(tn,Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),˜Y0=Y0+12Δt(F0(tn,Yk)−F0(tn−1,Un−1)),˜Yj=˜Yj−1+θΔt(Fj(tn,˜Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),Un=˜Yk. (20)

Modified Craig–Sneyd (MCS) scheme

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Y0=Un−1+ΔtF(tn−1,Un−1),Yj=Yj−1+θΔt(Fj(tn,Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),ˆY0=Y0+θΔt(F0(tn,Yk)−F0(tn−1,Un−1)),˜Y0=ˆY0+(12−θ)Δt(F(tn,Yk)−F(tn−1,Un−1)),˜Yj=˜Yj−1+θΔt(Fj(tn,˜Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),Un=˜Yk. (21)

Hundsdorfer–Verwer (HV) scheme

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Y0=Un−1+ΔtF(tn−1,Un−1),Yj=Yj−1+θΔt(Fj(tn,Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),˜Y0=Y0+12Δt(F(tn,Yk)−F(tn−1,Un−1)),˜Yj=˜Yj−1+θΔt(Fj(tn,˜Yj)−Fj(tn,Yk))(j=1,2,…,k),Un=˜Yk. (22)

In the Do scheme (19), a forward Euler predictor step is followed by  implicit but unidirectional corrector steps that serve to stabilize the predictor step. The CS scheme (20), the MCS scheme (21), and the HV scheme (22) can be viewed as different extensions to the Do scheme. Indeed, their first two lines are identical to those of the Do scheme. They next all perform a second predictor step, followed by  unidirectional corrector steps. Observe that the CS and MCS schemes are equivalent if (and only if) .

Clearly, in all four ADI schemes the part, representing all mixed derivatives, is always treated in an explicit  fashion. In the original formulation of ADI schemes mixed derivative terms were not considered. It is a common and natural use in the literature to refer to the above, extended schemes also as ADI schemes. In the special case where , the CS scheme reduces to the Do scheme, but the MCS scheme (with ) and the HV scheme do not. Following the original ADI approach, the , , parts are treated in an implicit  fashion. In every step of each scheme, systems of linear equations need to be solved involving the matrices for as well as if . Since all these matrices have a fixed, small bandwidth, this can be done very efficiently by means of LU decomposition, cf. also Section 6.1. Because for the pertinent matrices are further independent of the step index , their LU decompositions can be computed once, beforehand, and then used in all time steps. Accordingly, for each ADI scheme, the number of floating point operations per time step is directly proportional to the number of spatial grid points, which is a highly favorable property.

By Taylor expansion one obtains (after some elaborate calculations) the classical order of consistency111That is, the order for fixed nonstiff ODE systems. of each ADI scheme. For any given , the order of the Do scheme is just one whenever is nonzero. This low order is due to the fact that the part is treated in a simple, forward Euler fashion. The CS scheme has order two provided . The MCS and HV schemes are of order two for any given . A virtue of ADI schemes, compared to other operator splitting schemes based on direction, is that the internal vectors , form consistent approximations to .

The Do scheme can be regarded as a generalization of the original ADI schemes for two-dimensional diffusion equations by Douglas & Rachford Douglas56 () and Peaceman & Rachford Peaceman55 () to the situation where mixed derivative terms are present. This generalization was first considered by McKee & Mitchell McKee70 () for diffusion equations and subsequently in McKee96 () for convection-diffusion equations.

The CS scheme was developed by Craig & Sneyd Craig88 () with the aim to obtain a stable second-order ADI scheme for diffusion equations with mixed derivative terms.

The MCS scheme was constructed by In ’t Hout & Welfert intHout09b () so as to arrive at more freedom in the choice of as compared to the second-order CS scheme.

The HV scheme was designed by Hundsdorfer Hundsdorfer02 () and Verwer et. al. Verwer99 () for the numerical solution of convection-diffusion-reaction equations arising in atmospheric chemistry, cf. also Hundsdorfer03 (). The application of the HV scheme to equations containing mixed derivative terms was first studied in intHout07 (); intHout09b ().

The Do and CS schemes are well-known for PDEs in finance, see e.g. Andersen10 (); Lipton01 (). More recently, the MCS and HV schemes have gained interest, see e.g. Clark11 (); Dang10 (); Egloff11 (); Haentjens13a (); Haentjens12 (); intHout10 (); Itkin11 ().

The formulation of the ADI schemes (19)–(22) is analogous to the type of formulation used in Hundsdorfer02 (). In the literature, ADI schemes are also sometimes referred to as Stabilizing Correction schemes, and are further closely related to Approximate Matrix Factorization methods and Implicit-Explicit (IMEX) Runge–Kutta methods, cf. e.g. Hundsdorfer03 ().

In intHout11 (); intHout13 (); intHout07 (); intHout09b () comprehensive stability results in the von Neumann sense have been derived for the four schemes (19)–(22) in the application to multidimensional convection-diffusion equations with mixed derivative terms. These results concern unconditional stability, that is, without any restriction on the time step . For each ADI scheme, lower bounds on guaranteeing unconditional stability have been obtained, depending in particular on the spatial dimension. Based on these theoretical stability results and the numerical experience in Haentjens13a (); Haentjens12 (); intHout10 () the following values are found to be useful for :

• Do scheme     with (if ) and (if )

• CS scheme     with

• MCS scheme with (if ) and (if )

• HV scheme    with .

Here , which is a measure for the relative size of the mixed derivative coefficients.

In addition to ADI schemes, there exists a variety of well-known alternative operator splitting schemes based on direction, called Locally One-Dimensional (LOD) methods, fractional step methods, or componentwise splitting schemes. These schemes originate in the 1960s in the work by Dyakonov, Marchuk, Samarskii, Yanenko, and others. Some of them are related to Strang splitting schemes, developed at the same time. For a general overview and analysis of such methods we refer to Hundsdorfer03 (); Marchuk90 (). Applications in financial mathematics of these schemes are considered in, for example, Ikonen07a (); Toivanen10a ().

### 5.3 Operator Splitting Methods Based on Operator Type

For the jump models considered in Section 2.3 the semidiscrete matrix can be written in the form

 A=D+J, (23)

where and correspond to the differential operator and integral operator, respectively. The matrix is sparse while in general is a dense matrix or has dense blocks. In view of the different nature of these two matrices it can be preferable to employ an operator splitting method based on them.

In Andersen00 (), Andersen and Andreasen describe a generalized -method

 (I−θDΔtD−θJΔtJ)Un=(I+(1−θD)ΔtD+(1−θJ)ΔtJ)Un−1 (24)

assuming here . The standard choice and corresponds to the IMEX Euler method: it treats the stiff differential part implicitly, using the backward Euler method, and the nonstiff integral part explicitly, using the forward Euler method. This choice yields first-order consistency. The benefit is that it is not necessary to solve dense linear systems involving the matrix . Instead, in each time step only one multiplication with is required. This approach has been considered and analysed in Cont05 ().

In Feng08 () an extrapolation approach is advocated based on the IMEX Euler method. Here approximations at a given fixed time are computed for a decreasing sequence of step sizes and then linearly combined so as to achieve a high order of accuracy.

In Andersen00 () second-order consistency is obtained through an alternating treatment of the and parts. They propose to take a substep with and followed by a substep with and . Here linear systems involving the dense matrix need to be solved, for which the authors employ FFT.

In dHalluin05 () the original -method is analyzed, where the linear system in each time step is solved by applying a fixed-point iteration on the jump part following an idea in Tavella00 ().

The following, second-order IMEX midpoint scheme has been considered in e.g. Feng08 (); Kwon11a (); Kwon11b (); Salmi12 (),

 (I−ΔtD)Un=(I+ΔtD)Un−2+2ΔtJUn−1+2ΔtGn−1. (25)

The scheme (25) can be viewed as obtained from the semidiscrete system (17) at by the approximations and . Two subsequent second-order IMEX methods are the IMEX–CNAB scheme

 (I−Δt2D)Un=(I+Δt2D)Un−1+Δt2J(3Un−1−Un−2)+ΔtGn−1/2 (26)

and the IMEX–BDF2 scheme

 (32I−ΔtD)Un=2Un−1−12Un−2+ΔtJ(2Un−1−Un−2)+ΔtGn. (27)

These schemes have recently been applied for option pricing in Salmi14 () and can be regarded as obtained by approximating the semidiscrete system (17) at and at , respectively.

The IMEX schemes (25), (26), and (27) were studied in a general framework, without application to option valuation, in Frank97 (). Here it was noted that such schemes can be considered as starting with an implicit method and then replacing the nonstiff part of the implicit term by an explicit formula using extrapolation based on previous time steps. An overview of IMEX methods is given in Hundsdorfer03 ().

In general, IMEX methods are only conditionally stable, that is, they are stable for a sufficiently small time step . For example, the IMEX midpoint scheme (25) and the IMEX–CNAB scheme (26) are stable whenever and the term in (10) is included in ; see Salmi14 (). Recall that denotes the jump activity.

The schemes discussed in this section are of the linear multistep type. For IMEX schemes of Runge–Kutta type applied to jump models we mention Briani07 ().

### 5.4 Operator Splitting Method for Linear Complementarity Problems

The fully discrete LCPs obtained by spatial and temporal discretization of (14) for American-style options are more difficult to solve than the corresponding systems of linear equations for the European-style counterparts. It is desirable to split these LCPs into simpler subproblems. Here we describe the operator splitting method considered in Ikonen04 (); Ikonen09 () which was motivated by splitting methods for incompressible flows Chorin68 (); Glowinski86 (). To this purpose, we reformulate LCPs with Lagrange multipliers.

The -method discretization (18) naturally gives rise to the following, fully discrete LCP

 {BUn−CUn−1−ΔtGn−1+θ≥0,Un≥U0,(BUn−CUn−1−ΔtGn−1+θ)T(Un−U0)=0, (28)

where , , and is assumed to be constant in time. By introducing a Lagrange multiplier vector , the LCP (28) takes the equivalent form

 {BUn−CUn−1−ΔtGn−1+θ=Δtλn≥0,Un≥U0,(λn)T(Un−U0)=0. (29)

The basic idea of the operator splitting method proposed in Ikonen04 () is to decouple in (29) the first line from the second line. This is accomplished by approximating the Lagrange multiplier in the first line by the previous Lagrange multiplier . This leads to the system of linear equations

 B˜Un=CUn−1+ΔtGn−1+θ+Δtλn−1. (30)

After solving this system, the intermediate solution vector and the Lagrange multiplier are updated to satisfy the (spatially decoupled) equation and complementarity conditions

 {Un−˜Un=Δt(λn−λn−1),λn≥0,Un≥U0,(λn)T(Un−U0)=0. (31)

Thus, this operator splitting method for American options leads to the solution of linear systems (30), which are essentially the same as for European options, and a simple update step (31). This update can be performed very fast, at each spatial grid point independently, with the formula

 (Un,i , λn,i)=⎧⎪⎨⎪⎩(˜Un,i−Δtλn−1,i , 0),if ˜Un,i−Δtλn−1,i>U0,i ,(U0,i , λn−1,i+1Δt(U0,i−˜Un,i)),otherwise% . (32)

The above operator splitting approach has been studied for more advanced time discretization schemes of both linear multistep and Runge–Kutta type in Ikonen04 (); Ikonen09 (). Moreover, it has recently been effectively combined with IMEX schemes in Salmi12 () for the case of jump models and with ADI schemes in Haentjens15 () for the case of the Heston model. For instance, the pertinent adaptations of the IMEX–CNAB scheme and the MCS scheme are

 (I−Δt2D)˜Un=(I+Δt2D)Un−1+Δt2J(3Un−1−Un−2)+ΔtGn−1/2+Δtλn−1,

and

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Y0=Un−1+ΔtF(tn−1,Un−1)+Δtλn−1,Yj=Yj−1+θΔt(Fj(tn,Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),ˆY0=Y0+θΔt(F0(tn,Yk)−F0(tn−1,Un−1)),˜Y0=ˆY0+(12−θ)Δt(F(tn,Yk)−F(tn−1,Un−1)),˜Yj=˜Yj−1+θΔt(Fj(tn,˜Yj)−Fj(tn−1,Un−1))(j=1,2,…,k),˜Un=˜Yk,

respectively, followed by the update (32). The other three ADI schemes from Section 5.2 are adapted analogously. Note that only a term has been added to the first line of the MCS scheme (21). Accordingly, like for the -method, the amount of computational work per time step is essentially the same as for the corresponding European-style option.

## 6 Solvers for Algebraic Systems

The implicit time discretizations described in Section 5 lead, in each time step, to systems of linear equations of the form

 BU=Ψ (33)

or LCPs of the form

 {BU≥Ψ,U≥Φ,(BU−Ψ)T(U−Φ)=0 (34)

with given matrix and given vectors , . For models without jumps, semidiscretization by finite difference, finite volume, and finite element methods yields sparse matrices . For one-dimensional models, the central FDs (15) and (16) lead to tridiagonal . For higher dimensional models they give rise to matrices with a large bandwidth whenever classical (non-splitted) time stepping schemes are applied. On the other hand, for the operator splitting methods based on direction (cf. Section 5.2) one also acquires tridiagonal matrices (possibly after renumbering the unknowns). Wider FD stencils lead to additional nonzero diagonals. Time discretization of jump models with an implicit treatment of jumps makes dense.

### 6.1 Direct Methods

The system of linear equations (33) can be solved by a direct method using LU decomposition. This method first forms a lower triangular matrix and an upper triangular matrix such that . After this the solution vector is obtained by solving first and then .

Let denote the dimension of the matrix . For tridiagonal , or more generally matrices with a fixed small bandwidth, the LU decomposition yields optimal computational cost in the sense that the number of floating point operations is of order . Hence, it is very efficient for one-dimensional models and for higher-dimensional models when operator splitting schemes based on direction are applied.

For two-dimensional models with classical time stepping schemes, a LU decomposition can be formed by order floating point operations if a nested dissection method can be used and then the computational cost of the solution is of order , see Davis06 (); George73 (). For higher-dimensional models with classical time stepping schemes, the computational cost is less favorable.

For a general matrix , solving the LCP (34) requires iterative methods. However, in the special case that is tridiagonal, the solution vector satisfies (),  () for certain and some additional assumptions hold, the Brennan–Schwartz algorithm Brennan77 () gives a direct method to solve the LCP; see also Achdou05 (); Ikonen07b (); Jaillet90 (). After inverting the numbering of the unknowns to be from right to left, represented by a permutation matrix , this algorithm is equivalent to applying the LU decomposition method to the corresponding linear system with matrix where the projection step is carried out directly after computing each component in the back substitution step with . More precisely the back substitution step reads after the renumbering of unknowns:

 {Um=max{Vm/Um,m,Φm},Ui=max{(Vi−Ui,i+1Ui+1)/Ui,i,Φi}(i=m−1,m−2,…,1). (35)

The Brennan–Schwartz algorithm is essentially as fast as the LU decomposition method for linear systems and, thus, it has optimal computational cost.

### 6.2 Iterative Methods

There are many iterative methods for solving systems of linear equations. The two most important method categories are the stationary iterative methods and the Krylov subspace methods. Well-known Krylov subspace methods for the, typically unsymmetric, system (33) are the generalized minimal residual (GMRES) method Saad86 () and the BiCGSTAB method VanderVorst92 (). In the following we discuss a stationary iterative method in some more detail which is familiar in finance applications. The successive over-relaxation (SOR) method reads

 U(k+1)i=U(k)i+ωBi,i(Ψi−i−1∑j=1Bi,jU(k+1)j−m∑j=iBi,jU(k)j) (36)

for , where is a relaxation parameter. This method reduces to the Gauss–Seidel method in the case . The convergence rate of the iteration (36) can be improved significantly by an optimal choice of . Still the number of iterations to reach a given accuracy typically grows with , that is, when the spatial grid is refined the convergence slows down.

The SOR iteration can be generalized for LCPs by performing a projection after each update Cryer71 (); see also Glowinski84 (). This method is called the projected SOR (PSOR) method and it reads

 U(k+1)i=max{U(k)i+ωBi,i(Ψi−i−1∑j=1Bi,jU(k+1)j−m∑j=iBi,jU(k)j),Φi} (37)

. As can be expected, the PSOR method suffers from the same drawback as the SOR method mentioned above.

### 6.3 Multigrid Methods

The aim of multigrid methods for solving linear systems (33) is to render the number of iterations essentially independent of the problem size . The stationary iterative methods typically reduce high frequency errors quickly, while low frequency errors are reduced much more slowly. The idea of multigrid methods is to compute efficiently corrections to these slowly varying errors on coarser spatial grids. The multigrid methods can be divided into geometrical and algebraic methods. With the geometrical methods discretizations are explicitly constructed on a sequence of grids and transfer operators between these grids are explicitly defined. Algebraic multigrid (AMG) methods Ruge87 (); Stueben01 () build the coarse problems and the transfer operators automatically using the properties of the matrix . The details of these methods are beyond the scope of this chapter and we refer to e.g. Trottenberg01 () for details and extensive literature on this.

Several versions of multigrid methods also exist for LCPs. Brandt and Cryer introduced in Brandt83 () a projected full approximation scheme (PFAS) multigrid method for LCPs. American options under stochastic volatility were priced using the PFAS method in Clarke99 (); Oosterlee03 (). A projected multigrid (PMG) method for LCPs introduced in Reisinger04 () resembles more closely a classical multigrid method for linear problems. This method has been used to price American options in Ikonen08 (); Reisinger04 (). Recently, an AMG method was generalized for LCPs in Toivanen12 (). The resulting method is called the projected algebraic multigrid (PAMG) method and resembles the PMG method in the treatment of the complementarity conditions.

## 7 Numerical Illustrations

In the following we price European and American put options under a hierarchy of models: Black–Scholes, Merton, Heston, and Bates. The interest rate, the maturity time, and the strike price are always taken as

 r=0.03,  T=0.5,  and  K=100.

For the purpose of illustration, Fig. 1 and Fig. 2 show fair values of European and American options, respectively, under the four considered models with the model parameters described in the following sections. Figure 1: The fair values of European put options for the asset prices 75≤s≤125 and the volatility σ=0.2 (the variance v=0.04) under the four considered models. Figure 2: The fair values of American put options for the asset prices 75≤s≤125 and the volatility σ=0.2 (the variance v=0.04) under the four considered models.

### 7.1 Black–Scholes model

In the case of the Black–Scholes model, we price American put options. The volatility in the model (1) is taken as

 σ=0.2

and the following boundary conditions are employed:

 u(0,t)= K for   0

The Neumann boundary condition (39) introduces a modeling error as it is not exactly fulfilled by the actual option price function. If is taken sufficiently large, however, this error will be small in the region of interest.

For the spatial discretization of the Black–Scholes PDE (2), we apply FD formulas on nonuniform grids such that a large fraction of the grid points lie in the region of interest, that is, in the neighborhood of .

For the construction of the spatial grid we adopt Haentjens12 (). Let integer , constant , and be given. Let equidistant points be given with distance and

 ξmin =sinh−1(−Sleftc), ξint =Sright−Sleftc, ξmax =ξint+sinh−1(Smax−Srightc).

Then we define a nonuniform grid by the transformation

 si=φ(ξi)(0≤i≤m1), (40)

where

 φ(ξ)=⎧⎪⎨⎪⎩Sleft+c⋅sinh(ξ) (ξmin≤ξ≤0),Sleft+c⋅ξ (0<ξ<ξint),Sright+c⋅sinh(ξ−ξint) (ξint≤ξ≤