Application of Operator Splitting Methods in Finance
Abstract
Financial derivatives pricing aims to find the fair value of a
financial contract on an underlying asset. Here we consider
option pricing in the partial differential equations framework.
The contemporary models lead to onedimensional or multidimensional parabolic
problems of the convectiondiffusion type and generalizations thereof.
An overview of various operator splitting methods is presented for the
efficient numerical solution of these problems.
Splitting schemes of the Alternating Direction Implicit (ADI) type are
discussed for multidimensional problems, e.g. given by stochastic volatility
(SV) models. For jump models ImplicitExplicit (IMEX) methods are considered
which efficiently treat the nonlocal jump operator.
For American options an easytoimplement operator splitting method is
described for the resulting linear complementarity problems.
Numerical experiments are presented to illustrate the actual stability and
convergence of the splitting schemes.
Here European and American put options are considered under four asset price
models: the classical Black–Scholes model, the Merton jumpdiffusion model,
the Heston SV model, and the Bates SV model with jumps.
∎
1 Introduction
In the contemporary international financial markets option products are widely traded. The average daily turnover in the global overthecounter derivatives markets is huge. For example, in the foreign exchange market this was approximately equal to 337 billion US dollars in April 2013 BIS13 (). In addition to standard call and put options, the socalled vanilla options, a broad range of exotic derivatives exists. One of the primary goals of financial mathematics is to determine the fair values of these derivatives as well as their sensitivities to underlying variables and parameters, which are crucial for hedging. To this purpose, advanced mathematical models are employed nowadays, yielding initialboundary value problems for timedependent partial differential equations (PDEs) and generalizations thereof, see e.g. Andersen10 (); Clark11 (); Lipton01 (); Shreve08 (); Tavella00 (); Wilmott98 (). These problems are in general multidimensional and of the convectiondiffusion kind. In some cases analytical formulas in semiclosed form for the exact solutions have been obtained in the literature. For the majority of option valuation problems, however, such formulas are not available. In view of this, one resorts to numerical methods for their approximate solution. To banks and other financial institutions, the efficient, stable, and robust numerical approximation of option values and their sensitivities is of paramount importance.
A wellknown and versatile approach to the numerical solution of timedependent convectiondiffusion equations is given by the method of lines. It consists of two general, consecutive steps. In the first step the PDE is discretized in the spatial variables, e.g. by finite difference, finite volume, or finite element methods. This leads to a socalled semidiscrete system of ordinary differential equations. In the second step the obtained semidiscrete system is numerically solved by applying a suitable, implicit timediscretization method. If the PDE is multidimensional, then the latter task can be computationally very intensive when standard application of classical implicit methods, such as the Crank–Nicolson scheme, is used. In the recent years, a variety of operator splitting methods have been developed that enable a highly efficient and stable numerical solution of semidiscretized multidimensional PDEs and generalizations thereof that arise in financial mathematics.
The aim of this chapter to give an overview of main classes of operator splitting methods with applications in finance. Here we have chosen to consider a variety of, increasingly sophisticated, models that are wellknown in the financial option valuation literature.
We deal in the following with two basic types of options, involving a given socalled strike price and a given maturity time , where today is always denoted by time . A European call (put) option is a contract between two parties, the holder and the writer, which gives the holder the right to buy from (sell to) the writer a prescribed asset for the price at the future date . An American call (put) option is the same, except that the holder can exercise at any time between today and the maturity date. An option is a right and not an obligation. The underlying asset can be a stock, a foreign currency, a commodity, etc. For a detailed introduction to financial options we refer to Hull11 (). Clearly, an option has value and a central question in financial mathematics is what its fair value is.
2 Models for Underlying Assets
2.1 Geometric Brownian Motion
The seminal papers by Black & Scholes Black73 () and Merton Merton73 () present a key equation for the fair values of European call and put options. In these papers the dynamics of the underlying asset price is modeled by the stochastic differential equation (SDE)
(1) 
Here denotes the Wiener process or standard Brownian motion, and , are given real parameters that are called the drift and the volatility, respectively. The volatility is a degree for the uncertainty of the return realized on the asset.
The SDE (1) describes a socalled geometric Brownian motion, which satisfies whenever . Under this asset price model and several additional assumptions, Black, Scholes, and Merton derived the famous partial differential equation (PDE)
(2) 
Here represents the fair value at time of a European vanilla option if . The quantity in (2) is the riskfree interest rate and is given. A main consequence of the Black, Scholes, and Merton analysis is that the drift actually does not appear in the option pricing PDE. This observation has led to the important riskneutral valuation theory. It is beyond the scope of the present chapter to discuss this theory in more detail, but see e.g. Hull11 (); Shreve08 ().
In formulating (2) we have chosen as the time till maturity. Thus the time runs in the opposite direction compared to (1). Accordingly, the payoff function , which defines the value of the option contract at maturity time , leads to an initial condition
(3) 
For a European vanilla option with given strike price there holds
(4) 
and at one has the Dirichlet boundary condition
(5) 
Equation (2) is called the Black–Scholes PDE or Black–Scholes–Merton PDE. It is fully deterministic and it can be viewed as a timedependent convectiondiffusionreaction equation. For European vanilla options, an analytical solution in semiclosed form was derived in Black73 (), constituting the wellknown Black–Scholes formula.
The Black–Scholes PDE is generic in the sense that it is valid for a wide range of Europeanstyle options. The initial and boundary conditions are determined by the specific option. As an example, for a European upandout call option with given barrier , the PDE (2) holds whenever . In this case, the initial condition is
and one has the Dirichlet boundary conditions
The homogeneous condition at corresponds to the fact that, by construction, an upandout call option becomes worthless whenever the underlying asset price moves above the barrier.
For many types of options, including (continuous) barrier options, semianalytical pricing formulas have been obtained in the literature in the Black–Scholes framework, see e.g. Hull11 (). At present it is wellknown, however, that each of the assumptions underlying this framework are violated to a smaller or larger extent in practice. In particular, the interest rate and the volatility are not constant, but vary in time. In view of this, more advanced asset pricing models have been developed and, as a consequence, more advanced option valuation PDEs are obtained. In this chapter we do not enter into the details of the mathematical connection between asset price SDEs and option valuation PDEs, but mention that a main tool is the celebrated Feynman–Kac theorem, see e.g. Shreve08 (). In the following we discuss typical, contemporary instances of more advanced option valuation PDEs.
2.2 Stochastic Volatility and Stochastic Interest Rate Models
Heston Heston93 () modeled the volatility itself by a SDE. The Heston stochastic volatility model is popular especially in the foreign exchange markets. The corresponding option valuation PDE is
(6) 
for , , and . Here represents the fair value of a Europeanstyle option if at time units before maturity the asset price equals and the variance equals . We note that by definition the variance is the square of the volatility. The positive parameters and are the meanreversion rate and longterm mean, respectively, of the variance, is the volatilityofvariance, and denotes the correlation between the two underlying Brownian motions. Equation (6) is called the Heston PDE. It can be viewed as a timedependent convectiondiffusionreaction equation on an unbounded, twodimensional spatial domain. If the correlation is nonzero, which almost always holds in practice, then the Heston PDE contains a mixed spatial derivative term.
For a European vanilla option under the Heston model, one has an initial condition as well as a boundary condition at that are the same as in the Black–Scholes case discussed above. In the Heston case there is also a boundary . Observe that as , then all secondorder derivative terms vanish in (6). It has been proved in Ekstrom10 () that for the fair option value function the Heston PDE is fulfilled if , which constitutes the (nonstandard) boundary condition at .
For the Heston asset pricing model (which we did not explicitly formulate) the socalled Feller condition is often considered in the literature. This condition determines whether or not the variance process can attain the value zero (given a strictly positive initial variance): it cannot attain zero if and only if Feller holds. The situation where the Feller condition is violated is wellknown to be challenging when numerically solving the Heston asset pricing model. For the Heston option valuation PDE (6), on the other hand, it turns out that this issue is not critical in the numerical solution.
A refinement of the Heston model is obtained by considering also a stochastic interest rate, see e.g. Grzelak11 (); Grzelak12 (); Haentjens13a (); Haentjens12 (). As an illustration we consider the case where the interest rate is described by the wellknown Hull–White model Hull11 (); Hull90 (). This leads to the following socalled Heston–Hull–White (HHW) PDE for the option value function :
(7) 
for , , , and . Here , , , , and are given positive real constants and denotes a given deterministic, positive function of time. Further, there are given correlations , , . Clearly, the HHW PDE is a timedependent convectiondiffusionreaction equation on an unbounded, threedimensional spatial domain with three mixed derivative terms. For a European vanilla option, initial and boundary conditions are the same as in the Heston case above. Note that if , then all secondorder derivative terms, apart from the term, vanish in (2.2).
The Heston and HHW models are two of many instances of asset pricing models that lead to multidimensional option valuation PDEs. Multidimensional PDEs are also obtained when considering other types of options, e.g. options on a basket of assets. Then, in the Black–Scholes framework, the dimension of the PDE is equal to the number of assets. In general, analytical solutions in (semi)closed form to these PDEs are not available.
2.3 Jump Models
Sometimes the value of the underlying asset changes so rapidly that this would have very tiny probability under the above Brownian motion based models. For example, the stock price during a market crash or after a major news event can move very fast. Already in 1976, Merton proposed in Merton76 () to add a jump component in the model of the underlying asset price. In his model, the jumps are lognormally distributed and their arrival times follow a Poisson process. After a jump the value of the asset is obtained by multiplying the value before the jump by a random variable with the probability density function (PDF)
(8) 
for , where is the mean of the normal distribution and is its standard deviation. Kou proposed in Kou02 () a logdoubleexponential distribution defined by the PDF
(9) 
where , and are positive constants such that . These models have finite jump activity which is denoted by here. There are also many popular infinite jump activity models like the CGMY model Carr02 (). In the following we shall consider only finite activity models.
The value of a European option satisfies the partial integrodifferential equation (PIDE)
(10) 
for and , where is the mean jump size given by
(11) 
For the Merton and Kou models the mean jumps are and , respectively.
Bates proposed to combine the Heston stochastic volatility model and the Merton jump model in Bates96 (). Under this model the value of a European option satisfies the PIDE
(12) 
for , , and , where the PDF is given by (8). For an extensive discussion on jump models in finance see e.g. Cont04 ().
3 Linear Complementarity Problem for American Options
Unlike Europeanstyle options, Americanstyle options can be exercised at any time up to the maturity date. Hence, the fair value of an American option is always greater than or equal to the instantaneous payoff,
(13) 
Due to this early exercise constraint, the P(I)DE does not hold everywhere anymore. Instead, a linear complementarity problem (LCP) or partial (integro)differential complementarity problem is obtained in general for the fair value of an American option:
(14) 
where stands for the pertinent spatial differential operator. For example, for the Black–Scholes model,
The above inequalities and equation hold pointwise. The equation in (14) is the complementarity condition. It states that at each point one of the two inequalities has to be an equality. The paper Huang98 () discusses the LCP formulation for Americanstyle options under various asset price models and studies the structure and properties of the obtained fully discrete LCPs.
We note that the penalty approach is a popular alternative for LCPs. Here a penalty term is added to the P(I)DE for a European option with the aim to enforce the early exercise constraint (13). The resulting problems are nonlinear and their efficient numerical solution is considered in Forsyth02 (), for example. For several other alternative formulations and approximations for LCPs, we refer to Toivanen10b ().
4 Spatial Discretization
In this chapter we employ finite difference (FD) discretizations for the spatial derivatives. An alternative approach would be to use finite element discretizations; see e.g. Achdou05 (); Seydel12 (). It is common practice to first truncate the infinite domain to with a sufficiently large, real . Typically one wishes to be such that the error caused by this truncation is a small fraction of the error due to the discretization of the differential (and integral) operators. Similarly, with multidimensional models including the variance or the interest rate , their corresponding infinite domains are truncated to sufficiently large bounded domains. The truncation requires additional boundary conditions to be specified. For an actual choice of these conditions for the models considered in Sections 2, 3 we refer to Section 7.
Let the grid in the direction be defined by the grid points . The corresponding grid sizes are denoted by , . For multidimensional models, we use tensor product grids. For example, in the case of a stochastic volatility model, if a grid for the variance is given by , then spatial grid points are defined by with and . In financial applications nonuniform grids are often preferable over uniform grids. The use of suitable nonuniform grids will be illustrated in Section 7.
For discretizing the first derivative and the second derivative at , we employ in this chapter the wellknown central FD schemes
(15) 
and
(16) 
With multidimensional models the analogous schemes are used for the other spatial directions, thus e.g. for and at . For the mixed derivative at we consider the 9point stencil obtained by successively applying the central FD schemes for the first derivative in the  and directions. With sufficiently smooth varying grid sizes, the above central FDs give secondorder accurate approximations for the derivatives.
We mention that in financial applications other FD schemes are employed as well, such as upwind discretization for first derivative terms or alternative discretizations for mixed derivative terms.
With the jump models the integral term needs to be discretized at grid points . First the integral is divided into two parts
which correspond to the values of in the computational domain and outside of it, respectively. The second part can be estimated using knowledge about in the far field . For example, for put options is usually assumed to be close to zero for and, thus, the second integral is approximated by zero in this case. The PDFs are smooth functions apart from the potential jump at in the Kou model. Due to the smoothness of the integrand the trapezoidal rule leads to secondorder accuracy with respect to the grid size. This gives the approximation
For example, the papers Salmi11 () and Toivanen08 () describe more accurate quadrature rules for the Merton and Kou jumps models, respectively. The discretization of the integral term leads to a dense matrix. The integral can be transformed into a convolution integral and due to this FFT can be used to compute it more efficiently; see Almendral05 (); Andersen00 (); dHalluin05 (); Tavella00 (), for example. In the case of the Kou model, efficient recursion formulas can be used Carr07 (); Toivanen08 ().
5 Time Discretization
5.1 The method
For any P(I)DE from Section 2, the spatial discretization outlined in Section 4 leads to an initial value problem for a system of ordinary differential equations,
(17) 
Here for is a given square real matrix and is a given real vector that depends on the boundary conditions. The entries of the solution vector represent approximations to the exact solution of the option valuation P(I)DE at the spatial grid points, ordered in a convenient way. The vector is given by direct evaluation of the option’s payoff function at these grid points.
The semidiscrete system (17) is stiff in general and, hence, implicit time discretization methods are natural candidates for its numerical solution. Let parameter be given. Let time step with integer and temporal grid points for integers . The method forms a wellknown implicit time discretization method. It generates approximations to successively for by
(18) 
where denotes an approximation to at . This can also be written as
with the identity matrix of the same size as . For one obtains the firstorder backward Euler method and for the secondorder Crank–Nicolson method or trapezoidal rule. For simplicity we consider in this chapter only constant time steps, but most of the presented time discretization methods can directly be extended to variable time steps.
When applying the Crank–Nicolson method, it is common practice in finance to first perform a few backward Euler steps to start the time stepping. This is often called Rannacher smoothing Rannacher84 (). It helps to damp highfrequency components in the numerical solution, due to the nonsmooth initial (payoff) function, which are usually not sufficiently damped by the Crank–Nicolson method itself.
Clearly, in order to compute the vector defined by (18), one has to solve a linear system of equations with the matrix . When the option valuation PDE is multidimensional, the size of this matrix is usually very large and it possesses a large bandwidth. For a PIDE, this matrix is dense. In these situations, the solution of the linear system can be computationally demanding when standard methods, like LU decomposition, are applied. Time discretization methods based on operator splitting can then form an attractive alternative. The key idea is to split the matrix into several parts, each of which is numerically handled more easily than the complete matrix itself.
5.2 Operator Splitting Methods Based on Direction
For multidimensional PDEs, splitting schemes of the Alternating Direction Implicit (ADI) type are often applied in financial practice. To illustrate the idea, the twodimensional Heston PDE and threedimensional HHW PDE, given in Section 2.2, are considered. For the Heston PDE the semidiscrete system (17) is autonomous; we split
Next, for the HHW PDE,
Here is chosen as the part that represents all mixed derivative terms. It is nonzero whenever (one of) the correlation factor(s) is nonzero. The parts , , and represent all spatial derivatives in the , , and directions, respectively. The latter three matrices have, possibly up to permutation, all a fixed small bandwidth. The vector in the semidiscrete system is splitted in a similar way. For notational convenience, define functions by
for , . Set with for Heston and for HHW. We discuss in this section four contemporary ADItype splitting schemes:
Douglas (Do) scheme
(19) 
Craig–Sneyd (CS) scheme
(20) 
Modified Craig–Sneyd (MCS) scheme
(21) 
Hundsdorfer–Verwer (HV) scheme
(22) 
In the Do scheme (19), a forward Euler predictor step is followed by implicit but unidirectional corrector steps that serve to stabilize the predictor step. The CS scheme (20), the MCS scheme (21), and the HV scheme (22) can be viewed as different extensions to the Do scheme. Indeed, their first two lines are identical to those of the Do scheme. They next all perform a second predictor step, followed by unidirectional corrector steps. Observe that the CS and MCS schemes are equivalent if (and only if) .
Clearly, in all four ADI schemes the part, representing all mixed derivatives, is always treated in an explicit fashion. In the original formulation of ADI schemes mixed derivative terms were not considered. It is a common and natural use in the literature to refer to the above, extended schemes also as ADI schemes. In the special case where , the CS scheme reduces to the Do scheme, but the MCS scheme (with ) and the HV scheme do not. Following the original ADI approach, the , , parts are treated in an implicit fashion. In every step of each scheme, systems of linear equations need to be solved involving the matrices for as well as if . Since all these matrices have a fixed, small bandwidth, this can be done very efficiently by means of LU decomposition, cf. also Section 6.1. Because for the pertinent matrices are further independent of the step index , their LU decompositions can be computed once, beforehand, and then used in all time steps. Accordingly, for each ADI scheme, the number of floating point operations per time step is directly proportional to the number of spatial grid points, which is a highly favorable property.
By Taylor expansion one obtains (after some elaborate calculations) the classical order of consistency^{1}^{1}1That is, the order for fixed nonstiff ODE systems. of each ADI scheme. For any given , the order of the Do scheme is just one whenever is nonzero. This low order is due to the fact that the part is treated in a simple, forward Euler fashion. The CS scheme has order two provided . The MCS and HV schemes are of order two for any given . A virtue of ADI schemes, compared to other operator splitting schemes based on direction, is that the internal vectors , form consistent approximations to .
The Do scheme can be regarded as a generalization of the original ADI schemes for twodimensional diffusion equations by Douglas & Rachford Douglas56 () and Peaceman & Rachford Peaceman55 () to the situation where mixed derivative terms are present. This generalization was first considered by McKee & Mitchell McKee70 () for diffusion equations and subsequently in McKee96 () for convectiondiffusion equations.
The CS scheme was developed by Craig & Sneyd Craig88 () with the aim to obtain a stable secondorder ADI scheme for diffusion equations with mixed derivative terms.
The MCS scheme was constructed by In ’t Hout & Welfert intHout09b () so as to arrive at more freedom in the choice of as compared to the secondorder CS scheme.
The HV scheme was designed by Hundsdorfer Hundsdorfer02 () and Verwer et. al. Verwer99 () for the numerical solution of convectiondiffusionreaction equations arising in atmospheric chemistry, cf. also Hundsdorfer03 (). The application of the HV scheme to equations containing mixed derivative terms was first studied in intHout07 (); intHout09b ().
The Do and CS schemes are wellknown for PDEs in finance, see e.g. Andersen10 (); Lipton01 (). More recently, the MCS and HV schemes have gained interest, see e.g. Clark11 (); Dang10 (); Egloff11 (); Haentjens13a (); Haentjens12 (); intHout10 (); Itkin11 ().
The formulation of the ADI schemes (19)–(22) is analogous to the type of formulation used in Hundsdorfer02 (). In the literature, ADI schemes are also sometimes referred to as Stabilizing Correction schemes, and are further closely related to Approximate Matrix Factorization methods and ImplicitExplicit (IMEX) Runge–Kutta methods, cf. e.g. Hundsdorfer03 ().
In intHout11 (); intHout13 (); intHout07 (); intHout09b () comprehensive stability results in the von Neumann sense have been derived for the four schemes (19)–(22) in the application to multidimensional convectiondiffusion equations with mixed derivative terms. These results concern unconditional stability, that is, without any restriction on the time step . For each ADI scheme, lower bounds on guaranteeing unconditional stability have been obtained, depending in particular on the spatial dimension. Based on these theoretical stability results and the numerical experience in Haentjens13a (); Haentjens12 (); intHout10 () the following values are found to be useful for :

Do scheme with (if ) and (if )

CS scheme with

MCS scheme with (if ) and (if )

HV scheme with .
Here , which is a measure for the relative size of the mixed derivative coefficients.
In addition to ADI schemes, there exists a variety of wellknown alternative operator splitting schemes based on direction, called Locally OneDimensional (LOD) methods, fractional step methods, or componentwise splitting schemes. These schemes originate in the 1960s in the work by Dyakonov, Marchuk, Samarskii, Yanenko, and others. Some of them are related to Strang splitting schemes, developed at the same time. For a general overview and analysis of such methods we refer to Hundsdorfer03 (); Marchuk90 (). Applications in financial mathematics of these schemes are considered in, for example, Ikonen07a (); Toivanen10a ().
5.3 Operator Splitting Methods Based on Operator Type
For the jump models considered in Section 2.3 the semidiscrete matrix can be written in the form
(23) 
where and correspond to the differential operator and integral operator, respectively. The matrix is sparse while in general is a dense matrix or has dense blocks. In view of the different nature of these two matrices it can be preferable to employ an operator splitting method based on them.
In Andersen00 (), Andersen and Andreasen describe a generalized method
(24) 
assuming here . The standard choice and corresponds to the IMEX Euler method: it treats the stiff differential part implicitly, using the backward Euler method, and the nonstiff integral part explicitly, using the forward Euler method. This choice yields firstorder consistency. The benefit is that it is not necessary to solve dense linear systems involving the matrix . Instead, in each time step only one multiplication with is required. This approach has been considered and analysed in Cont05 ().
In Feng08 () an extrapolation approach is advocated based on the IMEX Euler method. Here approximations at a given fixed time are computed for a decreasing sequence of step sizes and then linearly combined so as to achieve a high order of accuracy.
In Andersen00 () secondorder consistency is obtained through an alternating treatment of the and parts. They propose to take a substep with and followed by a substep with and . Here linear systems involving the dense matrix need to be solved, for which the authors employ FFT.
In dHalluin05 () the original method is analyzed, where the linear system in each time step is solved by applying a fixedpoint iteration on the jump part following an idea in Tavella00 ().
The following, secondorder IMEX midpoint scheme has been considered in e.g. Feng08 (); Kwon11a (); Kwon11b (); Salmi12 (),
(25) 
The scheme (25) can be viewed as obtained from the semidiscrete system (17) at by the approximations and . Two subsequent secondorder IMEX methods are the IMEX–CNAB scheme
(26) 
and the IMEX–BDF2 scheme
(27) 
These schemes have recently been applied for option pricing in Salmi14 () and can be regarded as obtained by approximating the semidiscrete system (17) at and at , respectively.
The IMEX schemes (25), (26), and (27) were studied in a general framework, without application to option valuation, in Frank97 (). Here it was noted that such schemes can be considered as starting with an implicit method and then replacing the nonstiff part of the implicit term by an explicit formula using extrapolation based on previous time steps. An overview of IMEX methods is given in Hundsdorfer03 ().
In general, IMEX methods are only conditionally stable, that is, they are stable for a sufficiently small time step . For example, the IMEX midpoint scheme (25) and the IMEX–CNAB scheme (26) are stable whenever and the term in (10) is included in ; see Salmi14 (). Recall that denotes the jump activity.
The schemes discussed in this section are of the linear multistep type. For IMEX schemes of Runge–Kutta type applied to jump models we mention Briani07 ().
5.4 Operator Splitting Method for Linear Complementarity Problems
The fully discrete LCPs obtained by spatial and temporal discretization of (14) for Americanstyle options are more difficult to solve than the corresponding systems of linear equations for the Europeanstyle counterparts. It is desirable to split these LCPs into simpler subproblems. Here we describe the operator splitting method considered in Ikonen04 (); Ikonen09 () which was motivated by splitting methods for incompressible flows Chorin68 (); Glowinski86 (). To this purpose, we reformulate LCPs with Lagrange multipliers.
The method discretization (18) naturally gives rise to the following, fully discrete LCP
(28) 
where , , and is assumed to be constant in time. By introducing a Lagrange multiplier vector , the LCP (28) takes the equivalent form
(29) 
The basic idea of the operator splitting method proposed in Ikonen04 () is to decouple in (29) the first line from the second line. This is accomplished by approximating the Lagrange multiplier in the first line by the previous Lagrange multiplier . This leads to the system of linear equations
(30) 
After solving this system, the intermediate solution vector and the Lagrange multiplier are updated to satisfy the (spatially decoupled) equation and complementarity conditions
(31) 
Thus, this operator splitting method for American options leads to the solution of linear systems (30), which are essentially the same as for European options, and a simple update step (31). This update can be performed very fast, at each spatial grid point independently, with the formula
(32) 
The above operator splitting approach has been studied for more advanced time discretization schemes of both linear multistep and Runge–Kutta type in Ikonen04 (); Ikonen09 (). Moreover, it has recently been effectively combined with IMEX schemes in Salmi12 () for the case of jump models and with ADI schemes in Haentjens15 () for the case of the Heston model. For instance, the pertinent adaptations of the IMEX–CNAB scheme and the MCS scheme are
and
respectively, followed by the update (32). The other three ADI schemes from Section 5.2 are adapted analogously. Note that only a term has been added to the first line of the MCS scheme (21). Accordingly, like for the method, the amount of computational work per time step is essentially the same as for the corresponding Europeanstyle option.
6 Solvers for Algebraic Systems
The implicit time discretizations described in Section 5 lead, in each time step, to systems of linear equations of the form
(33) 
or LCPs of the form
(34) 
with given matrix and given vectors , . For models without jumps, semidiscretization by finite difference, finite volume, and finite element methods yields sparse matrices . For onedimensional models, the central FDs (15) and (16) lead to tridiagonal . For higher dimensional models they give rise to matrices with a large bandwidth whenever classical (nonsplitted) time stepping schemes are applied. On the other hand, for the operator splitting methods based on direction (cf. Section 5.2) one also acquires tridiagonal matrices (possibly after renumbering the unknowns). Wider FD stencils lead to additional nonzero diagonals. Time discretization of jump models with an implicit treatment of jumps makes dense.
6.1 Direct Methods
The system of linear equations (33) can be solved by a direct method using LU decomposition. This method first forms a lower triangular matrix and an upper triangular matrix such that . After this the solution vector is obtained by solving first and then .
Let denote the dimension of the matrix . For tridiagonal , or more generally matrices with a fixed small bandwidth, the LU decomposition yields optimal computational cost in the sense that the number of floating point operations is of order . Hence, it is very efficient for onedimensional models and for higherdimensional models when operator splitting schemes based on direction are applied.
For twodimensional models with classical time stepping schemes, a LU decomposition can be formed by order floating point operations if a nested dissection method can be used and then the computational cost of the solution is of order , see Davis06 (); George73 (). For higherdimensional models with classical time stepping schemes, the computational cost is less favorable.
For a general matrix , solving the LCP (34) requires iterative methods. However, in the special case that is tridiagonal, the solution vector satisfies (), () for certain and some additional assumptions hold, the Brennan–Schwartz algorithm Brennan77 () gives a direct method to solve the LCP; see also Achdou05 (); Ikonen07b (); Jaillet90 (). After inverting the numbering of the unknowns to be from right to left, represented by a permutation matrix , this algorithm is equivalent to applying the LU decomposition method to the corresponding linear system with matrix where the projection step is carried out directly after computing each component in the back substitution step with . More precisely the back substitution step reads after the renumbering of unknowns:
(35) 
The Brennan–Schwartz algorithm is essentially as fast as the LU decomposition method for linear systems and, thus, it has optimal computational cost.
6.2 Iterative Methods
There are many iterative methods for solving systems of linear equations. The two most important method categories are the stationary iterative methods and the Krylov subspace methods. Wellknown Krylov subspace methods for the, typically unsymmetric, system (33) are the generalized minimal residual (GMRES) method Saad86 () and the BiCGSTAB method VanderVorst92 (). In the following we discuss a stationary iterative method in some more detail which is familiar in finance applications. The successive overrelaxation (SOR) method reads
(36) 
for , where is a relaxation parameter. This method reduces to the Gauss–Seidel method in the case . The convergence rate of the iteration (36) can be improved significantly by an optimal choice of . Still the number of iterations to reach a given accuracy typically grows with , that is, when the spatial grid is refined the convergence slows down.
The SOR iteration can be generalized for LCPs by performing a projection after each update Cryer71 (); see also Glowinski84 (). This method is called the projected SOR (PSOR) method and it reads
(37) 
. As can be expected, the PSOR method suffers from the same drawback as the SOR method mentioned above.
6.3 Multigrid Methods
The aim of multigrid methods for solving linear systems (33) is to render the number of iterations essentially independent of the problem size . The stationary iterative methods typically reduce high frequency errors quickly, while low frequency errors are reduced much more slowly. The idea of multigrid methods is to compute efficiently corrections to these slowly varying errors on coarser spatial grids. The multigrid methods can be divided into geometrical and algebraic methods. With the geometrical methods discretizations are explicitly constructed on a sequence of grids and transfer operators between these grids are explicitly defined. Algebraic multigrid (AMG) methods Ruge87 (); Stueben01 () build the coarse problems and the transfer operators automatically using the properties of the matrix . The details of these methods are beyond the scope of this chapter and we refer to e.g. Trottenberg01 () for details and extensive literature on this.
Several versions of multigrid methods also exist for LCPs. Brandt and Cryer introduced in Brandt83 () a projected full approximation scheme (PFAS) multigrid method for LCPs. American options under stochastic volatility were priced using the PFAS method in Clarke99 (); Oosterlee03 (). A projected multigrid (PMG) method for LCPs introduced in Reisinger04 () resembles more closely a classical multigrid method for linear problems. This method has been used to price American options in Ikonen08 (); Reisinger04 (). Recently, an AMG method was generalized for LCPs in Toivanen12 (). The resulting method is called the projected algebraic multigrid (PAMG) method and resembles the PMG method in the treatment of the complementarity conditions.
7 Numerical Illustrations
In the following we price European and American put options under a hierarchy of models: Black–Scholes, Merton, Heston, and Bates. The interest rate, the maturity time, and the strike price are always taken as
For the purpose of illustration, Fig. 1 and Fig. 2 show fair values of European and American options, respectively, under the four considered models with the model parameters described in the following sections.
7.1 Black–Scholes model
In the case of the Black–Scholes model, we price American put options. The volatility in the model (1) is taken as
and the following boundary conditions are employed:
(38)  
(39) 
The Neumann boundary condition (39) introduces a modeling error as it is not exactly fulfilled by the actual option price function. If is taken sufficiently large, however, this error will be small in the region of interest.
For the spatial discretization of the Black–Scholes PDE (2), we apply FD formulas on nonuniform grids such that a large fraction of the grid points lie in the region of interest, that is, in the neighborhood of .
For the construction of the spatial grid we adopt Haentjens12 (). Let integer , constant , and be given. Let equidistant points be given with distance and
Then we define a nonuniform grid by the transformation
(40) 
where