Calibration of self-decomposable Lévy models

# Calibration of self-decomposable Lévy models

\fnmsMathias \snmTrabslabel=e1]trabs@math.hu-berlin.de [ Institut für Mathematik, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany. \printeade1
\smonth11 \syear2011\smonth7 \syear2012
\smonth11 \syear2011\smonth7 \syear2012
\smonth11 \syear2011\smonth7 \syear2012
###### Abstract

We study the nonparametric calibration of exponential Lévy models with infinite jump activity. In particular our analysis applies to self-decomposable processes whose jump density can be characterized by the -function, which is typically nonsmooth at zero. On the one hand the estimation of the drift, of the activity measure and of analogous parameters for the derivatives of the -function are considered and on the other hand we estimate nonparametrically the -function. Minimax convergence rates are derived. Since the rates depend on , we construct estimators adapting to this unknown parameter. Our estimation method is based on spectral representations of the observed option prices and on a regularization by cutting off high frequencies. Finally, the procedure is applied to simulations and real data.

\kwd
\aid

0 \volume20 \issue1 2014 \firstpage109 \lastpage140 \doi10.3150/12-BEJ478 \newproclaimassumptionAssumption \newproclaimdefinition[lemma]Definition \newremarkexample[lemma]Example \newremarkremark[lemma]Remark \newremarkexamplesExamples

\runtitle

Calibration of self-decomposable Lévy models

{aug}

adaptation \kwdEuropean option \kwdinfinite activity jump process \kwdminimax rates \kwdnonlinear inverse problem \kwdself-decomposability

## 1 Introduction

Since Merton merton1976 () introduced his discontinuous asset price model, stock returns were frequently described by exponentials of Lévy processes. A review of recent pricing and hedging results for these models is given by Tankov Tankov2011 (). The calibration of the underlying model, that is in the case of Lévy models the estimation of the characteristic triplet , from historical asset prices is mostly studied in parametric models only, consider the survey paper of Eberlein eberlein2012 () and the references therein. Remarkable exceptions are the nonparametric penalized least squares method by Cont and Tankov contTankov2004 () and the spectral calibration procedure by Belomestny and Reiß reiss12006 (). Both articles concentrate on models of finite jump activity. Our goal is to extend their results to infinite intensity models. A class which attracted much interest in financial modeling is given by self-decomposable Lévy processes, examples are the hyperbolic model (Eberlein, Keller and Prause eberlein1998 ()) or the variance gamma model (Madan and Seneta madan1990 (), Madan, Carr and Chang madan1998 ()). Moreover, self-decomposable distributions are discussed in the financial investigation using Sato processes (Carr et al. carr2007 (), Eberlein and Madan eberlein2009 ()). Our results can be applied in this context, too. The nonparametric calibration of Lévy models is not only relevant for stock prices, for instance, it can be used for the Libor market as well (see Belomestny and Schoenmakers belomestnySchoenmakers2011 ()). In the context of Ornstein–Uhlenbeck processes, the nonparametric inference of self-decomposable Lévy processes was considered by Jongbloed, van der Meulen and van der Vaart Jongbloed2005 ().

Owing to the infinite activity, the features of market prices can be reproduced even without a diffusion part (cf. Carr et al. carr2002 ()) and thus we study pure-jump Lévy processes. More precisely, we assume that the jump density satisfies

 (K)

When increases on and decreases on , it is called -function and the processes is self-decomposable. Further examples which have property (K) are compound Poisson processes and limit distributions of branching processes as considered by Keller-Ressel and Mijatović KellerRessel2012 (). Using the bounded variation of , we show that the estimation problem is only mildly ill-posed. While the Blumenthal–Getoor index, which was estimated by Belomestny belomestny2010 (), is zero in our model, the infinite activity can be described on a finer scale by the parameter

 α:=k(0+)+k(0−).

Since is typically nonsmooth at zero, we face two estimation problems: First, to give a proper description of at zero, we propose estimators for and its analogs , with , for the derivatives of as well as for the drift , which can be estimated similarly. We prove convergence rates for their mean squared error which turn out to be optimal in minimax sense up to a logarithmic factor. Second, we construct a nonparametric estimator of whose mean integrated squared error converges with nearly optimal rates. Owing to bid-ask spreads and other market frictions, we observe only noisy option prices. The definition of the estimators is based on the relation between these prices and the characteristic function of the driving process established by Carr and Madan carrMadan1999 () and on different spectral representations of the characteristic exponent. Smoothing is done by cutting off all frequencies higher than a certain value depending on a maximal permitted parameter . The whole estimation procedure is computationally efficient and achieves good results in simulations and in real data examples. All estimators converge with a polynomial rate, where the maximal determines the ill-posedness of the problem. Assuming sub-Gaussian error distributions, we provide an estimator with -adaptive rates. The main tool for this result is a concentration inequality for our estimator which might be of independent interest.

This work is organized as follows: In Section 2, we describe the setting of our estimation procedure and derive the necessary representations of the characteristic exponent. The estimators are described in Section 3, where we also determine the convergence rates. The construction of the -adaptive estimator of is contained in Section 4. In view of simulations and real data, we discuss our theoretical results and the implementation of the procedure in Section 5. All proofs are given in Section 6.

## 2 The model

### 2.1 Self-decomposable Lévy processes

A real valued random variable X has a self-decomposable law if for any there is an independent random variable such that . Since each self-decomposable distribution is infinitely divisible (see Proposition 15.5 in sato1999 ()), we can define the corresponding self-decomposable Lévy process. Self-decomposable laws can be understood as the class of limit distributions of converging scaled sums of independent random variables (Theorem 15.3 in sato1999 ()). This characterization is of economical interest. If we understand the price of an asset as an aggregate of small independent influences and release from the scaling, which leads to diffusion models, we automatically end up in a self-decomposable price process.

Sato sato1999 () shows that the jump measure of a self-decomposable distribution is always absolutely continuous with respect to the Lebesgue measure and its density can be characterized through (K) where needs to be increasing on and decreasing on . Note that self-decomposability does not affect the volatility nor the drift of the Lévy process.

Assuming and property (K), the process has finite variation and the characteristic function of is given by the Lévy–Khintchine representation

 φT(u):=E[eiuXT]=exp(T(iγu+∫∞−∞(eiux−1)k(x)|x|dx)). (2)

Motivated by a martingale argument, we will suppose the exponential moment condition for all , which yields

 0=γ+∫∞−∞(ex−1)k(x)|x|dx. (3)

In particular, we will impose . In this case, is defined on the strip .

Besides Lévy processes there is another class that is closely related to self-decomposability. Assuming self-similarity, that means , for all and some exponent , instead of stationary increments, is a Sato processes. Sato sato1991 () showed that self-decomposable distributions can be characterized as the laws at unit time of these processes. From the self-similarity and self-decomposability follows for

 φYT(u)=E[eiuYT]=E[eiTHuY1]=exp(iTHγu+∫∞−∞(eiux−1)k(T−Hx)|x|dx).

Since our estimation procedure only depends through equation (2) on the distributional structure of the underlying process, we can apply the estimators directly to Sato processes using and instead of , and . However, we concentrate on Lévy processes in the sequel.

For self-decomposable distributions the parameter captures many of its properties such as the smoothness of the densities of the marginal distributions (Theorem 28.4 in sato1999 ()) and the tail behavior of the characteristic function. This holds even for the more general class of Lévy processes that satisfy property (K). Recall that has bounded variation if and only if

 ∥k∥TV:=sup{n∑i=1∣∣k(xi)−k(xi−1)∣∣\dvtn∈N,−∞

In particular, implies . Similarly to deconvolution problems, the stochastic error in our model is driven by and thus we prove the following lemma in the Appendix.

###### Lemma 2.1

Let have property (K) and and let the martingale condition (3) hold. {longlist}[(ii)]

If and then there exists a constant such that for all with we obtain the bound

 ∣∣φT(u−i)∣∣≥Cφ|u|−Tα.

Let then holds uniformly over all and all with and .

The value as defined in the lemma can be understood as the largest slop of near zero. If the process is self-decomposable than holds and the bounded variation norm equals . Otherwise, we can use and , assuming the derivative exists, is bounded on and integrable on . If either or property (K) is violated, can decay faster than any polynomial order, for example, consider self-decomposable processes with (see sato1999 (), Lemma 28.5). Hence, the conditions of Lemma 2.1 are sharp.

### 2.2 Asset prices and Vanilla options

Let be the risk-less interest rate in the market and denote the initial value of the asset. In an exponential Lévy model the price process is given by

 St=S0ert+Xt,

where is a Lévy process described by the characteristic triplet . Throughout these notes, we assume has property (K) and . On the probability space with pricing (or martingale) measure the discounted process is a martingale with respect to its natural filtration . This is equivalent to for all and thus, the martingale condition (3) holds.

At time the risk neutral price of an European call option with underlying , time to maturity and strike price is given by where , and similarly is the price of European put. In terms of the negative log-forward moneyness the prices can be expressed as

 C(x,T)=S0E[(eXT−ex)+]andP(x,T)=S0E[(ex−eXT)+].

and set the Fourier transform in relation to the characteristic function through the pricing formula

 FO(u)=1−φT(u−i)u(u−i),u∈R∖{0}. (4)

The properties of were studied further by Belomestny and Reiß reiss12006 (). In particular, they showed that the option function is contained in and decays exponentially under the following assumption. {assumption} We assume that is finite, which is equivalent to the moment condition . Our observations are given by

 Oj=O(xj)+δjεj,j=1,…,N, (5)

where the noise consists of independent, centered random variables with and . The noise levels are assumed to be positive and known. In practice, the uncertainty is due to market frictions such as bid-ask spreads.

### 2.3 Representation of the characteristic exponent

Using (2) and (4), the shifted characteristic exponent is given by

 ψ(u) := 1Tlog(1+iu(1+iu)FO(u))=1Tlog(φT(u−i)) (6) = iγu+γ+∫∞−∞(ei(u−i)x−1)k(x)|x|dx (7)

for . Note that the last line equals zero for because of the martingale condition (3). Throughout, we choose a distinguished logarithm, that is a version of the complex logarithm such that is continuous with . Under the assumption that 111We denote and for . is finite, we can apply Fubini’s theorem to obtain

 (8)

where the Fourier transform is well defined on . Typically, the  and its derivatives are not continuous at zero. Moreover, if the function has a jump at zero in every case. Therefore, the Fourier transform decreases very slowly. Let be smooth on and fulfill an integrability condition which will be important later: {assumption} Assume with all derivatives having a finite right- and left-hand limit at zero and . To compensate those discontinuities, we add a linear combination of the functions , for . Since for and all are smooth on , we can find , such that is contained in . This approach yields the following representation. The proof is given in the supplementary article trabs2011Supplement ().

###### Proposition 2.2

Let . On Assumption 2.3, there exist functions and such that is bounded in and it holds

 ψ(u)=D(sgn(u))+iγu−α0log(|u|)+s−2∑j=1ij(j−1)!αjuj+ρ(u),u≠0. (9)

The coefficients are given by especially holds.

Representation (9) allows us to estimate and . A plug-in approach yields estimators for . Since we only apply this representation when is multiplied with weight functions having roots of degree at zero, the poles that appear in (9) do no harm.

Proposition 2.2 covers the case . For we conclude from (7), the martingale condition (3) and Assumption 2.3

 (10)

Hence, is a sum of a constant from the integration, the linear drift and a remainder of order , which follows from the decay of the Fourier transform as . Corollary 8 in trabs2011Supplement () even shows, that there exists no -consistent estimator of for . Therefore, we concentrate on the case in the sequel.

Equation (10) allows another useful observation. Defining the exponentially scaled -function

 ke(x):=sgn(x)exk(x),x∈R,

we obtain by differentiation

 ψ′(u)=1T(i−2u)FO(u)−(u+iu2)F(xO(x))(u)1+(iu−u2)FO(u)=iγ+iFke(u). (11)

Using this relation, we can define an estimator of .

## 3 Estimation procedure

### 3.1 Definition of the estimators and weight functions

Given the observations , we fit a function to these data using linear -splines

 bj(x):=x−xj−1xj−xj−11[xj−1,xj)+xj+1−xxj+1−xj1[xj,xj+1],j=1,…,N,

and a function with to take care of the jump of :

 ~O(x)=β0(x)+N∑j=1Ojbj(x),x∈R.

We choose with support where satisfies . Replacing with  in the representations (6) and (11) of and , respectively, allows us to define their empirical versions through

 ~ψ(u) := 1Tlog(vκ(u)(1+iu(1+iu)F~O(u))), ~ψ′(u) := 1T(i−2u)F~O(u)−(u+iu2)F(x~O(x))(u)vκ(u)(1+iu(1+iu)F~O(u)),u∈R,

where is a positive function and we apply a trimming function given by

to stabilize for large stochastic errors. A reasonable choice of will be derived below. The function is well defined on the interval on the event

For , we set arbitrarily, for instance equal to zero. The more concentrates around the true function the greater is the probability of . Söhl soehl2010 () shows even that in the continuous-time Lévy model with finite jump activity the identity holds.

In the spirit of Belomestny and Reiß reiss12006 (), we estimate the parameters and , as coefficients of the different powers of in equation (9). Using a spectral cut-off value , we define

 ^γ:=∫U−UIm(~ψ(u))wUγ(u)du

and for

The weight functions and are chosen such that they filter the coefficients of interest. Owing to (11), the nonparametric object can be estimated by

applying a one-sided kernel function with bandwidth since we know that jumps only at zero. The condition on the weights are summarized in the following: {assumption} We assume:

• fulfills for all odd

 ∫U−UuwUγ(u)du=1,∫U−Uu−jwUγ(u)du=0and∫U0wUγ(±u)du=0.
• satisfies for all even

 ∫U−Ulog(|u|)wUα0(u)du=−1,∫U−Uu−jwUα0(u)du=0and∫U0wUα0(±u)du=0.
• For the weight functions fulfill222For let denote the largest integer which is smaller than .

 ∫U−Uu−jwUαj(u)du = (−1)⌊j/2⌋(j−1)!,∫U−Uu−lwUαj(u)du=0and ∫U0wUαj(±u)du = 0,

where and is even for even and odd otherwise. For even we impose additionally

 ∫U−Ulog(|u|)wUαj(u)du=0.
• is of Sobolev smoothness , that is, , has support and fulfills for

 ∫RWk(x)dx=1,∫RxlWk(x)dx=0andx2s−1Wk(x)∈L1(R).

Furthermore, we assume continuity and boundedness of the functions for . The integral conditions can be provided by rescaling: Let satisfy Assumption 3.1 for and . Since , we can choose . Similarly, a rescaling is possible for :

 −1 = = ∫U−Ulog(|u|)U−1w1α0(uU)du.

Therefore, we define and analogously . The continuity condition on in Assumption 3.1 is set to take advantage of the decay of the remainder . In combination with the rescaling it implies

 ∣∣wUγ(u)∣∣≲U−s−1|u|s−1and∣∣wUαj(u)∣∣≲U−s+j|u|s−1,j=0,…,s−2. (13)

Throughout, we write if there is a constant independent of all parameters involved such that . In the sequel we assume that the weight functions satisfy Assumption 3.1 and the property (13).

We reduce the loss of by truncating positive values on and negative ones on . In the self-decomposable framework there are additional shape restrictions of the -function which the proposed estimator does not take into account. The monotonicity can be generated by a rearrangement of the function. To this end let , where we bounded the support with an arbitrary large constant . The rearranged estimator which is increasing on and decreasing on is then given by

Chernozhukov, Fernández-Val and Galichon Chernozhukov2009 () show that the rearrangement reduces weakly the error for increasing target functions on compact subsets. This result carries over to our estimation problem.

### 3.2 Convergence rates

To ensure a well-defined procedure, an exponential decay of , the identity (10) and to obtain a lower bound of , we consider the class . Uniform convergence results for the parameters will be derived in the smoothness class . {definition} Let and . We define {longlist}[(ii)]

as the set of all pairs where is of bounded variation and the corresponding Lévy process given by the triplet satisfies Assumption 2.2 with , martingale condition (3) as well as

 α∈[0,¯α]andmax{supx∈(0,1]{k(x)+k(−x)−αx},∥∥ke(x)∥∥L1,∥k∥TV}≤R,

as the set of all pairs satisfying additionally Assumption 2.3 with

 ∣∣k(l)(0+)+k(l)(0−)∣∣ ≤ R,for l=1,…,s−1, ∥∥(1∨ex)k(l)(x)∥∥L1 ≤ R,for l=0,…,s.

In the class Lemma 2.1(ii) provides a common lower bound of for . Using , we estimate roughly for :

 ∣∣φT(u−i)∣∣=exp(T∫∞−∞cos(x)−1xex/|u|k(x/|u|)dx)≥e−TR.

Hence, the choice

satisfies

 13∣∣φT(u−i)∣∣≥κ(u),u∈R, (15)

where the factor is used for technical reasons. As discussed above, we can restrict our investigation to the case . Since the Lévy process is only identifiable if is known on the whole real line, we consider asymptotics of a growing number of observations with

 Δ:=maxj=2,…,N(xj−xj−1)→0andA:=min(xN,−x1)→∞.

Taking into account the numerical interpolation error and the stochastic error, we analyze the risk of the estimators in terms of the abstract noise level

 ε:=Δ3/2+Δ1/2∥δ∥l∞.
###### Theorem 3.1

Let and assume and . We choose the cut-off value to obtain the uniform convergence rates

 supP=(γ,k)∈Gs(R,¯α)EP[|^γ−γ|2]1/2 ≲ ε2s/(2s+2T¯α+1)and supP=(γ,k)∈Gs(R,¯α)EP[|^αj−αj|2]1/2 ≲ ε2(s−1−j)/(2s+2T¯α+1),j=0,…,s−2.

As one may expect the rates for become slower as gets closer to its maximal value because the profit from the smoothness of decreases. Note that the cut-off for all estimators is the same. In contrast to we assume Sobolev conditions on in the class in order to apply -Fourier analysis. {definition} Let and . We define as the set of all pairs satisfying additionally , for corresponding Lévy process as well as

 |γ|≤Rand∥∥k(l)e∥∥L2≤Rfor l=0,…,s.

In the next theorem the conditions on and are stronger than for the upper bounds of the parameters which is due to the necessity to estimate also the derivative of . However, the estimation of does not lead to a loss in the rate. As seen in (12), we need to estimate .

###### Theorem 3.2

Let and assume as well as . Using an estimator which satisfies and choosing the cut-off value , we obtain for the risk of the uniform convergence rate

 supP=(γ,k)∈Hs(R,¯α)EP[∥^ke−ke∥2L2]1/2≲ε2s/(2s+2T¯α+5).
{remark}

The convergence rates in the Theorems 3.1 and 3.2 are minimax optimal up to a logarithmic factor, which is shown in the supplementary article trabs2011Supplement ().

The convergence rate of our estimation procedure depends on the bound of the true but unknown . Therefore, we construct an -adaptive estimator. For simplicity we concentrate on the estimation of itself whereas the results can be easily extended to , , and . In this section, we will require the following assumption. {assumption} Let , and for some maximal . Furthermore, we suppose and . These conditions only recall the setting in which the convergence rates of our parameter estimators were proven. Given a consistent preestimator of , let be the estimator using the data-driven cut-off value and the trimming parameter

respectively, with . If is sufficiently concentrated around the true value, the adaptation does not lead to losses in the rate as the following proposition shows. Note that the condition is not restrictive since any estimator of can be improved by using instead.

###### Proposition 4.1

On Assumption 4 let be a consistent estimator which is independent of the data and fulfills for the inequality

 P(|^αpre−α|≥|logε|−1)≤dε2 (18)

with a constant . Furthermore, we suppose almost surely. Then satisfies the asymptotic risk bound

 supP∈Gs(R,α)EP,^αpre[|~α0−α|2]1/2≲ε2(s−1)/(2s+2Tα+1),

where the expectation is taken with respect to the common distribution of the observations and the preestimator .

To use on an independent sample as preestimator, we establish a concentration result for the proposed procedure. We require to be uniformly sub-Gaussian (see, e.g., van de Geer vandegeer2000 ()). That means there are constants such that the following concentration inequality holds for all and

 P(∣∣∣N∑j=1ajεj∣∣∣≥t)≤C1exp(−C2t2∑Nj=1a2j). (19)
###### Proposition 4.2

Additionally to Assumption 4 let be uniformly sub-Gaussian fulfilling (19). Then there is a constant and for all there is an , such that for all the estimator satisfies

 P(|^α0−α|≥κ)≤((7N+1)C1+2)exp(−c(κ2∧κ1/2)ε−(s−1)/(2s+2T¯α+1)). (20)

Concentration (20) is stronger than needed in Proposition 4.1. To apply the proposed estimation procedure, let and be two independent samples with noise levels and as well as sample sizes and , respectively. Using