Nonparametric estimates of pricing functionals
We analyze the empirical performance of several non-parametric estimators of the pricing functional for European options, using historical put and call prices on the S&P500 during the year 2012. Two main families of estimators are considered, obtained by estimating the pricing functional directly, and by estimating the (Black-Scholes) implied volatility surface, respectively. In each case simple estimators based on linear interpolation are constructed, as well as more sophisticated ones based on smoothing kernels, à la Nadaraya-Watson. The results based on the analysis of the empirical pricing errors in an extensive out-of-sample study indicate that a simple approach based on the Black-Scholes formula coupled with linear interpolation of the volatility surface outperforms, both in accuracy and computational speed, all other methods.
Keywords: non-parametric estimation; option pricing; implied volatility.
JEL codes: G13, C14, C52.
The purpose of this work is to analyze the empirical performance of some non-parametric and semi-parametric estimators of pricing functional, with particular emphasis on the simplest contingent claims, i.e. European put and call options. The well-known idea underlying non-parametric estimation is the following: assume that the price of a contingent claim of a certain type can be written as , where is a function to be estimated, and are observable parameters. Given a sample
and , one can estimate by linear interpolation (for instance) of the function defined as
This procedure is well understood if belongs to the convex hull of (see §3.1 for more details). There exist of course many other procedures based on nonlinear interpolation, rather than linear, that may be preferable in certain situations, for instance to obtain estimators that are more regular than just continuous. The same problem has also been studied from a statistical perspective, leading to the large literature on non-parametric estimation of regression functions111The even larger literature on non-parametric density estimation treats a less general but very related problem. (see, e.g.,  and references therein). One of the most popular estimators in this literature is the so-called Nadaraya-Watson estimator (see [12, 18]), that is
where is a strictly positive continuous radial function with integral equal to and , or a slight generalization thereof.
Other functions of the parameters can clearly be estimated in the same manner. In particular, if the parameters are the usual inputs in the Black-Scholes pricing formula (except the volatility), one can estimate the implied volatility, which can then be fed back to the Black-Scholes formula to produce further estimates of the pricing functional.
One of the first works applying non-parametric regression à la Nadaraya-Watson in pricing problems is , where the authors, among other things, report impressive empirical results on the precision of semi-parametric estimators applied to pricing European call options on the S&P500. To date, a large literature deals with this approach to pricing – see, e.g.,  and references therein, as well as [6, 14] for related ideas.
Our purpose is to understand how different (but related) non-parametric approaches perform in terms of pricing precision, also in comparison with non-trivial fully parametric alternatives. A natural question, of clear relevance for practical purposes, is whether Nadaraya-Watson kernel estimation produces better estimates than elementary linear interpolation. In order to avoid rather heavy Markovianity and stationarity assumptions on the price process of the underlying index, we use observed option prices in a given day to estimate (unobserved!) option prices in the same day. It turns out that, from the empirical point of view, there does not seem to be any advantage connected with Nadaraya-Watson kernel estimators, which perform rather poorly in comparison with simple linear-interpolation estimators. The latter also consistently outperform a benchmark parametric estimator based on the (skewed) Variance-Gamma process (see ).
The focus of this work is somewhat different from that of , whose main aim is to estimate the so-called state-price density. However, such estimates are obtained by the second derivative with respect to the strike price of (estimates of) the pricing functional for European call options, and their main (practical) application seems to be pricing anyway. On the other hand, the main terms of comparison used in  are rather involved methods based on neural networks and on implied binomial trees, and empirical tests are done in a very different way: observed option prices over a period of nine months are aggregated to construct a Nadaraya-Watson estimator of the implied volatility surface, seen as a function of underlying’s futures price, strike, and time to maturity. This estimator is then used to forecast prices of European call options (with strike prices within a certain range) at five later dates, namely from one to twenty days in the future.
We have not been able to find in the literature neither empirical studies where observed option prices over different dates are not aggregated, nor a discussion about the practical aspects of non-parametric estimation of pricing functionals, for the mere purpose of pricing plain vanilla European options.222One can find, however, articles that use non-parametric methods to study related problems, without aggregating option prices across days. For instance,  estimate the risk-neutral density and  test the monotonicity of the pricing kernel. Our goal is to try and answer some basic fundamental questions in this regard, on the basis of an extensive out-of-sample analysis.
The rest of the paper is organized as follows: in Section 2, after recalling some facts about yield processes associated to dividend-paying assets that are not easily found in standard textbooks, we prove a put-call parity identity for European options written on a dividend-paying asset, under a conditional uncorrelation assumption involving the asset, dividend rate, and risk-free rate processes. These basic theoretical results are needed because the S&P500 pays dividends. The various estimators used in the empirical analysis to follow are introduced in detail in Section 3. A preliminary analysis on the data set of option prices on the S&P500 for the whole year 2012 is conducted in Section 4. Finally, Section 5 contains a detailed study of the empirical performance of the estimators introduced in Section 3.
2 Setting and preliminaries
Given a probability space endowed with a (right-continuous, complete) filtration , let the adapted processes and describe the price of two traded assets: the former is just the risk-free cash account, i.e.
where denotes the adapted, -a.s. positive, short-rate process; the latter is a risky asset with an associated adapted, -a.s. positive, dividend rate process , such that
The corresponding yield process is defined as
We assume that there exists a probability measure , equivalent to , such that, setting , the discounted yield process , defined by
is a square integrable -martingale. Setting , the integration by parts formula yields
and, by continuity of ,
is a (square integrable) -martingale , because is bounded and predictable.
As is well known, the existence of an equivalent probability measure such that the discounted yield process is a -martingale implies that the market defined by and is free of arbitrage. In general is not unique, unless the market is complete, and we assume that is just a fixed pricing measure. More precisely, we assume that we are given a pricing functional
where the random variable often goes under the name of stochastic discount factor.
We shall write, for compactness of notation,
We are going to use several times the following assumptions on , , and .
One has, for any ,
Here, for any two random variables , ,
Hypothesis (H) is equivalent to
for all .
2.1 Put-call parity with dividends and applications
To implement some estimators of the pricing functional, but also to carry out a preliminary analysis on the raw option prices, we shall need a version of put-call parity for European call and put options. This will also yield a natural estimator for (a functional of) the dividend rate process .
Assume that Hypothesis (H) holds. Then, for any ,
where and denote the prices at time of European call and put options, respectively, with maturity , strike price , and underlying price process .
Multiplying the identity
by and taking conditional expectation yields
Using Hypothesis (H) and the martingale property of the discounted yield process associated to and , one has
where , hence
Collecting terms and simplifying yields
Recall that the forward price at time of a contingent claim is defined as
(see, e.g., [10, §2.4]). Using arguments entirely analogous to those in the proof of the above Proposition, one obtains the following “spot-forward parity”.
If Hypothesis (H) holds, then
In particular, (1) could also be written as
The put-call parity identity of Proposition 2.1 implies a procedure to estimate : assuming that an estimator of is available and is denoted by the same symbol, and that prices of call and put options with the same maturity and strike price are observable, identity (1) yields
where and denote the price at time of European call and put options, respectively, with maturity and strike price . This estimate of is usually preferred in applications to option pricing to other proxies of , such as historical estimates.
3 Estimators of the pricing functional
From now on we shall assume that
the filtration is the (right-continuous, completed) filtration generated by ;
the process is Markovian, i.e., for any bounded random variable measurable with respect to , one has
These assumptions immediately imply that there exists a function such that
for all . In particular, without any further assumptions, the pricing functional will be time-dependent, hence it is wrong, in general, to use the price functional estimated at time to price options at time , or, similarly, to estimate the pricing functional aggregating data of different dates.333For a related discussion, also from an economic point of view, cf. . This procedure become meaningful if we further assume that the process is a time-homogeneous Markov process: in this case depends on and only through their difference . However, it is far from clear that a time-homogeneity assumption on the data-generating process would be appropriate, hence the empirical analysis of Section 5 is carried out with fixed. In §4.1 we also provide empirical data that does not seem to support the plausibility of a time-homogeneity assumption.
We now proceed to describe in detail the various estimators of the pricing functional for put options that will be tested in Section 5. Analogous considerations, not spelled out in detail, hold for call options and, in fact, for arbitrary European derivatives: some care has to be taken only if the payoff function is unbounded, in which case integrability conditions have to be assumed.
3.1 Linear interpolator
Let and subsets of and , and a function such that for all . Denoting the convex hull of by , the linear interpolator is a function obtained by linear interpolation of the function , where is the marked empirical measure
stands for the Dirac measure, and .
Linear interpolation here means that the Delaunay triangulation of is computed, and, for any , is obtained by interpolation of on the vertices of the simplex containing . We recall that a triangulation of the set of points is a partitioning of the convex hull into simplices whose vertices are the points . In particular, any two simplices of such a partition either do not intersect or share a common face. A Delaunay triangulation of satisfies the following additional property: if is a ball in such that all vertices of a simplex belong to its boundary, then the interior of does not contain any element of . Once a Delaunay triangulation of has been obtained, , with , is defined as follows: let the vertices of the unique simplex such that , and write
Then we set . For an extensive treatment of these topics, see, e.g., .
Let be fixed. Then, as seen above, we can write
Assuming that , adaptedness of and Markovianity imply
Denoting by the linear interpolator of , we define the normalized linear interpolator of by . This is the estimator used in the empirical analysis of Section 5.
Note that the linear interpolator is defined only on the convex hull of the couples . In other words, this estimator is unable to estimate the price of options whose parameters do not belong to . From a purely numerical point of view, the estimator could be extended outside , for instance by extrapolation. However, the estimates obtained this way are rarely reliable. Another possibility to extend the estimator outside is to add fictitious couples to the sample where the value of the function is known, e.g. for , or for “very large”. This idea, which is obviously more meaningful than a mere numerical extrapolation, may lead, in some situations, to relatively satisfactory results. Details are discussed in Section 5 below.
3.2 Nadaraya-Watson estimator
Let and be as in the previous subsection. Let be an integrable function with integral equal to , and set, for any , . With a slight abuse of notation, for any , we define the function as , i.e.
The Nadaraya-Watson (NW) estimator of , with smoothing parameters , based on the observations , is defined as
The value of the function at is thus a weighted average of the observations of the type
Usually is such that (i.e. it is symmetric), with is decreasing, so that the NW estimator effectively computes a weighted average of the observations assigning more weight to those closer to . If has compact support, the average is on a finite number of points whose distance from does not exceed a certain threshold (depending on ). The NW estimator can also be interpreted as a local constant least square approximation of the observed values , because (assuming for simplicity )
(see, e.g., [17, p. 34]). Here “local” simply refers to the weighting through that, as before, assigns more weights to observations closer to .
The explicit form of the Nadaraya-Watson estimator of we shall use is
where is the density of the standard Gaussian measure on .
As is well known (see, e.g., ), the choice of the smoothing parameter is crucial in the implementation of non-parametric regression techniques: as the estimator is “undersmoothing”, and as it is “oversmoothing”.444Roughly speaking, as the estimator reproduces the data, i.e. it is equal to for and zero elsewhere, and it converges to a constant equal to the average of the as . We are going to select by (leave-one-out) cross-validation (cf. [17, §1.4, p. 27-ff.]) as follows: let be the Nadaraya-Watson estimator of based on the sample , , and
Then we set
In the NW estimator of the pricing functional we replace above with
i.e. the smoothing parameter is chosen as to minimize the relative error rather than the absolute error.
Since the cross-validation procedure is computationally very intensive, hence rather slow, we shall also consider a much simpler choice of the smoothing parameters as a term of comparison. In particular, following [16, §3.4.2], we shall use
where and are the (sample) standard deviation and the inter-quartile range of , , respectively, and , are defined in the same way with in place of .
(a) In some applications (e.g. to estimate a function together with its derivatives) it is useful to allow to take negative values. Then is not guaranteed to be positive. The usual convention is then simply to redefine as its positive part. (b) If is supported on the whole space, the Nadaraya-Watson estimator is defined also for points that do not lie in the convex hull of . However, since this estimator is nothing else than a local average, estimates produced for points that lie outside should be taken with extreme care, if not plainly discarded.
3.3 Implied volatility estimators
Let denote the Black-Scholes price of a put option with time to maturity and strike , written on an underlying with current price , constant dividend rate and volatility , where the risk-free rate is also constant. As is well known, is strictly monotone, hence, for any , and , there exists a unique , called implied volatility, such that . We shall call by the same name also the function that is uniquely defined by the procedure just described.
If , , is a sample of observed option prices and corresponding strike prices and times to maturity in a fixed day (so that is also fixed), then for each there exists a unique positive number such that
hence can be interpreted as (estimates of the) values of the implied volatility function on the set of points , i.e., for some function , for all . This yields
which immediately suggest another procedure to estimate the pricing functional : let , where denotes the convex hull of , and define by linear interpolation of (in the sense of §3.1), hence set
In the empirical study below we shall employ a normalized estimator of the implied volatility function , in complete analogy to the construction of the normalized linear interpolator of §3.1. Namely, given , we define by linear interpolation at of the function whose value at is , .
Alternatively, one could use, in place of the linear interpolator , a Nadaraya-Watson estimator , obtaining another estimator of the pricing functional. Of course all considerations of the previous subsection regarding the choice of the smoothing parameter, as well as the lack of plausibility for estimates with , apply also in this case.
Note that, in contrast to the linear interpolator and the Nadaraya-Watson estimator, estimating requires estimates of and . While historical data on the risk-free interest rate are readily available, we use as a proxy for the implied estimator discussed at the end of Section 2 and, in more detail, in Section 4 below. Moreover, the Black-Scholes structure allows to obtain an explicit expression for the sensitivity of the implied volatility with respect to the parameter . Such result, which could have some interest in its own right, can be found in the Appendix (the derivation therein deals with call options, but the corresponding result for put options follows easily).
3.4 A parametric estimator of the Variance-Gamma class
A measure on the Borel -algebra of is called Gamma measure with parameters and if
for any Borel set . A random variable is said to be Gamma-distributed with parameters and if its law is a Gamma measure with the same parameters. Elementary calculus (and the definition of the Gamma function) shows that, for any ,
whence it immediately follows that Gamma laws are infinitely divisible. Let be a Gamma measure with . Then there exists a positive increasing Lévy process starting from zero (i.e., a subordinator) such that the law of is , and
hence the law of is Gamma with parameters and . We refer to, e.g.,  for details.
Let a standard Wiener process independent of , and consider the process defined by
where and are constants. Since is a Lévy process and is obtained by subordination of the former process, is itself a Lévy process, which goes under the name of (asymmetric) Variance-Gamma process and was introduced in .
In order to construct a pricing functional, we are going to assume that
This condition guarantees that there exists a constant such that the process is a -martingale. In fact, since is equal in law to , one has, recalling the expression for the moment-generating function of Gaussian laws,
Therefore, if , (3) implies
Since is a Lévy process, it is now easy to conclude that the process is a -martingale.
We postulate that the risk-free interest rate and the dividend rate are constant and that the price process can be written as
so that the market satisfies the no-arbitrage condition.
The hypothesis on just made is not equivalent to assuming that
In fact, setting , , the integration-by-parts formula yields
hence is given by the Doléans stochastic exponential of (see, e.g., [9, Theorem 26.8]), which in this case, recalling that is a pure-jump process with finite variation, reduces to
i.e. . Since the Variance-Gamma process has paths of finite variation, the same conclusion can of course be obtained by elementary path-wise considerations, without any recourse to stochastic calculus.
The price at time zero of a European call option expiring at time is
it is immediately seen that
A completely similar argument shows that the price of a put option with the same features can be written as
The problem of option prices is thus reduced to the evaluation of integrals against a Gamma measure. This can either be accomplished by numerical integration (the Gamma measure has an explicit density and the integrands, as functions of , are “almost” explicit), or, alternatively, by a simulation method. In particular, denoting a sequence of independent copies of by , the strong law of large numbers yields
-almost surely for any (measurable) function such that . Moreover, if , the central limit theorem implies
where . Writing , for a properly chosen , it is immediately seen that thanks to the hypothesis on the parameters , and if . Since typically (negative skew) and rarely exceeds , the condition is not restrictive (the estimated values of in our data set are always larger than ). For several representative choices of parameters, an average over (pseudo)random variates produces estimates of that are in very good agreement with those obtained by numerical integration.
Once a pricing formula for European options is available, calibration of the parameters is rather straightforward. Namely, denoting by the (theoretical) price of a put option in the above VG model (with fixed), assuming that and , , are observed parameters corresponding and option prices in a fixed day, one sets
It should be pointed out, however, that the map is not injective, hence the calibration procedure just outlined is not well posed, i.e. the (daily) estimates , , are not unique. In practice they will depend on the initialization of the minimization algorithm (we have chosen , and , respectively).
We use S&P500 index option data555The raw data are obtained from Historical Option Data, see www.historicaloptiondata.com. for the period January 3, 2012 to December 31, 2012. The sample contains observations of European call and put options. Prices are averages of bid and ask prices. Data points with time-to-maturity less than one day or volume less than are eliminated. This reduces the size of the sample to : 61% and 39% of the options are of put and call type, respectively.
During 2012 the annualized mean and standard deviation of daily returns of the S&P500 index were equal to and , respectively. During the same period the 1-year T-bill rate was very close to zero, with minimal variations: in particular, its mean was equal to , with a standard deviation equal to .
As is well known, index options on the S&P500 are very actively traded: the average daily volume is contracts, with maturities ranging from day to almost years. Descriptive statistics of the data are collected in Table 1.
This table collects some simple statistics for prices of European call and put options on the S&P500 index. The sample period is January 3, 2012 to December 31, 2012. Implied volatilities are annualized, time to maturity is expressed in days, strike and futures prices are expressed in index points.
|Implied ATM vol.||0.2||0.1||0.0||0.1||0.1||0.2||0.4||0.4||2.0|
|Time to maturity||96.7||157.0||1.0||2.0||4.0||38.0||269.0||404.0||1088.0|
It is commonly accepted that prices of in-the-money (ITM) options, because of their small trading volume, are not reliable, and that, for this reason, they should be replaced by prices computed via put-call parity whenever possible: the “new” prices are determined by prices of out-of-the-money (OTM) options, that are generally traded in larger volumes, and are hence considered to be accurately priced (cf., e.g., [1, p. 517-ff.] and ). We need therefore to check whether our data set is affected by such phenomenon. In other words, we need to check whether the recorded prices for ITM options satisfy the put-call parity relation with the corresponding OTM options. We are going to show that prices of ITM options in our data set can be considered perfectly reliable, and hence that no correction is needed. It is natural to argue that discarding prices of options whose trading volume is lower than 100 already eliminates possibly unreliable quotes.
Basic summary statistics on the volume of the options in our database according to their moneyness (see below for the precise definition we adopt) are collected in Table 2. Note that the total number of traded ITM and OTM options are rather close.
This table collects basic descriptive statistical figures on the daily trading volume of European call and put options on the S&P500 index. The sample period is January 3, 2012 to December 31, 2012. Moneyness is defined in terms of the spot price falling within a interval centered around the strike price.
Let us describe in detail the procedure to obtain “better” prices of ITM options via prices of corresponding OTM options: let , and be given, where and denote the observed prices at time of the index and of an ITM call option666The reasoning obviously holds, mutatis mutandis, also for an ITM put option. with maturity and strike price , respectively. Since the put option with the same maturity and strike is necessarily OTM, if Hypothesis (H) is satisfied (or simply assumed to hold), the put-call parity identity (1) provides an “alternative” price for the ITM call option, provided that
a put option with the same maturity and strike is traded;
estimates of and are available.
While good estimates of are easily available, estimating is in general not straightforward. In particular, since we are going to use the estimator defined in (2), which relies on put-call parity, one needs to avoid circular reasoning. This is achieved by using pairs of at-the-money (ATM) put and call options with the same maturity and strike to estimate . The latter estimates are then used in the put-call parity formula for ITM/OTM options, so that no circularity is involved, as, for any given day, the sets of ATM, ITM and OTM options are disjoint. ATM options are in general very liquid, so that their observed prices can be considered accurate. This implies that the corresponding estimates of can also be considered accurate. On the other hand, it may happen that for an ITM option with maturity there is no available estimate of , simply because no couple of ATM options with that maturity is traded. In this case we use linear interpolation, if possible, and nearest-neighbor extrapolation otherwise.
To implement the procedure just outlined, it is clearly necessary to define a measure of moneyness, so that options can be (uniquely) classified as at-the-money, in-the-money, or out-of-the-money. The simplest definition of (logarithmic spot) moneyness at time for a European call or put option with maturity and strike is . Closely related is the logarithmic forward simple moneyness, defined as , that is clearly better suited especially for options with longer maturities. However, recalling that the forward price