Nonparametric estimates of pricing functionals
Abstract
We analyze the empirical performance of several nonparametric estimators of the pricing functional for European options, using historical put and call prices on the S&P500 during the year 2012. Two main families of estimators are considered, obtained by estimating the pricing functional directly, and by estimating the (BlackScholes) implied volatility surface, respectively. In each case simple estimators based on linear interpolation are constructed, as well as more sophisticated ones based on smoothing kernels, à la NadarayaWatson. The results based on the analysis of the empirical pricing errors in an extensive outofsample study indicate that a simple approach based on the BlackScholes formula coupled with linear interpolation of the volatility surface outperforms, both in accuracy and computational speed, all other methods.
Keywords: nonparametric estimation; option pricing; implied volatility.
JEL codes: G13, C14, C52.
1 Introduction
The purpose of this work is to analyze the empirical performance of some nonparametric and semiparametric estimators of pricing functional, with particular emphasis on the simplest contingent claims, i.e. European put and call options. The wellknown idea underlying nonparametric estimation is the following: assume that the price of a contingent claim of a certain type can be written as , where is a function to be estimated, and are observable parameters. Given a sample
and , one can estimate by linear interpolation (for instance) of the function defined as
This procedure is well understood if belongs to the convex hull of (see §3.1 for more details). There exist of course many other procedures based on nonlinear interpolation, rather than linear, that may be preferable in certain situations, for instance to obtain estimators that are more regular than just continuous. The same problem has also been studied from a statistical perspective, leading to the large literature on nonparametric estimation of regression functions^{1}^{1}1The even larger literature on nonparametric density estimation treats a less general but very related problem. (see, e.g., [3] and references therein). One of the most popular estimators in this literature is the socalled NadarayaWatson estimator (see [12, 18]), that is
where is a strictly positive continuous radial function with integral equal to and , or a slight generalization thereof.
Other functions of the parameters can clearly be estimated in the same manner. In particular, if the parameters are the usual inputs in the BlackScholes pricing formula (except the volatility), one can estimate the implied volatility, which can then be fed back to the BlackScholes formula to produce further estimates of the pricing functional.
One of the first works applying nonparametric regression à la NadarayaWatson in pricing problems is [1], where the authors, among other things, report impressive empirical results on the precision of semiparametric estimators applied to pricing European call options on the S&P500. To date, a large literature deals with this approach to pricing – see, e.g., [7] and references therein, as well as [6, 14] for related ideas.
Our purpose is to understand how different (but related) nonparametric approaches perform in terms of pricing precision, also in comparison with nontrivial fully parametric alternatives. A natural question, of clear relevance for practical purposes, is whether NadarayaWatson kernel estimation produces better estimates than elementary linear interpolation. In order to avoid rather heavy Markovianity and stationarity assumptions on the price process of the underlying index, we use observed option prices in a given day to estimate (unobserved!) option prices in the same day. It turns out that, from the empirical point of view, there does not seem to be any advantage connected with NadarayaWatson kernel estimators, which perform rather poorly in comparison with simple linearinterpolation estimators. The latter also consistently outperform a benchmark parametric estimator based on the (skewed) VarianceGamma process (see [11]).
The focus of this work is somewhat different from that of [1], whose main aim is to estimate the socalled stateprice density. However, such estimates are obtained by the second derivative with respect to the strike price of (estimates of) the pricing functional for European call options, and their main (practical) application seems to be pricing anyway. On the other hand, the main terms of comparison used in [1] are rather involved methods based on neural networks and on implied binomial trees, and empirical tests are done in a very different way: observed option prices over a period of nine months are aggregated to construct a NadarayaWatson estimator of the implied volatility surface, seen as a function of underlying’s futures price, strike, and time to maturity. This estimator is then used to forecast prices of European call options (with strike prices within a certain range) at five later dates, namely from one to twenty days in the future.
We have not been able to find in the literature neither empirical studies where observed option prices over different dates are not aggregated, nor a discussion about the practical aspects of nonparametric estimation of pricing functionals, for the mere purpose of pricing plain vanilla European options.^{2}^{2}2One can find, however, articles that use nonparametric methods to study related problems, without aggregating option prices across days. For instance, [8] estimate the riskneutral density and [2] test the monotonicity of the pricing kernel. Our goal is to try and answer some basic fundamental questions in this regard, on the basis of an extensive outofsample analysis.
The rest of the paper is organized as follows: in Section 2, after recalling some facts about yield processes associated to dividendpaying assets that are not easily found in standard textbooks, we prove a putcall parity identity for European options written on a dividendpaying asset, under a conditional uncorrelation assumption involving the asset, dividend rate, and riskfree rate processes. These basic theoretical results are needed because the S&P500 pays dividends. The various estimators used in the empirical analysis to follow are introduced in detail in Section 3. A preliminary analysis on the data set of option prices on the S&P500 for the whole year 2012 is conducted in Section 4. Finally, Section 5 contains a detailed study of the empirical performance of the estimators introduced in Section 3.
2 Setting and preliminaries
Given a probability space endowed with a (rightcontinuous, complete) filtration , let the adapted processes and describe the price of two traded assets: the former is just the riskfree cash account, i.e.
where denotes the adapted, a.s. positive, shortrate process; the latter is a risky asset with an associated adapted, a.s. positive, dividend rate process , such that
The corresponding yield process is defined as
We assume that there exists a probability measure , equivalent to , such that, setting , the discounted yield process , defined by
is a square integrable martingale. Setting , the integration by parts formula yields
where
and, by continuity of ,
In particular,
is a (square integrable) martingale , because is bounded and predictable.
As is well known, the existence of an equivalent probability measure such that the discounted yield process is a martingale implies that the market defined by and is free of arbitrage. In general is not unique, unless the market is complete, and we assume that is just a fixed pricing measure. More precisely, we assume that we are given a pricing functional
or, equivalently,
where the random variable often goes under the name of stochastic discount factor.
We shall write, for compactness of notation,
where .
We are going to use several times the following assumptions on , , and .
Hypothesis (H).
One has, for any ,
Here, for any two random variables , ,
Hypothesis (H) is equivalent to
for all .
2.1 Putcall parity with dividends and applications
To implement some estimators of the pricing functional, but also to carry out a preliminary analysis on the raw option prices, we shall need a version of putcall parity for European call and put options. This will also yield a natural estimator for (a functional of) the dividend rate process .
Proposition 2.1.
Assume that Hypothesis (H) holds. Then, for any ,
(1) 
where and denote the prices at time of European call and put options, respectively, with maturity , strike price , and underlying price process .
Proof.
Multiplying the identity
by and taking conditional expectation yields
Using Hypothesis (H) and the martingale property of the discounted yield process associated to and , one has
where , hence
Similarly,
Collecting terms and simplifying yields
Recall that the forward price at time of a contingent claim is defined as
(see, e.g., [10, §2.4]). Using arguments entirely analogous to those in the proof of the above Proposition, one obtains the following “spotforward parity”.
Proposition 2.2.
If Hypothesis (H) holds, then
In particular, (1) could also be written as
The putcall parity identity of Proposition 2.1 implies a procedure to estimate : assuming that an estimator of is available and is denoted by the same symbol, and that prices of call and put options with the same maturity and strike price are observable, identity (1) yields
(2) 
where and denote the price at time of European call and put options, respectively, with maturity and strike price . This estimate of is usually preferred in applications to option pricing to other proxies of , such as historical estimates.
3 Estimators of the pricing functional
From now on we shall assume that

the filtration is the (rightcontinuous, completed) filtration generated by ;

the process is Markovian, i.e., for any bounded random variable measurable with respect to , one has
These assumptions immediately imply that there exists a function such that
for all . In particular, without any further assumptions, the pricing functional will be timedependent, hence it is wrong, in general, to use the price functional estimated at time to price options at time , or, similarly, to estimate the pricing functional aggregating data of different dates.^{3}^{3}3For a related discussion, also from an economic point of view, cf. [14]. This procedure become meaningful if we further assume that the process is a timehomogeneous Markov process: in this case depends on and only through their difference . However, it is far from clear that a timehomogeneity assumption on the datagenerating process would be appropriate, hence the empirical analysis of Section 5 is carried out with fixed. In §4.1 we also provide empirical data that does not seem to support the plausibility of a timehomogeneity assumption.
We now proceed to describe in detail the various estimators of the pricing functional for put options that will be tested in Section 5. Analogous considerations, not spelled out in detail, hold for call options and, in fact, for arbitrary European derivatives: some care has to be taken only if the payoff function is unbounded, in which case integrability conditions have to be assumed.
3.1 Linear interpolator
Let and subsets of and , and a function such that for all . Denoting the convex hull of by , the linear interpolator is a function obtained by linear interpolation of the function , where is the marked empirical measure
stands for the Dirac measure, and .
Linear interpolation here means that the Delaunay triangulation of is computed, and, for any , is obtained by interpolation of on the vertices of the simplex containing . We recall that a triangulation of the set of points is a partitioning of the convex hull into simplices whose vertices are the points . In particular, any two simplices of such a partition either do not intersect or share a common face. A Delaunay triangulation of satisfies the following additional property: if is a ball in such that all vertices of a simplex belong to its boundary, then the interior of does not contain any element of . Once a Delaunay triangulation of has been obtained, , with , is defined as follows: let the vertices of the unique simplex such that , and write
Then we set . For an extensive treatment of these topics, see, e.g., [13].
Let be fixed. Then, as seen above, we can write
Assuming that , adaptedness of and Markovianity imply
Denoting by the linear interpolator of , we define the normalized linear interpolator of by . This is the estimator used in the empirical analysis of Section 5.
Note that the linear interpolator is defined only on the convex hull of the couples . In other words, this estimator is unable to estimate the price of options whose parameters do not belong to . From a purely numerical point of view, the estimator could be extended outside , for instance by extrapolation. However, the estimates obtained this way are rarely reliable. Another possibility to extend the estimator outside is to add fictitious couples to the sample where the value of the function is known, e.g. for , or for “very large”. This idea, which is obviously more meaningful than a mere numerical extrapolation, may lead, in some situations, to relatively satisfactory results. Details are discussed in Section 5 below.
3.2 NadarayaWatson estimator
Let and be as in the previous subsection. Let be an integrable function with integral equal to , and set, for any , . With a slight abuse of notation, for any , we define the function as , i.e.
The NadarayaWatson (NW) estimator of , with smoothing parameters , based on the observations , is defined as
The value of the function at is thus a weighted average of the observations of the type
Usually is such that (i.e. it is symmetric), with is decreasing, so that the NW estimator effectively computes a weighted average of the observations assigning more weight to those closer to . If has compact support, the average is on a finite number of points whose distance from does not exceed a certain threshold (depending on ). The NW estimator can also be interpreted as a local constant least square approximation of the observed values , because (assuming for simplicity )
(see, e.g., [17, p. 34]). Here “local” simply refers to the weighting through that, as before, assigns more weights to observations closer to .
The explicit form of the NadarayaWatson estimator of we shall use is
where is the density of the standard Gaussian measure on .
As is well known (see, e.g., [17]), the choice of the smoothing parameter is crucial in the implementation of nonparametric regression techniques: as the estimator is “undersmoothing”, and as it is “oversmoothing”.^{4}^{4}4Roughly speaking, as the estimator reproduces the data, i.e. it is equal to for and zero elsewhere, and it converges to a constant equal to the average of the as . We are going to select by (leaveoneout) crossvalidation (cf. [17, §1.4, p. 27ff.]) as follows: let be the NadarayaWatson estimator of based on the sample , , and
Then we set
In the NW estimator of the pricing functional we replace above with
i.e. the smoothing parameter is chosen as to minimize the relative error rather than the absolute error.
Since the crossvalidation procedure is computationally very intensive, hence rather slow, we shall also consider a much simpler choice of the smoothing parameters as a term of comparison. In particular, following [16, §3.4.2], we shall use
where and are the (sample) standard deviation and the interquartile range of , , respectively, and , are defined in the same way with in place of .
Remark 3.1.
(a) In some applications (e.g. to estimate a function together with its derivatives) it is useful to allow to take negative values. Then is not guaranteed to be positive. The usual convention is then simply to redefine as its positive part. (b) If is supported on the whole space, the NadarayaWatson estimator is defined also for points that do not lie in the convex hull of . However, since this estimator is nothing else than a local average, estimates produced for points that lie outside should be taken with extreme care, if not plainly discarded.
3.3 Implied volatility estimators
Let denote the BlackScholes price of a put option with time to maturity and strike , written on an underlying with current price , constant dividend rate and volatility , where the riskfree rate is also constant. As is well known, is strictly monotone, hence, for any , and , there exists a unique , called implied volatility, such that . We shall call by the same name also the function that is uniquely defined by the procedure just described.
If , , is a sample of observed option prices and corresponding strike prices and times to maturity in a fixed day (so that is also fixed), then for each there exists a unique positive number such that
hence can be interpreted as (estimates of the) values of the implied volatility function on the set of points , i.e., for some function , for all . This yields
which immediately suggest another procedure to estimate the pricing functional : let , where denotes the convex hull of , and define by linear interpolation of (in the sense of §3.1), hence set
In the empirical study below we shall employ a normalized estimator of the implied volatility function , in complete analogy to the construction of the normalized linear interpolator of §3.1. Namely, given , we define by linear interpolation at of the function whose value at is , .
Alternatively, one could use, in place of the linear interpolator , a NadarayaWatson estimator , obtaining another estimator of the pricing functional. Of course all considerations of the previous subsection regarding the choice of the smoothing parameter, as well as the lack of plausibility for estimates with , apply also in this case.
Note that, in contrast to the linear interpolator and the NadarayaWatson estimator, estimating requires estimates of and . While historical data on the riskfree interest rate are readily available, we use as a proxy for the implied estimator discussed at the end of Section 2 and, in more detail, in Section 4 below. Moreover, the BlackScholes structure allows to obtain an explicit expression for the sensitivity of the implied volatility with respect to the parameter . Such result, which could have some interest in its own right, can be found in the Appendix (the derivation therein deals with call options, but the corresponding result for put options follows easily).
3.4 A parametric estimator of the VarianceGamma class
A measure on the Borel algebra of is called Gamma measure with parameters and if
for any Borel set . A random variable is said to be Gammadistributed with parameters and if its law is a Gamma measure with the same parameters. Elementary calculus (and the definition of the Gamma function) shows that, for any ,
(3) 
and
whence it immediately follows that Gamma laws are infinitely divisible. Let be a Gamma measure with . Then there exists a positive increasing Lévy process starting from zero (i.e., a subordinator) such that the law of is , and
hence the law of is Gamma with parameters and . We refer to, e.g., [15] for details.
Let a standard Wiener process independent of , and consider the process defined by
where and are constants. Since is a Lévy process and is obtained by subordination of the former process, is itself a Lévy process, which goes under the name of (asymmetric) VarianceGamma process and was introduced in [11].
In order to construct a pricing functional, we are going to assume that
This condition guarantees that there exists a constant such that the process is a martingale. In fact, since is equal in law to , one has, recalling the expression for the momentgenerating function of Gaussian laws,
Therefore, if , (3) implies
hence choosing
Since is a Lévy process, it is now easy to conclude that the process is a martingale.
We postulate that the riskfree interest rate and the dividend rate are constant and that the price process can be written as
so that the market satisfies the noarbitrage condition.
Remark 3.2.
The hypothesis on just made is not equivalent to assuming that
In fact, setting , , the integrationbyparts formula yields
hence is given by the Doléans stochastic exponential of (see, e.g., [9, Theorem 26.8]), which in this case, recalling that is a purejump process with finite variation, reduces to
i.e. . Since the VarianceGamma process has paths of finite variation, the same conclusion can of course be obtained by elementary pathwise considerations, without any recourse to stochastic calculus.
The price at time zero of a European call option expiring at time is
Setting
it is immediately seen that
where
A completely similar argument shows that the price of a put option with the same features can be written as
The problem of option prices is thus reduced to the evaluation of integrals against a Gamma measure. This can either be accomplished by numerical integration (the Gamma measure has an explicit density and the integrands, as functions of , are “almost” explicit), or, alternatively, by a simulation method. In particular, denoting a sequence of independent copies of by , the strong law of large numbers yields
almost surely for any (measurable) function such that . Moreover, if , the central limit theorem implies
where . Writing , for a properly chosen , it is immediately seen that thanks to the hypothesis on the parameters , and if . Since typically (negative skew) and rarely exceeds , the condition is not restrictive (the estimated values of in our data set are always larger than ). For several representative choices of parameters, an average over (pseudo)random variates produces estimates of that are in very good agreement with those obtained by numerical integration.
Once a pricing formula for European options is available, calibration of the parameters is rather straightforward. Namely, denoting by the (theoretical) price of a put option in the above VG model (with fixed), assuming that and , , are observed parameters corresponding and option prices in a fixed day, one sets
where . This procedure, possibly with the sum of absolute errors instead of relative errors, is widely used by practitioners as well as in academic publications (cf. e.g. [5] and [4], respectively).
Remark 3.3.
It should be pointed out, however, that the map is not injective, hence the calibration procedure just outlined is not well posed, i.e. the (daily) estimates , , are not unique. In practice they will depend on the initialization of the minimization algorithm (we have chosen , and , respectively).
4 Data
We use S&P500 index option data^{5}^{5}5The raw data are obtained from Historical Option Data, see www.historicaloptiondata.com. for the period January 3, 2012 to December 31, 2012. The sample contains observations of European call and put options. Prices are averages of bid and ask prices. Data points with timetomaturity less than one day or volume less than are eliminated. This reduces the size of the sample to : 61% and 39% of the options are of put and call type, respectively.
During 2012 the annualized mean and standard deviation of daily returns of the S&P500 index were equal to and , respectively. During the same period the 1year Tbill rate was very close to zero, with minimal variations: in particular, its mean was equal to , with a standard deviation equal to .
As is well known, index options on the S&P500 are very actively traded: the average daily volume is contracts, with maturities ranging from day to almost years. Descriptive statistics of the data are collected in Table 1.
This table collects some simple statistics for prices of European call and put options on the S&P500 index. The sample period is January 3, 2012 to December 31, 2012. Implied volatilities are annualized, time to maturity is expressed in days, strike and futures prices are expressed in index points.
Percentiles  

Variable  Mean  Std  Min  Max  
Call price  34.3  98.8  0.0  0.1  0.2  9.2  75.2  115.5  1270.0 
Put price  21.3  46.5  0.0  0.1  0.1  5.9  58.2  93.7  1197.0 
Implied vol.  0.2  0.1  0.0  0.1  0.1  0.2  0.4  0.4  2.6 
Implied ATM vol.  0.2  0.1  0.0  0.1  0.1  0.2  0.4  0.4  2.0 
Time to maturity  96.7  157.0  1.0  2.0  4.0  38.0  269.0  404.0  1088.0 
Strike price  1301.0  208.4  100.0  950.0  1075.0  1345.0  1480.0  1525.0  3000.0 
Futures price  1374.4  48.4  1207.2  1289.5  1309.1  1377.1  1435.9  1450.7  1466.8 
It is commonly accepted that prices of inthemoney (ITM) options, because of their small trading volume, are not reliable, and that, for this reason, they should be replaced by prices computed via putcall parity whenever possible: the “new” prices are determined by prices of outofthemoney (OTM) options, that are generally traded in larger volumes, and are hence considered to be accurately priced (cf., e.g., [1, p. 517ff.] and [5]). We need therefore to check whether our data set is affected by such phenomenon. In other words, we need to check whether the recorded prices for ITM options satisfy the putcall parity relation with the corresponding OTM options. We are going to show that prices of ITM options in our data set can be considered perfectly reliable, and hence that no correction is needed. It is natural to argue that discarding prices of options whose trading volume is lower than 100 already eliminates possibly unreliable quotes.
Basic summary statistics on the volume of the options in our database according to their moneyness (see below for the precise definition we adopt) are collected in Table 2. Note that the total number of traded ITM and OTM options are rather close.
This table collects basic descriptive statistical figures on the daily trading volume of European call and put options on the S&P500 index. The sample period is January 3, 2012 to December 31, 2012. Moneyness is defined in terms of the spot price falling within a interval centered around the strike price.
Percentiles  

Moneyness  Mean  Std  Min  Max  
Atthemoney  2696  4793  100  110  150  825  7824  12475  62334 
Inthemoney  1489  3972  100  100  113  490  3000  6000  65412 
Outofthemoney  1688  3191  100  108  133  640  4037  6381  101718 
Let us describe in detail the procedure to obtain “better” prices of ITM options via prices of corresponding OTM options: let , and be given, where and denote the observed prices at time of the index and of an ITM call option^{6}^{6}6The reasoning obviously holds, mutatis mutandis, also for an ITM put option. with maturity and strike price , respectively. Since the put option with the same maturity and strike is necessarily OTM, if Hypothesis (H) is satisfied (or simply assumed to hold), the putcall parity identity (1) provides an “alternative” price for the ITM call option, provided that

a put option with the same maturity and strike is traded;

estimates of and are available.
While good estimates of are easily available, estimating is in general not straightforward. In particular, since we are going to use the estimator defined in (2), which relies on putcall parity, one needs to avoid circular reasoning. This is achieved by using pairs of atthemoney (ATM) put and call options with the same maturity and strike to estimate . The latter estimates are then used in the putcall parity formula for ITM/OTM options, so that no circularity is involved, as, for any given day, the sets of ATM, ITM and OTM options are disjoint. ATM options are in general very liquid, so that their observed prices can be considered accurate. This implies that the corresponding estimates of can also be considered accurate. On the other hand, it may happen that for an ITM option with maturity there is no available estimate of , simply because no couple of ATM options with that maturity is traded. In this case we use linear interpolation, if possible, and nearestneighbor extrapolation otherwise.
To implement the procedure just outlined, it is clearly necessary to define a measure of moneyness, so that options can be (uniquely) classified as atthemoney, inthemoney, or outofthemoney. The simplest definition of (logarithmic spot) moneyness at time for a European call or put option with maturity and strike is . Closely related is the logarithmic forward simple moneyness, defined as , that is clearly better suited especially for options with longer maturities. However, recalling that the forward price