Optimum thresholding using mean and conditional mean square error

Optimum thresholding using mean and conditional mean square error

José E. Figueroa-López111Department of Mathematics, Washington University in St. Louis, MO, 63130, USA (figueroa@math.wustl.edu)  and Cecilia Mancini222Department of Management and Economics, University of Florence, via delle Pandette 9, 50127 (cecilia.mancini@unifi.it)
July 8, 2019

We consider a univariate semimartingale model for (the logarithm of) an asset price, containing jumps having possibly infinite activity (IA). The nonparametric threshold estimator of the integrated variance proposed in [18] is constructed using observations on a discrete time grid, and precisely it sums up the squared increments of the process when they are below a threshold, a deterministic function of the observation step and possibly of the coefficients of . All the threshold functions satisfying given conditions allow asymptotically consistent estimates of , however the finite sample properties of can depend on the specific choice of the threshold. We aim here at optimally selecting the threshold by minimizing either the estimation mean square error (MSE) or the conditional mean square error (cMSE). The last criterion allows to reach a threshold which is optimal not in mean but for the specific volatility and jumps paths at hand.

A parsimonious characterization of the optimum is established, which turns out to be asymptotically proportional to the Lévy’s modulus of continuity of the underlying Brownian motion. Moreover, minimizing the cMSE enables us to propose a novel implementation scheme for approximating the optimal threshold. Monte Carlo simulations illustrate the superior performance of the proposed method.

Keywords: Threshold estimator, integrated variance, Lévy jumps, mean square error, conditional mean square error, modulus of continuity of the Brownian motion paths, numerical scheme

JEL classification codes: C6, C13

1 Introduction

The importance of including jump components in assets prices models has been extensively highlighted. For instance Huang and Tauchen (in [13]) documented empirically that jumps account for 7% of the S&P500 market price variance, and many different tests for the presence of jumps in asset prices have been proposed and applied in the literature (see [19], Sec. 17.3, for a review of the most used tests). From an economic point of view, jumps may reflect, for instance, reactions of the market to important announcements or events. Thus semimartingale models with jumps are broadly used in a variety of financial applications, for example for derivative pricing, and also infinite activity jump components have been considered (see e.g. [9], ch.15).

Separately identifying the contribution of the Brownian part (through the Integrated Variance IV) and the one of the jumps to the asset price variations when we can observe prices discretely is crucial in many respects, for instance, for model assessing and for improving volatility forecasting: e.g. in [6] the proposed test for the presence of jumps is obtained after having filtered out the jump component; in [2], the separation allows to construct two tests for recognizing whether the jumps have finite or infinite variation; in [3] it is shown that including a separate factor accounting for the jumps in an econometric model for the realized variance substantially improves the out of sample volatility forecasts. The correct identification of a model has a significant impact on option pricing and on risk management and thus on assets allocation: for instance Carr and Wu (in [8]) show that the asymptotic behavior of the price of an option as the time-to-maturity approaches zero is substantially different depending on whether the model for the underlying contains jumps or not, and whether the jumps have finite or infinite variation; Liu, Longstaff, and Pan (in [17]) find that incorporating jumps events in the model dramatically affects the optimal investment strategy.

With discrete (non-noisy) observations, non parametrically disentangling the jumps from integrated variance (IV) has mainly been done by using Multipower Variations (MPVs) and Truncated (or Threshold) Realized Variance (TRV) (see [19], Sec. 17.2, for a review of also other methods). MPV relies on the observation that, when the jumps have finite activity, the probability of having jumps among subsequent sampling intervals is very small, however with infinite activity jumps, this probability is much larger. Hence, MPV may not work well in the general case. In contrast, TRV has been shown to be consistent also in the presence of any infinite activity jumps component ([18]). Further, it is efficient as soon as the jumps have finite variation.

However the choice of the truncation level (threshold) has an impact on the estimation performance of IV on finite samples. The estimation error is large when either the threshold is too small or when it is too large. In the first case too many increments are discarded, included the increments bearing relevant information about the Brownian part, and TRV underestimates IV. In the second case too many increments are kept within TRV, included many increments containing jumps, leading to an overestimation of IV. Many different data driven choices of the threshold have been proposed in the literature, for instance Ait-Sahalia and Jacod [2] (Sec. 4 therein) chose a truncation level of the form , where is the observation step and is a multiplier of the standard deviations of the continuous martingale part of the process (other choices are described in [19], p.418). However it is important to control for the estimation error for a given time resolution , and here we look for an endogenous, theoretically supported, optimal choice.

We consider the model


where W is a standard Brownian motion, is a cádlág process, and is a pure jump semimartingale (SM) process. We assume that we have at our disposal a record of discrete observations of spanned on the fixed time interval . We also define , or , the increment for any process , and a threshold function any deterministic non-negative function of the observation step , and possibly of a summary measure of the realized volatility path of , such that for any value the following conditions are satisfied

We know that then TRV, given by


where , is a consistent estimator of as , as soon as is a.s. bounded away from zero on . In the case where the jump process has finite variation (FV) and the observations are evenly spaced, the estimator is also asymptotically Gaussian and efficient.

For the choice of the threshold (TH) in finite samples, we consider the following two optimality criteria: minimization of MSE, the expected quadratic error in the estimation of IV; and minimization of cMSE, the expected quadratic error conditional on the realized paths of the jump process and of the volatility process . Even though, as mentioned above, many different TH selection procedures have been proposed, the literature for optimal TH selection is rather scarce. In [11] the TH that minimizes the expected number of jump misclassifications is considered for a class of additive processes with finite activity (FA) jumps and absolutely continuous characteristics. Even though it is shown therein that the proposed criterion is asymptotically equivalent to the minimization of the MSE in the case of Lévy processes with FA jumps, the latter optimality criterion was not directly analyzed in [11]. Here we go beyond and not only investigate the MSE criterion in the presence of FA jumps but also consider infinite activity jumps and further introduce the novel cMSE criterion. The last criterion allows to reach a threshold which is optimal not in mean but for the specific volatility and jumps paths at hand, so it is particularly appealing in the cases of non-stationary processes, for which, even if the MSE was feasible, the deviation of each realization from the unconditional mean value could be quite large, yielding a poor performance of the unconditional criterion. Moreover, minimizing the cMSE is important from a practical point of view, as will be seen in Section 5, where we propose a new TH selection method in the presence of FA jump processes.

Assuming evenly spaced observations, it turns out that for any semimartingale for which the volatility and the jump processes are independent of the underlying Brownian motion, the two quantities MSE and cMSE are explicit functions of the TH and under each criterion an optimal TH exists, and is a solution of an explicitly given equation, the equation being different under the two criteria. Under certain specific assumptions we also show uniqueness of the optimal TH: for Lévy processes , under the first criterion; for constant volatility processes with general FA jumps, under the second criterion.

The equation characterizing the optimal threshold depends on the observations’ time step and so does its solution. The optimal TH has to tend to 0 as h tends to zero and, under each criterion, an asymptotic expansion with respect to is possible for some terms within the equation, which in turn implies an asymptotic expansion of the optimal TH. Under the MSE criterion, when is Lévy and has either finite activity jumps or the activity is infinite but is symmetric strictly stable, the leading term of the expansion is explicit in , and in both cases is proportional to the modulus of continuity of the Brownian motion paths and to the spot volatility of X, the proportionality constant being , where is the jump activity index of Thus the higher the jump activity is, the lower the optimal threshold has to be if we want to discard the higher noise represented by the jumps and to catch information about .

The leading term of the optimal TH does not satisfy the classical assumptions under which the truncation method has been shown in [18] to consistently estimate , however, at least in the finite activity jumps case, we show herein that the threshold estimator of IV constructed with the optimal TH is still consistent.

The assumptions needed for the asymptotic characterization for the cMSE criterion are less restrictive, and also allow for a drift. We find that, for constant and general FA jumps, the leading term of the optimal TH still has to be proportional to the modulus of continuity of the Brownian motion paths and to . One of the main motivations for considering the cMSE arises from a novel application of this to tuneup the threshold parameter. The idea consists in iteratively updating the optimal TH and estimates of the increments of the continuous and jump components and of . We illustrate this method on simulated data. Minimization of cMSE in the presence of infinite activity jumps in is a further topic of ongoing research.

The constant volatility assumption of some of our results is obviously restrictive. It is possible to allow for stochastic volatility and leverage but, since the proofs are still ongoing, we only discuss here some ideas and present some simulations experiments that show that also in such contexts our methods outperform other popular estimators appearing in the literature.

An outline of the paper is as follows. Section 2 deals with the MSE: the existence of an optimal threshold is established for a SM having volatility and jumps independent on the underlying Brownian motion ; for a Lévy process , uniqueness is also established (Subsection 2.1) and the asymptotic expansion for the optimal TH is found in Section 2.3, in both the cases of a finite jump activity Lévy and of an infinite activity symmetric strictly stable . In Section 3, for any finite jump activity SM , consistency of is verified even when the threshold function consists of the leading term of the optimal threshold, which does not satisfy the classical hypothesis. Section 4 deals with the cMSE in the case where is a SM with constant volatility and FA jumps: existence of an optimal TH is established, its asymptotic expansion is found, then uniqueness is obtained. In Section 5 the results of Section 4 are used to construct a new method for iteratively determine the optimal threshold value in finite samples, and a reliability check is executed on simulated data. Section 6 presents a Monte Carlo study that shows the superior performance of the new methods over other methods available in the literature under stochastic volatility and leverage. Section 7 concludes and Section 8 contains the proofs of the presented results.

Acknowledgements. José Figueroa-López’s research was supported in part by the National Science Foundation grants: DMS-1561141 and DMS-1613016. Cecilia Mancini’s work has benefited from support by GNAMPA (Italian Group for research in Analysis, Probability and their Applications. It is a subunit of the INdAM group, the Itaian Group for research in High Mathematics, with site in Rome) and EIF (Institut Europlace de Finance, subunit of the Institut Louis Bachelier in Paris).

2 Mean Square Error

We compute and optimize the mean square error (MSE) of passing through the conditional expectation with respect to the paths of and :

Conditioning on , as well as assuming no drift in , is standard in papers where MSE-optimality is looked for, in the absence of jumps (see e.g. [5]). We also assume evenly spaced observation over a fixed time horizon , so that , for any , with . Denoted by the square root of a given threshold function, in this work we focus on the performance of the threshold estimator:


We indicate the corresponding MSE by . Note that for we have so ; as increases some squared increments are included within , so becomes closer to and decreases. However, if , for the quantity increases again, since includes all the squared increments and thus estimates the global quadratic variation of at time , and becomes close to . We look for a threshold giving

In this section we analyze the first derivative and we find that an optimal threshold exists, in the general framework where is a semimartingale satisfying A1 below, and we furnish an equation to which is a solution, while in Section 2.1, we find that is even unique. The equation has no explicit solution, but is a function of and we can explicitly characterize the first order term of its asymptotic expansion in , for . Clearly we can always find an approximation of the optimal threshold with arbitrary precision making use of numerical methods.

Let us denote

We assume the following

A1. A.s. for all ; ; and , are independent on .

The independence condition is needed to guarantee that remains a Brownian motion conditionally to and . We analyze the leverage case in our simulation study of Sec. 6. With the next theorem we compute the first derivative of the mean square error. The proof is deferred to the Appendix.

Theorem 1.

Under A1 and the finiteness of the expectations of the terms below, for fixed and , we have that , where


with and defined as

It clearly follows that if and only if and, thus, to our aim of finding an optimal threshold, it suffices to study the sign of as varies.

Notation. For brevity we sometimes omit to precise the dependence on of and .
For a function we sometimes use for
For two functions of a non-negative variable which tends to 0 (respectively to ), by , we mean that as (respectively ), by we mean that both and as (respectively ), while by we mean that as (respectively ).
We denote
h.o.t means higher order terms

Remark 1.

Under A1 and the finiteness of the expectation of the terms in MSE we have

The next Corollary states the existence of an optimal threshold (see the proof in the Appendix).

Corollary 1.

Under the same assumptions of Theorem 1 an optimal threshold exists and is solution of the equation .

To find an optimal threshold to estimate we need to find the zeroes of , which in turn depends on . Also, depends on the jump process increments , which we don’t know. An analogous problem arises when dealing with the minimization of the conditional MSE introduced in Section 4, where the optimal threshold has to satisfy the equation , with . However, when we apply our theory to the case of constant and finite activity jumps, as precisely explained in Section 5, we can proceed by estimating and iteratively. Another method yet to implement is to study the infill asymptotic behavior of in a stationary or deterministic state of . In some situations, the leading order terms of will only depend on a few summary measures of the stationary distribution or path of , which could be estimated separately or jointly with .

Remark 2.

In principle could even have many points where the absolute minimum value of MSE on is reached; also, MSE could have an infinite number of local not absolute minima.

To determine the number of solutions to , we need to study the sign of (corresponding to the convexity properties of ), but this is not easy. Define

so that

We can easily study the functions since we know that and for all . However within the joint function the presence of the terms makes it difficult even to know whether is positive.

2.1 When is Lévy

Let us assume

A2. is a Lévy process.

We now have that is constant and are i.i.d., so the equation characterizing is much simpler to analyze. Indeed, from (4), since within , the term of is independent on the terms of we have

The next result establishes uniqueness of the optimal threshold under A2. The proof is in the Appendix.

Theorem 2.

If is Lévy, equation


has a unique solution and, thus, there exists a unique optimal threshold, which is .

The equation in (5) has no explicit solution, however we can give some important indications to approximate .

2.2 Asymptotic behavior of

For the rest of Section 2, in order to emphasize the dependence of on , we write . We still are under A2, so recall that

is constant in . Note that is finite for any Lévy process , regardless of whether has bounded first moment or not. We consider two cases: the case where is a finite jump activity process and the one where it is a symmetric strictly stable process. The asymptotic characterization of will be used in Subsection 2.3 to deduce the asymptotic behavior in of the optimal threshold .

We anticipate that in Subsection 2.3 we will also see that an optimal threshold has to tend to 0 as and in such a way that

2.2.1 Finite Jump Activity Lévy process

The asymptotic characterization of in the case where has finite activity jumps is given in the following Theorem. Its proof is in the Appendix.

Theorem 3.

Let be a finite jump activity Lévy process with jump size density and with jump intensity . Suppose also that the restrictions of on and admit extensions on and , respectively. Then, for any such that and , as , we have

where above .

2.2.2 Strictly stable symmetric Lévy Jump process

Let us start by noting that

The first term above can be written as


By conditioning on and using the fact that , for all , we have

The following Lemmas state the asymptotic behavior of the above quantities under the assumption that . Their proofs are in the Appendix.

Lemma 1.

Suppose that is a symmetric -stable process with . Then, there exist constants and such that:

Lemma 2.

Suppose that is a symmetric strictly stable process with Lévy measure . Then, the following asymptotics hold:


As a consequence, the following Theorem states explicitly the asymptotic behavior of . It’s proof is in the Appendix.

Theorem 4.

Let , where is a Wiener process and is a symmetric strictly stable Lévy process with Lévy measure . Then, for any such that and , as , we have

2.3 Asymptotic behavior of

We now assume

A3. The support of any jump size is .

We firstly see that an optimal threshold has to tend to 0 as and in such a way that Then we will show the asymptotic behavior of in more detail.

Remark 3.

Note that under A3, if minimizes MSE, then necessarily as . Indeed, if then on a sequence converging to we would have in probability, rather than ; since the MSE could not be minimized.

Lemma 3.

Suppose , where is a Brownian motion and is a pure-jump Lévy process of bounded variation or, more generally, such that, for some , , for a real-valued random variable . Then, , as .

Remark. If has FA jumps, drift and , then we have and, thus, the assumption in Lemma 3 is satisfied with . If is a Lévy process with Blumenthal-Getoor index , then and for any we have , and again the assumption is satisfied.

We are now ready to show more precisely the asymptotic behavior of . Proposition 1 covers the FA jumps case, while Proposition 2 tackles the case of symmetric strictly stable jumps. Their proofs are deferred to the Appendix.

Proposition 1.

Let have FA jumps and satisfy the assumptions of Theorem 3, let be the optimal threshold. Then,

Proposition 2.

Under the conditions of Theorem 4, the optimal threshold is such that

As explained in the introduction, the proportionality constant of the previous result says that the higher the jump activity is, the lower the optimal threshold has to be if we want to discard the higher noise represented by the jumps and to catch information about .

3 Consistency When

Under the framework described in [18], in the case of equally spaced observations, the threshold criterion allows convergence of

to when, for all , we have and is a deterministic function of s.t. , as . Here we show that, under finite activity jumps, the same estimator is also consistent in the case where on any we consider a different truncation level with suitably chosen random variables . Concretely, assume the following

A4. Let


where for a non-explosive counting process and real-valued random variables , are càdlàg and a.s. .

Recall that a.s. the paths of and of are bounded on . Define , then, the following Proposition and Corollary hold true. Their proofs are in the Appendix.

Proposition 3.

Under A4, if we choose with any such that , we have:

a.s. , for sufficiently small :

Corollary 2.

For all , we have as .


We now put ourselves under A1. The quantity of our interest here, , is such that and because Further, from the proof of Theorem 1, we have


We analyze the sign of : for fixed, and also are fixed, and we have since . Further we have : to see this, first note that, from the expression of , , then , as . Moreover, each , thus, for sufficiently large , is a finite sum of positive terms for some constant and fixed so as . Since is continuous, it follows that an optimal threshold exists and solves

We now assume also A3.

Remark 4.

Under A3, as in Remark 3, if minimizes cMSE, then it has to be true that as . In Proposition 4 below we again also find that under the following A4’ then necessarily .

A4’. We assume A4 with , constant and

Under FA jumps, when considering we assume to have a sufficiently small so that a.s. the number of jumps occurring during is at most 1; note that for any we have , when selecting such that . Thus, when considering a jump time , we assume that is sufficiently small so that the sign of is the same as the one of , in particular if then the increments approaching it are non-zero.

4.1 Asymptotic behavior of , and

The following result ensures that, as previously announced, an optimal threshold has to tend to 0, as , but at a slower rate than Its proof is in the Appendix.

Proposition 4.

Under A1, A3, A4’, if solves and , then

We now pass to consider the asymptotic behavior of for sequences satisfying the conditions of Proposition 4.

Proposition 5.

Under A4’, if as in such a way that then where

With the notation and , we can write Note that , but , so which is the leading term between and depends on the choice of . We also remark that a solution of not necessarily is such that , however if a sequence is such that then the whole so it has to be true that is close (in a way that will become explicit later) to one of the solutions of .

Remark 5.

The asymptotic behavior of stated in Proposition 5 also holds under the presence of a nonzero drift process that has almost surely locally bounded paths (recall that any cádlág process satisfies such a requirement) and that is independent on . This is shown in the Appendix.

4.2 Asymptotic behavior of

We show here that any cMSE optimal threshold has the same asymptotic behavior as the MSE optimal threshold The proof of the following result is given in the Appendix.

Corollary 3.

Under A1, A3, A4’ we have that

The previous result suggests an approximation for the optimal of the form , with . It is natural to wonder about other choices for . Intuitively, we should aim at making to converge to as quickly as possible: in view of (49) within the proof of Corollary 3, the only possible way is rendering and within of the same order, so we choose such that


as . For example a function of type with any continuous function tending to as , satisfies the three above conditions333We thank Andrey Sarychev for having provided such nice examples.. However the quickest convergence speed of to 0 would be reached by choosing a function , which satisfies the following three more restrictive conditions, as ,


where condition 3’) means that In fact such a exists, since the following holds true444We thank Salvatore Federico for having provided such a nice result. The proof is available upon request..

Theorem 5.

There exists a unique deterministic function such that the three conditions 1), 2) and 3’) above are satisfied. Such a turns out to be differentiable and to satisfy also the ODE which entails that

We finally reach the uniqueness of the optimal threshold as a consequence of the following result, whose proof is in the Appendix. We remark that the asymptotic behavior of described in Corollary 3 is obtained after having proved just before (40) that it has to satisfy , as .

Proposition 6.

The first derivative of is such that, when evaluated at a function of satisfying , , and , then, as ,

Remark 6.

Uniqueness of . Since for any we reach that for sufficiently small we have on any sequence as in the above Proposition. That entails that for any sufficiently small the cMSE optimal is unique. Indeed, if there existed two optimal , we would necessarily have that , , and , but then, for small , on such sequences , which is a contradiction, because in order to be optimal both sequences have to satisfy

Remark 7.

The fact that the asymptotic behavior of the cMSE optimal threshold is the same as the one of the MSE optimal threshold under FA jumps is due to the fact that solves , solves , , , and the leading terms in are the ones with , which do not depend on , thus they are the same as for . It follows that, in the case of Lévy FA jumps, we have