Nonparametric tests for detecting breaks in the jump behaviour of a time-continuous process
This paper is concerned with tests for changes in the jump behaviour of a time-continuous process. Based on results on weak convergence of a sequential empirical tail integral process, asymptotics of certain tests statistics for breaks in the jump measure of an Itō semimartingale are constructed. Whenever limiting distributions depend in a complicated way on the unknown jump measure, empirical quantiles are obtained using a multiplier bootstrap scheme. An extensive simulation study shows a good performance of our tests in finite samples.
Keywords and Phrases: Change points; Lévy measure; multiplier bootstrap; sequential empirical processes; weak convergence.
AMS Subject Classification: 60F17, 60G51, 62G10.
Recent years have witnessed a growing interest in statistical tools for high-frequency observations of time-continuous processes. With a view on finance, the seminal paper by DelSch94 suggests to model such a process using Itō semimartingales, say , which is why most research has focused on the estimation of (or on tests concerned with) its characteristics. Particular interest has been paid to integrated volatility or the entire quadratic variation, mostly adapting parametric procedures based on normal distributions, as the continuous martingale part of an Itô semimartingale is nothing but a time-changed Brownian motion. For an overview on methods in this field see the recent monographs by JacPro12 and AitJac14.
Still less popular is inference on the jump behaviour only, even though empirical research shows a strong evidence supporting the presence of a jump component within ; see e.g. AitJac09a or AitJac09b. In this work, we will address the question whether the jump behaviour of is time-invariant. Corresponding tests, commonly referred to as change point tests, are well known in the framework of discrete time series, but have recently also been extended to time-continuous processes; see e.g. LeeNisYos06 on changes in the drift or IacYos12 on changes in the volatility function of . However, to the best of our knowledge, no procedures are available for detecting breaks in the jump component.
Suppose that we observe an Itō semimartingale which admits a decomposition of the form
where is a standard Brownian motion, is a Poisson random measure on , and the predictable compensator satisfies . As a fairly general structural assumption, we allow the characteristics of , i.e. and to depend deterministically on time. Recall that can be interpreted as a local Lévy measure, such that
for each and denotes the average number of jumps that fall into the set over a unit time interval.
Now, we assume that we have data from the process in a high-frequency setup. Precisely, at stage , we are able to observe realizations of the process at the equidistant times for , where the mesh , while . In this situation we want to test the null hypothesis that the jump behaviour of the process is the same for all observations, i.e. there exists some measure such that for all , against alternatives involving the non-constancy of . For instance, one might consider an alternative consisting of one break point, i.e. there exists some and two Lévy measures , such that the process behind the first observations has Lévy measure and the remaining observations are taken from a process with Lévy measure . The restriction to a deterministic drift and volatility in (1.1) is merely technical here, as it allows to use empirical process theory for independent observations later. An argument similar to that in Section 5.3 in BueVet13 proves that one might as well work with random coefficients and .
Throughout the work, we will restrict ourselves to positive jumps only. Thus, for , let denote the tail integral (or spectral measure; see RueWoe02) associated with , which determines the jump measure uniquely. For such that , define
with , which serves as an empirical tail integral based on the increments . If is a Lévy process with a Lévy measure not changing in time, FigLop08 illustrated that is a suitable estimator for the tail integral in the sense that, under regularity conditions, is -consistent for . Following the approach in Ino01, it is therefore likely that we can base tests for on suitable functionals of the process
where and . Under the null hypothesis, this expression can be expected to converge to for all and , whereas under alternatives, for instance those involving a change at as described before, should converge to an expression which is non-zero.
More precisely, we will consider the following standardized version of , namely
for and , where . An appropriate functional allowing to test the hypothesis of a constant Lévy measure is for instance given by a Kolmogorov-Smirnov statistic of the form
The null hypothesis of no change in the Lévy measure is rejected for large values of . The restriction to jumps larger than is important, since there might be infinitely many of arbitrary small size.
The limiting distribution of the previously mentioned test statistic will turn out to depend in a complicated way on the unknown Lévy measure . Therefore, corresponding quantiles are not easily accessible and must be obtained by suitable bootstrap approximations. Following related ideas for detecting breaks within multivariate empirical distribution functions (Ino01), we opt for using empirical counterparts based on a multiplier bootstrap scheme, frequently also referred to as wild or weighted bootstrap. The approach essentially consists of multiplying each indicator within the respective empirical tail integrals with an additional, independent and standardized multiplier. The underlying empirical process theory is for instance summarized in the monograph Kos08.
The remaining part of this paper is organized as follows: the derivation of a functional weak convergence result for the process under the null hypothesis is the content of Section 2. The asymptotic properties of can then easily be derived from the continuous mapping theorem. Section 3 is concerned with the approximation of the limiting distribution using the previously described multiplier bootstrap scheme. In Section 4, we discuss the formal derivation of several tests for a time-homogeneous jump behaviour, whereas an extensive simulation study is presented in Section 5. All proofs are deferred to the Appendix, which is Section 6.
2 Functional weak convergence of the sequential empirical tail integral
In this section, we derive a functional weak convergence result for the process defined in (1.2). For that purpose, we have to introduce an appropriate function space. We set and let denote the space of all functions which are bounded on every set for which the projection onto the second coordinate, , is bounded away from . Moreover, for , we define , and, for , we set
where Note that defines a metric on which induces the topology of uniform convergence on all sets such that its projection is bounded away from , i.e. a sequence of functions converges with respect to if and only if it converges uniformly on each (VanWel96, Chapter 1.6).
Furthermore to establish our results on weak convergence under the null hypothesis, we impose the following conditions.
Condition 2.1. ()
is an Itō semimartingale with the representation in (1.1) such that
The drift and the volatility are càglàd, bounded and deterministic.
There exists some Lévy measure such that for all .
has only positive jumps, that is, the jump measure is supported on .
is absolutely continuous with respect to the Lebesgue measure on . Its density , called Lévy density, is differentiable with derivative and satisfies
for all with . ∎
The next lemma is essential for the weak convergence results. Similar statements can be found in FigLopHou09, with slightly stronger assumptions on , and in BueVet13 in the bivariate case.
Lemma 2.2. ()
Remark 2.3. ()
The limiting behaviour of the process can be deduced from the next theorem, which is a result for weak convergence of a sequential empirical tail integral process. For and set
where and denote its standardized version by
Obviously, the sample paths of are elements of .
Theorem 2.4. ()
Let be an Itō semimartingale that satisfies Condition 2.1. Furthermore, assume that the observation scheme has the properties:
Then, in , where is a tight mean zero Gaussian process with covariance
for . The sample paths of are almost surely uniformly continuous on each ) with respect to the semimetric
Note that we have centered around its expectation in (2.2). In most applications, however, we are interested in estimating functionals of the jump measure, and according to Lemma 2.2 we need stronger conditions then. Precisely, we consider the process
and get, as an immediate consequence of the previous two results, the following sequential generalization of Theorem 4.2 of BueVet13.
Corollary 2.5. ()
Theorem 2.6. ()
Using the continuous mapping theorem, we are now able to derive the weak convergence of various statistics allowing for the detection of breaks in the jump behaviour. The following corollary treats the statistic defined in (1.3).
Corollary 2.7. ()
The covariance function of the limit process in Theorem 2.6 depends on the Lévy measure of the underlying process, which is usually unknown in applications. If one only wants to detect changes in the tail integral of the Lévy measure at a fixed point , the following proposition deals with the simple transformation
of which yields a pivotal limiting distribution.
Proposition 2.8. ()
Let be an Itō semimartingale that satisfies Condition 2.1. Moreover, let be a real number with and suppose that the underlying observation scheme meets the assumptions from Corollary 2.5. Then, in , where denotes a standard Brownian bridge. As a consequence,
the limiting distribution being also known as the Kolmogorov-Smirnov distribution.
Remark 2.9. ()
We have derived the previous results under somewhat simplified assumptions on the observation scheme in order to keep the presentation rather simple. A more realistic setting could involve additional microstructure noise effects or might rely on non-equidistant data. In both cases, standard techniques still yield similar results.
For example, in case of noisy observations, Vet14 has shown that a particular de-noising technique allows for virtually the same results on weak convergence as for the plain in the case without noise. For non-equidistant data, the limiting covariance functions and in general depend on the sampling scheme. The latter effect is well-known from high-frequency statistics in the case of volatility estimation; see e.g. MykZha12. ∎
3 Bootstrap approximations for the sequential empirical tail integral
We have seen in Corollary 2.7 that the distribution of the limit of the process depends in a complicated way on the unknown Lévy measure of the underlying process. However, we need the quantiles of or at least good approximations for them to obtain a feasible test procedure. Typically, one uses resampling methods to solve this problem.
Probably the most natural way to do so is to use in order to obtain an estimator for the Lévy measure first, and to draw a large number of independent samples of an Itō semimartingale with Lévy measure then, possibly with estimates for drift and volatility as well. Based on each sample, one might then compute the test statistic , and by doing so one obtains empirical quantiles for .
However, from a computational side, such a method is computationally expensive since one has to generate independent Itō semimartingales for each stage within the bootstrap algorithm. Therefore we have decided to work with an alternative bootstrap method based on multipliers, where one only needs to generate i.i.d. random variables with mean zero and variance one (see also Ino01, who used a similar approach in the context of empirical processes).
Precisely, the situation now is as follows: The bootstrapped processes, say , will depend on some random variables and on some random weights . The , that we consider as collected data, are defined on a probability space . The random weights are defined on a distinct probability space . Thus, the bootstrapped processes live on the product space . The following notion of conditional weak convergence will be essential. It can be found in Kos08 on pp. 19–20.
Definition 3.1. ()
Let be a (bootstrapped) element in some metric space depending on some random variables and some random weights . Moreover, let be a tight, Borel measurable map into . Then converges weakly to conditional on the data in probability, notationally , if and only if
Here, denotes the conditional expectation over the weights given the data , whereas is the space of all real-valued Lipschitz continuous functions on with sup-norm and Lipschitz constant . Moreover, and denote a minimal measurable majorant and a maximal measurable minorant with respect to the joint data (including the weights ), respectively. ∎
Remark 3.2. ()
Note that we do not use a measurable majorant or minorant in item (a) of the definition. This is justified through the fact that, in this work, all expressions , with a bootstrapped statistic and a Lipschitz continuous function , are measurable functions of the random weights.
Note that the implication “(ii) (i)” in the proof of Theorem 2.9.6 in VanWel96 shows that, in general, conditional weak convergence implies unconditional weak convergence with respect to the product measure . ∎
Throughout this paper we denote by
the bootstrap approximation which is defined by
where . The following theorem establishes conditional weak convergence of this bootstrap approximation for the sequential empirical tail integral process .
Theorem 3.3. ()
Let be an Itō semimartingale that satisfies Condition 2.1 and assume that the observation scheme meets the conditions from Theorem 2.4. Furthermore, let be independent and identically distributed random variables with mean and variance , defined on a distinct probability space as described above. Then,
in , where denotes the limiting process of Theorem 2.4.
The following result establishes consistency of in the sense of Definition 3.1.
Theorem 3.4. ()
The distribution of the Kolmogorov-Smirnov-type test statistic defined in (1.3) can be approximated with the bootstrap statistics investigated in the following corollary. It can be proved by a simple application of Proposition 10.7 in Kos08 on an appropriate .
Corollary 3.5. ()
Under the assumptions of Theorem 3.3 we have, for each ,
4 The testing procedures
In order to derive a test procedure which utilizes the results on weak convergence from the previous two sections, we have to formulate our hypotheses first. Under the null hypothesis the jump behaviour of the process is constant. More precisely, this means the following:
We want to test this hypothesis versus the alternative that there is exactly one change in the jump behaviour. This means in detail:
The corresponding alternative for a fixed is then given through:
We have the situation from , but with and .
4.2 The tests and their asymptotic properties
In the sequel, let be some large number and let denote independent vectors of i.i.d. random variables, , with mean zero and variance one. As before, we assume that these random variables are generated independently from the original data. We denote by or the particular statistics calculated with respect to the data and the -th bootstrap multipliers . For a given level , we consider the following test procedures:
Reject in favor of , if , where is defined in Proposition 2.8 and where denotes the quantile of the Kolmogorov-Smirnov-(KS-)distribution, that is the distribution of with a standard Brownian bridge .
Reject in favor of , if
where denotes the -sample quantile of , and where
Choose an appropriate small and reject in favor of , if
where denotes the -sample quantile of .
Since has to be chosen prior to an application of the CP-Test, we can only detect changes in the jumps larger than . From a theoretical point of view this is not entirely satisfactory, since one is interested in distinguishing arbitrary changes in the jump behaviour. On the other hand, in most applications only the larger jumps are of particular interest, and at least the size of provides a natural bound to disentangle jumps from volatility. Thus, a practitioner can choose a minimum jump size first, and use the CP-Test to decide whether there is a change in the jumps larger than .
The following proposition shows that three aforementioned tests keep the asymptotic level under the null hypothesis.
Proposition 4.1. ()
Suppose the sampling scheme meets the conditions of Corollary 2.5. Then, KSCP-Test1, KSCP-Test2 and CP-Test are asymptotic level tests for in the sense that, under , for all ,
for all such that .
The next proposition shows that the preceding tests are consistent under the fixed alternatives defined in Section 4.1. For simplicity, we only consider alternatives involving one change point, even though the results may be extended to alternatives involving multiple breaks or even continuous changes.
Proposition 4.2. ()
Suppose the sampling scheme meets the conditions of Corollary 2.5. Then, KSCP-Test1, KSCP-Test2 and CP-Test are consistent in the following sense: under , for all and all , we have
Under , there exists an such that, for all and all ,
4.3 Locating the change point
Let us finally discuss how to construct suitable estimators for the location of the change point. We begin with a useful proposition.
Proposition 4.3. ()
Suppose the sampling scheme meets the conditions of Corollary 2.5. Then, under , converges in to the function
in outer probability, with and .
Since attains its maximum in , natural estimators for the position of the change point are therefore given by
for the test problem versus and
in the setup versus . The next proposition states that these estimators are consistent.
Proposition 4.4. ()
Suppose the sampling scheme meets the conditions of Corollary 2.5. If is true, there exists an such that as . In the special case of , we have
5 Finite-sample performance
In this section, we present results of a large scale Monte Carlo simulation study, assessing the finite-sample performance of the proposed test statistics for detecting breaks in the Lévy measure. Moreover, under the alternative of one single break, we show results on the performance of the estimator for the break point from Section 4.3.
The experimental design of the study is as follows.
We consider five different choices for the number of trading days, namely , and corresponding frequencies . Note that for any of these choices.
We consider two different models for the drift and the volatility: either, we set or , resulting in a pure jump process and a process including a continuous component, respectively.
We consider one parametric model for the tail integral, namely
(which yields a -stable subordinator in the case of ). For the parameter , we consider different choices, that is , with , ranging from to .
We consider models with one single break in the tail integral at different break points, ranging form to (note that corresponds to the null hypothesis). The tail integrals before and after the break point are chosen from the previous parametric model.
The target values of our study are, on the one hand, the empirical rejection level of the tests and, on the other hand, the empirical distribution of the estimators for the change point . To assess these target values, any combination of the previously described settings was run times, with the bootstrap tests being based on bootstrap replications. The Itō semimartingales were simulated by a straight-forward modification of Algorithm 6.13 in ConTan04, where, under alternatives involving one break point, we simply merged two paths of independent semimartingales together.
The simulation results under these settings are partially reported in Table 1 and 2 (for the null hypothesis) and in Figures 1–4 (for various alternatives). More precisely, Table 1 and 2 contain simulated rejection rates under the null hypothesis for various values of and in the KSCP-tests, for the pure jump subordinator (Table 1) and for the process involving a continuous component (Table 2). For the CP-tests, the suprema over were approximated by taking a maximum over a finite grid : we used the grids in the pure jump case, resulting in , and in the case , resulting in . In the latter case, we chose depending on since jumps of smaller size may be dominated by the Brownian component resulting in a loss of efficiency of the CP-test (see also the results in Figure 3 below). The results in the two tables reveal a rather precise approximation of the nominal level of the tests () in all scenarios. In general, KSCP-Test 1 turns out to be slightly more conservative than KSCP-Test 2.
The results presented in Figure 1 consider the CP-test for alternatives involving one fixed break point at and a varying height of the jump size, as measured through the value of in (5.1). In contrast to the results in Tables 1 and 2, due to computational reasons, we subsequently used smaller grids for the case , resulting in , and for the case , resulting in . The left plot is based on the pure jump process (), whereas the right one is based on . The dashed red line indicates the nominal level of . We observe that the rejection rate of the test is increasing in (as to be expected) and in . The latter can be explained by the fact that represents the effective sample size (interpretable as the number of trading days). Finally, the rejection rates turn out to be higher when no continuous component is involved in the underlying semimartingale.
The next two graphics in Figure 2 show the rate of rejection of the CP-Test under alternatives involving one break point from to within the model in (5.1) for varying locations of the change point . Again, the left and right plots correspond to and , respectively. Additionally to the general conclusions drawn from the results in Figure 1, we observe that break points can be detected best if , and that the rejection rates are symmetric around that point.
Figure 3 shows the rejection rates of the KSCP-Test 1 and 2, evaluated at different points , for one fixed alternative model involving a single change from to at the point . The curves in the left plot are based on a pure jump process. We can see that the rejection rates are decreasing in , explainable by the fact that there are only very few large jumps both for and for . In the right plot, involving drift and volatility (), we observe a maximal value of the rejection rates that is increasing in the number of trading days, . For values of smaller than this maximum, the contribution of the Brownian component (an independent normally distributed term with variance within each increment ) predominates the jumps of that size and results in a decrease of the rejection rate.
Finally, in Figure 4, we depict box plots for the estimators and of the change point for certain values of and for as specified in the case of Tables 1 and 2. The results are based on two models, involving a change in from to at time point (left panel) and (right panel) for and , and with . We observe a reasonable approximation of the true value (indicated by the red line) with more accurate approximations for . For , the distribution of the estimator is skewed, giving more weight to the left tail directing to . This might be explained by the fact that the distribution of the argmax absolute value of a tight-down stochastic process indexed by gives very small weight to the boundaries of the unit interval. Moreover, as for the results presented in the right plot of Figure 3, the plots in Figure 4 reveal that the estimator behaves best for an intermediate choice of . Results for are not depicted for the sake of brevity, since they do not transfer any additional insight.
6.1 Proof of Lemma 2.2
Let and pick a smooth cut-off function satisfying
We also define the function via . We use