Bootstrapping confidence intervals for the change-point of time series
We study an AMOC time series model with an abrupt change in the mean and dependent
errors that fulfill certain mixing conditions. We obtain confidence intervals for
the unknown change-point via bootstrapping methods.
Precisely we use a block bootstrap of the estimated centered error sequence. Then we reconstruct a sequence with a change in the mean using the same estimators as before. The difference between the change-point estimator of the resampled sequence and the one for the original sequence can be used as an approximation of the difference between the real change-point and its estimator. This enables us to construct confidence intervals using the empirical distribution of the resampled time series.
A simulation study shows that the resampled confidence intervals are usually closer to their target levels and at the same time smaller than the asymptotic intervals.
Keywords: confidence intervals, block bootstrap, mixing, change in mean
AMS Subject Classification 2000: 62G09, 62G15, 60G10
Acknowledgement: The work of the first author was partly supported by the grants GAČR 201/06/0186 and MSM 02162839.
Recently a number of papers has been published on possible application of bootstrapping or permutation methods in change-point analysis, confer Hušková  for a recent survey. Most of these papers are concerned with obtaining critical values for the corresponding change-point tests. Another important issue in change-point analysis, however, is how to obtain confidence intervals for the change-point. In this paper we construct bootstrapping confidence intervals for the change-point in a model with dependent data.
We consider the following At-Most-One-Change (AMOC) location model
where , may depend on . The errors are stationary and strong-mixing with a rate specified below,
The purpose of this paper is to develop and study a bootstrap suitable for getting approximation of the distribution of the following class of change-point estimators
There is a quite extensive literature concerning asymptotic behavior of change-point
estimators for independent observations.
For a survey of various results, see e.g. Dümbgen ,
Csörgő and Horváth  and Antoch et al. .
One of the first papers to derive the limit
for and independent errors under local changes has been
written by Bhattacharya and Brockwell .
Dümbgen  considered a change in a general model for AMOC
with independent observations and developed a suitable
Antoch et al.  studied the asymptotic behavior of
, in the model
(1.1) with independent identically distributed
errors and developed and studied a bootstrap valid for
local changes. They also obtained various related
results, such as rates of consistency for the estimators and their
limiting distribution. Ferger and Stute  and Ferger [14, 15]
studied change-point estimators based on -statistics for i.i.d. errors.
Most of the theoretical results concerning bootstrap methods in change-point analysis (testing and estimation) have been obtained for independent observations, see e.g., Hušková  . Antoch and Hušková  obtained critical values for the change-point test related to functionals of for (”no change”) vs. (”there is a change in the mean”) using permutation methods (or equivalently, bootstrap without replacement) for the independent case. Recently, Kirch [18, 19, 20] has developed various bootstrap approximations for critical values for the above tests of ”no change” versus ”there is a change in the mean” suitable for the case of dependent observations that form a linear process. The results in  can also be modified in a straightforward way for dependent observations as discussed here.
In this paper we develop and prove the validity of a circular overlapping block bootstrap for obtaining asymptotically correct confidence intervals in the case of dependent errors.
In order to prove validity of the developed bootstrap scheme as well as to obtain the asymptotic under the null hypothesis for the change-point estimator we have to use some results like laws of (iterated) logarithm or large numbers for a triangular array. Therefore we need additionally to the assumptions (1.2) the following one for certain (in some cases ):
Let be a strictly stationary sequence with . Assume there are with
where is the corresponding strong mixing coefficient, i.e.
where and vary over the -fields respectively
Under this assumption we get moment inequalities (cf. Yokoyama , Theorem 1, and Serfling , Lemma B, Theorem 3.1), which in turn yield laws of large numbers. Moreover the results remain true for triangular arrays that fulfill uniformly the assumptions above. For more details we refer to Kirch , Appendix B.2.
In fact we only need this assumption in order to obtain a Donsker type central limit theorem for the partial sums of the errors (to derive the asymptotic under the null hypothesis) as well as bounds on higher order moments of certain sums of the observed error sequence. This in turn yields laws of large numbers and laws of (iterated) logarithm. The proofs can easily be adapted to allow for errors that do not fulfill condition but the necessary moment conditions.
Suppose that the errors form a linear process
where the innovations are i.i.d. random variables with
We suppose that the weights satisfy
Corollary 4 in Withers  gives mild conditions under which linear sequences are strong mixing and even provides the mixing coefficients. This can be used to check condition (). Causal ARMA sequences with appropriate innovations, for example, fulfill it for any , if the -moment of the innovations exists.
For the sake of simplicity we will only consider the case in the following. The results for can be obtained in a similar way as outlined in Antoch et al. . In the simulation study we will also consider other choices of , since the asymptotic method does not give such good approximations for . The reason is that the asymptotic distribution in this case depends on unknown parameters and thus in practice on estimators.
In the present paper we focus on local alternatives (i.e., , as ). To obtain results for fixed alternatives is more complicated because the limit distribution of the estimator is determined by finite sums, which depend on the underlying distribution function. Some comments concerning the i.i.d. case can be found in the survey paper by Antoch and Hušková . Furthermore Dümbgen  considers both, local and fixed changes, for independent observations in a somewhat more general setup, i.e. the parameters that are subject to change need not be location parameters.
In the following , .
2 Limit Distribution and Rate of Consistency for the Estimators
The next theorem gives the rate of consistency for the change-point estimator as well as its limit distribution for a local change. For the i.i.d. case these results have been obtained by Antoch et al. , Theorem 1 and 2. The second result has been generalized for errors that form a linear process under the additional assumption by Antoch et al.  (Theorem 2.2).
The proof is postponed to Section 5.
We would like to point out that the result in a) also remains true for a fixed change, precisely it suffices to have we do not need . As a contrast we do need to obtain the limit distribution in b), but if this is not fullfilled the proof still shows .
It can be shown that the above limit distribution is continuous and explicitly known (confer Remark 2.3 in Antoch et Hušková ). Thus the above theorem can be used to construct asymptotic confidence intervals, precisely , where , for as in Remark 2.4 below. Note that does not depend on the unknown parameter .
For the limit distribution depends on the unknown parameter . Precisely it can be shown that the limit is
Remark 2.3 in Antoch et Hušková  gives a closed formula for the limit distribution. We would like to point out a small (but for simulations very important) misprint there: The integral is equal to and not to .
3 Bootstrap approximations
Antoch et al.  propose a bootstrap with replacement of the estimated error sequence to obtain confidence intervals for the change-point. Since in our case the error sequence is no longer independent we have to use a slightly different approach here. We still bootstrap the estimated error sequence with replacement, but we will now use a circular moving block bootstrap as suggested by Politis and Romano . It has the advantage over the regular moving blocks bootstrap by Künsch  that the sample mean is unbiased. Another possibility is to use non-overlapping blocks as suggested by Carlstein , but this bootstrap does behave slightly worse in simulations.
Kirch [18, 19] used block bootstrapping procedures (more precisely a block permutation method as well as a circular and non-circular block bootstrap) to get approximations for the critical values of the change-point test corresponding to the above problem.
Block bootstrapping methods split the observation sequence of length into sequences of length . Then we put of them together to a bootstrap sequence (i.e. ). We keep the order within the blocks. and depend on and converge to infinity with .
The idea is that, for properly chosen block-length , the block contains enough information about the dependency structure so that the estimate is close to the null hypothesis.
We assume in the following that
Let be an estimator for , for , and for , e.g.
Define the estimated residuals and the centered residuals by
respectively. Throughout the paper the following representation will turn out to be very useful
Let be i.i.d. with for independent of the observations . Take the i.i.d. bootstrap sample , where for (hence the name circular bootstrap).
Consider the bootstrap observations
We now deal with the following bootstrap estimator of the change-point
Now we are ready to present results on the asymptotic behavior of the bootstrap estimator defined in (3.4) of the change-point together with a short discussion, how to apply the result to obtain confidence intervals.
With , , we will denote probability,
expectation, variance,, given
Since the limit distribution (for both the bootstrap as well as null asymptotic) is continuous (as has been pointed out by Remark 2.3) the described sampling scheme provides bootstrap approximations to the -quantile for arbitrary . Thus the bootstrap based approximation for the change-point can be constructed along the usual lines. Precisely the -bootstrap confidence interval is given by
Usually one uses the empirical bootstrap distribution of for say random bootstrap samples. Further discussions on bootstrap approximations of confidence intervals (for the similar case of i.i.d. errors) can be found in Antoch and Hušková .
There are also several other possibilities of bootstrapping. For example we can use a non-circular approach and/or non-overlapping blocks. Simulations for the bootstrap where are i.i.d. uniformly distributed on and indicate that this bootstrap does not perform quite as good as the bootstrap proposed above.
4 Simulation Study
In the previous chapter we have established the asymptotic validity of the bootstrap confidence intervals. The question remains how well these confidence intervals behave for small samples and also how well they behave in comparison with the asymptotic intervals.
In this section we not only consider but also . The important difference is that the asymptotic confidence intervals depend on the unknown parameter for . Not surprisingly it turns out that the asymptotic intervals behave better for , whereas in all other cases it is better to use the bootstrap intervals.
Moreover we consider changes in the mean of . The latter ones can hardly be regarded as local changes, however we are still interested in the behavior of the bootstrap intervals, since we conjecture it will also be valid in those cases.
For the simulations we use an autoregressive sequence of order one as an error sequence with standard normally distributed innovations and different values of . We consider changes at . We use the estimator as in (2.4) respectively (3.2) and - for the asymptotic method - the Bartlett estimator given in (2.3) with , because in the simulation study conducted by Antoch et al.  this choice gave best results in the AR(1)-case.
The goodness of confidence intervals can essentially be determined by two criteria:
The probability that the actual change-point is outside the (1-)-confidence interval should be close to (smaller than) .
The confidence intervals should be short.
We visualize the first quantity by using CoLe-Plots (Confidence-Level-Plots) and the second one by using CoIL-Plots (Confidence-Interval-Length-Plots).
In fact we have done more simulations (such as QQ-plots or tables of the quantiles of ) for a large amount of different combinations of parameters as well as different possible bootstrap procedures.
The problem, however, is that for the bootstrap they only give result for one specific underlying sequence and are thus rather not as informative. For this reason and also due to similarity of results as well as due to limitations of space we restrict ourselves to the following plots.
We explain how the plots are created using the example of asymptotic confidence intervals. The general version of Theorem 2.1 yields that the asymptotic confidence intervals are calculated using the distribution of , where is as in Remark 2.4 and . Note that
The CoLe-Plots now draw the empirical distributions function (based on observation sequences) of .
Thus for given on the -axis the plot shows the empirical probability that is outside the -confidence interval on the -axis, hence it visualizes . Optimally, the plot should be below or (even better) on the diagonal.
For the bootstrap confidence intervals the procedures works exactly the same but now the intervals are calculated using the (empirical, based on resamples) distribution of .
We calculate for observation sequences the length of the confidence intervals for levels . The empirical bootstrap distribution is based on random samples as before. Then we plot the mean using a thick line (as well as the upper and lower quartiles with thin lines), linearly interpolated. So these plots visualize the length of the intervals and thus .
Note that the scale on the -axis is not the same for different pictures. This way we can better compare the asymptotic with the bootstrap method.
The plots are given in figures 4.1-4.2 for and 4.3 for . Concerning the CoIL-Plots we only plot the means for better readability. In Figure 4.4 we give the CoIL-Plot corresponding to Figure 4.1 (2) including the quartiles to give a better idea of the distribution of the length of the confidence interval.
Concerning we see that for small the actual cover probability of the interval is too small for both methods, yet the asymptotic interval is somewhat better than the bootstrap intervals. At the same time the length of the asymptotic interval is very large, much larger than the length of the bootstrap interval. Frequently it is even longer than the observation sequence. We did not correct upper and lower bounds of the intervals by respectively , but bootstrap intervals can also be outside that possible range.
In fact it is somewhat surprising that even though the intervals are quite long the levels are not as good. The reason is that the change-point estimator for such a small change (and relatively few observations points) is frequently not very good. A typical example is an observation sequence with a change at , where the estimator suggests a change at . This results in intervals that do not contain the actual change-point. Also this leads to a wrong estimation of the parameters of the underlying asymptotic distribution, which is then highly skewed in the wrong direction. Thus the lower quantile of the interval is something around , whereas the upper quantile is far bigger than .
For more obvious changes the level of the intervals as well as the length becomes better. This is somewhat surprising in case of the asymptotic intervals because for fixed changes the asymptotic is not valid. The reason is that we have an interval around the change-point estimator, which is quite good for more obvious changes.
If the changes are closer to the border of the interval, the levels for both methods deteriorate somewhat. The same holds true for stronger correlation of the underlying error sequence.
Overall the bootstrap intervals behave better than the asymptotic intervals.
However, in the case of the asymptotic distribution does not depend on unknown parameters anymore. In this case the asymptotic confidence intervals for local changes are in fact better than the bootstrap intervals. The levels of both methods are for small somewhat worse than for , but the lengths are much better, especially for the asymptotic intervals. However, for more obvious changes the bootstrap intervals are again better than the asymptotic ones. This is due to the fact that the asymptotic does not hold in this case.
It is worth noting that the performance of the bootstrap method does not seem to depend significantly on the choice of the block-length. This is in contrast to the situation where we bootstrap critical values for change-point tests (cf. Kirch ) where a larger block-length was needed when the data was more dependent.
In real-life situations we recommend to rather use the bootstrap intervals, since they work no matter what and for both, local as well as fixed changes.
Throughout the proofs we use the notation for .
of Theorem 2.1.
We only sketch the proof, because it is very similar to the proof of Theorem 1 respectively 2 in Antoch et al. . First note that
Simple calculations yield for
First we show assertion a), i.e. the rate of consistency for the change-point estimator. Theorem B.8 b) and Remark B.2 in Kirch  give
Similarly we get for , where is an arbitrary fixed constant,
Note that is increasing in for , so that
A similar argument gives
Hence assertion a) is proven.
For assertion b) we first need somewhat stronger bounds for the above sums, but only in a -stochastic sense. Theorem B.3 in Kirch  gives a Hájek -Rényi type inequality if certain moment conditions of the sums are fulfilled. This yields here ( arbitrary fixed constant)
where the last line follows because for all (in particular for ), and some
Analogously to above this yields for , where is an arbitrary fixed constant,
We will first formulate some auxiliary lemmas, which will enable us to prove the results in Section 3.
Let be a triangular array of row-wise i.i.d. random variables with and as , then
It is analogous to that of Theorem 16.1 in Billingsley , since the central limit theorem holds for triangular arrays and the proof of tightness also works analogously. ∎
More careful considerations concerning below even yield an almost sure rate of .
’Estimator’ is closely related to the Bartlett window estimator with parameter if for this estimator one also uses a circularly extended series, precisely