Bootstrapping confidence intervals for the changepoint of time series
Abstract
We study an AMOC time series model with an abrupt change in the mean and dependent
errors that fulfill certain mixing conditions. We obtain confidence intervals for
the unknown changepoint via bootstrapping methods.
Precisely we use a block bootstrap of the estimated centered error sequence.
Then we reconstruct a sequence with a change in the mean using the same estimators
as before. The difference between the changepoint estimator of the
resampled sequence and the one for the original sequence can be used
as an approximation of the difference between the real changepoint and
its estimator. This enables us to construct confidence intervals using
the empirical distribution of the resampled time series.
A simulation study shows that the resampled confidence intervals are
usually closer to their target levels and at the same time smaller than the asymptotic intervals.
Keywords: confidence intervals, block bootstrap, mixing, change in mean
AMS Subject Classification 2000: 62G09, 62G15, 60G10
Acknowledgement: The work of the first author was partly supported by the grants GAČR 201/06/0186 and MSM 02162839.
1 Introduction
Recently a number of papers has been published on possible application of bootstrapping or permutation methods in changepoint analysis, confer Hušková [17] for a recent survey. Most of these papers are concerned with obtaining critical values for the corresponding changepoint tests. Another important issue in changepoint analysis, however, is how to obtain confidence intervals for the changepoint. In this paper we construct bootstrapping confidence intervals for the changepoint in a model with dependent data.
We consider the following AtMostOneChange (AMOC) location model
(1.1) 
where , may depend on . The errors are stationary and strongmixing with a rate specified below,
(1.2) 
and
(1.3) 
The purpose of this paper is to develop and study a bootstrap suitable for getting approximation of the distribution of the following class of changepoint estimators
(1.4) 
where
and .
There is a quite extensive literature concerning asymptotic behavior of changepoint
estimators for independent observations.
For a survey of various results, see e.g. Dümbgen [13],
Csörgő and Horváth [10] and Antoch et al. [1].
One of the first papers to derive the limit
distribution
for and independent errors under local changes has been
written by Bhattacharya and Brockwell [7].
Dümbgen [13] considered a change in a general model for AMOC
with independent observations and developed a suitable
bootstrap.
Antoch et al. [4] studied the asymptotic behavior of
, in the model
(1.1) with independent identically distributed
errors and developed and studied a bootstrap valid for
local changes. They also obtained various related
results, such as rates of consistency for the estimators and their
limiting distribution. Ferger and Stute [16] and Ferger [14, 15]
studied changepoint estimators based on statistics for i.i.d. errors.
Bai [5] and Antoch et al. [3] analyzed the limit
behavior of various estimators when the error sequence forms a linear process.
However, they have not discussed bootstrapping.
Most of the theoretical results concerning bootstrap methods in changepoint analysis (testing and estimation) have been obtained for independent observations, see e.g., Hušková [17] . Antoch and Hušková [2] obtained critical values for the changepoint test related to functionals of for (”no change”) vs. (”there is a change in the mean”) using permutation methods (or equivalently, bootstrap without replacement) for the independent case. Recently, Kirch [18, 19, 20] has developed various bootstrap approximations for critical values for the above tests of ”no change” versus ”there is a change in the mean” suitable for the case of dependent observations that form a linear process. The results in [19] can also be modified in a straightforward way for dependent observations as discussed here.
In this paper we develop and prove the validity of a circular overlapping block bootstrap for obtaining asymptotically correct confidence intervals in the case of dependent errors.
In order to prove validity of the developed bootstrap scheme as well as to obtain the asymptotic under the null hypothesis for the changepoint estimator we have to use some results like laws of (iterated) logarithm or large numbers for a triangular array. Therefore we need additionally to the assumptions (1.2) the following one for certain (in some cases ):

Let be a strictly stationary sequence with . Assume there are with
and
(1.5) where is the corresponding strong mixing coefficient, i.e.
where and vary over the fields respectively
.
Under this assumption we get moment inequalities (cf. Yokoyama [26], Theorem 1, and Serfling [24], Lemma B, Theorem 3.1), which in turn yield laws of large numbers. Moreover the results remain true for triangular arrays that fulfill uniformly the assumptions above. For more details we refer to Kirch [18], Appendix B.2.
In fact we only need this assumption in order to obtain a Donsker type central limit theorem for the partial sums of the errors (to derive the asymptotic under the null hypothesis) as well as bounds on higher order moments of certain sums of the observed error sequence. This in turn yields laws of large numbers and laws of (iterated) logarithm. The proofs can easily be adapted to allow for errors that do not fulfill condition but the necessary moment conditions.
Example 1.1.
Suppose that the errors form a linear process
where the innovations are i.i.d. random variables with
We suppose that the weights satisfy
Corollary 4 in Withers [25] gives mild conditions under which linear sequences are strong mixing and even provides the mixing coefficients. This can be used to check condition (). Causal ARMA sequences with appropriate innovations, for example, fulfill it for any , if the moment of the innovations exists.
For the sake of simplicity we will only consider the case in the following. The results for can be obtained in a similar way as outlined in Antoch et al. [4]. In the simulation study we will also consider other choices of , since the asymptotic method does not give such good approximations for . The reason is that the asymptotic distribution in this case depends on unknown parameters and thus in practice on estimators.
In the present paper we focus on local alternatives (i.e., , as ). To obtain results for fixed alternatives is more complicated because the limit distribution of the estimator is determined by finite sums, which depend on the underlying distribution function. Some comments concerning the i.i.d. case can be found in the survey paper by Antoch and Hušková [1]. Furthermore Dümbgen [13] considers both, local and fixed changes, for independent observations in a somewhat more general setup, i.e. the parameters that are subject to change need not be location parameters.
In the following , .
2 Limit Distribution and Rate of Consistency for the Estimators
In this section we summarize and generalize some previous results by Antoch et al. [3, 4] that we need in the sequel.
The next theorem gives the rate of consistency for the changepoint estimator as well as its limit distribution for a local change. For the i.i.d. case these results have been obtained by Antoch et al. [4], Theorem 1 and 2. The second result has been generalized for errors that form a linear process under the additional assumption by Antoch et al. [3] (Theorem 2.2).
Theorem 2.1.
The proof is postponed to Section 5.
Remark 2.1.
We would like to point out that the result in a) also remains true for a fixed change, precisely it suffices to have we do not need . As a contrast we do need to obtain the limit distribution in b), but if this is not fullfilled the proof still shows .
Remark 2.2.
Remark 2.3.
It can be shown that the above limit distribution is continuous and explicitly known (confer Remark 2.3 in Antoch et Hušková [1]). Thus the above theorem can be used to construct asymptotic confidence intervals, precisely , where , for as in Remark 2.4 below. Note that does not depend on the unknown parameter .
Remark 2.4.
For the limit distribution depends on the unknown parameter . Precisely it can be shown that the limit is
Remark 2.3 in Antoch et Hušková [1] gives a closed formula for the limit distribution. We would like to point out a small (but for simulations very important) misprint there: The integral is equal to and not to .
3 Bootstrap approximations
Antoch et al. [4] propose a bootstrap with replacement of the estimated error sequence to obtain confidence intervals for the changepoint. Since in our case the error sequence is no longer independent we have to use a slightly different approach here. We still bootstrap the estimated error sequence with replacement, but we will now use a circular moving block bootstrap as suggested by Politis and Romano [23]. It has the advantage over the regular moving blocks bootstrap by Künsch [21] that the sample mean is unbiased. Another possibility is to use nonoverlapping blocks as suggested by Carlstein [9], but this bootstrap does behave slightly worse in simulations.
Kirch [18, 19] used block bootstrapping procedures (more precisely a block permutation method as well as a circular and noncircular block bootstrap) to get approximations for the critical values of the changepoint test corresponding to the above problem.
Block bootstrapping methods split the observation sequence of length into sequences of length . Then we put of them together to a bootstrap sequence (i.e. ). We keep the order within the blocks. and depend on and converge to infinity with .
The idea is that, for properly chosen blocklength , the block contains enough information about the dependency structure so that the estimate is close to the null hypothesis.
We assume in the following that
(3.1) 
Let be an estimator for , for , and for , e.g.
(3.2) 
where as in (1.4). Remark 2.2 yields that fulfills assumption (3.5).
Define the estimated residuals and the centered residuals by
respectively. Throughout the paper the following representation will turn out to be very useful
(3.3) 
Let be i.i.d. with for independent of the observations . Take the i.i.d. bootstrap sample , where for (hence the name circular bootstrap).
Consider the bootstrap observations
We now deal with the following bootstrap estimator of the changepoint
(3.4) 
where
Now we are ready to present results on the asymptotic behavior of the bootstrap estimator defined in (3.4) of the changepoint together with a short discussion, how to apply the result to obtain confidence intervals.
With , , we will denote probability,
expectation, variance,, given
.
Theorem 3.1.
Since the limit distribution (for both the bootstrap as well as null asymptotic) is continuous (as has been pointed out by Remark 2.3) the described sampling scheme provides bootstrap approximations to the quantile for arbitrary . Thus the bootstrap based approximation for the changepoint can be constructed along the usual lines. Precisely the bootstrap confidence interval is given by
where
and
Usually one uses the empirical bootstrap distribution of for say random bootstrap samples. Further discussions on bootstrap approximations of confidence intervals (for the similar case of i.i.d. errors) can be found in Antoch and Hušková [1].
Remark 3.1.
There are also several other possibilities of bootstrapping. For example we can use a noncircular approach and/or nonoverlapping blocks. Simulations for the bootstrap where are i.i.d. uniformly distributed on and indicate that this bootstrap does not perform quite as good as the bootstrap proposed above.
4 Simulation Study
In the previous chapter we have established the asymptotic validity of the bootstrap confidence intervals. The question remains how well these confidence intervals behave for small samples and also how well they behave in comparison with the asymptotic intervals.
In this section we not only consider but also . The important difference is that the asymptotic confidence intervals depend on the unknown parameter for . Not surprisingly it turns out that the asymptotic intervals behave better for , whereas in all other cases it is better to use the bootstrap intervals.
Moreover we consider changes in the mean of . The latter ones can hardly be regarded as local changes, however we are still interested in the behavior of the bootstrap intervals, since we conjecture it will also be valid in those cases.
For the simulations we use an autoregressive sequence of order one as an error sequence with standard normally distributed innovations and different values of . We consider changes at . We use the estimator as in (2.4) respectively (3.2) and  for the asymptotic method  the Bartlett estimator given in (2.3) with , because in the simulation study conducted by Antoch et al. [3] this choice gave best results in the AR(1)case.
The goodness of confidence intervals can essentially be determined by two criteria:

The probability that the actual changepoint is outside the (1)confidence interval should be close to (smaller than) .

The confidence intervals should be short.
We visualize the first quantity by using CoLePlots (ConfidenceLevelPlots) and the second one by using CoILPlots (ConfidenceIntervalLengthPlots).
In fact we have done more simulations (such as QQplots or tables of the quantiles of ) for a large amount of different combinations of parameters as well as different possible bootstrap procedures.
The problem, however, is that for the bootstrap they only give result for one specific underlying sequence and are thus rather not as informative. For this reason and also due to similarity of results as well as due to limitations of space we restrict ourselves to the following plots.
CoLePlots
We explain how the plots are created using the example of asymptotic confidence intervals. The general version of Theorem 2.1
yields that the asymptotic confidence intervals are calculated using the distribution of , where is as in Remark 2.4 and .
Note that
The CoLePlots now draw the empirical distributions function (based on observation sequences) of .
Thus for given on the axis the plot shows the empirical probability that is outside the confidence interval on the axis, hence it visualizes . Optimally, the plot should be below or (even better) on the diagonal.
For the bootstrap confidence intervals the procedures works exactly the same but now the intervals are calculated using the (empirical, based on resamples) distribution of .
CoILPlots
We calculate for observation sequences the length of the confidence intervals for levels . The empirical bootstrap distribution is based on random samples as before. Then we plot the mean using a thick line (as well as the upper and lower quartiles with thin lines), linearly interpolated. So these plots visualize the length of the intervals and thus .
Note that the scale on the axis is not the same for different pictures. This way we can better compare the asymptotic with the bootstrap method.
figure
figure
The plots are given in figures 4.14.2 for and 4.3 for . Concerning the CoILPlots we only plot the means for better readability. In Figure 4.4 we give the CoILPlot corresponding to Figure 4.1 (2) including the quartiles to give a better idea of the distribution of the length of the confidence interval.
Concerning we see that for small the actual cover probability of the interval is too small for both methods, yet the asymptotic interval is somewhat better than the bootstrap intervals. At the same time the length of the asymptotic interval is very large, much larger than the length of the bootstrap interval. Frequently it is even longer than the observation sequence. We did not correct upper and lower bounds of the intervals by respectively , but bootstrap intervals can also be outside that possible range.
In fact it is somewhat surprising that even though the intervals are quite long the levels are not as good. The reason is that the changepoint estimator for such a small change (and relatively few observations points) is frequently not very good. A typical example is an observation sequence with a change at , where the estimator suggests a change at . This results in intervals that do not contain the actual changepoint. Also this leads to a wrong estimation of the parameters of the underlying asymptotic distribution, which is then highly skewed in the wrong direction. Thus the lower quantile of the interval is something around , whereas the upper quantile is far bigger than .
For more obvious changes the level of the intervals as well as the length becomes better. This is somewhat surprising in case of the asymptotic intervals because for fixed changes the asymptotic is not valid. The reason is that we have an interval around the changepoint estimator, which is quite good for more obvious changes.
If the changes are closer to the border of the interval, the levels for both methods deteriorate somewhat. The same holds true for stronger correlation of the underlying error sequence.
Overall the bootstrap intervals behave better than the asymptotic intervals.
However, in the case of the asymptotic distribution does not depend on unknown parameters anymore. In this case the asymptotic confidence intervals for local changes are in fact better than the bootstrap intervals. The levels of both methods are for small somewhat worse than for , but the lengths are much better, especially for the asymptotic intervals. However, for more obvious changes the bootstrap intervals are again better than the asymptotic ones. This is due to the fact that the asymptotic does not hold in this case.
It is worth noting that the performance of the bootstrap method does not seem to depend significantly on the choice of the blocklength. This is in contrast to the situation where we bootstrap critical values for changepoint tests (cf. Kirch [19]) where a larger blocklength was needed when the data was more dependent.
In reallife situations we recommend to rather use the bootstrap intervals, since they work no matter what and for both, local as well as fixed changes.
5 Proofs
Throughout the proofs we use the notation for .
of Theorem 2.1.
We only sketch the proof, because it is very similar to the proof of Theorem 1 respectively 2 in Antoch et al. [4]. First note that
Simple calculations yield for
(5.1) 
First we show assertion a), i.e. the rate of consistency for the changepoint estimator. Theorem B.8 b) and Remark B.2 in Kirch [18] give
Similarly we get for , where is an arbitrary fixed constant,
Note that is increasing in for , so that
Thus
A similar argument gives
Hence assertion a) is proven.
For assertion b) we first need somewhat stronger bounds for the above sums, but only in a stochastic sense. Theorem B.3 in Kirch [18] gives a Hájek Rényi type inequality if certain moment conditions of the sums are fulfilled. This yields here ( arbitrary fixed constant)
where the last line follows because for all (in particular for ), and some
Analogously to above this yields for , where is an arbitrary fixed constant,
Similarly for
where the last rate is uniformly in . The proof can be finished analogously to the proof of Theorem 2 in Antoch et al. [4], where we now use Theorem 1 of Section 1.5 in Doukhan [12]. ∎
We will first formulate some auxiliary lemmas, which will enable us to prove the results in Section 3.
Lemma 5.1.
Let be a triangular array of rowwise i.i.d. random variables with and as , then
Proof.
It is analogous to that of Theorem 16.1 in Billingsley [8], since the central limit theorem holds for triangular arrays and the proof of tightness also works analogously. ∎
Lemma 5.2.
Remark 5.1.
More careful considerations concerning below even yield an almost sure rate of .
Remark 5.2.
’Estimator’ is closely related to the Bartlett window estimator with parameter if for this estimator one also uses a circularly extended series, precisely