Bootstrapping Empirical Processes of Cluster Functionals with Application to Extremograms

# Bootstrapping Empirical Processes of Cluster Functionals with Application to Extremograms

## Abstract

In the extreme value analysis of time series, not only the tail behavior is of interest, but also the serial dependence plays a crucial role. Drees and Rootzén (2010) established limit theorems for a general class of empirical processes of so-called cluster functionals which can be used to analyse various aspects of the extreme value behavior of mixing time series. However, usually the limit distribution is too complex to enable a direct construction of confidence regions. Therefore, we suggest a multiplier block bootstrap analog to the empirical processes of cluster functionals. It is shown that under virtually the same conditions as used by Drees and Rootzén (2010), conditionally on the data, the bootstrap processes converge to the same limit distribution. These general results are applied to construct confidence regions for the empirical extremogram introduced by Davis and Mikosch (2009). In a simulation study, the confidence intervals constructed by our multiplier block bootstrap approach compare favorably to the stationary bootstrap proposed by Davis et al. (2012).

2

## 1 Introduction

Time series of observations in environmetrics, (financial) risk management and other fields often exhibit a non-negligible serial dependence between extremes. For example, stable areas of low (or high) pressure may lead to consecutive days of high precipitation (or high temperature). Likewise, large losses to a financial investment tend to occur in clusters.

The statistical analysis of the serial dependence structure between extreme observations is still a challenging task. Yet even if one is only interested in marginal parameters, like extreme quantiles, it is crucial to take into account the serial dependence when assessing the estimation error; see, e.g., Drees (2003) for a simulation study which demonstrates how misleading confidence intervals may be if the serial dependence is ignored.

In most applications, no parametric time series model for the extremal behavior suggests itself. Hence, one should resort to non-parametric procedures to avoid the risk of an unquantifiable, but potentially large modeling error. In this context, a general class of empirical processes that can capture a wide range of different aspects of the extremal behavior of time series prove a powerful tool.

To be more concrete, assume that a stationary time series with values in is observed, from which we construct blocks

 Yn,j:=(Xn,i)(j−1)rn

of “standardized extreme observations” , . A typical choice for univariate time series is

 Xn,i:=(Xi−un)+/an:=a−1n(Xi−un)1{Xi>un} (1.2)

for suitable normalizing constants and . Later on, we will use a different notion of extreme observation in our application to the analysis of the extremogram, for a multivariate time series.

Denote by the set of vectors of arbitrary length with components in , which is equipped with the -field induced by the Borel--fields on , . Let be a family of so-called cluster functionals, i.e. functions such that and for all where the numbers of coordinates equal to 0 in the beginning and in the end of the argument on the right-hand side can be arbitrary. Thus the value of the cluster functional depends only on the core of the argument, which is the smallest subvector of consecutive coordinates that contains all non-zero values (resp. it equals 0 if the argument only consists of zeros). Then, the pertaining empirical process of cluster functionals is defined by

 Zn(f):=1√nvnmn∑j=1(f(Yn,j)−Ef(Yn,j)),f∈F, (1.3)

with . Drees and Rootzén (2010) established sufficient conditions for to converge to a Gaussian process in the space of bounded functions on . The following theorem summarizes their main results; the conditions are recalled in the appendix.

###### 1.1 Theorem
1. If the conditions (B1), (B2) and (C1)–(C3) are fulfilled, the finite-dimensional marginal distributions (fidis) of the empirical process converge to the pertaining fidis of a Gaussian process with covariance function (defined in (C3)).

2. Under the conditions (B1), (B2) and (D1)–(D4) the empirical process is asymptotically tight in . If, in addition, the conditions (C1)–(C3) are met, then weakly converges to .

3. If the assumptions (B1), (B2), (D1), (D2’), (D3) and (D5) are satisfied and, in addition, (D6) (or the more restrictive condition (D6’)) holds, then is asymptotically equicontinuous. Hence, weakly converges to in if also the conditions (C1)–(C3) hold.

For certain types of families of cluster functionals, Drees and Rootzén (2010) also gave sets of conditions that are sufficient for to converge and easier to verify than the abstract conditions listed in the appendix.

We will demonstrate their usefulness by improving on limit results on an empirical version of the so-called extremogram introduced by Davis and Mikosch (2009) in the framework of time series with regularly varying marginals. To be more precise, assume that is a stationary -valued time series such that for all the vector is regularly varying. Recall that a random vector is regularly varying if there exists a non-null measure on such that

 P{W∈xB}P{∥W∥>x}⟶ν(B)<∞

for all -continuity sets that are bounded away from the origin 0. Note that, while this definition of regular variation does not depend on the choice of the norm , the specific form of the limiting measure does. In any case, the limiting measure is homogeneous of order for some , the so-called index of regular variation.

Then, with denoting the quantile function of and , to each lag there exists a measure on such that

 nP{a−1n(X0,Xh)∈B}⟶ν(0,h)(B) (1.4)

for all -continuity sets bounded away from the origin. In particular, for all bounded away from 0 such that and one has

 P(Xh∈anB∣X0∈anA)=P{a−1n(X0,Xh)∈A×B}P{a−1nX0∈A}⟶ν(0,h)(A×B)ν(0,h)(A×Rd)=:ρA,B(h).

Davis and Mikosch (2009) called (as a function of ) the extremogram of (pertaining to ). It is worth mentioning that the extremogram is closely related to the concept of tail processes introduced by Basrak and Segers (2008).

Based on the observations , they proposed the following empirical counterpart as an estimator of :

 ^ρA,B(h):=∑n−hi=11{Xi∈akA,Xi+h∈akB}∑ni=11{Xi∈akA}. (1.5)

Here is a sequence that tends to at a slower rate than so that at a slower rate than , and thus the number of extreme observations used for estimation tends to . Under suitable conditions, is asymptotically normal (see Davis and Mikosch, 2009, Corollary 3.4).

This result has two serious drawbacks. First, usually, the normalizing constants are unknown and must hence be replaced with an empirical counterpart, like, e.g., the largest observed norm:

 ^ak:=^ak,n:=∥X∥n−⌊n/k⌋:n. (1.6)

It is not obvious whether this modification influences the asymptotic behavior of the empirical extremogram.

Secondly, the extremogram for a fixed pair of sets and conveys limited information on the extremal dependence structure, in particular in a multivariate setting, i.e. if . To get a fuller picture, one should consider the extremogram for a whole family of sets simultaneously. For example, in the case , Drees et al. (2015) considered rays and for all simultaneously. However, the techniques used by Davis and Mikosch (2009) are not applicable to infinite families of sets.

We will show that both problems can be neatly solved using the theory of empirical processes of cluster functionals. Indeed, if the families of sets and are suitably chosen and the bias of is asymptotically negligible, then the asymptotic normality of the empirical extremogram with estimated normalizing sequence follows immediately.

If one wants to construct confidence regions using this limit theorem, then estimators of the limiting covariance structure are needed. Since the direct estimation does not look promising, Davis et al. (2012) proposed to use a so-called stationary bootstrap instead. Here we follow a somewhat different approach. First, in the general setting considered by Drees and Rootzén (2010), it is shown that the convergence of a multiplier block bootstrap version of the empirical process of cluster functional conditionally given the data follows under the same conditions as the convergence of itself. From this powerful result it is easily concluded that a multiplier block bootstrap version can be used to construct confidence regions for the extremogram.

Though in the present paper we focus on the extremogram as one possible measure for the extremal dependence structure of the time series, the same approach using empirical processes of cluster functionals can be used in a much wider context. For example, Drees (2011) analyzed block estimators of the so-called extremal index of absolutely regular time series using empirical processes of cluster functionals and suggested a bias corrected version thereof.

The paper is organized as follows. In Section 2 we introduce multiplier block bootstrap versions of the empirical process . Moreover, we give sufficient conditions under which, in probability conditional on the data, this bootstrap processes weakly converge to the same limiting process as . In Section 3, it is demonstrated that the theory developed by Drees and Rootzén (2010) yields limit theorems for the empirical extremogram with estimated normalizing sequence uniformly over suitable families of sets. In the same setup, a bootstrap result easily follows from the general theory developed in Section 2. The results of a small simulation study are reported in Section 3. All proofs are postponed to Section 5.

Throughout the paper, we will use the notation for the vector made up by the first components in the vector , if has at least components, and otherwise . The maximum norm of a vector for some is denoted by . We omit indices of random variables to denote a generic random variable with the same distribution; for example, is a generic random variable with the same distribution as and is a generic random vector with the same distribution as .

## 2 Multiplier processes

In what follows, is a row-wise stationary triangular scheme of -valued random vectors. Usually these vectors are derived from some fixed stationary time series by a transformation which depends on the stage and which sets all but the “extreme” observations to 0 in such a way that the probability that a transformed observation is non-zero tends to 0 as . For univariate time series, often definition (1.2) is used. In our application to the empirical extremogram instead we define

 X(h,~h)n,i:=a−1k(Xi1{Xi∉(−∞,akx∗)d},Xi+h1{Xi+h∉(−∞,akx∗)d},Xi+~h1{Xi+~h∉(−∞,akx∗)d}) (2.1)

for some and .

According to Theorem 1.1, under suitable conditions, the empirical process of cluster functionals converge to a Gaussian process with covariance function , which is defined in (C3) as the limit of the covariance function of the cluster functionals applied to a block of consecutive “standardized extremes” . One may try to estimate this covariance function by an empirical covariance, but since most of the blocks defined in (1.1) equal 0, a bootstrap approach seems more promising.

Because the processes are defined via functionals applied to whole blocks of “standardized extremes”, it suggests itself to use some block bootstrap. More precisely, we consider the following two versions of multiplier block bootstrap processes:

 Zn,ξ(f) := 1√nvnmn∑j=1ξj(f(Yn,j)−Ef(Yn,j)), (2.2) Z∗n,ξ(f) := 1√nvnmn∑j=1ξj(f(Yn,j)−¯¯¯¯¯¯¯¯¯¯¯¯¯f(Yn)),f∈F, (2.3)

where and , are i.i.d. random variables with and independent of . Note that in the definition of the multiplier process expectations are used which are usually unknown to the statistician. Hence, in some applications, it may be useful to replace them with the estimators , which leads to the bootstrap processes .

Our main goal is to prove weak convergence of and to in probability, conditionally on the data. To this end, as usual, we metrize weak convergence in using the bounded Lipschitz metric on the space of probability measures on . That is, for two probability measures and we define

 dBL(ℓ∞(F))(Q1,Q2):=supg∈BL1(ℓ∞(F))∣∣∫gdQ1−∫gdQ2∣∣,

where

 BL1(ℓ∞(F)) := {g:ℓ∞(F)→R∣∥g∥∞:=supz∈ℓ∞(F)|g(z)|≤1, |g(z1)−g(z2)|≤∥z1−z2∥F:=supf∈F|z1(f)−z2(f)| for all z1,z2∈ℓ∞(F)}.

Likewise, for the convergence of the fidis, we use the distance

 dBL(Rl)(Q1,Q2):=supg∈BL1(Rl)∣∣∫gdQ1−∫gdQ2∣∣,

between two probability measures and on , where

 BL1(Rl):={g:Rl→R∣supv∈Rl|g(v)|≤1,|g(v1)−g(v2)|≤∥v1−v2∥ for all v1,v2∈Rl}.

By (resp. ) we denote the (outer) expectation with respect to , i.e. is the expectation of the function conditionally on the observations. Likewise, we denote by the probability measure w.r.t. . (Cf. Kosorok, 2003, for a precise definition using a special construction of probability spaces.)

Our first result shows that the asymptotic behavior of the fidis of , conditionally on the data, is the same as the (unconditional) behavior of the fidis of .

###### 2.2 Theorem

Under the conditions (B1), (B2) and (C1)–(C3) one has for all

 supg∈BL1(Rl)∣∣Eξg((Zn,ξ(fk))1≤k≤l)−Eg((Z(fk))1≤k≤l)∣∣⟶0 (2.4)

in probability.

Since the supremum in (2.4) is bounded by 2, it readily follows that

 supg∈BL1(Rl)∣∣Eg((Zn,ξ(fk))1≤k≤l)−Eg((Z(fk))1≤k≤l)∣∣ ≤ Esupg∈BL1(Rl)∣∣Eξg((Zn,ξ(fk))1≤k≤l)−Eg((Z(fk))1≤k≤l)∣∣⟶0,

that is, the (unconditional) weak convergence of the fidis of to the corresponding fidis of .

Following the ideas developed by Kosorok (2003), the following result establishes the asymptotic tightness of under a bracketing entropy condition, and thus also the weak convergence of under the same conditions as the convergence of the original empirical process in Theorem 1.1(ii).

###### 2.3 Proposition

Suppose that the conditions (B1), (B2), (D1), (D3) and (D4) hold and

1. (D2) holds and is bounded, or

2. (D2’) holds and .

Then is asymptotically tight in . Hence it converges to if, in addition, the conditions (C1)–(C3) are met.

Now a modification of the arguments given in the proof of Theorem 2 of Kosorok (2003) yields the desired convergence result for the multiplier process conditionally on the data.

###### 2.4 Theorem

If condition (D3) and convergence (2.4) hold and weakly converges to , then

 supg∈BL1(ℓ∞(F))∣∣Eξg(Zn,ξ)−Eg(Z)∣∣⟶0 (2.5)

in outer probability.

A combination of this result with Theorem 2.2 and Proposition 2.3 leads to

###### 2.5 Corollary

If the conditions (B1), (B2), (C1)-(C3) and (D1)-(D4) are satisfied and is bounded, then convergence (2.5) holds.

According to Theorem 2.4, under (D3) the weak convergence of the multiplier process to conditionally on the data follows from the weak convergence of the fidis conditionally on the data and the (unconditional) convergence of to . The latter assertion may also be derived by establishing the asymptotic equicontinuity of using a metric entropy condition (instead of verifying tightness using a bracketing entropy condition as in Proposition 2.3).

###### 2.6 Proposition

Suppose that the conditions (B1), (B2), (D1), (D2’), (D3) and
(D5’)    For all and the map is measurable
are fulfilled and

1. (D6) holds and is bounded, or

2. (D6’) holds.

Then is asymptotically equicontinuous. Hence, it converges to if, in addition, the conditions (C1)–(C3) are met.

Using Theorem 2.4 and Corollary 2.6.12 of van der Vaart and Wellner (1996), we obtain as an immediate consequence

###### 2.7 Corollary

If the conditions (B1), (B2), (C1)-(C3), (D1), (D2’), (D3) and (D5’) are met, if is measurable with and is a VC-hull class, then convergence (2.5) holds.

To sum up, we have shown that, roughly under the same conditions as used in Theorem 1.1, the multiplier process shows the same asymptotic behavior conditionally on the data as the empirical process unconditionally. The following result gives conditions under which the convergence of implies the convergence of the bootstrap process conditionally on the data.

###### 2.8 Corollary

If convergence (2.4) of the fidis of holds conditionally on the data, condition (D3) is satisfied and and weakly, then

 Eξsupf∈F|Z∗n,ξ(f)−Zn,ξ(f)|⟶0 (2.6)

in outer probability, weakly and

 supg∈BL1(ℓ∞(F))∣∣Eξg(Z∗n,ξ)−Eg(Z)∣∣⟶0 (2.7)

in outer probability. In particular, these assertions hold under the conditions of Corollary 2.5 and under the assumptions of Corollary 2.7.

###### 2.9 Remark

Note that also the normalizing factor in the definition of may be unknown. In most applications of multiplier processes, though, this is not problematic, because this factor is not needed to construct confidence regions. Nevertheless, it is noteworthy that assertion (2.7) remains valid if is replaced with some estimator that is consistent in the sense that in probability.

For specific types of cluster functionals, Drees and Rootzén (2010) gave simpler sufficient conditions for the convergence of the corresponding empirical process which carry over to the multiplier processes considered here. In the next section we will use the conditions of Corollary 3.6 of that paper, which deals with so-called generalized tail array sums, i.e. empirical processes with functionals of the form for functions such that .

## 3 Processes of Extremograms

In this section we employ the general theory to analyze the asymptotic behavior of the empirical extremogram , a version with empirical normalization and a bootstrap version thereof, uniformly over suitable families of sets and and over lags for some fixed . Throughout this section we are only interested in the behavior for vectors with at least one large component. We thus consider families of pairs of measurable subsets of such that

 x∗:=inf(A,B)∈Cinfx∈Amax1≤j≤dxj>0,

i.e.  for all . However, the results below can be generalized to families of sets that are uniformly bounded away from 0 so that . For the sake of notational simplicity, we assume that (instead of ) -valued random vectors are observed.

###### 3.10 Remark

To keep the presentation simple, we will assume that is regularly varying on the full cone with a limiting measure which is not concentrated on ; see Theorem 3.11 below. This assumption could be weakened to the regular variation on the cone defined in the spirit of Das et al. (2013), i.e. there exists a normalizing sequence and a measure such that

 nP{X0/~an∈B}⟶~ν0(B)

for all -continuity sets bounded away from , where the limit has to be finite. Here one may choose as the -quantile of . Under the slightly more restrictive assumption used in the results below, one has

 P{max1≤j≤dX0,j>u}P{∥X0∥>u}⟶ν0(Rd∖(−∞,1]d)

as , and hence and , where is the degree of homogeneity of , i.e. .

For some intermediate sequence (i.e. , ), we define the empirical extremogram to the sets and and lag as

 ^ρn,A,B(h):=∑ni=11A×B(Xi/ak,Xi+h/ak)∑ni=11A(Xi/ak).

Note that this is a slight modification of the definition given by Davis and Mikosch (2009) in that we do not use the maximal number of summands in the denominator. However, it is easily seen that all results given below carry over to the original definition.

The uniform asymptotic behavior of the empirical extremogram will easily follow from that of the stochastic process

 ~Zn(h,A,B) := 1√nvnn∑i=1(1A×B(Xi/ak,Xi+h/ak)−P{Xi∈akA,Xi+h∈akB}),

with

 vn:=P{X0∉(−∞,akx∗)d}.

This process, in turn, can be analyzed using the theory for empirical processes of cluster functionals developed by Drees and Rootzén (2010). In order to use conditions on the joint distribution of the as weak as possible, it is useful to consider such processes indexed by and just two lags . Let

 ~Xn,i := Xiak1Rd∖(−∞,x∗)d(Xiak),1≤i≤n+h0, X(h,~h)n,i := (~Xn,i,~Xn,i+h,~Xn,i+~h),1≤i≤n, Y(h,~h)n,j := (X(h,~h)n,i)(j−1)rn

Note that, for , we have and ; under the conditions of Theorem 3.11 the difference between these processes is asymptotically negligible even if .

Using Corollary 3.6 of Drees and Rootzén (2010) and Drees and Rootzén (2015), we obtain the following set of sufficient conditions for the convergence of .

###### 3.11 Theorem

Suppose that all four-dimensional marginal distributions of the stationary time series are regularly varying, i.e. for all index vectors of dimension there exists a measure such that

 nP{a−1nXI∈B}⟶νI(B)<∞ (3.1)

for all Borel sets bounded away from , and that . In addition, assume that the conditions (B1), (B2) and () are fulfilled, and . Finally, assume that there exists a bounded semi-metric on such that is totally bounded w.r.t. , and a function such that and

 E(rn∑i=11(A×B)Δ(~A×~B)(Xi/ak,Xi+h/ak))2≤u(¯ϱ((A,B),(~A,~B)))rnvn (3.2)

for all , , and that the conditions (D5) and (D6) hold for if , , or , , and else for some sufficiently large constant . (Here denotes the symmetric difference of the two sets and .)

Then converges weakly to a Gaussian process with covariance function

 ~c((h,A,B),(~h,~A,~B)):=∞∑i=−∞ν(0,h,i,i+~h)(A×B×~A×~B)ν0(Rd∖(−∞,x∗)d)<∞.

Observe that in (3.1) necessarily the following consistency condition holds: for vectors and of indices and -continuity sets bounded away from the origin one has .

Usually the moment condition (3.2) and the entropy condition (D6) are most difficult to verify. The proof of Theorem 3.11 shows that the process indexed by is asymptotically tight if and only if the empirical processes indexed by resp.  are asymptotically tight for all . Thus we may replace condition (D6) by the assumption that these families are VC-subgraph class of functions, which in turn is equivalent to the assumption that

 ¯F:={¯fA×B∣(A,B)∈C}% with¯fD(y1,…,yr):=r∑i=11D(yi)%foryi∈R2d,1≤i≤r, (3.3)

is a VC-subgraph class of functions. Likewise, one may divide the family into a finite number of subfamilies and check that is a VC-subgraph class of functions.

For applications to the asymptotic analysis of empirical extremograms, we shall consider families such that for also belongs to . The following simple example exhibits another closedness property of which is important to prove convergence of the empirical extremogram with estimated normalizing constant.

###### 3.12 Example

Fix some and measurable sets bounded away from 0 such that implies for all and likewise for . (In particular, one may choose a set such that and imply .) Then, for , the family is a VC-subgraph class of functions. To see this, note that if , i.e. the functions are linearly ordered. Hence no set of size 2 can be shattered by the subgraphs of . Likewise, the family pertaining to is a VC-subgraph class of functions.

Condition (3.2) can be reformulated as follows. There exists a semi-metric on such that is totally bounded w.r.t.  and and hold for all and all .

The families of sets and most widely discussed in the literature are sets of upper right orthants and complements of lower left orthants.

###### 3.13 Example

Consider the family of pairs of upper right orthants bounded away from the origin. Then condition (D6) holds for if condition (B1) is satisfied and

 E(rn∑i=11{Xi∉(−∞,akx∗)d})2+δ=O(rnvn), (3.4)

for some . (see Section 5).

By the same arguments one can show that condition (D6) is fulfilled for the family .

From Theorem 3.11 one may easily conclude the uniform asymptotic normality of the empirical extremogram centered at the pre-asymptotic extremogram

 ρt,A,B(h):=P(Xh/t∈B∣X0/t∈A).
###### 3.14 Corollary

Suppose that the conditions of Theorem 3.11 are met, that for all and , and that . Then