Conditional TailRelated Risk Estimation Using Composite Asymmetric Least Squares and Empirical Likelihood
Abstract
In this article, by using composite asymmetric least squares (CALS) and empirical likelihood, we propose a twostep procedure to estimate the conditional value at risk (VaR) and conditional expected shortfall (ES) for the GARCH series. First, we perform asymmetric least square regressions at several significance levels to model the volatility structure and separate it from the innovation process in the GARCH model. Note that expectile can serve as a bond to make up the gap from VaR estimation to ES estimation because there exists a bijective mapping from expectiles to specific quantile, and ES can be induced by expectile through a simple formula. Then, we introduce the empirical likelihood method to determine the relation above; this method is datadriven and distributionfree. Theoretical studies guarantee the asymptotic properties, such as consistency and the asymptotic normal distribution of the estimator obtained by our proposed method. A Monte Carlo experiment and an empirical application are conducted to evaluate the performance of the proposed method. The results indicate that our proposed estimation method is competitive with some alternative existing tailrelated risk estimation methods.
keywords:
Tailrelated risks, GARCH model, Composite asymmetric least squares, Empirical likelihood1 Introduction
The accurate assessment of the exposure to market risk lies at the core of risk control and portfolio management. Value at risk (VaR), which was first introduced in 1990s as a risk measure, has witnessed a great development and wide applications in financerelated fields (Jorion, 2000) due to its conceptual simplicity and practical convenience. From the perspective of statistics, VaR actually amounts to the quantile of a random loss variable and measures the maximum potential loss at a given specific confidence level. However, although VaR is employed as the standard risk measure by Basel II, it is criticised because of its lack of subadditivity, especially in portfolio management, because it is generally accepted that the aggregate risk on a portfolio should not be greater than the sum of the risks of its constituents, but VaR does not reflect this feature. Furthermore, Lucas and Klaassen (1998) noted that VaR ignores the extreme loss beyond itself, which may cause some uncontrollable and hazardous loss.
These shortcomings of VaR motivated the development of another risk measure, the expected shortfall (ES), which was introduced by Artzner et al. (1999). The ES risk measure is defined as the conditional expectation of the loss exceeding or equal to VaR at a given confidence level. ES has been studied in detail, and it has been shown that ES possesses good properties, such as monotonicity, subadditivity, homogeneity, and translational invariance. In other words, ES enjoys coherence; see Pflug (2000); Acerbi et al. (2001); Acerbi and Tasche (2002). This distinguished property has caused ES to be increasingly widely used in financerelated fields, such as portfolio management, risk control and prediction.
Considering the respective merits of VaR and ES, such as VaR’s conceptual simplicity and ES’s coherence, ES and VaR have recently been employed simultaneously to obtain a deeper and more accurate understanding of risk management, especially in the analysis of financial time series data. Estimating (or forecasting) the conditional VaR and ES of time series is a great challenge and has attracted heated discussion for a long time; see McNeil and Frey (2000); Engle and Manganelli (2004); Cai and Xu (2009); Taylor (2008); Xiao and Koenker (2009); Kuan et al. (2009), and so on.
Engle and Manganelli (2004) provided a review of the VaR literature and divided the corresponding estimation or prediction methods into three different categories: parametric, semiparametric, and nonparametric. The detailed summary in Engle and Manganelli (2004) provides guidance about the general research framework of VaR and ES since ES estimation follows similar patterns. Generally speaking, parametric methods need a specific parameterized distribution assumption regarding financial prices. One of the most commonly used parametric methods for time series data is the volatilitybased method, in which VaR is estimated based on a conditional volatility forecast with a distribution assumption for the shape of residuals. GARCH models (Bollerslev, 1986) are the most widely used models for forecasting volatility (Granger and Poon, 2003), and there are different choices for the residual distribution, such as the normal distribution (Bollerslev, 1986), Student or skewed Student distributions (Zhu and Galbraith, 2011), generalized Pareto distribution (Harmantzis et al., 2006), Johnson family (Simonato, 2011), and mixture distribution (Broda and Paolella, 2011). This type of approach to VaR estimation has an appealing advantage in that it provides the structure of the data generation process, so it is very convenient and has comparable accuracy for forecasting or predicting the future VaR. However, this approach focuses on estimating VaR, and it is not clear how to obtain the corresponding ES estimates because in some situations, such as portfolio management, VaR is insufficient for describing the total risk. Nonparametric methods are another choice for VaR and ES estimation. Cai (2002) first applied a kernelbased method to estimate VaR. On the basis of this work, many other nonparametric methods for VaR and ES estimation have been developed, such as Scaillet (2004); Chen (2008); Cai and Wang (2008); Cai and Xu (2009).
Compared with parametric methods, kernelbased nonparametric methods do not require specification of the distribution and are thus more flexible. However, it is well known that kernel methods may lose efficiency and be impractical for financial problems because they require larger data sets to obtain a comparable estimation accuracy (Fan and Yao, 2006). Semiparametric methods are a good alternative to balance the tradeoff of estimation efficiency and distributionfree demand; see Hang Chan et al. (2007); Linton and Xiao (2013); Wang and Zhao (2016). One of the semiparametric approaches for VaR and ES estimation is based on extreme value theory (EVT). For example, for the GARCH model, McNeil and Frey (2000) proved that the distribution of the residuals standardized by GARCH conditional volatility estimates beyond some threshold can be approximated by some extreme value distribution and proposed the peaks over threshold EVT method to obtain the corresponding VaR and ES estimations.
The semiparametric autoregressive model is another appealing approach to VaR estimation. Engle and Manganelli (2004) proposed the conditional autoregressive value at risk (CAViaR) model for time series and adopted quantile regression for coefficient estimation. The CAViaR model deals directly with the quantile process instead of the whole distribution of financial returns. Consequently, it does not require a specification of the distribution of financial returns, i.e., it does not rely on distributional assumptions, which is a quite appealing advantage in practice. However, the CAViaR model may cause inconvenience when estimating characteristic features other than quantile, such as the volatility of financial data. Considering this demand, Xiao and Koenker (2009) applied quantile regression to the widely used financial data generation process  the GARCH model for conditional VaR estimation. However, as Taylor (2008) noted, it is still unclear how to estimate the corresponding ES from the VaR estimate in the CAViaR or GARCH models. The gap between the VaR and ES estimations is made up for by expectile. Aigner et al. (1976) and Newey and Powell (1987) adopted the ‘asymmetric’ concept from quantile regression in a smooth manner and proposed asymmetric least squares estimation, from which expectile originates. Efron (1991) showed that there exists a bijective mapping from expectile to quantile; i.e., for each quantile of some random variable, there exists a unique corresponding expectile equivalent to the quantile. This wonderful property makes expectile serve as a bond between VaR and ES. The pioneering framework of ES estimation from VaR using expectile was proposed by Taylor (2008). Applying the key idea to treat the quantile structure in the CAViaR model to expectile, Taylor introduced the conditional autoregressive expectile (CARE) model to estimate VaR and ES simultaneously for time series. Since then, VaR and ES estimations using expectile have seen wide discussion and development. Kuan et al. (2009) modified the CARE model and studied the asymptotic property of this method. Xie et al. (2014) generalised the CARE model to situations with timevarying coefficients. Kim and Lee (2016) recently extended this idea to the nonlinear case.
In the use of CAREtype models, the relationship between VaR and ES is built up by the bijective mapping between expectile and quantile, so a fundamental problem of great concern is how to determinate the corresponding bijective mapping for a fixed . For this problem, Taylor (2008) first calculated the expectile and quantile for a sequence of and using historical data and determined the corresponding mapping for a fixed via gridsearch. This approach is straightforward, but it can be shown that it may lose estimation accuracy when estimating the conditional tailrelated risk when the historical financial data are insufficient (see our simulation results in Subsection 2.1). Kim and Lee (2016) obtained this value, provided that the innovation process followed the normal distribution. These methods for determining the bijective mapping are either demanding for large data sets or rely on a prespecified distribution. Another notable issue is indicated in the empirical study of Kuan et al. (2009), which reminds us that this value may be timevarying. In this article, we propose using the datadriven and distributionfree empirical likelihood method introduced by Owen (1990) and Qin and Lawless (1994) to determine the mapping expectile level for a fixed for the innovation term. Before we start, a preprocessing operation must be performed to separate the innovation process from the volatility part. For this purpose, we adopt the idea used by Xiao and Koenker (2009) and propose using the composite asymmetric least regression.
To summarise, in this article, we estimate the volatility and conditional tailrelated risks of the financial return series under the GARCH framework. Under the GARCH framework, the induced dynamic autoregressive structure of conditional tailrelated risks stems from the dynamic structure of volatility, which is different from the directly portrayed autoregressive risk structure, such as CAViaR in Engle and Manganelli (2004) or CARE in Taylor (2008); Kuan et al. (2009). We assume that the innovation process is an i.i.d. random sequence, which is commonly used in the GARCH model; see Berkes et al. (2003); Hall and Yao (2003). This assumption does not involve any distribution assumption but indeed helps gain a complete picture of the return series. To better capture the dynamic structure of volatility, we adopt the idea of Xiao and Koenker (2009) and propose CALS. Compared to the method in Xiao and Koenker (2009), the main improvement of our proposed method is that we divide all autoregressive parameters into two parts: parameters from the dynamic structure of volatility and parameters from the conditional distribution of the innovation term. With such a representation, we can have a more intuitive understanding of the dynamic structure of expectile, which is the fundamental concern of our methods. This division of the coefficients can help us obtain the volatility structure from CALS directly without complex matrix decomposition computations. Once the volatility part is modelled, we can separate the innovation process from the return series. Then, we can determine the corresponding bijective mapping for a fixed such that the expectile equals the quantile via the empirical likelihood method. Combining CALS and empirical likelihood, the conditional VaR and ES in a GARCH framework can be estimated.
The article is organised as follows. Section 2 reviews three tailrelated risks and their relation, which lead to some potential issue in estimating conditional tailrelated risk. Based on these issue, we state the motivations and contributions of the proposal. In section 3, we introduce our method to estimate the conditional VaR and ES by combining CALS and empirical likelihood. The asymptotic properties of the method are presented in section 4. Simulation results and an empirical application of the method are given in section 5 and section 6, respectively. Finally, section 7 concludes the paper and presents further discussion. The proofs of some theorems in section 4 are provided in the appendix.
2 Review of related risks, potential issues and motivation
In this section, we first review the definition and some properties of three tailrelated risks (VaR, ES and expectile), from which the estimating equations in empirical likelihood are derived. Then, we analyse the dynamic structure of tailrelated risks for GARCHtype return series and state reasons why we capture the dynamic structure using CALS. Finally, we indicate some fundamental issues in the overall estimating procedure and highlight the motivation and contribution of our proposed method.
2.1 Review of three tailrelated risks
VaR and ES are two widespread risk measures in the field of risk management. Since we are dedicated to return series, here, we consider the downside risk, similarly used in Acerbi et al. (2001); Engle and Manganelli (2004); Taylor (2008). Hence, the VaR of a random variable with significance level is defined as
(1) 
where is the cumulative distribution function of . If is continuous, the corresponding ES is defined as
(2) 
Both risk measures have their own merits and defects (see the Introduction section). In addition, to evaluate and backtest the risk measures, elicitability is another considerable property. Briefly, the elicitability of a risk measure determines whether we can find a scoring function from which we can obtain the optimal forecast of the measure (Ziegel, 2016). The nonelicitability of ES brings many problems in estimating and backtesting since the corresponding Mestimation or test statistic is hard to construct directly (Acerbi and Szekely, 2014; Fissler et al., 2015; Fissler and Ziegel, 2016).
Kuan et al. (2009) noted that expectile is another measure for assessing the tailrelated risk; see also Bellini and Bernardino (2017). Expectile shares both elicitability and coherency. Similar to quantile from asymmetric absolute loss, the expectile with significance level of a random variable originates from the asymmetric squares loss(Newey and Powell, 1987)
(3) 
where the asymmetric squares loss is defined as . Without loss of generality, suppose that ; by a straightforward calculation, we have
(4) 
Eq.(4) indicates the specific relation between expectile and ES. Jones (1994) proposed another proposition of expectile, showing that there exists a unique increasing bijective function such that , when , where is defined as
(5) 
with the partial moment function of . Eq.(5) builds the close relation between expectile and quantile.
Based on Eq.(4) and Eq.(5), we can address three tailrelated risk simultaneously if we know the mapping . Hence, finding the mapping from to is an important issue. Taylor (2008) proposed a gridsearch method to determine the corresponding for fixed . This approach may lose estimation accuracy when the historical financial data are insufficient (see the boxplot in Figure 1).
To improve the estimation accuracy, we adopt the empirical likelihood method, which is datadriven and distributionfree, to estimate . The empirical likelihood method is an effective and flexible nonparametric method of statistical inference (Owen, 2001). Maximum empirical likelihood method based on estimating equations is one such method, which is typically used for point estimation (Qin and Lawless, 1994). The properties of expectile described above opportunely provide us with the following two estimating equations:
(6) 
from which we construct the maximum empirical likelihood method to estimate .
Here, a simple simulation in the i.i.d case is performed to evaluate the performance of the empirical likelihood method. The data are generated from standard normal distribution and Student’s distribution with 5 degrees of freedom, and the sample size is set to 1000, 300 and 100. Taylor (2008)’s gridsearch method and the empirical likelihood method are used to estimate the value corresponding to . We repeat each data generation and estimation procedure 100 times.
The squared error is used as the evaluation criterion to compare the estimation accuracy of the gridsearch method and the empirical likelihood method. Figure 1 shows the boxplot of the squared errors of the two methods in different cases. The simulation result convinces us that the empirical likelihood method is a proper choice to solve the selection problem. In addition to the competitive estimation accuracy, the estimation procedure of the empirical likelihood method is more computationally convenient than the gridsearch method is.
2.2 Conditional tailrelated risk under the GARCH framework
In this section, we focus on how the dynamic structure of conditional tailrelated risks stems from the dynamic structure of volatility under the GARCH framework. This analysis inspires us to propose the CALS method, which is an improvement with respect to the composite quantile method (Xiao and Koenker, 2009) for estimating volatility.
Consider a GARCHtype return series given by
(7) 
with natural filtration . The regular assumptions for the GARCH model (Berkes et al., 2003; Hall and Yao, 2003) includes the following: is an innovation series (i.e. independent identically distributed) from marginal distribution with zero mean and unit variance; is independent to ; and the volatility term is measurable with respect to .
For quantile level and expectile index , denote and as the th quantile and ES of the marginal distribution , and let be its th expectile. The conditional VaR, conditional ES and conditional expectile of are represented as
(8) 
(9) 
(10) 
Following the linear GARCH model in Xiao and Koenker (2009), the volatility has the following dynamic structure:
(11) 
(12) 
where and are positive. Hence, we obtain the autoregressive specification of conditional VaR, conditional ES and conditional expectile as follows:
(13) 
where , , ;
(14) 
where , , ;
(15) 
where , , .
The above transformation shows us that the autoregressive dynamics of tailrelated risks stem from the dynamic of volatility. The coefficients stay unchanged in all of the dynamics, whether for volatility or for different risks. The rest of the coefficients in the above dynamics are variational for different risk dynamics. It can be seen that these coefficients are the products of the original coefficients and the corresponding risks of . This fact indicates that the dynamic of a tailrelated risk consists of two types of information: dynamic information from the volatility and distribution information from the innovation series.
Based on the above analysis, there are two noteworthy points to highlight here:

We show that under the GARCH framework, the dynamics of different tailrelated risks share a specific structure with some latent and common coefficients. As discussed above, ES is nonelicitable; thus, the coefficients in Eq. (14) cannot be estimated directly by solving a optimisation function. With the dynamic structures above, we can obtain the ES dynamic from the expectile dynamic based on the properties of expectile shown in Eq. (4)  Eq. (5). Since the innovation series has zero expectation, if we can determine such that , then the th expectile specification in Eq. (15) is equivalent to the th quantile specification in Eq. (13). Moreover, the ES specification can be obtained by multiplying both sides of Eq. (15) by a constant , where . This is the theoretical basis of Taylor (2008)’s method, which captures the ES dynamic by estimating the coefficients in the expectile dynamics. In our framework, the distribution information of the innovation series should no longer be ignored because the optimal and should be determined by the distribution of .

Another insight is that the two part of coefficients cannot be divided in a single conditional risk dynamic. For example, we cannot capture or , even if we have a estimator of from a single quantile regression. The problem can be solved if we consider a cluster of conditional risk dynamics, such as in the composite quantile method (Xiao and Koenker, 2009). It fits several quantile dynamics to obtain dynamic quantile coefficients and use matrix decomposition to estimate the volatility dynamic coefficients. Here, we find that the conditional expectile shares a similar dynamic structure as conditional quantile. Hence, Xiao and Koenker (2009)’s method can be extended to a composite expectile form, which is CALS. Later, we show that a further technical adjustment could make this method more computationally efficient.
We have clarified the dynamic structure of the tailrelated risks under the GARCH framework. Next, we state some potential issues and our motivation.
2.3 Motivation: the combination of CALS and empirical likelihood
In this part, we summarise the fundamental issues in the estimating procedure and our corresponding approaches; this section describes our main motivation and the primary contributions of this work.
The first issue is about the determinant of the mapping . Under the GARCH framework, it has been shown that the dynamics of tailrelated risks consist of two part of information: dynamic information from the volatility and distribution information from the innovation series. When we obtain the ES dynamics from the expectile dynamics, the distribution information of the innovation series is necessary to determine two important values: and . Taylor (2008)’s gridsearch method is based on the return series , which will lead to estimation error because the conditional distribution of with respect to is often different depending on the marginal distribution of . Moreover, the timevarying property of the determinant of is noteworthy. Taylor (2008) selects based on the data from the first moving window and keeps it fixed in the rest of the estimation procedure. It is not guaranteed that the distribution for the return series is not timevarying. In fact, the issue was verified in the empirical study of Kuan et al. (2009), who noted that with a fixed probabilitylevel conditional expectile, the corresponding tail probabilities of the conditional quantile are different insample and outofsample. Hence, a timevarying mapping is more appropriate in practice.
To overcome these issues, we separate the innovation series from the return series, which involving a volatility estimating step first. Then, the empirical likelihood method is used to determine the value of for fixed quantile level from the separated innovation series. The empirical likelihood method is performed in a rolling manner in each moving window so the determined value of can be updated over time.
The second issue occurs in the procedure of estimating the volatility structure. To separate the innovation series () from the return series, an important step is to estimate the volatility structure. The composite quantile method proposed by Xiao and Koenker (2009) is an alternative choice to capture the volatility structure of a conditional heteroscedastic time series. Inspired by the analysis in section 2.2, we make some technical adjustment to perfect this method.
First, Xiao and Koenker (2009)’s method is extended to a composite expectile form, which is called CALS. The substitution from quantile to expectile is made for computational efficiency since the expectile regression has better properties in computation than quantile regression does; see the details in Waltrup et al. (2015). The second adjustment is keeping the separation of two parts of coefficients in the corresponding optimisation function (for example, maintaining the separation of and ), rather than taking their product as a single coefficients. This adjustment can avoid the complex computations of matrix decomposition in Xiao and Koenker (2009).
Based on the potential issues noted above, we combine the CALS method and the empirical likelihood method to estimate the conditional VaR, conditional ES and even conditional expectile simultaneously. The main model assumption of our method is the GARCHtype series with i.i.d innovation, which is different from the semiparametric autoregressive model. The main estimation procedure of the proposed method can be outlined as,

Estimating the volatility structure of the return series by CLAS, and separating the innovation series from the return series;

Determining the mapping by empirical likelihood, and estimating , simultaneously;

Estimating conditional tailrelated risks by combining the estimations above.
The three steps in the outline are detailed in Section 3, corresponding to the three subsections of Section 3.
The contributions of our work can be summarised as follows: First, we use a flexible nonparametric method, empirical likelihood, to determine the mapping from the separated innovation series, which is competitive in terms of estimation accuracy. Second, we extend the composite quantile method to CALS for volatility estimation. To avoid complex computations of matrix decomposition, we maintain the separation of two parts of coefficients in the corresponding optimisation function. The adjusted method is more efficient in computation and has asymptotic properties similar to those of the composite quantile method. Finally, by analysing the dynamic structure of three conditional tailrelated risk, we find that they have the same structure with common coefficients. The combinational method allows us to process the coefficients of volatility dynamic and coefficients of distribution separately. The consequent advantage is that we can perform the estimation of conditional VaR, conditional ES and conditional expectile simultaneously.
3 Combination method of CALS and empirical likelihood
In the previous section, we have presented an improved proposal: CALS for volatility estimation and empirical likelihood for the determination of . In this section, we present more methodological details of CALS and empirical likelihood and then give the complete estimation procedure of the combinational method.
3.1 Volatility estimation using CALS
The idea of volatility estimation using composite expectile is enlightened and improved from the methods of composite quantile in Xiao and Koenker (2009) and Kai et al. (2010). We use a class of expectile specifications with common parameter constraints to fit the volatility structure.
Consider the following linear GARCH(,) model:
(16) 
(17) 
Denote and (where is the lagged operator) satisfying the invertible assumption [B1] (see section 4). With this assumption, we can obtain an ARCH() representation of ,
(18) 
where the coefficients decrease geometrically, which is implied by the assumption [B1] (see detailed discussion in Koenker and Xiao (2006)). Without loss of generality, we normalised for identification. Substituting the foregoing ARCH() representation into (16) and (17), we have
(19) 
Denoting the truncation parameter by , we use the following truncated ARCH() model as an approximation of the real model:
(20) 
We construct a composite method, CALS, to fit the volatility structure, and it has a class of expectile autoregressive specifications as
(21) 
where is a class of the expectile significance level. The advantage of this method is that a class of expectile autoregressive models can fully exploit the potential information about the volatility structure. Here, expectile specifications with different significance levels share the same parameter structure, with common parameters to for the volatility structure and a specific parameter for different significance levels. Essentially, it imposes parameter constraints on different expectile specifications, and this is a big difference from the composite quantile method in Xiao and Koenker (2009).
For convenience of expression, we denote the parameters in the CALS formulas above as
(22) 
where
(23) 
(24) 
and is fixed as 1 for identification, as is its estimator . The parameters involve the distribution information of innovation sereis, and the parameters involve the dynamic information of volatility structure. Here, as we discussed in Section 2, we maintain them separation in the loss function.
Denote , and . We can estimate these parameters using CALS with the following expression:
(25)  
(26) 
Then, we can obtain a preliminary estimation of insample as
(27) 
To improve estimation accuracy, we can refit the GARCH(p,q) model by least squares since we already have a preliminary estimation of the volatility. Denote the parameters in the GARCH(p,q) model by , which can be estimated by
(28) 
The corresponding volatility estimation is
(29) 
Remark 3.1.
Compared to the method of Xiao and Koenker (2009), there are several improvements in our method. First, the asymmetric least squares has better computational properties than asymmetric least absolute; see Waltrup et al. (2015). Second, the parameters constrained in CALS makes the model be free from the crossing problem, which often occurs in composite methods; see details in Waltrup et al. (2015). Finally, it also avoids the complex matrix decomposition in Xiao and Koenker (2009), making the method more computationally efficient. With lower computational complexity, the proposed method still shares similar asymptotic properties with that in Xiao and Koenker (2009).
3.2 Empirical likelihood for determining
Having obtained the estimation of volatility insample, a series of estimated innovation can be obtained by
(30) 
where . We determine the corresponding for fixed based on the series of estimated innovation by the method of empirical likelihood since depends on the marginal distribution of noise process .
We use the maximum empirical likelihood method to determine , which is theoretically based on the properties of expectile stated in Eq. (6). Considering the propositions for innovation series, we have
(31) 
With the equations above, for a fixed , we can construct the empirical likelihood function and maximum empirical likelihood estimation of . For notational convenience, in this subsection, we denote the true value of by , and the notation will be used in the optimisation function of empirical likelihood.
Suppose that are the estimated innovation from (30). For and fixed , let
(32) 
Then, the empirical likelihood function of can be expressed as
(33) 
Making use of Lagrange multipliers, we can obtain
(34) 
where is a twodimensional vector associated with but has no explicit expression. The relationship of and is as follows:
(35) 
Next, we can obtain the maximum empirical likelihood estimate for , which is defined as
(36) 
Here, we use the estimated innovation series to estimate and for fixed via the empirical likelihood method mentioned above. So far, we have described the method for determining the corresponding . It is more reasonable than the method of Taylor (2008) since our method authentically uses information about the conditional distribution of . Furthermore, the empirical likelihood method is completely distributionassumptionfree and datadriven.
3.3 Estimating conditional tailrelated risks
At the end of the section, we summarise the combination method of CALS and empirical likelihood for estimating conditional tailrelated risks. We are going to estimate the conditional VaR and ES of series conditional on information prior to time . Given a suitable length of moving window , we obtain from the observations in the moving window by the method of CALS. Additionally, the estimated innovation series is obtained by
(37) 
where . From this series of , we obtain the estimation using empirical likelihood. Then, the quantile estimator of is represented as
(38) 
and the corresponding estimator of ES is
(39) 
The conditional tailrelated risks of prediction can be obtained from the product of volatility prediction and tailrelated risks of the innovation series. Since we have a preliminary estimation , a rational choice is predicting the conditional tailrelated risks of by
(40) 
(41) 
where . Additionally, we can predicate the conditional tailrelated risks based on another volatility estimation as follows:
(42) 
(43) 
where In the simulation section, we show that both of two estimation are competitive, but the ’hat’ one (Eqs. (42)  (43)) outperforms the ’tilde’ one (Eqs. (40)  (41)).
So far, we have described the procedure for estimating the conditional tailrelated risk by using our proposed method. Now, let us present some details about the rule of thumb for selecting the tuning parameters in our method. The length of moving window, , must be determined since it is crucial to the estimation. Consider the overall asymptotic properties of the method, a bigger is preferable. However, in reality, financial time series are modelchanging frequently, in terms of both the heteroscedasticity structure and noise distribution. An overly large may lead to undesirable model errors. The selection of is also sensitive to the quantile level . Based on simulations, a moving window with length from 500 to 1000 is suitable for is not too extreme, and longer moving windows are necessary for situations with a more extreme .
The truncation parameter is a value associated with the sample size . Xiao and Koenker (2009) proposed that should be a sufficiently large constant multiple of to ensure that the approximation error of is sufficiently small. As the method for preliminary estimation in our paper is essentially similar to the method of Xiao and Koenker (2009), we follow its selection, with .
The number of expectile specifications, , and the corresponding expectile index in CALS must also be determined by analysts. Usually, we choose a uniform grid over the interval as the class of expectile index . Although a larger K will improve the estimation accuracy, the computational expense along with an increase in K and robustness of estimation when some approaching 0 or 1 should be considered. As verified via simulations, a uniform grid over the interval with a length from 9 to 19 is an appropriate choice for nottooextreme .
4 Asymptotic properties of the combination estimation
In this section, we state the asymptotic properties of the proposed methods as theorems, and the corresponding proofs are all presented in the appendix.
4.1 Asymptotic properties of CALS estimation
Let be the minimizer of . Consider the estimation of derived from the CALS in Eq. (25). The consistency and asymptotic normality of the CALS estimator are given by Theorem 4.1 and Theorem 4.2. We first give some necessary conditions for these results.
Assumption.
A1. is strictly stationary and ergodic and has the probability density function with respect to the measure , where is continuous in for almost all and denotes the Lebesque measure on the .
Assumption.
A2. There is a such that , where denotes the infinite norm in this article.
Assumption.
A3. , where is compact.
Assumption.
A4. is nonsingular.
These assumptions are common in asymmetric least squares regression; see Newey and Powell (1987). Under these regularity conditions, we have the following theorems about the asymptotic properties of CALS estimation.
Theorem 4.2.
Under Assumptions [A1][A3], there is a unique minimiser, , of the object function , and the CALS estimator satisfies, , as .
Theorem 4.3.
Under assumptions [A1][A4],
as , where , with
and
More detailed presentations of and can be found in the appendix.
In fact, for volatility estimation, we only need part of the parameters in , which are . We rewrite the asymptotic property of these parameters as in the following corollary.
Corollary 4.4.
Under Assumptions [A1][A4], the CALS estimation of satisfies
and
as , where is the principal submatrix of from th line to th line. Alternatively, it can be presented as , with
and
Before we provide the asymptotic properties of volatility estimation and , we should discuss the error from approximating linear GARCH(,) by the truncated ARCH() and determine suitable truncation parameters . Following the conclusion of Xiao and Koenker (2009), we present some necessary assumptions to bound the error from this part of approximating.
Assumption.
B1. The polynomials and , where and are positive, have no common zero points; , for ; and , for .
Assumption.
B2. The truncation parameter m satisfies for some constant .
Under Assumption B1, is invertible, and the parameters in (18) decrease at a geometric rate. As a consequence, we have the following proposition given by Xiao and Koenker (2009).
Proposition 4.5.
Under Assumptions [B1][B2], there exists a positive constant such that has approximation as . If we choose the constant in Assumption. B2 as , then .
With the conclusions above, we can obtain the asymptotic properties of the preliminary estimation, .
Corollary 4.6.
Under Assumptions [A1][A4] and [B1][B2], conditional on the information prior to time , the preliminary estimation, , has the following asymptotic properties,
and
as , where .
To make a consistent onestep postsample prediction of condition variance, we should discuss the asymptotic properties of in (28). Let us present some notation before the discussion.
Let , and denote
since can be expressed as . Correspondingly, we write
Then, the estimator of from (28) can be rewritten as
(44) 
and has the following limiting behaviour.
Theorem 4.7.
Under Assumptions [A1][A4] and [B1][B2], the estimator of from Eq. (28) has the following asymptotic properties:
(45) 
(46) 
as . can be expressed as
(47) 
with and .
Similar to corollary 4.6, we have the asymptotic properties of .
Corollary 4.8.
Under Assumptions [A1][A4] and [B1][B2] and conditional on information prior to time , has the following asymptotic properties,
and
as , where , and can be approximated by .
4.2 Asymptotic properties of estimation of the conditional tailrelated risk
Let us turn to estimate the conditional tailrelated risk after providing the asymptotic properties of CALS estimation. Since the estimations of the conditional tailrelated risks are a combination of the estimation of volatility and the empirical likelihood estimation of and , the latter’s asymptotic properties must be discussed. These asymptotic results also require some assumptions about the distribution of the innovation series , which is not strict for most of the common distribution.
Assumption.
C1. In each moving window, the innovation series is an independent identically distributed random sample with distribution .
Assumption.
C2. has expectation and finite secondary moment . Its derivative is bounded and satisfies .
With these two assumptions and the assumption mentioned above, we can obtain the following lemma about the empirical distribution of the estimated innovation .
Lemma 4.9.
Suppose that is the empirical distribution of the estimated innovation . Under Assumptions [A1][A4], [B1][B2] and [C1][C2], for any given , we have
(48) 
as .
Lemma 4.9 plays an important role in deriving the asymptotic property of the empirical likelihood estimation, which actually implies that the empirical distribution of the estimated innovation has similar convergent properties as the empirical distribution of an i.i.d sample. Based on this lemma, the asymptotic property of empirical likelihood estimation is established by the following theorem.
Theorem 4.10.
For the empirical likelihood estimation from (36), under Assumptions [A1][A4], [B1][B2] and [C1][C2], when , it follows that
(49) 
(50) 
as , where , with