Markov Switching Component GARCH Model: Stability and Forecasting

# Markov Switching Component GARCH Model: Stability and Forecasting

N. Alemohammad, S. Rezakhah, S. H. Alizadeh 111Faculty of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran. Email: n-alemohammad@aut.ac.ir, rezakhah@aut.ac.ir, sasan.h.alizadeh@qiau.ac.ir
###### Abstract

This paper introduces an extension of the Markov switching GARCH model where the volatility in each state is a convex combination of two different GARCH components with time varying weights. This model has the dynamic behaviour to capture the variants of shocks. The asymptotic behavior of the second moment is investigated and an appropriate upper bound for it is evaluated. The estimation of the parameters by using the Bayesian method via Gibbs sampling algorithm is studied. Finally we illustrate the efficiency of the model by simulation and empirical analysis. We show that this model provides a much better forecast of the volatility than the Markov switching GARCH model.

Keywords: GARCH models, Markov process, Stability, Component GARCH models, Forecasting, Bayesian inference, Griddy Gibbs sampling.

Mathematics Subject Classification: 60J10, 62M10, 62F15.

## 1 Introduction

In the past three decades, there has been a growing interest in using non linear time series models in finance and economy. For financial time series, the ARCH and GARCH models, introduced by Engle [11] and Bollerslev [7], are surely the most popular classes of volatility models. Although these models have been applied extensively in the modeling of financial time series, the dynamic structure of volatility can not be captured passably by such models. For more consistent volatility modeling, the models by time varying parameters are introduced. One class of such models is that of smooth transition GARCH models presented by Gonzalez-Rivera [14], Lubrano [23] (see also Hagerud [19] and Medeiros and Veiga [25]). These models can be considered as a valuable tool for including the asymmetry properties to negative and positive or small and big shocks in financial time series. The component GARCH models, introduced first by Ding and Granger [10], are also a generalization of the constant parameter GARCH model. In the structure of the component GARCH model ([10]), two different GARCH components contribute to the overall conditional variance at time t. One component has the high volatility (integrated variance component) and the other one has the low volatility. These models have been widely applied in modeling the financial time series (e.g. [24] and [12]). A generalization of the component GARCH model of Ding and Granger is the weighted GARCH model that is proposed by Bauwens and Storti [5]. In this model the weights of GARCH components are the functions of lagged values of the conditional standard deviation or squared past observations.

Another class is that of Markov switching models. These models are obtained by Merging (G)ARCH model with a Markov process, where each state of the Markov model allows a different (G)ARCH behavior. These models are introduced by Cai [8] and Hamilton and Susmel [18]. This feature extends the dynamic formulation of the model and potentially enables improving forecasts of the volatility [1]. Gray [15], Klaassen [21], Haas, Mittnik and Paolella [17] proposed different variants of Markov-Switching GARCH models. See also further studies, Abramson and Cohen [1], Alexander and Lazar [2] and Bauwens et al. [6].

In this paper we consider a Markov switching model that the volatility of each state is a convex combination of two GARCH regimes with time varying coefficients which is in effect of the previous observation. This model has the potential to switch between several regimes with various volatilities and also is able to model the time series with variants of shocks. The structure of the model makes a dynamic behavior in each regime to react differently to the species of shocks. For example in the high volatility regime, the model is able to have different responses to very high and high shocks and in low volatility regime different reactions to moderate and low shocks. We consider different weight functions for each state that allow volatility in each state to react differently to the shocks of equal size. As using all past observations for forecasting could increase the complexity of the model, we reduce the volume of calculations by proposing a dynamic programming algorithm. We derive necessary and sufficient conditions for stability and obtain an upper bound for the limit of the second moment by using the method of Abramson and Cohen [1] and Medeiros [25]. For the estimation of the parameters, we use the Bayesian inference via the Gibbs sampling. We compare the performance of our model with the Markov switching GARCH model. The Markov switching component GARCH model can forecast the conditional variance much better than MS-GARCH model.

The paper is organized as follows: in section 2 we introduce the Markov switching component GARCH model. Section 3 investigates the statistical properties of the model. Section 4 is devoted to the estimation of the parameters of the model. Section 5 is dedicated to the analyzing of the efficiency of the proposed model through simulation and the comparison of the forecast errors with the MS-GARCH model. The empirical applications and discussion are developed in section 6. Section 7 concludes.

## 2 Markov Switching Component GARCH Model

The Markov switching component GARCH model, MS-CGARCH, for time series is defined as

 yt=εt√Ht,Zt, (2.1)

where are iid standard normal variables, is an irreducible and aperiodic Markov chain on finite state space with transition probability matrix where , and stationary probability measure Also given that , (the conditional variance in regime j) is driven by

 Ht,j=wt,jh1,t,j+(1−wt,j)h2,t,j, (2.2)

where

 h1,t,j= a0j+a1jy2t−1+a2jHt−1,j, (2.3) h2,t,j= b0j+b1jy2t−1+b2jHt−1,j, (2.4)

and each of the weights () is a function of the past observation as

 wt,j=1−exp(−γj|yt−1|)1+exp(−γj|yt−1|)    γj>0, (2.5)

which is bounded , . The parameter is called the slope parameter, that explains the speed of transition from one component to the other one: the higher , the faster the transition. in (2.2) is the conditional variance of state m at time , that is a combination of conditional variances of both components at the state j. Since , when the absolute value of increases, the impact of increases and consequently the effect of decreases and vice versa. Another good feature of our model is that it overcomes the problem of path dependency 222Path dependency happens when the volatility of each regime at time t depends on the entire sequence of past regimes because of the recursive property of GARCH processes. (that is common in some kinds of MS-GARCH processes).
If becomes a constant value (for example when tending to zero or infinity), the MS-CGARCH model will be a MS-GARCH model. In the case of single regime, if , our model is the generalization of the smooth transition GARCH model that is introduced by Lubrano [23].

It is assumed that and are independent. Sufficient conditions to guarantee strictly positive conditional variance are to be positive and being nonnegative.

Let be the observation set up to time t. The conditional density function of given past observations is obtained as follows:

 f(yt|It−1)= K∑j=1f(yt,Zt=j|It−1) =K∑j=1p(Zt=j|It−1)f(yt|It−1,Zt=j) =K∑j=1α(t)jϕ(yt√Ht,j) (2.6)

in which (that is obtained in next section), and is the probability density function of the standard normal distribution.

## 3 Statistical Properties of the model

In this section, the statistical properties of the MS-CGARCH model are investigated and the conditional variance of the process is obtained. We show that the model, under some conditions on coefficients and transition probabilities , is asymptotically stable in the second moment. An appropriate upper bound for the limiting value of the second moment is obtained.

### 3.1 Forecasting

The forecasting volatility (conditional variance) of MS-CGARCH model is given by

 Var(Yt|It−1)=K∑j=1α(t)jHt,j=K∑j=1α(t)j(wt,jh1,t,j+(1−wt,j)h2,t,j). (3.7)

This relation shows that the conditional variance of this model is affected by the changes in states, the volatility of components and the weight functions in each state.
At each time , (in equation (2), (3.7)) can be obtained from a dynamic programming method based on forward recursion algorithm, proposed in remark (3.1).

###### Remark 3.1

The value of is obtained recursively by

 α(t)j=∑Km=1f(yt−1|Zt−1=m,It−2)p(Zt−1=m|It−2)pm,j∑Km=1f(yt−1|Zt−1=m,It−2)p(Zt−1=m|It−2). (3.8)
###### Proof 3.1

As the hidden variables have Markov structure in the MS-CGARCH model, so

 α(t)j= p(Zt=j|It−1)=K∑m=1P(Zt=j,Zt−1=m|It−1) =K∑m=1p(Zt=j|Zt−1=m,It−1)p(Zt−1=m|It−1) =K∑m=1p(Zt=j|Zt−1=m)p(Zt−1=m|It−1) =∑Km=1f(It−1,Zt−1=m)pm,j∑Km=1f(It−1,Zt−1=m) (3.9)

where

 f(yt−1|Zt−1=m,It−2)=ϕ(yt−1√Ht−1,m).

### 3.2 Stability

In this subsection, we investigate the stability of the second moment of MS-CGARCH model. Indeed we are looking for an upper bound for the second moment of our model. The second moment of the model can be calculated as:

 =K∑zt=1πztEt−1(Ht,Zt|zt). (3.10)

denotes the expectation with respect to the information up to time t. Also for summarization, we shall use and to represent and , respectively, where is the realization of the state at time t. We investigate the conditional variance under the chain state, , as follows:

 Et−1(Ht,m|zt) =Et−1[wt,m(a0m+a1my2t−1+a2mHt−1,m)|zt] +Et−1[(1−wt,m)(b0m+b1my2t−1+b2mHt−1,m)|zt] =b0m+b1mEt−1[y2t−1|zt]I+(a0m−b0m)Et−1[wt,m|zt]II+b2mEt−1(Ht−1,m|zt)III +(a1m−b1m)Et−1[wt,my2t−1|zt]IV+(a2m−b2m)Et−1(wt,mHt−1,m|zt)V. (3.11)

The relation (II) in (3.2) can be interpreted as follows:

 Et−1[y2t−1|zt]=K∑zt−1=1∫SIt−1y2t−1p(It−1|zt,zt−1)p(zt−1|zt)dIt−1
 =K∑zt−1=1p(zt−1|zt)Et−1[y2t−1|zt−1,zt], (3.12)

where is the support of . Since the expected value of is independent of any future state, so

 Et−1[y2t−1|zt−1,zt]=Et−1[y2t−1|zt−1]. (3.13)

Also using the tower property of the conditional expectation, [see Grimmett and Stirzaker (2001, p. 69)], we have

 Et−1[y2t−1|zt−1]=Et−2[Et−1(y2t−1|It−2,zt−1)|zt−1]
 =Et−2[Ht−1,Zt−1|zt−1]. (3.14)

The calculation of , and is a problem that can not be easily done, for this reason we will try to find an upper bound for them.

Upper bound to II. As , so an upper bound for the relation II in (3.2) is obtained by

 (a0m−b0m)Et−1[wt,m|zt]≤|a0m−b0m|<∞. (3.15)

The relation (III) can be specified as

 b2mEt−1(Ht−1,m|zt)=b2m∫SIt−1Ht−1,mp(It−1|zt)dIt−1
 =b2,mK∑zt−1=1p(zt−1|zt)Et−2(Ht−1,m|zt−1). (3.16)

Upper bound to IV. Let be a constant, so

 Et−1[wt,zty2t−1|zt]= Et−1[wt,zty2t−1I|yt−1|

in which

 Ix

As by (2.5), and so

 Et−1[wt,zty2t−1|zt]≤M2+Et−1[wt,zty2t−1I|yt−1|≥M|zt],

also

 Et−1[wt,zty2t−1I|yt−1|≥M|zt]= ∫SIt−2,yt−1≤−My2t−1[wt,zt]p(It−1|zt)dIt−1 +∫SIt−2,yt−1≥My2t−1[wt,zt]p(It−1|zt)dIt−1,

by (2.5),

 limyt−1→+∞wt,zt=1,        limyt−1→−∞wt,zt=1, (3.17)

therefore according to the definition of limit at infinity, for a small number , there will exist a finite constant such that if , and if , . Hence

 Et−1[wt,zty2t−1I|yt−1|≥M|zt]≤ (δ+1)∫SIt−2,yt−1≤−My2t−1p(It−1|zt)dIt−1 +(δ+1)∫SIt−2,yt−1≥My2t−1p(It−1|zt)dIt−1.

Since the distribution of the is symmetric, then

 (δ+1)∫SIt−2,yt−1≤−My2t−1p(It−1|zt)dIt−1≤ (δ+1)∫SIt−2,−∞

and

 (δ+1)∫SIt−2,yt−1≥My2t−1p(It−1|zt)dIt−1≤ (δ+1)∫SIt−2,0

Therefor

 (a1m−b1m)Et−1[wt,zty2t−1|zt]≤|a1m−b1m|(M2+(δ+1)Et−1[y2t−1|zt]).

Upper bound to V. Since , so

 (a2m−b2m)Et−1(wt,mHt−1,m|zt)≤|a2m−b2m|Et−1(Ht−1,m|zt). (3.18)

By replacing the obtained upper bounds and relations (3.12)-(3.14) in (3.2), the upper bound for is acquired as:

 Et−1(Ht,m|zt) ≤a0m+|a1m−b1m|M2 +K∑zt−1=1[b1m+|a1m−b1m|(δ+1)]p(zt−1|zt)Et−2[Ht−1,Zt−1|zt−1] + K∑zt−1=1a2mp(zt−1|zt)Et−2[Ht−1,m|zt−1], (3.19)

in which by Bayes’ rule

 p(zt−1|zt)=πzt−1πzt{Pzt−1zt},

where is the transition probability matrix. Let

 Ω=[a01+|a11−b11|M2,⋯,a0K+|a1K−b1K|M2]′, (3.20)

be a vector with K component, denotes a -by- block matrix as

 C=⎛⎜ ⎜ ⎜ ⎜⎝C11C21⋯CK1C12C22⋯CK2⋮⋮C1KC2K⋯CKK⎞⎟ ⎟ ⎟ ⎟⎠ (3.21)

with each block given by

 Cjk=p(Zt−1=j|Zt=k)(ue′j+v),       j,k=1,⋯,K, (3.22)

where , is a K-by-1 vector of all zeros, except its jth element, which is one, and is a diagonal K-by-K matrix with elements on its diagonal.
Let , be a -by-1 vector and consider be a vector that is made of K vector .

Hence by (3.20)-(3.22) we have the following recursive inequality vector form for , as

 At≤˙Ω+CAt−1,   t≥0. (3.23)

with some initial conditions
Let and consider denotes the spectral radius of a matrix A, then we have the following theorem for the stationarity condition of the MS-CGARCH model.

###### Theorem 3.1

Let follows the MS-CGARCH model, defined by (2.1)-(2.5), the process is asymptotically stable in variance and , if and only if

###### Proof 3.2

[1], By recursive inequality (3.23),

 At≤˙Ωt−1∑i=0Ci+CtA0:=Bt. (3.24)

By the matrix convergence theorem [22], a necessary and sufficient condition for the convergence of where is ( the value of can be considered small enough to be negligible). Under this condition, converges to zero as t goes to infinity and converges to provided that matrix is invertible. So if ,

 limt→∞At≤(I−C)−1˙Ω.

By (3.10) the upper bound for the asymptotic behavior of unconditional variance is given by

 limt→∞E(y2t)≤Π′(I−C)−1˙Ω.

If , goes to infinity with the growth of the time and it can not be possible to find an upper bound for the unconditional second moment of the model.

## 4 Estimation

In this section we describe the estimation of the parameters of the MS-CGARCH model. We consider Bayesian MCMC method using Gibbs algorithm by following methods of sampling of a hidden Markov process ([9] and [20]), MS-GARCH model and weighted GARCH model ([5] and [6] ) for estimation of the parameters.

Let and . For the case of two states, the transition probabilities are and the parameters of the model are , where for .

The purpose of Bayesian inference is to simulate from the distributions of the parameters and the state variables given the observations. As and , the posterior density of our model is:

 p(θ,η,Z|Y)∝p(θ,η)p(Z|θ,η)f(Y|θ,η,Z), (4.25)

in which is the prior of the parameters. The conditional probability mass function of given the is independent of , so

 p(Z|θ,η)= p(Z|η11,η22) =T∏t=1p(zt+1|zt,η11,η22) =pn1111(1−p11)n12pn2222(1−p22)n21, (4.26)

where . The conditional density function of given the realization of and the parameters is factorized in the following way:

 f(Y|η,θ,Z)=T∏t=1f(yt|θ,zt=k,Yt−1),   k=1,2, (4.27)

where the one step ahead of the predictive densities are:

 f(yt|θ,zt=k,Yt−1)=1√2πHt,kexp(−y2tHt,k). (4.28)

Since the posterior density (4.25) is not standard we can not sample it in a straightforward manner. Gibbs sampling of Gelfand and Smith [13] is a repetitive algorithm to sample consecutively from the posterior distribution. Under regularity conditions, the simulated distribution converges to the posterior distribution, (see e.g Robert and Casella [26]). The blocks of parameters are , and the realizations of .
A brief description of the Gibbs algorithm: Let use the superscript on and to denote the estimators of , , and at the r-th iteration of the algorithm. Each iteration of the algorithm consists of three steps:
(i) Drawing an estimator random sample of the state variable given .
(ii) Drawing a random sample of the transition probabilities given .
(iii) Drawing a random sample of the given and .

These steps are repeated until the convergency is obtained. In what follows sampling of each block is explained.

### 4.1 Sampling zt

The purpose of this step is to obtain the sample of that is performed by Chib[9], (see also [20]). Suppose be the stationary distribution of the chain,

 p(zt|η,θ,Yt)∝f(yt|θ,zt=k,Yt−1)p(zt|η,θ,Yt−1), (4.29)

where the predictive density is calculated by the relation (4.28) and by the law of total probability is given by:

 p(zt|η,θ,Yt−1)=K∑zt−1=1p(zt−1|η,θ,Yt−1)ηzt−1zt. (4.30)

Given the filter probabilities (), we run a backward algorithm, starting from that is derived from . For the sample is derived from ,which is obtained by

 p(zt|zt+1,⋯,zT,θ,η,Y)∝p(zt|η,θ,Yt)ηzt,zt+1.

To derive from is by sampling from the conditional probabilities (for example) which are given by

 p(Zt=1|Zt≥1,.)=p1∑2l=1pl.

After generating a uniform (0,1) number , if then , otherwise .

### 4.2 Sampling η

This stage is devoted to sample from the posterior probability that is independent of . We consider independent beta prior density for each of and . For example,

 p(η11|Zt)∝p(η11)p(Zt|η11)=ηc11+n11−111(1−η11)c12+n12−1,

where and are the parameters of Beta prior, is the number of transition from to . In the same way the sample of is obtained.

### 4.3 Sampling θ

The posterior density of given the prior is given by:

 p(θ|Y,Z,η)∝p(θ)T∏t=1f(yt|θ,zt=k,Yt−1)=p(θ)T∏t=11√2πHt,kexp(−y2tHt,k), (4.31)

which is independent of . Since the conditional distribution of does not have a closed-form (because for example , in which is the parameter vector without contains , which is also a function of . Therefor it can not be a normal density.) using the Gibbs sampling in this situation may be complicated. The Griddy Gibbs algorithm, that introduced by Ritter and Tanner (1992), can be a solution of this problem. This method is very applicable in researches (for example [4] , [5] and [6]).

Given samples at iteration the Griddy Gibbs at iteration proceeds as follows:

1. Select a grid of points, such as . Using (4.31), evaluate the conditional posterior density function over the grid points to obtain the vector .
2. By a deterministic integration rule using the G points, compute with

 Φj=∫aj0ia10ik(a01|θ(r)−a0i,Z(r)t,Yt)da0i,   i=2,⋯,G. (4.32)

3. Simulate and invert by numerical interpolation to obtain a sample from .
4. Repeat steps 1-3 for other parameters.

For the prior densities of all elements of , it can be can considered independent uniform densities over the finite intervals.

## 5 Simulation Results

In this section we provide some simulation results of MS-CGARCH model defined by equations (2.1)-(2.5) for two states. We simulate 300 sample from the following MS-CGARCH model:

 yt=εt√HZt,t, (5.33)

where is an iid sequence of standard normal variables, is a Markov chain on finite state space with transition probability matrix

 P=⎛⎜⎝.85.15.05.95\par⎞⎟⎠,

and

 H1,t= 1−exp(−2|yt−1|)1+exp(−2|yt−1|)(2.2+.75y2t−1+.15H1,t−1)+ [1−1−exp(−2|yt−1|)1+exp(−2|yt−1|)](.7+.3y2t−1+.2H1,t−1), H2,t= 1−exp(−.5|yt−1|)1+exp(−.5|yt−1|)(.4+.15y2t−1+.1H2,t−1)+ [1−1−exp(−.5|yt−1|)1+exp(−.5|yt−1|)](.2+.1y2t−1+.2H2,t−1). (5.34)

The first state implies a higher conditional variance than the second one and in each state, the first component has the higher volatility than the other component.

In Table 1, we report summery statistics for simulated data and figure 1 shows the plot of the simulated time series.

Using the Bayesian inference, we estimate the parameters of the MS-CGARCH model. The prior density of each parameter is assumed to be uniform restricted over a finite interval (except for and , since they are drawn from the beta distribution). Table 2 demonstrates the performance of the estimation methods. The results of this table show that the standard deviation are small enough in most cases.

For clarifying the performance of MS-CGARCH model toward MS-GARCH model, We compare the forecasting volatility () of each model with the squared observations. Figure 2 shows that the forecasting volatility of MS-CGARCH is much better than MS-GARCH model and the absolute forecast error (the difference between the forecasting volatility and the squared observations) of our model is often smaller than the MS-GARCH model. The root of mean squared error of the MS-GARCH and MS-CGARCH respectively are 0.738 and .483 and the mean absolute error of them are 0.510 and .3804.

## 6 Empirical Applications

We apply the daily stock market index of Dow Jones industrial average (DJIA) from 07/10/2009 to 14/12/2010 (300 observations) and from 12/12/2006 to 22/02/2008 (300 observations) for estimation. Figures 3 demonstrates the stock market index and the percentage returns 3