Modeling Financial Volatility in the Presence of Abrupt Changes
Abstract
The volatility of financial instruments is rarely constant, and usually varies over time. This creates a phenomenon called volatility clustering, where large price movements on one day are followed by similarly large movements on successive days, creating temporal clusters. The GARCH model, which treats volatility as a drift process, is commonly used to capture this behavior. However research suggests that volatility is often better described by a structural break model, where the volatility undergoes abrupt jumps in addition to drift. Most efforts to integrate these jumps into the GARCH methodology have resulted in models which are either very computationally demanding, or which make problematic assumptions about the distribution of the instruments, often assuming that they are Gaussian. We present a new approach which uses ideas from nonparametric statistics to identify structural break points without making such distributional assumptions, and then models drift separately within each identified regime. Using our method, we investigate the volatility of several major stock indexes, and find that our approach can potentially give an improved fit compared to more commonly used techniques.
1 Introduction
Volatility clustering is often observed in the return series of financial instruments Oh2008 (); Tsenga2011 (). This phenomena is best illustrated by an example. Let denote the price of some financial instrument at a set of equally spaced discrete time points , and let the return series be the logincrements . The volatility of the instrument is defined as the standard deviation of these returns. A typical example of a financial return series can be seen in Figure 1, which shows the daily returns of the Dow Jones stock index over a year period ranging from January 1991 to August 2011. It can be observed that the standard deviation is not constant, but instead varies over time. In particular, note that the period from 2003 to 2007 seems to have noticeably lower volatility than the period immediately before or after. Similarly, in there are many extreme return values which occur in close succession, pointing to an abnormally high volatility during this period.
Volatility clustering refers to this notion that large/small returns tend to be followed by similarly large/small values, which results in extended regimes of abnormally high or low volatility. This has been empirically observed in many different financial time series, and poses a problem for traditional financial models, which have typically assumed that the volatility is roughly constant over time. The last 25 years have seen an increasing number of attempts to model the timevarying nature of volatility, and the generalized autoregressive conditional heteroskedasticity (GARCH) model Bollerslev1986 (), along with its many variants, is now the defacto standard. The idea behind GARCH is that the volatility undergoes a stochastic drift process, where the conditional volatility at time is a random variable, with a conditional distribution which depends on the long term volatility, the volatility during the most recent period, and the most recent values of the return series.
However the gradual drift process underlying the GARCH model seems to be empirically violated in many real financial series. In some cases, volatility seems to behave more like a jump process, where it fluctuates around some value for an extended period of time, before undergoing an abrupt change, after which it fluctuates around a new value. This can be seen in Figure 1 around the year , where the volatility spontaneously increases for a period of several years, before dropping to a lower value during . Since the standard GARCH model does not contain the possibility of these sudden jumps, it tends to overestimate the degree of long term volatility persistence. This has prompted the development of regimeswitching GARCH processes which can incorporate jumps Gray1996 (); He2010 (). In these models, the return series is allowed to contain multiple change points which segments it into regimes, with the GARCH model having different parameters within each segment.
However, such models can be hard to estimate. Although there are computationally efficient procedures for estimating multiple change points in simpler ARCH models Hamilton1994 (); Kokoszka2000 (); Lee2004 (), the longrange dependence introduced by the GARCH formulation makes such approaches difficult to apply. Standard techniques for fitting multiple change point models to data assume independence between segments Fearnhead2006 () which is not the case in the GARCH framework. Although some recent attempts to fit such models have been attempted He2010 (), it remains a difficult numerical procedure. Therefore, the most popular strategy is to instead use the approximate procedure introduced by Aggarwal1999 () where the model is fitted in stages, with the abrupt change points first being located using the iterated cumulative sum of squares (ICSS) algorithm Inclan1994 (), before a GARCH model then estimated conditional on these change points. This ICSSGARCH algorithm has been used to study a wide variety of financial time series. For example, Covarrubias2006 () uses it to study the volatility of the US dollar exchange rate against several different currencies, Malik2005 () studies the returns of the Canadian stock exchange, Kang2009 () does likewise for the Japanese and Korean exchanges, and Kang2011 () analyses the market for crude oil
Although ICSSGARCH is simple to implement and has been shown to give improved results compared to standard GARCH models, it is not without its problems Sanso2004 (); Rapach2008 (). The parameters of the ICSS algorithm are usually designed under the assumption that the financial returns follow a Gaussian distribution, and it can produce many spurious jump points if this assumption is violated. We will show later that applying ICSS to heavytailed series can give poor results, since extreme observations are misinterpreted as being regime shifts. Unfortunately, it has now been conclusively established that financial data is very rarely Gaussian, and return series typically exhibit heavy tail behaviour Stanley2008 (); Liu1999 (); Podobnik2009 (); Plerou1999 (); Gopikrishnan1999 (). Similar heavy tail behaviour has been observed in many financial series and is not limited to asset returns Podobnik2011 ().
This limitation of the ICSSGARCH methodology has meant that it is usually only used to detect change points in the weekly returns of financial instruments, i.e. where and are one week apart Fernandez2007 (); Kang2009 (); Kang2011 (); Malik2005 (); Aggarwal1999 (). Using the algorithm on daily returns can generate too many spurious false positives for it to be useful, due to the number of extreme values. This is a problem since the daily returns are more finegrained and hence using them should allow more accurate volatility modelling. Therefore, it is desirable to find a way to use this data when it is available.
In this paper we present an alternative to the ICSSGARCH algorithm which is better suited for dealing with the heavy tailed, nonGaussian data which is typical in finance. We replace the ICSS segmentation step of ICSSGARCH with an alternative technique based on nonparametric statistics, which does not make any assumptions about the true returns distribution.This allows it to entirely avoid the Gaussianity assumption and allows it to be deployed on daily returns. Our approach is based on the nonparametric change point model framework described in Rosstechnometrics (); Rossjqt (), and we hence refer to it as NPCPMGARCH. Using this technique, we analyse several stock indexes for volatility change points, specifically focusing on the Dow Jones Industrial Average, the German DAX, the VIX volatility index, and the Japanese Nikkei 225. We compare our results with those of ICSSGARCH, and find that our method generally gives a better fit to the data when measured using standard criteria. This suggests that it could be a widely useful tool for modelling volatility in other contexts.
The remainder of the paper proceeds as follows. We begin in Section 2 by describing the ICSS step of the ICSSGARCH algorithm. We explain why it gives poor performance when used with heavy tailed data, and give a simulated example using Studentt to show this. In Section 3 we introduce our new nonparametric approach. We then briefly review the GARCH stage of the algorithm in Section 3.3, and in Section 4 we present an empirical evaluation of our method on a range of foreign exchange series.
2 The ICSSGARCH Algorithm
The ICSSGARCH methodology has two stages. Given a financial returns series, the ICSS algorithm is first used to detect any change points in the volatility, and the series is then segmented around these points. Then, a separate GARCH model is fitted to each segment. We begin by giving overview of the ICSS stage, before pointing out its problems and introducing our alternative. Next, we review the GARCH estimation stage.
2.1 Stage 1: Change Point Detection using ICSS
The Iterated Cumulative SumofSquares algorithm is based on the work of Inclan1994 (), who proposed a retrospective technique for detecting changes in the variance of a financial time series. Given a series of financial returns with mean , define the cumulative sum of squares as , and let
If the sequence has constant variance, then the value of will oscillate around 0. However if the variance undergoes an abrupt change at some point then the value of will exhibit extreme behaviour around this point with its magnitude becoming unusually large. Change detection is carried out by defining a threshold , and comparing the maximum value of to this. Specifically, a change is flagged if:
(1) 
where the factor is included for standardization purposes. If the threshold is exceeded then the estimate of the change point, which we denote , is located at the value of which gave the maximum value of , i.e. .
In cases where the series may contain multiple changepoints, the above procedure can be iterated. The ICSS algorithm is first applied to the full series. If a change is flagged, and estimated to be at location . The series is then split into two segments; and around this point. Then, the ICSS algorithm is recursively applied to both segments A and B separately, in the same manner as before. If a change point is flagged in segment A, and estimated to be at location , then segment A is further subdivided into two segments around point , and the ICSS algorithm is applied to these new segments. and so on. The same procedure is likewise applied to segment B. This produces a sequence of estimated change points.
Deploying the ICSS algorithm requires specifying the threshold . In the original paper Inclan1994 (), this is chosen in order to control the probability of mistakenly concluding the a change has occurred, if it fact there is no change. Let be the probability of this occurring. The authors show that if the observations are Gaussian,then choosing asymptotically gives a value of assuming that the observations are Gaussian. This is the value which has typically been by other papers using the ICSSGARCH algorithm Aggarwal1999 (); Kang2009 (); Kang2011 (). Note that if the observations are not Gaussian, then the actual value of obtained for this choice of may be radically different from  this is the crux of the problem with the ICSS algorithm in the context of financial data.
2.1.1 NonGaussian Data
The ICSS algorithm is very easy to implement and does not require much computational resources, which is one of the reasons why it has been widely adopted. However its reliance on the Gaussian distribution when specifying the threshold is problematic, since financial data is known to be nonGaussian, and can exhibit heavy tailed behavior. The justification for the Gaussian assumption comes from the central limit theorem; if there are no change points, then is asymptotically Gaussian since it is a sum of independent and identical random variables. However asymptotic arguments often fail in practice, where we are concerned with finite length return series. The basic problem is that, because financial return series are heavy tailed, there will occasionally be large values generated which are interpreted as change points, even though they should more correctly be classed as outliers.
We illustrate this by deploying the ICSS algorithm on a simulated series of Studentt random variables, which is a standard distribution used to model heavy tailed behaviour. The series consists of independent observations . The first observations have a standard Studentt() distribution, which has mean and variance of . The next observations come from a scaled Student t() distribution with mean and variance . Finally, the last observations are again Studentt() with mean and variance . The series hence consists of regimes, with a volatility shift in each. We stress that by design the volatility between the change points at times and is constant, therefore any change points flagged in these regions are spurious false positives.
We simulated realisations of such a series, and applied the ICSS to each. On average, ICSS detected different regimes in each sequence, which is almost three times the true number. A typical realisation of the series is shown in Figure (a)a, with the change points discovered by ICSS plotted as red lines, and it can be seen that many spurious change points are generated. The problem is that the ICSS algorithm is based on the squared magnitudes of the returns, . However when the observations are heavytailed, extreme values can be produced even though nothing in the series has changed. The ICSS algorithm incorrectly interprets these extreme values as being jumps in volatility. This suggests that the ICSS algorithm will not work on daily financial return series which exhibit similar heavy tail behaviour, and our analysis in Section 4 will confirm this.
3 NPCPM: A Nonparametric Alternative to ICSS
The limitation of the ICSS algorithm is hence the assumption that the returns are Gaussian. We therefore propose replacing the ICSS stage of ICSSGARCH algorithm with a new technique which makes no such distributional assumptions, based on the sequential change detection work in Rosstechnometrics (). We call this approach the Nonparametric Change Point Model (NPCPM). Recall from the discussion of ICSS that it was configured to have a probability of incurring a false positive, given that there are no changes in the return series, and that the Gaussian assumption holds. We wish to retain this fixed false positive probability, regardless of the true return distribution. This can be done by adopting the idea of rank tests from the field of nonparametric statistics. These tests are easy to understand and implement, yet are very powerful and able to maintain a fixed rate of false positives regardless of the true distribution of the return series. We first review a standard nonparametric test for comparing two samples of observations and testing whether they have the same variance, when their distribution is unknown. Then, we show how this test can be extended to detect changes in volatility.
3.1 Two Sample Testing
Suppose that we have two samples of observations , , with an unknown heavytailed distribution, and we wish to test whether they have equal variance. One commonly used method for this is the Mood test Mood1954 () . This consists of replacing each observation with its rank, which is defined as the number of observations in the combined sample which it is greater than. More formally, the rank of each observation is:
So for example, if the first sample contains the observations and the second sample contains then the observation has rank , the observation has rank , and so on. The key point in the theory of rank tests is that if both samples have the same distribution, then each observation is equally likely to have any of the possible ranks. This is true regardless of what the true distribution is, and no matter how heavy tailed it is. Therefore, any test statistic which depends only on the ranks of the observations will not depend on their distribution.
The Mood test for equal variance measures the extent to which the rank of each observation deviates from the median rank. If both samples have an identical distribution, then the median rank is simply . If the observations all have the same distribution, then we would expect the ranks to be roughly equally split between the two samples. However if the variance of the samples differs, then the one with higher variance will typically significantly more extreme observations than the other. This leads to a test statistic based on summing the squared rank deviations from either of the samples, and comparing it to a threshold:
The expected value and standard deviation of this statistic depends on the sample sizes and . To make it easier to compare values evaluated on different sample sizes, we standardise it by subtracting its mean and dividing by its variance. From Mood1954 () this can be shown to be:
Finally if for some appropriately chosen threshold, then we conclude that the two samples have unequal variance. The value of can again be chosen to (e.g.) give a probability of falsely concluding that the samples have unequal variance when they are in fact equal. Unlike the ICSS approach, this threshold can be chosen in a way that allows this probability to hold regardless of how the returns are distributed.
3.2 Change Detection
Given the return series , we wish to test whether there is a change in volatility. Assuming for now that there is at most a single change point, we can think of this as being a compound problem; we first test whether there is a change point immediately after the second observation, then test if there is a change point after the third observation, and so on. More formally, we wish to decide between the hypothesis that there is no change point in the series, and the hypothesis that there is a change point at observation for some unknown value of .
For each possible value of , split the observations into two samples , . Then, the Mood test can be applied to these two samples in order to compare whether they have equal variance as before. Let be the computed value. By repeating this procedure over all values of , the following maximized test statistic can be defined:
.
The test then consists of comparing this maximized statistic to a threshold . As before, if for an appropriate threshold, we conclude that a change has occurred, with the best estimate of the change point then being . Note the similarity between this, and the ICSS statistic in equation 1. In both cases, we are essentially performing a test at each individual point in the sequence, and picking out the value which maximises it.
n  10  20  50  100  200  500  1000  5000  10000  20000 

2.48  2.65  2.88  2.99  3.09  3.20  3.25  3.35  3.37  3.42 
The final step is specifying the value of . Similar to the ICSS algorithm, we wish to choose this so that the probability of incurring a false positive is equal to either ; it should therefore be chosen as the percentile of . Unlike this ICSS algorithm, doing this will guarantee a false positive probability of regardless of the return distribution. These values can be easily found using Monte Carlo simulation. In Table 1 we list the values which give a false positive probability of , for various lengths of the financial series.
In cases where the series may contain multiple changepoints, we use the same recursive approach as in the ICSS algorithm. We first run our method on the whole series, and compare to the threshold . If it exceeds it, let . The observations are then split into two samples around , and the change detection algorithm is recursively applied to each sample until the threshold is no longer exceeded.
To illustrate the advantage of our approach over ICSS when working with nonGaussian data, we applied it to the same heavytailed Studentt data discussed in the previous section. Based on simulations, the NPCPM algorithm on average detects change points per sequence, compared to both the true number of , and the average of found by the ICSS algorithm. This highlights that the NPCPM approach is much more accurate when working with nonGaussian data. Figure (b)b shows the volatility change points which are identified in a typical realisation of the Studentt() series. Unlike the ICSS algorithm, our approach does not typically generate spurious false positives even though the observations are heavytailed. This shows that it is better able to cope with heavy tailed observations, and should be better suited to financial data.
3.3 Stage 2: GARCH Modelling
After the jump points have been found using either ICSS or NPCPM, the next step is to model the volatility drift in the segments between each pair of change points. This is done using the GARCH(p,q) model Bollerslev1986 (), where the conditional variance of the returns obeys an autoregressive moving average process, with and denoting the time lags. In practice, by far the most common version of this model is the GARCH(1,1), which we also use. A financial time series is said to be GARCH(1,1) if its volatility has the following timevarying form:
where is a sequence of independent and identically distributed random variables. In other words, the volatility at time is a function of the long term volatility (), the variance at the previous time point (), and the squared previous return (). This reliance on previous values leads naturally to the volatility clustering effect as seen in Figure 1. The distribution of is often taken to be Gaussian , but we will also consider the case where the variables have a Studentt distribution with degrees of freedom as in Bollerslev1987 (), in order to model heavy tail behaviour.
One limiting feature of the GARCH model is that the volatility is mean reverting and fluctuates around a fixed value. As discussed in the Introduction, it is often more realistic to use a regimeswitching/change point formulation where the parameters of the GARCH model, and hence the longrun volatility, can take different values in each segment. In this case, the segment boundaries are the changepoints found by the ICSS or NPMLE algorithms. We consider two different GARCH models; the first is the one used in Aggarwal1999 (); Malik2005 (); Kang2011 () where only the parameter undergoes change, i.e:
where is equal to some constant until the first change point, before switching to until the next change point, and so on. In the second regimeswitching model all three parameters are allowed to vary between regimes, i.e:
This gives a more flexible model, at the risk of overparameterization. We will refer to the model where only changes as the GARCH model, and the one where all parameters change as GARCH. Note that when using the models with Studentt error with degrees of freedom, we treat as a free parameter which is estimated along with the GARCH coefficients.
4 Empirical Results
Our change point study uses four world stock indexes, which are 1) the Dow Jones Industrial Average which consists of 30 large companies in the United States, 2) the Deutscher Aktien Index (DAX) which consists of the 30 large Germany companies, 3) the Nikkei 225 which consists of 225 Japanese countries, and 4) the VIX volatility index, which measures the implied volatility of the companies from the SP 500. We obtained daily closing prices for each series between the of January 1991, and the of October 2011. Figure 3 shows a plot of each of these four series.
Dow Jones  DAX  Nikkei  VIX  
Mean  
Standard Dev  
Skew  0.15  0.33  0.44  
Kurtosis  8.09  5.12  4.16  
Shapiro Wilk  
Ljung Box  0.67  
Ljung Box 
For each series, we analyze the logarithm of the daily price differences defined as . Table 2 displays some summary statistics for each sequence of differences. It can be seen that all have a mean of near 0, as should be expected. All series exhibit kurtosis far in excess of what would be expected if they followed a Gaussian distribution (recall that the Gaussian distribution has a kurtosis of ). To test this further, we show the pvalues of the standard ShapiroWilk test Shapiro1965 () for Gaussianity. The small pvalues show that the Gaussian hypothesis should be rejected for all series. Finally, we give the pvalues associated with the Ljung Box test for autocorrelation, in both the original series of differences, and their squared values. If the volatility of each series was constant then we would expect there to be no autocorrelation in the squared differences; the low pvalues obtained for all series show that this hypothesis should be rejected, and that the volatility is not constant.
We next investigate the change points which are found by the ICSS and NPCPM algorithms in these series. After this, we will fit the full GARCH model, and compare these two methods more formally.
4.1 Change Point Analysis
We begin by investigating the change points which are discovered by both the ICSS and NPCPM algorithms. We configured both algorithms to have a significance level of as discussed previously. In Figure 4 we show the change points which were detected by the ICSS algorithm. It can be seen that there are a very large number of change points detected, with for the Dax index, and 31, 30, and 22 for the others respectively. Most of these do not seem to correspond to genuine long term changes in the volatility; as we would expect from our discussion in Section 2.1.1, many seem to be false positives flagged in response to the extreme values for the daily differences which sometimes .
In Figure 5 we show the results of the NPCPM algorithm applied to the same series. In contrast to the ICSS, there are fewer change points detected, suggesting that this is giving a better fit to the data. Unlike ICSS, the change points found by NPCPM do not seem to correspond to the outlying observations, suggesting robustness. In the following section we will use standard model fitting criteria to give a more quantitative determination of which algorithm is more accurately finding change points.
Dow Jones  DAX  Nikkei  VIX  
L  AIC  BIC  L  AIC  BIC  L  AIC  BIC  L  AIC  BIC  
GARCH, ICSS, Gaussian  16908  33761  33584  15561  31024  30702  7480  14895  14685  13691  27311  27082 
GARCH, ICSS, Studentt  16336  32617  32433  15057  30014  29685  7002  13938  13721  13845  27618  27382 
GARCH, NPCPM, Gaussian  16862  33688  33570  15245  30435  30251  7573  15113  15008  13756  27463  27306 
GARCH, NPCPM, Studentt  16362  32685  32560  14691  29325  29134  7090  14146  14034  13755  27461  27297 
GARCH, ICSS, Gaussian  17229  34277  33679  15958  31549  30346  14855  29440  28556  7761  15276  14468 
GARCH, ICSS, Studentt  17333  34483  33885  15992  31619  30416  14903  29536  28653  7885  15525  14717 
GARCH, NPCPM, Gaussian  17178  34255  33920  15836  31499  30927  14830  29485  28916  7630  15173  14891 
GARCH, NPCPM, Studentt  17307  34512  34177  15944  31714  31143  14878  29582  29012  7823  15560  15278 
GARCH, GICSS, Gaussian  17162  34294  34195  15800  31537  31334  14693  29380  29360  7654  15239  15009 
GARCH, GICSS, Studentt  17260  34490  34391  15872  31681  31477  14768  29529  29509  7809  15548  15319 
GARCH, GNPCPM, Gaussian  17140  34267  34221  15713  31405  31332  14693  29380  29360  7564  15113  15067 
GARCH, GNPCPM, Studentt  17246  34478  34432  15838  31655  31582  14768  29529  29509  7716  15419  15373 
Having completed our preliminary analysis of the change points, we now fit the change point GARCH models. As discussed in Section 3.3, we consider several different types of models. As a benchmark, we fit GARCH model with no change points, using both the Gaussian and Student t distributions for the error distributions. Next, we fit the and where the segment boundaries correspond to the change points found by the ICSS and NPCPM algorithms.
As a final modeling remark, it is possible that the large number of change points found by the ICSS and NPCPM algorithms are an artifact of the twostage process we are using to fit the models. Both of these change point detection algorithms assume that the observations are independent, however since we are applying these algorithms before the GARCH model is fit, it is possible for the autocorrelation in the volatility to cause an unusually high number of false positives. We therefore also considered a threestage model fitting procedure of the following form: first, a GARCH(1,1) model is fit to the return series and the conditional variance is estimated on each day. This is then used to standardize the observations via the transformation . If the GARCH model correctly fits the data and does not contain any change points then these transformed variables should be independent with variance . The ICSS and NPCPM algorithms are then applied to these transformed variables to find any change points. Finally, separate GARCH models are fit within each of the discovered regimes. When using this procedure, the number of change points found drops substantially with the ICSS procedure finding 3, 7, 0 and 8 change points in the four indexes respectively, with the NPCPM finding 1,2,0 and 1, both of which are substantial reductions compared to the number found when running the algorithm on the raw sequences. In the following discussion we will refer to the models fit in this manner as GICSS and GNPCPM respectively, to denote the fact that the algorithms are applied to the residuals from an initial GARCH fit rather than to the raw data.
4.2 GARCH Model Fitting
In order to compare which models best describe the data, we use the standard Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) model comparison metrics Burnham2004 (); Yang2005 (). Both of these measure how well a model fits the observed data, based on the likelihood of the data under the model, with a penalization for the number of parameters in the model. This penalization is necessary in order to prevent overfitting, and balance out the increase in the likelihood that an overparameterized model will generally have. The AIC is defined as:
where is the likelihood of the model under the MLE parameter estimate, and is the number of parameters in the model. Similarly, the BIC is defined as:
where and are as before, and is the number of observations. With both measures a low value indicates a better fit. The practical difference between the two criterion is that the BIC penalizes model parameters to a greater degree than the AIC. There is some controversy over which of the two criteria is more appropriate, see Burnham2004 () for a review. We choose to report both, and the results are shown in Table 3.We can draw several conclusions:

The models using Studentt errors distribution consistently outperform the Gaussian models. This suggests that even with the timevarying volatility allowed by the GARCH process, the Gaussian distribution still cannot adequately capture the heavy tailed nature of these series. Similar results have been noted by He2010 ().

The best fits in terms of likelihood are given by the various models. This shows the advantage of incorporating structural breaks into the GARCH framework

The full models which allow all parameters to vary generally outperform the more parsimonious models, even when factoring in the likelihood penalty imposed by BIC/AIC

According to the AIC performance measure, the model using NPCPM change points is the best fitting model for every stock index. Although the ICSS methods give comparable values for the likelihood, the exaggerated number of changepoints they produce means that give a poorer fit overall.

When the BIC performance measure is used instead, the three stage model where the NPCPM algorithm is applied the residuals of an initial GARCH fit gives the best BIC for three of the four indexes. The exception is the Nikkei index for which no change points were found by either the ICSS or NPCPM methods when applying these to the GARCH residuals, in which case their BIC results are tied.
In summary, the fact that the NPCPM algorithm does not assume Gaussianity means that it is more robust to outliers than the ICSS, and this results in more parsimonious change detection when using daily returns. This is reflected in the performance criteria used to assess model fit, which shows that it gives improved overall results. We would hence generally recommend the model using the Studentt error distribution in conjunction with the NPCPM algorithm when modeling volatility.
We note that we could also have compared these different methods for volatility estimating by treating them as predictive models and comparing their outofsample forecasting errors according to some standard criteria such as the mean squared error (MSE). However the question of whether change point models are appropriate for shortterm volatility forecasting is still controversial, and there is some evidence Covarrubias2006 () that standard GARCH models may perform better for this purpose depending on the precise performance measure which is used. Because our main concern is volatility estimation rather than forecasting, we prefer to avoid this issue and use performance measures which relate only to (penalized) model fit.
4.3 Further Analysis
After detecting the change points in the above section, it is potentially interesting to investigate whether they correspond to events in the real world. In this section we examine the Dow Jones index in more detail. Using the NPCPM GARCH algorith, there were 12 change points detected. For each of the dates at which the changes occurred, we searched through news headlines from the week immediately before and after to find whether any major events occured which may be related. For of the change points, we managed to find significant economic events which occured within several days and may have been the cause of the volatility shifts:

26th July 2003: two days earlier on the 24th July, the S&P credit rating agency cut the rating of California bonds from A to BBB.

19th July 2007: the following week, the Dow Jones index experienced a substantial 2.3% drop over concerns about the housing and credit markets. The volatility increase may have anticipated this.

15th September 2008: on this date Lehmann Brothers filed for bankruptcy, the event which signaled the start of the recent financial crisis.

2nd June 2009: on the previous day, General Motors filed for bankruptcy.

8th August 2011: three days earlier, S&P downgraded the credit rating of the United States.
Regime  Volatility 

2nd January 1991  16th May 1991  0.011 
17h May 1991  30th December 1996  0.007 
31st December 1996  16th June 2002  0.012 
17th June 2002  23rd September 2002  0.020 
24th September 2002  17th October 2002  0.028 
18th October 2002  25th July 2003  0.013 
26th July 2003  16th August 2006  0.007 
17th August 2006  18th July 2007  0.006 
19th July 2007  14th September 2008  0.013 
15th September 2008  9th December 2008  0.042 
10th December 2008  1st June 2009  0.020 
2nd June 2009  7th August 2011  0.010 
8th August 2011  16th November 2011  0.019 
For the remaining change points we did not find any specific events. For reference, these were 17th May 1991, 31st December 1996, 17th June 2002, 24th September 2002, 18th October 2002, 17th August 2006, and 10th December 2008. This seems slightly puzzling since the first two change points in 1991 and 1996 correspond to very clear structural change points which can be seen in Figure (a)a, with the second one marking a pronounced switch from a period of low volatility to high volatility. Since there are no specific associated news events, it is possible that these changes occurred in response to longer term trends in the markets or economic system rather than being responses to specific events. For example, the bursting of the internet bubble caused a prolonged stock market downturn during 2002, during which the Dow Jones lost almost 17% of its value, with most of this occurring between May and October. It seems probable that the change points found by NPCPM in June and September are caused by this, even though are are no high profile news events around these dates specifically.
To investigate further, Table 4 shows the unconditional volatility in each of the segments. It can be seen that the most volatile period was unsurprisingly the three month period immediately following the bankruptcy of Lehmann Brothers in late 2008. After this, volatility decreased but was still high compared to the historical average. The other sustained period of high volatility occurred during 2002 and lasted from June to October. Since this corresponds quite closely to the stock market downturn, it seems reasonable to assume that it was an underlying factor which may have caused the discovered change points around the start and end of this period.
5 Conclusions
Many financial applications require an accurate estimate of the historical volatility of specified financial instruments. For example, certain types of derivatives are priced using the realized volatility, such as the popular Merton model which is used to price Credit Default Swaps and is usually estimated by using the volatility of the stock price as an input variable Jones1984 (). Volatility calculations also feature extensively in risk management, with GARCH models finding regular use within traditional ValueatRisk (VaR) analysis Engle2001 (). Similarly, accurate volatility estimation is the first step in computing the correlation between financial instruments, which is a central task in portfolio optimization Elton2009 ().
The ICSSGARCH algorithm has been widely used to model the time varying volatility commonly found in financial returns. In this methodology, the ICSS algorithm is first used to segment the series based on discovered change points, before a GARCH model is fit to each segment. However, ICSS is very sensitive to heavy tailed data, and can flag for spurious change points when used in this setting. This is unfortunate since heavytailed behaviour is typical in financial data, and this has limited the use of the algorithm to the study of weekly returns, where large daily price movements are smoothed out. In order to work with daily data, we have introduced an alternative algorithm where we replace ICSS with a test utilizing ideas from nonparametric statistics. Our experimental analysis shows that this generally gives a better fit to daily data, as measured by several standard model selection techniques.
References
 [1] R. Aggarwal, C. Inclan, and R. Leal. Volatility in emerging stock markets. The Journal of Financial and Quantitative Analysis, 34(1):33–55, 1999.
 [2] T. Bollerslev. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, pages 307–327, 1986.
 [3] T. Bollerslev. A conditional heteroskedastic time series model for speculative prices and rates of return. Review of Economics and Statistics, 69:542–547, 1987.
 [4] K. P. Burnham and D. R. Anderson. Multimodel inference: Understanding aic and bic in model selection. Sociological Methods Research, 33(2):261–304, 2004.
 [5] G. Covarrubias, B. T. Ewing, S. E. Hein, and M. A. Thompson. Modeling volatility changes in the 10year treasury. Physica A, 369:737–744, 2006.
 [6] E. J. Elton, M. J. Gruber, S. J. Brown, and W. N. Goetzmann. Modern portfolio theory and investment analysis. John Wiley & Sons, 2009.
 [7] R. Engle. GARCH 101: The use of ARCH/GARCH models in applied econometrics. The Journal of Economic Perspectives, 15(4):157–168, 2001.
 [8] P. Fearnhead. Exact and efficient Bayesian inference for multiple changepoint problems. Statistics and Computing, 16:203–213, 2006.
 [9] V. Fernandez and B. M. Lucey. Portfolio management under sudden changes in volatility and heterogeneous investment horizons. Physica A: Statistical Mechanics and its Applications, 375(2):612–624, 2007.
 [10] P. Gopikrishnan, V. Plerou, L. A. N. Amaral, M. Meyer, and H. E. Stanley. Scaling of the distributions of fluctuations of financial market indices. Physical Review E, 60:5305–5316, 1999.
 [11] S. F. Gray. Modeling the conditional distribution of interest rates as a regimeswitching process. Journal of Financial Economics, 42(1):27–62, 1996.
 [12] J. D. Hamilton and R. Susmel. Autoregressive conditional heteroskedasticity and changes in regime. Journal of Econometrics, 64:307–333, 1994.
 [13] Z. He and J. M. Maheu. Real time detection of structural breaks in GARCH models. Computational Statistics and Data Analysis, 54(11):2628–2640, 2010.
 [14] C. Inclan and G. C. Tiao. Use of cumulative sums of squares for retrospective detection of changes of variance. Journal of the American Statistical Association, 89(427):913–923, 1994.
 [15] E. P. Jones, S. P. Mason, and E. Rosenfeld. Contingent claims analysis of corporate capital structure: An empirical investigation. Journal of Finance, 39, 1984.
 [16] S. H. Kang, C. Cheongb, and S.M. Yoon. Structural changes and volatility transmission in crude oil markets. Physica A: Statistical Mechanics and its Applications, 390(2324):4317–4324, 2011.
 [17] S. H. Kang, H.G. Chob, and S.M. Yoon. Modeling sudden volatility changes: evidence from japanese and korean stock markets. Physica A: Statistical Mechanics and its Applications, 388(17):3543–3550, 2009.
 [18] P. Kokoszka and R. Leipus. Changepoint estimation in arch models. Bernoulli, 6:513–539, 2000.
 [19] S. Lee, Y. Tokutsu, and K. Maekawa. The cusum test for parameter change in regression models with ARCH errors. Journal of Japan Statistics Society, pages 173–188, 2004.
 [20] Y. Liu, P. Gopikrishnan, Cizeau, Meyer, Peng, and E. H. Stanley. Statistical properties of the volatility of price fluctuations. Physical Review E, 60(2):1390–1400, 1999.
 [21] F. Malik, B. T. Ewing, and J. E. Payne. Measuring volatility persistence in the presence of sudden changes in the variance of canadian stock returns. Canadian Journal of Economics, 38(3):1037–1056, 2005.
 [22] A. Mood. On the asymptotic efficiency of certain nonparametric twosample tests. Annals of Mathematical Statistics, 25:514–533, 1954.
 [23] G. Oh, S. Kim, and C. Eomb. Longterm memory and volatility clustering in highfrequency price changes. Physica A: Statistical Mechanics and its Applications, 387(5):1247–1254, 2008.
 [24] V. Plerou, P. Gopikrishnan, L. A. N. Amaral, M. Meyer, and H. E. Stanley. Scaling of the distribution of price fluctuations of individual companies. Physical Review E, 60:6519–6529, 1999.
 [25] B. Podobnik, D. Horvatic, A. M. Petersen, and H. E. Stanley. Crosscorrelations between volume change and price change. Proc. Natl. Acad. Sci. USA, 106:22079–22084, 2009.
 [26] B. Podobnik, A. Valentincic, D. Horvatic, and H. E. Stanley. Asymmetric levy flight in financial ratios. Proc. Natl. Acad. Sci. USA, 108:17883–17888, 2011.
 [27] D. E. Rapach and J. K. Strauss. Structural breaks and GARCH models of exchange rate volatility. Journal of Applied Econometrics, 23(1):65–90, 2008.
 [28] G. J. Ross and N. M. Adams. Two nonparametric control charts for detecting arbitrary distribution changes. Journal of Quality Technology, 44(2):102–116, 2012.
 [29] G. J. Ross, D. K. Tasoulis, and N. M. Adams. A nonparametric change point model for streaming data. Technometrics, 53(4), 2011.
 [30] A. Sanso, V. Arago, and J. L. Carrion. Testing for changes in the unconditional variance of financial time series. Revista de Economia Financiera, 4:32–53, 2004.
 [31] S. S. Shapiro and M. B. Wilk. An analysis of variance test for normality (complete samples). Biometrika, 52(34):591–611, 1965.
 [32] H. E. Stanley, V. Plerou, and X. Gabaix. A statistical physics view of financial fluctuations: Evidence for scaling and universality. Physica A: Statistical Mechanics and its Applications, 387:3967–3981, 2008.
 [33] J.J. Tsenga and S.P. Lia. Asset returns and volatility clustering in financial time series. Physica A: Statistical Mechanics and its Applications, 390(7):1300–1314, 2011.
 [34] Y. Yang. Can the strengths of AIC and BIC be shared? a conflict between model indentification and regression estimation. Biometrika, 92(4):937–950, 2005.