Approximated maximum likelihood estimation in multifractal random walks
Abstract
We present an approximated maximum likelihood method for the multifractal random walk processes of [E. Bacry et al., Phys. Rev. E 64, 026103 (2001)]. The likelihood is computed using a Laplace approximation and a truncation in the dependency structure for the latent volatility. The procedure is implemented as a package in the R computer language. Its performance is tested on synthetic data and compared to an inference approach based on the generalized method of moments. The method is applied to estimate parameters for various financial stock indices.
pacs:
05.40.a, 02.50.r, 47.53.+n, 95.75.WxI Introduction
Multifractal models were first introduced in the 1960s by the socalled “Russian school” in turbulence theory (Obukhov, 1962; Kolmogorov, 1962). In turbulence, multifractality can be conceived as a weakening of the spatial selfsimilarity in the velocity field implicitly assumed in Kolmogorov’s 1941theory (Kolmogorov, 1941). This generalization is called the KolmogorovObukhov model and entails modeling the spatial variability of the energy dissipation rate as a random measure with certain multiscaling properties. The KolmogorovObukhov model is treated rigorously by Kahane Kahane (1985) and this construction is known as Gaussian multiplicative chaos.
In recent years multifractal random processes and multifractal random measures have received increased attention and are widely used in physics, geophysics and complex systems theory. Examples include phenomena as diverse as internet traffic Riedi and Ribeiro (1999), geomagnetic activity Rypdal and Rypdal (2010, 2011) and rainfall patterns Pathirana and Herath (2002). In addition, multifractal processes provide natural models for the longrange volatility persistence observed in financial time series. This was first discovered by Ghashghaie Ghashghaie et al. (1996) and Mandelbrot Mandelbrot et al. (1997), and since the late 1990s much work has been done on multifractal modeling of financial markets (Di Matteo, 2007; Calvet and Fisher, 2001). Logarithmic returns of assets are modeled as , where are continuoustime processes with stationary increments and multifractal scaling. The latter means that the moments of are powerlaws as functions of time;
(1) 
either in some interval or asymptotically as . The scaling function is linear for selfsimilar processes, but may in general be concave. Processes satisfying equation (1) with strictly concave scaling functions are generally referred to as multifractal.
Two wellknown “stylized facts” of financial time series are that logreturns are uncorrelated and nonGaussian. Based on this, Mandelbrot Mandelbrot (1963) deduced that if prices are described as selfsimilar processes, then these processes must be socalled Lévy flights, i.e. stable Lévy processes with . However, if one allows nonlinear scaling functions, then one can maintain uncorrelated logreturns by simply imposing the condition . The concave shape of implies that the variables are increasingly leptokurtic with decreasing , and consequently nonGaussian. Moreover, as opposed to Lévy flights, multifractal processes have strongly dependent increments and can therefore describe a third “stylized fact” of financial time series, namely volatility clustering.
Notwithstanding that multifractal processes provide accurate and parsimonious descriptions of temporal financial fluctuations, the models are rarely implemented for forecasting and riskanalysis in financial institutions. This is partially due to a lack of accurate, stable and efficient inference methods for multifractal processes. Parameter estimation has so far mostly been made using various momentbased estimators, such as the generalized method of moments (GMM). Alternatively, one can fit the estimated scaling functions to theoretical expressions of . However, as pointed out in e.g. Lux (2003) and Chapman et al. (2005), the standard estimators of scaling exponents have large mean square errors for time series of length comparable to those typically available in econometrics.
An exception to the statements above is the MarkovSwitching Multifractal (MSM) model (Calvet and Fisher, 2001) where maximum likelihood estimation is feasible. In discrete time MSM implies that the increments are described by a stochastic volatility model on the form
(2) 
Here are independent variables and the volatility is a product on the form
where (for each time step ) are updated from a distribution with a probability . In this model however, maximum likelihood estimation is only possible in the case where is defined on a discrete state space, and there is a limitation on the magnitude of which should not exceed (Lux, 2008). These restrictions not only limit flexibility with respect to the distribution of returns, but also the possible range in the volatility dependency.
This paper concerns parametric inference for the multifractal random walk (MRW) introduced by Bacry et al. Bacry et al. (2001). The increment process is still a discretetime stochastic process described by equation (2), but now the volatility is modeled as , where is a stationary and centered Gaussian process with the covariance structure
(3) 
where .
Here is called the correlation range
Let denote the parameter vector and a fixed time series. The main result of this paper is the development of a method for efficiently computing approximations to the likelihood function
where is the probability density function of a random vector produced by the MRW model with parameters . Using the likelihood function, parameters can be determined by means of the maximum likelihood (ML) estimator:
Our method exploits that the discrete MRW model has a construction similar to simple volatility (SV) models. The distinguishing feature is that the processes are autoregressive in SV models. By truncating the dependency structure in the logarithmic volatility , the computation of the likelihood function is mapped on to a similar problem for SV models, and hence existing techniques for further approximations are available.
To our knowledge the present paper is the first to present results on ML estimation for multifractal models with continuous state spaces for the volatility. Such estimates may be of great practical importance, since accurate parameter estimation is essential for volatility forecasts and risk estimates. In the MRW model this degree of accuracy is particularly important for the intermittency parameter which determines the peakedness of the return distributions on all time scales. In applications other than finance, accurate estimates of can be used as supplements to the empirical scaling functions, and thereby the ML estimator can provide a tool for quantifying multifractality in data.
The paper is organized as follows: In section II we briefly explain the construction of the continuoustime process for which the model given by equations (2) and (3) is a discretization. There exists a large class of multifractal processes which are related to a construction known as infinitely divisible cascades (IDC). In general the random walk models associated with IDC processes have logartithmic volatility with infinitely divisible distributions, and the MRW model considered in this paper is a discrete approximation to the random walk model obtained in the special case when the logarithmic volatility is Gaussian.
Ii Motivation of the model
There exist several popular models for multifractal stochastic processes with uncorrelated increments. All of these models can be written either on the form , where is a Brownian motion and is the distribution function of a multifractal random measure on the time axis, or as
where as . The meaning of is discussed below. These two types of constructions are equivalent as long as is a Brownian motion. (This is not the case for fractional Brownian motions with .)
The differences between the various multifractal models are then related to the construction of the random measure . The lognormal MRW model is on one hand based on a particular construction of known as multiplicative chaos, and on the other hand it can be seen as a special case of the more general IDC constructions.
In multiplicative chaos, which was first developed rigorously in Kahane (1985), one considers a sequence of measures defined via random densities on the form
where , and are centered Gaussian processes with covariance structures that converge to some expression in the limit . Kahane Kahane (1985) showed that if is positive, meaning that there are positive and positive definite functions such that
then the sequence converges weakly to a Borel measure which depends only on the function . One can therefore informally think of as being on the form where and is a “Gaussian” process with covariance . Then, if one makes the choice
(4) 
one easily obtains the relation , where are independent of and distributed according to . It follows that we for and have the scaling relation
(5) 
with . See proposition 3.3 in (Robert and Vargas, 2010) for a rigorous proof of (5), and see example 2.3 of the same paper for a verification that the function in equation (4) is positive. By using the wellknown formula for the th moments of lognormal variables together with equation (5), we easily verify the multifractality of the process : Denote and observe that
where . Since a Brownian motion is selfsimilar with the scaling function of is given by .
Alternatively the model defined by equations (2) and (3) can be motivated by considering the more general class of IDC models. Here we briefly mention the main ideas and results in this theory, and we refer to Bacry and Muzy (2003); Muzy and Bacry (2002) for details. At the base of this construction is an object called an independently scattered infinitely divisible random measure defined on the halfplane . The defining properties of the random measure are: (1) for any measurable set , the random variable is infinitely divisible with characteristic function
where . (2) for any finite sequence of disjoint and measurable sets, the corresponding random variables are independent. If we assume that , the random measure induces a family of centered and stationary stochastic processes through the equation
where are conelike domains defined by
with for and for . The time correlations in the processes are characterized by the functions
In fact, the covariance of is given by
where .
Random measures are defined by , where . The corresponding distribution functions are and corresponding random walks are
By using the relation one can show that
for and , where are independent of and have characteristic functions . Consequently the limit process has scaling function
In the case that are Gaussian, i.e. , the covariance is on the form
for , and hence it can be approximated by the process defined by equation (3). In this case the scaling function is
We note that for the process is reduced to a Brownian motion and .
We point out that this paper presents a ML estimator for the discretetime process defined by equations (2) and (3). This is sufficient for the purpose of modeling and forecasting volatility in financial time series, since the discretetime MRW model is directly comparable to GARCHtype models. In other applications, such as modeling the velocity field in turbulence, one is interested in the continuoustime process . Since is an approximation to the continuoustime process , our method can also be interpreted as an estimator for this process. In this case one must be aware that the increment process is not proportional (in law) to , and that this is only an approximation in the limit . See appendix A.1. in (Bacry et al., 2008). In the case of strong intermittency, the estimator for the continuoustime process may therefore depend significantly on the timescale for which the data is sampled. An analysis of how our method performs as an estimator for will require extensive Monte Carlo simulations (with varying and ), and this is beyond the scope of this paper.
We also remark that it in some applications is relevant to estimate the parameters of the measure , for instance when modeling the energy dissipation fields in turbulence. In the discretetime approximation this corresponds to the process , where is described by equation (3). Since is Gaussian, this problem is much easier than the one considered in this paper. The ML estimator for can be constructed using standard methods McLeod et al. (2007) and no approximations are required.
Iii Approximated maximum likelihood
In this section we explain our method of approximated maximum likelihood estimation. Let and be the processes defined by (2) and (3). Denote and . The first step is to write
(6) 
The first factor in the integrand is computed by noting that, when conditioned on , the variables are independent and Gaussian. In fact,
(7) 
For the second factor we use that is a centered Gaussian process with a specified covariance structure . First we decompose the density into onedimensional marginals;
(8) 
where we have used the notation
Denote by the covariance matrix of the vector , and let . The covariance matrix can be written on the block form:
By performing standard computations of conditional marginals in multivariable Gaussian distributions we deduce that is a Gaussian with mean
and variance
As usual it is convenient to introduce vectors defined by . This allows us to write the mean as and the variance as . Then from equation (8) we have
(9) 
Combining equation (9) with equation (7) we get an expression for the full likelihood:
(10) 
We keep in mind that depends on and through the relation
.
Approximation 1: By comparing covariances the process can be written as
(11) 
where are independent Gaussian variables with zero mean and variances equal to . As approximations to we can consider processes obtained by truncating the sum in equation (11). We fix a parameter , and for we replace equation (11) with
(12) 
where are independent Gaussian variables with zero mean and variances equal to . Note that in equation (11) has the same distribution as obtained from (12), namely a Gaussian with mean and variance . In effect we have approximated the distribution of , by truncating the dependency after a lag . As a result of this approximation equation (9) becomes
(13) 
In order to compute the expression in equation (13) we need to solve the equations
This is done efficiently using the DurbinLevinson algorithm (Levinson, 1946; Trench, 1964). We remark that for the expression in equation (13) is exact.
Approximation 2: The second approximation is the socalled Laplace’s method, which is frequently used for approximation of likelihoods in SV models, see e.g. Skaug and Yu (2009); Martino et al. (2011). We write equation (6) on the form
(14) 
where
(15) 
Laplace’s method is to assume that has a global maximum in , which we denote by . When is large the contribution to the integral in equation (14) is concentrated around , and therefore we make a second order Taylor approximation to around this point. Since is also a local maximum we have
where
is the Hessian matrix of at the point . The approximation now reads
The maximum is found by computing the partial derivatives of with respect to , setting them equal to zero and solving the corresponding system of equations numerically using the algorithm DFSANE (Cruz et al., 2006), which is implemented in R package “BB” Varadhan and Gilbert (2009). The matrix is obtained using analytical expressions for the second derivatives. This matrix is banddiagonal with bandwidth equal to the truncation parameter , and in the R software such matrices are efficiently stored and manipulated using the package “Matrix” (Bates and Maechler, 2011).
ML  GMM  

Index  (time period)  (years)  (years)  
CAC 40  (1990–2011)  
S&P 500  (1950–2011)  
DAX  (1990–2011)  
Nikkei 225  (1984–2011)  
Hang Seng  (1986–2011)  
FTSE 100  (1984–2011) 
ML  GMM  
2500  10  
50  
100  
5000 
10  
50  
100  
10000 
10  
50  
100  

Iv Estimator comparisons
In this section the ML estimator is compared with an GMM approach which is similar to the one used in Bacry et al. (2008). This GMM version is essentially a leastsquare fitting of the autocorrelation function for the logarithmic volatility, and we briefly explain this method in the following: Denote and observe that
where are independent and identically distributed. We can use the sample standard deviation to normalize so that it has unit variance. Then, if we let denote the mean of , the autocorrelation function of has the form
For we have
and and can be found by linear regression of the autocorrelation function versus .
We begin testing the approximated ML estimator by applying it to various stock market indices. We use daily logreturns and in all of the estimates the truncation parameter is set to days. The results are presented in table 1. We observe that the intermittency parameter varies from to for the different indices and time periods. We also observe that the correlation range parameter varies by roughly one order of magnitude, in the range 1.412.2 years. If we compare with the GMM we see that, for all the indices, the estimates of are lower using the ML method. For the parameter the estimates using ML and GMM are more or less consistent, but with quite large variations between the two estimators.
To further test the performance of the proposed ML estimator we run a smallsample Monte Carlo study. We have used three different sample lengths , and for each sample length we simulated 500 sample realizations. The parameter vector considered is , and . For the truncation parameter we have considered the cases , and in the GMM method we use a maximum time lag days in the autocorrelation function of .
The results are presented in table 2. For both the GMM and the ML methods the estimates of are highly unstable. This is also pointed out in Bacry et al. (2008). However, the processes only depend on the through expressions on the form . Therefore, in order to have an estimator which is comparable to , we should consider the variable . The estimators of behave reasonably well, even though there are significant mean square errors and some bias. We see that both the ML and GMM method underestimate and that the errors are roughly the same for the two estimators.
On the other hand we observe that the ML estimates of have a standard deviations which are much smaller than the corresponding standard deviation for the GMM estimate, especially for . This can also be seen from figures 2 and 3 where the probability density functions for the GMM estimates and the ML estimates are presented. Based on this we conclude that the ML estimator performs better than the GMM. Moreover, if one allows longer computing times, the truncation parameter can be increased to obtain even more accurate estimates. For a time series of data points, an ML estimate with takes a few minutes on a personal computer.
In figure 4 we have plotted the mean square errors (MSE) for the ML estimator with and the GMM estimator. We see that for both the estimators the MSE is roughly invserly proportional to the sample length. However, from table 2 we see that is a slight negative bias in for the ML estimator. This bias decreases with increasing , and we suspect the estimator to be asymptotically unbiased in the limit .
V Concluding remarks
In this paper we have presented an approximate ML estimator for MRW processes. The method is implemented and tested in a Monte Carlo study, and the results show significant improvements over existing methods for the intermittency parameter .
The methods of this paper represent a suitable starting point for two important generalizations. The first generalization is to allow for correlated innovations, for instance by letting be a fractional Gaussian noise. This has several important applications, for instance in modeling of geomagnetic activity Rypdal and Rypdal (2010, 2011) and electricity spot prices Malo (2006). Another interesting generalization is to consider the nonGaussian IDC models referred to in section II.
We also point out that the techniques presented in section III can be used to calculate conditional densities on the form
. At time , such an expression provides a complete forecast over the next time steps.
Forecasting and risk analysis based on the MRW model and the methods in this paper is a promising topic that will be pursued in future work.
Acknowledgment. This project was supported by Sparebank 1 NordNorge and the Norwegian Research Council (project number 208125). We thank K. Rypdal and the anonymous referee for useful comments and suggestions.
Footnotes
 In turbulence corresponds to the integral scale.
References
 A. M. Obukhov, J. Geophys. Res. 67, 3011 (1962).
 A. N. Kolmogorov, J. Fluid. Mech. 13, 83 (1962).
 A. N. Kolmogorov, Dokl. Akad. Nauk. SSSR 31, 301 (1941).
 J. P. Kahane, Ann. Sci. Math. Quebec 9, 105 (1985).
 R. H. Riedi and V. J. Ribeiro, IEEE T. Inform. Theory 45, 992 (1999).
 M. Rypdal and K. Rypdal, J. Geophys. Res. 115, A11216 (2010).
 M. Rypdal and K. Rypdal, J. Geophys. Res. 116, A02202 (2011).
 A. Pathirana and S. Herath, Hydrol. Earth Syst. Sc. 6, 695 (2002).
 S. Ghashghaie, S. Brewmann, W. Peinke, J. Talkner, and Y. Dodge, Nature 381, 767 (1996).
 B. Mandelbrot, A. Fisher, and L. Calvet, Cowles Foundation for Research in Economics Working Papers (1997).
 T. Di Matteo, Quant. Finance 7, 21 (2007).
 L. Calvet and A. Fisher, J. Econometrics 105, 27 (2001).
 B. B. Mandelbrot, J. Bus. 36, 394 (1963).
 T. Lux, Economics Working Papers, ChristianAlbrechtsUniversity of Kiel (2003).
 S. C. Chapman, B. Hnat, G. Rowlands, and N. W. Watkins, Nonlinear Proc. Geoph. 12, 767 (2005).
 T. Lux, J. Bus. Econ. Stat. 26, 194 (2008).
 E. Bacry, J. Delour, and J. F. Muzy, Phys. Rev. E 64, 026103 (2001).
 Note1, in turbulence corresponds to the integral scale.
 E. Bacry, A. Kozhemyak, and J.F. Muzy, J. Econ. Dyn. Control 32, 156 (2008).
 R Development Core Team (2011), URL http://www.Rproject.org/.
 O. Løvsletten and M. Rypdal, R package (2011), URL http://complexityandplasmas.net/nordforsk/Papers_files/MLE%20MRW%20R%20package.zip.
 R. Robert and V. Vargas, Ann. Probab. 38, 605 (2010).
 E. Bacry and J. F. Muzy, Commun. Math. Phys. 236, 449 (2003).
 J. F. Muzy and E. Bacry, Phys. Rev. E 66, 056121 (2002).
 I. A. McLeod, H. Yu, and Z. L. Krougly, J. Stat. Softw. 23, 1 (2007).
 N. Levinson, J. Math. Phys. 25, 261 (1946).
 W. F. Trench, J. Soc. Indust. Appl. Math. 12, 515 (1964).
 H. Skaug and J. Yu, Research Collection School of Economics, Singapore Management University (2009).
 S. Martino, K. Aas, O. Lindqvist, L. R. Neef, and H. Rue, Europ. J. Finance 17, 487 (2011).
 W. L. Cruz, J. Martínez, and M. Raydan, Math. Comput. 75, 1429 (2006).
 R. Varadhan and P. Gilbert, J. Stat. Softw. 32, 1 (2009).
 D. Bates and M. Maechler, R package (2011), URL http://CRAN.Rproject.org/package=Matrix.
 URL http://finance.yahoo.com/.
 P. Malo, Helsinki School of Economics Working Papers (2006).