Semiparametric inference on the fractal index of Gaussian and conditionally Gaussian time series data
Abstract
We study a wellknown estimator of the fractal index of a stochastic process. Our framework is very general and encompasses many models of interest; we show how to extend the theory of the estimator to a large class of nonGaussian processes. Particular focus is on clarity and ease of implementation of the estimator and the associated asymptotic results, making it easy for practitioners to apply the methods. We additionally show how measurement noise in the observations will bias the estimator, potentially resulting in the practitioner erroneously finding evidence of fractal characteristics in a time series. We propose a new estimator which is robust to such noise and construct a formal hypothesis test for the presence of noise in the observations. Finally, the methods are illustrated on two empirical data sets; one of turbulent velocity flows and one of financial prices.
Keywords: Fractal index; roughness; estimation; inference; fractional Brownian motion; stochastic volatility.
JEL Classification: C12, C22, C51, G12
MSC 2010 Classification: 60G10, 60G15, 60G17, 60G22, 62M07, 62M09, 65C05
1 Introduction
Fractallike models are used in a wide array of applications such as in the characterization of surface smoothness/roughness (Constantine and Hall, 1994), in the study of turbulence (Corcuera et al., 2013), and many others (e.g., Burrough, 1981; Mandelbrot, 1982; Falconer, 1990). Most recently, these models have attracted attention in mathematical finance, as models of stochastic volatility (e.g., Gatheral et al., 2018; Bayer et al., 2016; Bennedsen et al., 2017a, b; Jacquier et al., 2017). In such applications, it is imperative to be able to estimate and conduct inference on the key parameter in these models, the fractal index. Many estimators of this parameter exist (see Gneiting et al., 2012, for a survey); however, the underlying assumptions behind the various estimators, as well as their asymptotic properties, are often different and rarely stated in a clear and concise manner. These facts can make analysis difficult for the practitioner, as well as for the researcher.
This paper aims at making empirical analysis in applications, such as the ones mentioned above, easier. We clearly lay out a large and coherent framework – including the valid underlying assumptions – for analysing time series data which are potentially fractallike. We focus on a specific estimator, which is arguably the most widely used in practice and which in our experience is the most accurate. Further, the estimator is easy to implement – it relies on a simple OLS regression – and its asymptotic properties are easy to apply. Our hope is that this will provide a transparent guide to analyzing fractal data using sound statistical methods.
The main contribution of the paper is to lay out the theory of the estimator and provide the theoretical underpinnings of it, stating the results in a manner so that application of the results becomes straight forward. For this, we rely heavily on earlier theoretical work on the increments of fractal processes, most notably BarndorffNielsen et al. (2009) and BarndorffNielsen et al. (2011). We further investigate the estimator numerically to gauge it’s properties when applied to data, leading to a number of practical recommendations for implementation. Most importantly, we advocate a different choice of bandwidth parameter for the estimator, than what is generally accepted practice in the literature, cf. Section 3.1.
In their survey of the asymptotic theory of various estimators of the fractal index, Gneiting et al. (2012) section 3.1., report that “a general nonGaussian theory remains lacking”. The second contribution of this paper is to extend the estimation theory beyond the Gaussian paradigm. We accomplish this by volatility modulation which turns out to be a convenient way of extending the theory to a large class of nonGaussian processes. As will be seen, this results in conditionally Gaussian processes for which the fractal theory continues to hold. Again we clearly lay out the relevant assumptions and focus on the interpretation of the results and implementation of the methods.
The final contribution of the paper is an indepth study of the case where the data are contaminated by noise, such as measurement noise. We prove that noise will bias estimates of the fractal index downwards, thereby making noisecontaminated data look more rough than the underlying process actually is. We go on to propose a novel way to construct an estimator which is robust to noise in the observations. The new estimator also relies on an OLS regression and is just as easy to implement as the standard (nonrobust) estimator studied in the first part of the paper. We present the asymptotic theory concerning the robust estimator and propose a hypothesis test, which can be used to formally test for the presence of noise in the observations.
The rest of the paper is structured as follows. Section 2 presents the mathematical setup and assumptions and gives some examples of the kind of processes we have in mind. The section then goes on to consider some extensions to the basic setup, most notably the extension to nonGaussian processes. Section 3 presents the semiparametric estimator of the fractal index and it’s asymptotic properties. Then, in Section 3.2, we consider the case where the observations have been contaminated by noise and present asymptotic theory for a new estimator in this case; Section 3.2.1 presents a formal test for the presence of noise. Section 4 contains small simulation studies, illustrating the finite sample properties of the asymptotic results presented in the paper. Finally, Section 5 contains two illustrations of the methods: the first using measurements of the longitudinal component of a turbulent velocity field, and the second using a time series of financial prices. Section 6 concludes and gives some directions for future study. Proofs of technical results and some mathematical derivations are given in an appendix.
2 Setup
Let be a probability space satisfying the usual assumptions and supporting , a onedimensional, zeromean, stochastic process with stationary increments. Define the ’th order variogram of :
As we intend to make use of the theory developed in BarndorffNielsen et al. (2009, 2011) we adopt the assumptions of those papers. The assumptions are standard in the literature on fractal processes and are as follows.
 [label=(A0),ref=A0,leftmargin=3em]

For some ,
(2.1) where is continuously differentiable and bounded away from zero in a neighborhood of . The function is assumed to be slowly varying at zero, in the sense that for all .

for some slowly varying (at zero) function , which is continuous on .
^{2} 
There exists with

There exists a constant such that the derivative of satisfies
for some .
Remark 2.1.
Remark 2.2.
The technical assumption (4) is only needed for the asymptotic normality of the estimator of and not for consistency.
The parameter is termed the fractal index because it, under mild assumptions, is related to the fractal dimension of the sample paths of the process (Falconer, 1990; Gneiting et al., 2012). It is also refered to as the roughness index of , since the value of reflects itself in the pathwise properties of , as the following result formalizes.
Proposition 2.1.
Let be a Gaussian process with stationary increments satisfying (1) with fractal index . Then there exists a modification of which has locally Hölder continuous trajectories of order for all .
Proposition 2.1 shows that controls the degree of (Hölder) continuity of . In particular, negative values of corresponds to having rough paths, while positive values of corresponds to smooth paths. It is well known that the Brownian motion has . In Table 1 we give some parametric examples of the kind of processes we have in mind and comment on how they fit into the setup of the present paper; the examples are taken from Table 1 in Gneiting et al. (2012).
Class  Autocorrelation function  Slowly varying function  Parameters 

fBm  
Matérn  
Powered exp.  
Cauchy  ,  
Dagum  , 
Parametric examples of Gaussian fractal processes. “fBm” is the fractional Brownian motion; “Powered exp.” is the powered exponential process. is a scale parameter and is the fractal index. The processes fulfill assumptions (1)–(3) for the parameter ranges given in the rightmost column; a letter superscript denotes whether the parameter ranges are different under (4). : (4) valid for . : (4) valid for . : (4) valid for .
To get an intuitive understanding of how the trajectories of the fractal processes look, and in particular how the value of reflects itself in the roughness of the paths, Figure 1 plots three simulated trajectories of the Matérn process. It is evident how negative values of correspond to very rough paths, while the paths become smoother as increases.
The processes in Table 1 are all Gaussian. However, in many applications it is preferable to have rough processes which are both fractal and nonGaussian (Gneiting et al., 2012, section 3.1.). In the following section we suggest an extension to the above setup that explicitly results in nonGaussian processes with fractal properties, by considering processes which are volatility modulated.
2.1 Extension to stochastic volatility processes
A flexible way to introduce nonGaussianity of processes for which the theory of the fractal index continues to hold, is through volatility modulation. Following BarndorffNielsen et al. (2009), consider processes of the form
(2.2) 
where , is a stochastic volatility process, and is a zeromean Gaussian process with stationary increments satisfying (1)–(4), e.g. one of the processes from Table 1. The modulation of the increments of by the stochastic volatility process is a convenient way of introducing nonGaussianity. To see this, note that the marginal distribution of , conditional on the past of the stochastic volatility process and the starting value , is
In other words, the marginal distribution of is a normal meanvariance mixture distribution, where the distribution of the stochastic process and initial value determine the mixture.
For the integral in (2.2) to be well defined (in a pathwise RiemannStieltjes sense), we require that has finite variation for some . Intuitively, this means that the “more rough” is, the “less rough” can be. Under these conditions on and , the process in (2.2) will inherit the fractal properties of the driving process , as shown in BarndorffNielsen et al. (2009).
For the central limit theorems developed below to hold, we further require another assumption on .
 [label=(SV),ref=SV]

For any , it holds that
for some and .
As pointed out in Bennedsen et al. (2017a), the requirement that has finite variation for a can be quite restrictive. For instance, if (i.e., is rough) then can not be driven by a standard Brownian motion. A very convenient process, which does not have these restrictions and which is very tractable, is the Brownian semistationary process, which we consider next.
The Brownian semistationary process
Consider , the (volatility modulated) Brownian semistationary () process (BarndorffNielsen and Schmiegel, 2007, 2009), defined as
(2.3) 
where is a Brownian motion on , a stationary process, and a Borel measurable function such that a.s. See, e.g., Bennedsen et al. (2017a) for further details of the process. The process is also a normal meanvariance mixture:
It is interesting to note that BarndorffNielsen et al. (2013) show that for a particular choice of kernel function and stochastic volatility process , will have a marginal distribution of the ubiquitous Normal Inverse Gaussian type.
We need to impose some technical assumptions on the kernel function . They are as follows.
 [label=(BSS),ref=BSS]

It holds that

, where is slowly varying at zero.

, where is slowly varying at zero, and, for any , we have . Also, for some , is nonincreasing on the interval .
^{3} 
For any ,

The kernel function gives the framework great flexibility. A particularly useful kernel function which has been applied in a number of studies, e.g. BarndorffNielsen et al. (2013) and Bennedsen (2017), is the socalled gamma kernel.
Example 2.1 ( process).
Remark 2.3.
In Bennedsen et al. (2017b) it was shown that processes satisfying (1)–(3), (1), and (1) will have the same fractal and continuity properties as their Gaussian counterparts: for such a process Proposition 2.1 continues to hold. In other words, will have a modification with Hölder continuous trajectories of order for all .
2.2 Extension to processes with nonstationary increments
When the increments of are nonstationary an approach similar to the one in Bennedsen et al. (2017b) can be adopted as follows. Define the timedependent variogram
and, analogously to (2.1), assume that
(2.4) 
where again , , and is a slowly varying function at zero. The methods considered in this paper applies – mutatis mutandis – also to such processes. An example is the truncated Brownian semistationary process.
Example 2.2 (Truncated process, Bennedsen et al. (2017b)).
Let
where , is a Brownian motion, and a stochastic volatility process. Bennedsen et al. (2017b) call such a process a truncated () process. When satisfies (1)–(3) and (1), Bennedsen et al. (2017b) show that is indeed the fractal index of , in the sense of satisfying (2.4). We note that processes similar to the process (with for all ) have recently been proposed as models of stochastic logvolatility of financial assets, e.g., Gatheral et al. (2018); Bayer et al. (2016).
2.3 Summary of assumptions
Above we introduced a number of processes, differing in important ways, most notably through their distributional properties. In spite of these differences, the results presented in this paper will apply equally to all of them. To ease notation, we briefly summarize the assumptions here.
The first set of assumptions is required for consistency of the estimator of the fractal index .
 [label=(LLN),ref=LLN]
The second set of assumptions is required for asymptotic normality of the estimator of the fractal index .
 [label=(CLT),ref=CLT]

Suppose that one of the following holds:
Remark 2.4.
As seen from the assumptions, the central limit theorems will not be applicable for . In fact, a central limit theorem do hold in this case, but with a different convergence rate and limiting distribution from what we derive below. When , the convergence rate is and the limiting distribution is zeromean Gaussian with an asymptotic variance different from when . When the convergence rate is and the limiting distribution is of the Rosenblatt type, see Taqqu (1979). If one is interested in the range and desire asymptotic normality results similar to what we have below, we recommend using gaps between the observations as in Corcuera et al. (2013) Remark 4.4; the downside of this approach is that one is forced to throw away observations. Given the results presented below, filling in the details of this approach is straight forward, albeit notationally cumbersome. Since the case of very smooth processes, i.e. , seems of limited practical value, we do not pursue this further here.
3 Semiparametric estimation of, and inference on, the fractal index
Consider equidistant observations of the stochastic process , observed over a fixed time interval, which we without loss of generality take to be the unit interval, so that the time between observations is . As , this gives rise to the socalled infill asymptotics. In what follows, suppose that the process satisfies the assumptions (1)–(3).
When is Gaussian, it holds, by standard properties on the (absolute) moments of the Gaussian distribution and (2.1), that
(3.1) 
where , the function is slowly varying at zero, and is a constant. This motivates the regression
(3.2) 
where is a bandwidth parameter,
The variogram is estimated straightforwardly as
(3.3) 
The OLS estimator of the parameter is naturally
with “T” denoting the transpose of a vector and being the vector
while
Given an estimate of , our estimate of the fractal index is
(3.4) 
This estimator is well known and much used in the literature, e.g. Gneiting and Schlather (2004); Gatheral et al. (2018); Bennedsen et al. (2017a). The following proposition shows the consistency of the OLS estimator of .
Proposition 3.1.
A number of studies have considered the asymptotic properties of the OLS estimates coming from (3.4), e.g. Constantine and Hall (1994), Davies and Hall (1999), and Coeurjolly (2001, 2008). For a brief summary of this literature, see Gneiting et al. (2012), Section 3.1. The following theorem presents the details in the context of this paper.
Theorem 3.1.
Remark 3.1.
Perhaps surprisingly, Theorem 3.1 shows that the asymptotic distribution of the OLS estimator does not depend on the precise structure of the underlying process , but only on the value of the fractal index , through the correlation structure of the increments of a fractional Brownian motion (fBm) with Hurst index , and possibly the “heteroskedasticity factor” . The reason for this is that the small scale behavior of a process fulfilling assumption (1), will have the same small scale behavior as increments of the fBm. To see this, write
(3.7) 
by assumption (1) and the properties of slowly varying functions. We recognize (3.7) as the correlation function of the increments of an fBm with Hurst index . As shown in the proof of Theorem 3.1, this will imply that the asymptotic variance of the estimator, , is the same for all Gaussian processes fulfilling assumptions (1)–(4), including the fBm. However, as the theorem also shows, the asymptotic distribution proves to be slightly different when we consider conditionally Gaussian processes. In this case, the stochastic volatility component introduces heteroskedasticity, which results in the extra factor in the central limit theorem. To make inference feasible in practice, we need to estimate this factor. For this, define
(3.8) 
where is given in (3.3). We can prove the following.
Proposition 3.2.
(i) Suppose (a) holds. Let . Now,
Proposition 3.2 shows that of (3.8) is a suitable estimator for our purpose: when is Gaussian, the factor is asymptotically irrelevant, while when is nonGaussian (volatility modulated) it provides the correct normalization. This justifies including the factor , whether or not one believes the data is Gaussian, at least when any potential nonGaussianity is volatility induced. In fact, the following corollary is a straightforward consequence of Theorem 3.1, Proposition 3.2, and the properties of stable convergence; the corollary has obvious applications to feasibly conducting inference and making confidence intervals for .
Corollary 3.1.
Suppose the assumptions of Theorem 3.1 hold. Now,
where “d” denotes convergence in distribution and denotes the asymptotic variance calculated using the estimate .
Remark 3.2.
When using Corollary 3.1 for hypothesis testing, we recommend calculating using the value of under the null, instead of .
To apply the above results we need to calculate the factor , which boils down to calculating the entries of the matrix given in equation (3.5). Unfortunately, this is only feasible when and becomes increasingly cumbersome as increases. (The already tedious calculation for is given in Bennedsen et al., 2016, Appendix B.). For this reason, we recommend Monte Carlo estimation of ; in fact, we suggest using the finite sample analogue of this factor. The procedure is detailed in Appendix B; in the next section we present an example of the output, when we study the effect of the choice of bandwidth, .
3.1 Choosing the bandwidth parameter
The choice of bandwidth parameter is, in general, an open problem. Standard practice in the literature is to set (Gneiting et al., 2012, Section 2.3). Indeed, Constantine and Hall (1994) argue that the bias of the estimator increases with and Davies and Hall (1999) present simulation evidence for the optimal value, in terms of mean squared error, being . Setting amounts to estimating by drawing a straight line between only the two points closest to the origin, and , when running the OLS regression in (3.2). While tempting from a bias viewpoint, we conjecture that this can result in increased variance of the estimator, by relying on just two points in the regression. In what follows, we examine this in more depth. To be specific, we consider the effect that the bandwidth has on the estimator of the fractal index; first on the theoretical (finite sample) variance of , as derived in Theorem 3.1 (Figure 2), and then on the finite sample bias and mean squared error of the estimator when applied to simulated paths of the various processes of Table 1 (Figure 3). For these investigations, we consider both (rough case) and (smooth case).
Figure 2 studies the effect that the choice of bandwidth has on the variance of the estimator of : we plot the approximation of the finite sample variance of , , which is approximately equal to , cf. Theorem 3.1. From the figure, we see that the choice of bandwidth indeed has an effect on the variance of the OLS estimator of . Interestingly, the effect is very different in the rough case, as compared to the smooth case. In the former, it is evident from the top left plot of Figure 2, that the variance is minimized by an intermediate value of such as or . To further investigate this, the top right plot shows the ratio between the finite sample variance when and when . Numbers less than one indicate that the variance of the estimator with is greater than the variance of the corresponding estimator with , and vice versa. These ratios seem quite stable as a function of sample size and it is evident that, from a variance stand point, it is preferable to choose an intermediate — indeed, the variance of the estimator is reduced by approximately when going from to . These conclusions get turned on their heads when we consider the smooth case, , in the bottom row: here it seems that is optimal.
We further investigate this through simulations as in Davies and Hall (1999): Figure 3 plots the bias (left) and mean squared error (right) of the estimator (3.4), as a function of bandwidth , for the five parametric processes of Table 1. To calculate the finite sample bias and mean squared error of the estimator, we simulate instances of each process, each with observations; the true value of the fractal index in this exercise is (top row) and (bottom row). The scale parameter is set to . For the Cauchy and Dagum processes we additionally set and , respectively. When looking at the rough case, , the same conclusion as above emerges: even though the bias do increase, as expected, with increasing , it is clear that the mean squared error is minimized for an . In this case, i.e. for these parameter values and this sample size, the minimum is attained between and for all five processes. We again conclude that an intermediate value for the bandwidth is preferable in finite samples when . The smooth case, , also matches what we found above: indeed, we find that both bias and mean squared error increase with increasing , so here seems optimal.
In conclusion, the evidence of this section suggests that when the underlying process is rough, the optimal choice of bandwidth is some and we recommend an intermediate value such as . In contrast, when the process is smooth, is preferable. Although setting seems to be accepted practice in the literature, we believe that the rough case of is arguably more relevant in empirical applications. For this reason we suggest using an intermediate value for the bandwidth parameter, unless one has reason to believe the underlying data to be smooth.
3.2 Asymptotic theory in the presence of additive noise
Consider now the situation, where the observations of , satisfying (1)–(3), are contaminated by additive noise; that is, instead of observing , we observe the process , given by
(3.9) 
where is a constant and is a Gaussian iid noise sequence with mean zero and variance . (When we mean that the noise is absent from the observations)
Since we observe , and not , what is relevant for us is the “contaminated”, or “noisy”, variogram, i.e. the variogram of the observation process :
(3.10) 
where the last equality follows from Assumption (1). From this we see that when , will not be linear in , hence the estimator (3.4) of will not be applicable; in fact, it is not hard to show that this estimator will be downwards biased in the presence of noise, i.e. when applied to . In fact, the following is true.
Proposition 3.3.
Proposition 3.3 shows that if the data are contaminated by noise, then estimates of the parameter will be biased downwards towards , i.e. the lowest permissible value for . In other words, if the data are contaminated by noise, then the estimator of considered above, will lead one to conclude that the data are more rough than what is actually the case for the underlying process . This is an important point to note for the practitioner: when finding evidence of roughness (i.e. ) in data, it is crucial to consider whether this is due to an intrinsic property of the underlying data generating mechanism or whether it could simply be the product of noise, e.g. measurement noise.
Fortunately, it is possible to account for the noise when estimating to arrive at a consistent estimator. For instance, Bennedsen et al. (2017a) suggest a noiserobust estimator based on a nonlinear least squares regression – however, this estimator does not allow for the slowly varying function and requires the interval over which the process is observed to grow. Presently, therefore, we propose an alternative noiserobust estimator which is valid in our infill asymptotics setup and again relies on a simple OLS regression.
First, for an integer define the function
From (3.10) and assumption (1), we have
(3.11) 
where it is easy to show that the function
is slowly varying at zero. From this, it is clear that the logarithm of is – up to the slowly varying function – linear in . This motivates a linear regression as the one in (3.2) with in place of :
(3.12) 
where
is the empirical estimate of the function , which is feasible to calculate from the observations . Define the noise robust estimate of as
(3.13) 
where is the OLS estimate of from the linear regression (3.12), analogous to (3.4) with in place of . We can prove the following.
Proposition 3.4.
Remark 3.3.
Proposition 3.4 allows for i.e. for there to be no noise in the observations. In other words, the robust estimator is a consistent estimator of , also in the absence of noise.
In Figure 4 we illustrate the use of Proposition 3.4 by calculating the bias and root mean squared error (RMSE) of the two OLS estimators given in (3.4) and (3.13), when applied to a process with . The details are provided in the caption of the figure. The former estimator is not robust to the noise in , while the latter estimator is per Proposition 3.4. It is clear how this manifests itself in a large bias in the OLS estimator (3.4). In fact, although the true value of the fractal index of the underlying process is , the mean OLS estimates coming from the nonrobust estimator is , i.e. almost at the lowest permissible value of . This is of course a consequence of Proposition 3.3. In contrast, the robust estimator (3.13) proposed in this section is practically unbiased for most values of the parameter , at least when .
Although the results of this section hold for all integer , the actual finite sample performance of the results can be quite sensitive to this tuning parameter, as also witnessed in Figure 4. The optimal choice of seems to depend on the number of observations and the variance of the noise ; an investigation into the exact way this is the case is beyond the scope of the present paper. In practice, we recommend that the researcher run some numerical experiments on simulated data under conditions similar to those of the practical experiment; simulation experiments such as the one in Figure 4 for example. We provide an example of how one can construct such a simulation experiment to arrive at a reasonable value for in Section 5.2, where we apply the robust estimator to a time series of financial prices.
The next result provides the central limit theorem, as it relates to the robust estimator.
Theorem 3.2.
Remark 3.4.
As shown in 2 of Theorem 3.2, the presence of the noise will unfortunately result in a variance of , which decays slower than ; indeed, the exact distribution of is difficult to derive and even harder to feasibly estimate.
A test for the presence of noise
Using the above, we can now construct a test for whether the observed time series contains noise or not. To be specific, we are interested in testing the null hypothesis
(3.14) 
Tests of this kind, in the context of time series of asset prices, were considered in AïtSahalia and Xiu (2018), where the authors develop a test for the presence of market microstructure noise in high frequency data. The test proposed here is similar in spirit to the test of AïtSahalia and Xiu (2018) and in Section 5.2 we briefly consider testing for the presence of market microstructure in high frequency asset prices as well.
To device the test, we consider the difference between the robust estimator from (3.13) and the usual (nonrobust) estimator from (3.4). From Propositions 3.1 and 3.4 is is immediately clear that under
while under , Proposition 3.3 additionally implies that
Analogously to Theorem 3.2, we can also prove the following.
Theorem 3.3.
Corollary 3.2.
Let be as in (3.15). Now,
 [label=()]

Under : as .

Under : as .
The applicability of Corollary 3.2 for testing whether a fractal process is contaminated by noise is obvious.
Remark 3.5.
Above we have assumed that the noise sequence is Gaussian. However, one can show that all the results of sections 3.2 and 3.2.1 apply for general iid noise sequences with finite variance when . In other words, if the Gaussian assumption on the noise sequence is not fulfilled – or seems too restrictive – then one should choose and go ahead and apply the results of these sections.
4 Simulation studies
To examine the finite sample properties of the central limit results presented above, we here conduct three small simulation studies and collect the results in Tables 2–4. In each study we will let be an fBm with Hurst index , for various values of , and simulate observations on the interval . Additional information on the exacts simulation setups are given in the captions of the tables.
For a value , Corollary 3.1 allows us to test the null hypothesis