Fractional Gaussian noise: Prior specification and model comparison

Fractional Gaussian noise (fGn) is a self-similar stochastic process used to model anti-persistent or persistent dependency structures in observed time series. Properties of the autocovariance function of fGn are characterised by the Hurst exponent (H), which in Bayesian contexts typically has been assigned a uniform prior on the unit interval. This paper argues why a uniform prior is unreasonable and introduces the use of a penalised complexity (PC) prior for H. The PC prior is computed to penalise divergence from the special case of white noise, and is invariant to reparameterisations. An immediate advantage is that the exact same prior can be used for the autocorrelation coefficient of a first-order autoregressive process AR(1), as this model also reflects a flexible version of white noise. Within the general setting of latent Gaussian models, this allows us to compare an fGn model component with AR(1) using Bayes factors, avoiding confounding effects of prior choices for the hyperparameters. Among others, this is useful in climate regression models where inference for underlying linear or smooth trends depends heavily on the assumed noise model.

Key words: Autoregressive process; Bayes factor; long-range dependence; PC prior; R-INLA

1 Introduction

Many real time series exhibit statistical properties which can be modelled by self-similar stochastic processes. Such processes have probability distributions which are invariant to changes of scale in time (or space), and are particularly useful to model long-range dependency structures. Long-memory is observed within a wide range of fields, among others economics, climatology, geophysics and network engineering (Beran et al., 2013). Of specific use is fractional Brownian motion (fBm) (Mandelbrot and Ness, 1968), which is the only self-similar continuous-time Gaussian process with stationary increments. Its discrete-time increment process, referred to as fractional Gaussian noise (fGn), has autocovariance function characterised by the self-similarity parameter . This parameter is often referred to as the Hurst exponent, known to quantify the Hurst phenomenon (Hurst, 1951).

Several methods to estimate have been proposed in the literature, among others heuristic approaches like the rescaled range method (Hurst, 1951), detrended fluctuation analysis (Peng et al., 1994) and the rescaled variance method (Giraitis et al., 2003). Alternative approaches include use of wavelets (McCoy and Walden, 1996; Abry and Veitch, 1998), and maximum likelihood and Whittle estimation, see Beran et al. (2013) for a comprehensive overview. Using a Bayesian framework, has typically been assigned a uniform prior (Benhmehdi et al., 2011; Makarava et al., 2011; Makarava and Holschneider, 2012). This is computationally simple and in lack of prior knowledge, a non-informative prior might seem like a good choice. However, one major drawback is that the uniform prior is not invariant to reparameterisations, and as seen in Section 2 this can give surprising results when the prior is transformed.

This paper introduces the use of penalised complexity (PC) priors (Simpson et al., 2016), in estimating the Hurst exponent . We also use a PC prior for the precision of the process. PC priors represent a new principle-based approach to compute priors for a wide range of hyperparameters in hierarchical models, where a given model component can be seen as a flexible version of a simple base model. The simplest base model in the case of fGn is to have no random effect, corresponding to infinite precision. For a fixed precision, the fGn process represents a flexible version of white noise (). The PC prior is computed to penalise model component complexity. This is achieved by assigning a prior to a measure of distance from the flexible model to the base model, which is then transformed to give a prior for the hyperparameter of interest. The informativeness of the prior is adjusted by an intuitive and interpretable used-defined scaling criterion.

The framework of PC priors has several beneficial properties, including robustness, invariance to reparameterisations, and it also provides meaningful priors with a clear interpretation (Simpson et al., 2016; Riebler et al., 2016). This paper utilises the invariance property to compare fGn with a first-order autoregressive process, AR(1), using Bayes factor. Similarly to fGn, the AR(1) process represents a flexible version of uncorrelated white noise, where the first-lag autocorrelation coefficient . This implies that the PC priors for the hyperparameters and can be chosen as transformations of the exact same prior assigned to the distance measure. This eliminates confounding effects of prior choices in the calculation of Bayes factor which is very important as the Bayes factor is known to be sensitive to prior choices (Kass and Raftery, 1995). A comparison of the AR and fGn models is for example relevant in analysing climatic time series (Løvsletten and Rypdal, 2016), identifying short versus long-range dependency structures in temperature series at different temporal and/or spatial scales. However, the given ideas could also be used in comparing other models as long as these represent flexible versions of the same base model.

The structure of this paper is as follows. Section 2 gives a brief review of the principles underlying PC priors and utilise these to derive the PC prior for . Specifically, the model complexity of fGn is seen to be non-symmetric around which implies that the prior for should also be non-symmetric. Section 2 also illustrates how Beta priors on would shift the mode of the prior for distance away from the natural simple base model. In section 3, we focus on model comparison of fGn and the first-order AR process using Bayes factor. We perform a simulation study and also apply the given ideas to check for fGn structure in global land and sea-surface temperatures, respectively. Both the estimation and the comparison of models can be computed within the general framework of latent Gaussian models, using the methodology of integrated nested Laplace approximations (INLA) (Rue et al., 2009). This implies that inference for fGn models is made easily available and that fGn can be combined with other model components in constructing an additive predictor, for example including covariates and non-linear or oscillatory random effects. Concluding remarks are given in section 4, while Appendix A describes the implementation of fGn using the R-interface R-INLA, combined with generic functions.

2 Penalised complexity priors for the parameters of fractional Gaussian noise

Prior selection for hyperparameters is a difficult issue in Bayesian statistics and often subject to rather ad-hoc choices. A computationally simple choice is to adopt flat, non-informative priors. In the case of fGn, the Hurst parameter has typically been assigned a uniform prior, argued for in terms of having no knowledge about the parameter (Benhmehdi et al., 2011; Makarava et al., 2011; Makarava and Holschneider, 2012). Also, it is suggested to use the Jeffrey’s prior for the marginal standard deviation of the model, i.e. In this section, we derive the penalised complexity prior (Simpson et al., 2016) for and suggest to use a PC prior also for the precision . We also describe drawbacks in using a uniform, or more generally a Beta prior, for .

2.1 Penalising divergence from a base model

Define fractional Gaussian noise as a zero-mean multinormal vector ). The correlation matrix is Toeplitz with first-row elements

where is referred to as the Hurst exponent, while denotes the precision parameter. Asymptotically, the autocorrelation function of fGn is seen to have a power-law decay,

This implies that fGn reduces to uncorrelated white noise when . When , the process has positive correlation reflecting a persistent autocorrelation structure. Similarly, the autocorrelation is negative when , and the resulting process is then referred to as being anti-persistent.

The specific structure of the autocorrelation function of fGn implies that the process can be seen as a flexible version of white noise, where represents a flexibility parameter. The framework of penalised complexity priors (Simpson et al., 2016) takes advantage of this inherit nested structure, which is seen in many model components where hyperparameters can be used to define flexible versions of a simple base model structure. A key idea is to assign a prior to the divergence from the flexible model to the base model and transform this to give a prior for the hyperparameter of interest. Specifically, the prior is calculated based on four principles, stated in Simpson et al. (2016). These are as follows:

  1. The PC prior is calculated to support Occam’s razor, emphasising that model simplicity should be preferred over model complexity. Specifically, a model component is seen as a flexible version of a simple base model specification. In the fGn case, let represent the flexible version of a white noise base model , where is the identity matrix. The PC prior for is then designed to give shrinkage to the fixed value .

  2. The second principle in deriving a PC prior implies penalisation of model complexity using a measure of distance from the flexible version of a model component to its base model. This is achieved by making use of the Kullback-Leibler divergence, which in the zero-mean Gaussian case simplifies to . For the fGn process, the determinant will be of order and we can define a distance measure from to as


    which is invariant to .

  3. The third principle used to derive PC priors assumes constant rate penalisation. This implies that the relative change in the prior is constant, independent of the actual value of the distance measure. Consequently, the prior assigned to the distance will be exponential. In the fGn case, the PC prior is calculated separately for the two distances and ,

    where and denotes the rate parameter. The prior for is then obtained by an ordinary change of variable transformation,


    where the derivative is found numerically. In practice, truncation due to the bounded interval of is avoided by calculating the prior for the logit transformation , which also has a variance-stabilising effect.

  4. The last principle in deriving PC priors states that the rate , controlling shrinkage of the prior, should be governed by a user-defined and interpretable scaling criterion. This needs to be chosen for each specific model component of interest. In the given case, a natural and easily-implemented choice is to define a tail probability, like


    where is a specified small probability. Alternatively, the given probability statement could for example define the median of the prior. In any of these cases, the corresponding rate parameter is

2.2 Implications using Beta versus PC priors

The calculated distance function in (1) is non-symmetric around (Figure 1), clearly illustrating that the properties of fractional Gaussian noise depend on the value of . Specifically, the distance measure for volatility or anti-persistent behaviour () is seen to decrease more slowly than in the case of having positive correlation and long memory properties (). Also, notice that the given distance measure increases rapidly as and , reflecting the major change in the properties of the fGn process as it approaches its limiting non-stationary cases.

Figure 1: The distance measure , from a general fractional Gaussian process to the white noise base model ().

By transforming the exponential prior on the given distance measure, the resulting PC prior for will preserve the properties of the distance function. Specifically, the PC prior will always have its mode at the white noise base model (), independent of the parameter specifications for and in (3). Also, it preserves the non-symmetry of the distance function and it goes to infinity when and . The PC prior for is illustrated in Figure 2 (a), using three different tail probabilities in  (3). Specifically is set equal to 0.10, 0.15 and 0.20, in which the last two cases express an increased prior belief in persistent versus anti-persistent behaviour. Alternatively, if we knew in advance that a given process has long memory properties, we could consider to define the prior only for the interval .

Using the given tail probabilities, the corresponding values for the rate parameter of the exponential distribution are approximately equal to 1.70, 1.27 and 0.97. Figure 2 (b) illustrates corresponding Beta priors, where the shape parameters are chosen to give the same tail probability statements as for the PC priors. Naturally, this can be obtained choosing the shape parameters of the Beta distribution in numerous way. Here, we just make a specific choice in which the first shape parameter of the Beta distribution is set equal to 1.0,1.8 and 2.5, respectively. Specifically, the first case gives the uniform distribution which equals the Beta(1,1) case.

Figure 2: Panel a) and b) illustrate PC priors and Beta priors for using three different scalings, in which is set equal to 0.1 (solid), 0.15 (dashed) and 0.2 (dotted), respectively. Panel c) and d) give the corresponding priors for the distance measure .

In order to understand the implications of different priors on , we consider the corresponding priors for the distance measure , see Figure 2 (c) and (d). Here, we use negative distance to illustrate the priors when . Notice that when the parameters of the PC prior are changed, the mode of the prior for distance is still at the base model, implying that we would not impose a more complicated model if the true data are in fact white noise. However, we do change the rate of shrinkage to the base model and increase the probability of large positive distances. In changing the parameters of the Beta prior, the mode of the prior for distance is shifted and pulled away from the base model. Especially, by assigning a uniform prior to , the corresponding prior on distance is shifted to the left of the base model. The given negative distance corresponds to a mode at around . This seems like an unreasonable assumption, expect for cases where prior knowledge actually supports this. For example, theoretical results suggest that the Hurst exponent is around for various turbulence processes (Kang and Vidakovic, 2016).

2.3 The PC prior for the precision

The PC prior for is derived assuming a fixed precision parameter . An alternative base model for fGn is to have no random effect (). Using the same principles as above, the resulting PC prior for is the type-2 Gumbel distribution (Simpson et al., 2016)


which prescribes the size of the marginal precision of the fGn process. This density corresponds to using an exponential prior on the standard deviation.

Adopting the strategy of Simpson et al. (2016), the rate parameter is inferred using the probability statement , where is a small probability while specifies an upper limit for the marginal standard deviation . For example, if , the marginal standard deviation is , after the precision is integrated out. In practice, the PC prior for the precision is assigned to the log-precision , which is represented in closed form as


3 Identify long versus short-range dependency using Bayes factor

In analysing real time series, it is of major importance to model the true underlying dependency structure correctly as this will influence the inference made. Consider a stochastic regression model,


where represents an underlying smooth trend while the noise term models random fluctuations. In analysing temperature data for different geographical regions, significance of a linear trend is seen to depend heavily on whether the noise term is modelled by fGn or a first-order autoregressive process (Løvsletten and Rypdal, 2016).

3.1 Imposing identical priors for the hyperparameters of fGn and first-order AR model

Autoregressive (AR) processes represent a commonly applied class of stochastic processes, used to model a one-step or Markovian dependency structure in time series. Specifically, assume a first-order AR(1) process defined by

where the first-lag autocorrelation coefficient . The innovations are assumed Gaussian, , and has zero-mean and precision . The resulting covariance matrix of is Toeplitz and defined by the autocorrelation function for lags .

Similarly to fGn, the first-order AR process represents a flexible version of uncorrelated white noise (). The Kullback-Leibler divergence is of order and an invariant distance measure from AR(1) to white noise is

The resulting PC prior for (Simpson et al., 2016) is then expressed as


where .

In general, let and denote two different model components for the noise term in (6). Within a Bayesian framework, the two models can be compared using Bayes factor which quantifies the evidence in favour of one statistical model. The Bayes factor is defined as as the ratio of the marginal likelihoods


where and represent priors for the two model components. In general, Bayes factors are sensitive to the choices of and (Kass and Raftery, 1995). When the aim is to compare a model with an fGn component to a model with an AR(1) component, the Bayes factor will depend on the prior distributions for the precision parameters of the models and the priors for the parameters and . The precision parameters have the same interpretation for both fGn and AR(1), hence we will use the same prior distribution for the precision of both models. Since we are using PC priors for both and , these parameters are just a reparameterisation of the distance from the (unit variance) AR(1) and fGn model, to the same base model. By choosing the same rate parameter for the PC priors of both and , the prior distributions will be the same even if the models and hyperparameters are different. This is a very convenient feature of PC priors which we make use of in the following examples.

3.2 Simulation results

To illustrate the capability of Bayes factor to identify fGn versus AR(1) structure, we perform a simulation study and generate fGn processes of different lengths , for specified values of the Hurst parameter, scaled to have variance 1. The selected values for the Hurst parameter are , and . These all give persistent fGn processes and are selected to mimic realistic values of the Hurst parameter in analysing long-memory processes. The PC prior for is implemented using the scaling criterion . This implies that the rate parameter of the exponential distribution of the distance measure is . This rate parameter is used to find the corresponding prior for the first-lag correlation coefficient in fitting an AR(1) model to the data. The marginal precisions of both the fGn and AR(1) models are assigned the PC prior using and in (5).

Bayes factor: Strength of evidence
False No conclusion Positive Strong Very Strong Total
100 0.7 0.226 0.685 0.077 0.009 0.003 0.089
200 0.7 0.232 0.493 0192 0.063 0.020 0.275
500 0.7 0.097 0.224 0.220 0.186 0.273 0.679
100 0.8 0.366 0.513 0.105 0.015 0.001 0.121
200 0.8 0.260 0.353 0.219 0.123 0.045 0.387
500 0.8 0.072 0.099 0.121 0.153 0.555 0.829
100 0.9 0.521 0.352 0.096 0.029 0.002 0.127
200 0.9 0.316 0.252 0.189 0.146 0.097 0.432
500 0.9 0.060 0.073 0.091 0.104 0.672 0.867
Table 1: Estimated proportion of Bayes factors in each of five groups, showing strength of evidence of the underlying process being fGn. Results are based on 1000 simulations for each value of the Hurst parameter and for each time series length .

In evaluating the evidence given by Bayes factors, we have chosen to apply the categories given in Kass and Raftery (1995), in which Bayes factor should be larger than 3 to give a positive evidence in favour of one model. Note that these categories could also be specified in terms of twice the natural logarithm of the Bayes factor, which would have the same scale as the likelihood ratio test statistic. Table 1 illustrates that rather long time series are needed in order for the Bayes factor to give correct identification of the underlying fGn process. This seems natural as the long-memory properties might not be clearly apparent if the time series is short. We also notice that the results improve with higher values of the Hurst parameter, in which the long-memory structure of fGn would get more apparent. Especially, if and the length of the time series is , we have very strong evidence of fGn in two-thirds of the cases. Naturally, the given results could be improved upon using a prior for that puts more probability mass to the upper tail of the distribution.

3.3 Temperature data example

In this section, we analyse a real temperature data set to investigate whether we find evidence of fGn compared with AR(1) structure, using Bayes factor. The dataset ”NOAA-CIRES 20th Century Reanalysis V2c”, downloaded from, contains reanalysed data from the period 1851 - 2014. The data are available for a 2 degree latitude times a 2 degree longitude global grid covering the earth and combine temperature measurements with model estimated data.

We assume a regression model for temperature as given in (6), where the trend is assumed to be either linear or non-linear. The non-linear trend is modelled using an intrinsic CAR model (Besag et al., 1991), also referred to as a second-order intrinsic Gaussian Markov random field (IGMRF) defined on a line, see Rue and Held (2005, Ch. 3). The precision parameter of the model is assigned the penalised complexity prior in (4), using scaling parameters in (5). By choosing a small value of , the prior imposes only small deviation from a straight line. Note that the given IGMRF model needs to be scaled to have a generalised variance equal to 1, such that we can use the same prior for independently of the time scale used for the analysis (Sørbye and Rue, 2014). The hyperprior specifications for the AR(1) and fGn components used to model climate noise, are chosen to be the same as in the simulation study of Section 3.2.

Trend parameter estimates fGn parameter estimates Evidence fGn
Time series CI CI CI Category
Land (a) 0.006 (0.004, 0.008) 0.76 (0.67, 0.84) 0.22 (0.19, 0.26) Strong
Land (q) 0.006 (0.004, 0.008) 0.77 (0.72, 0.82) 0.31 (0.29, 0.34) Very strong
Sea (a) 0.004 (0.003, 0.006) 0.95 (0.91, 0.99) 0.21 (0.13, 0.37) Strong
Sea (q) 0.004 (0.003, 0.006) 0.99 (0.97,1.00) 0.39 (0.21, 0.74) Very strong
Table 2: The mean estimates, credible intervals and the evidence of fGn versus AR(1) given by Bayes factor, fitting a regression model with linear trend to global land and sea temperatures. We use both annual (a) and quarterly (q) time scales for the years 1851 - 2014.
Trend parameter estimates fGn parameter estimates Evidence fGn
Time series CI CI CI Category
Land (a) 0.047 (0.015, 0.102) 0.69 (0.58, 0.79) 0.20 (0.18, 0.23) No conclusion
Land (q) 0.038 (0.008, 0.093) 0.75 (0.69, 0.80) 0.30 (0.28, 0.33) Positive
Sea (a) 0.041 (0.012, 0.095) 0.92 (0.83, 0.98) 0.15 (0.10, 0.23) No conclusion
Sea (q) 0.039 (0.012, 0.091) 0.98 (0.95, 1.00) 0.34 (0.16, 0.66) No conclusion
Table 3: The mean estimates, credible intervals and the evidence of fGn versus AR(1) given by Bayes factor, fitting a regression model with a smooth non-linear trend to global land and sea temperatures. We use both annual (a) and quarterly (q) time scales for the years 1851 - 2014.

Using the given data, it would be interesting to study evidence of fGn for different geographical regions, but here we only report results for aggregated data over land areas and sea-surfaces. We calculate the Bayes factor (8) in each case, using both an annual and quarterly time scale. The results are displayed in Table 2-3 using a linear and non-linear trend, respectively. The results clearly illustrate that the evidence of fGn is very sensitive to the trend model. Using a linear trend, the evidence of fGn compared with AR(1) is strong () or very strong (). Using a non-linear trend, the test using Bayes factor is inconclusive, except for the quarterly land data where there is weak evidence of fGn structure. It is well-known that non-linear trends can easily be interpreted incorrectly as long memory (Beran et al., 2013). The given results illustrate that a non-linear trend, even a very weak one (Figure 3), typically captures some of the long-range dependency structure of the time series, influencing the Bayes factor. However, it does not change the estimate of , radically. We also notice that the quarterly data give higher values for the Hurst parameter and stronger evidence of fGn, than the annual data, which among others could be explained by the length of these series being four times longer than for the annual data.

The estimated Hurst parameter is seen to be higher for the sea-surface temperatures than the land temperatures, both using different trend models and different time scales. In fact, in the cases where the Hurst parameter is very close to 1 this could indicate that more components should be included in the model, for example to explain oscillatory behaviour. It is well-known that estimation of the Hurst parameter is not robust against departures from stationarity and trends and maybe also other models for the climate noise should be considered. Alternatively, the variability of the trend model could be adjusted to explain more of the fluctuations of the temperatures. A big advantage of using PC priors is that such adjustments are easily made, simply choosing a larger upper value of in (5).

Figure 3: The upper panels show annual global land temperature (a) and annual global sea-surface temperature (b) in the period 1851 - 2014, with estimated linear (red) and non-linear (blue) trends. The lower panels show the corresponding quarterly global land temperature (c) and sea-surface temperature (d).

4 Concluding remarks

The framework of penalised complexity priors (Simpson et al., 2016) represents a new approach to calculate priors for hyperparameters in Bayesian hierarchical models. The PC priors are intuitive in the sense that they are designed to shrink towards a well-defined simple base model and will always have a local mode at this base model where the distance measure is 0. Also, the parameters of a PC prior have a clear interpretation as they govern the rate of shrinkage to the base model and thereby the informativeness of the prior. In estimating the Hurst parameter of fractional Gaussian noise, the argument of having no knowledge about the parameter has encouraged the use of a non-informative prior. If we accept that a given prior on should give reasonable results also on the distance scale, the uniform prior needs be ruled out. Similarly, changes in the parameters of Beta priors would shift the mode at distance scale and pull estimates away from the base model, making interpretation of these parameters non-intuitive and non-transparent.

In calculating Bayes factor, we take advantage of the fact that PC priors are invariant to reparameterisations. This implies that priors for two different hyperparameters simply represent two different transformations of the same prior on distance scale, as long as the corresponding model components represent flexible versions of the same base model. This is very useful in calculation of Bayes factors as these are known to be rather sensitive to prior choices. In the current analysis, we have focused on identifying the underlying noise term in a regression model as either AR(1) or fGn but the given ideas could also be applied for other model components as long as these share the same base model. The given analysis demonstrates that it is difficult to separate the trend from the model noise and that potential non-linearity in the trend could easily be interpreted as long memory. This makes it important to control the smoothness of the underlying trend, in which a linear model is considered as the base model and deviation from the base model is controlled by a PC prior on the precision parameter. In general, the use of PC priors is very helpful to control the effect of different random effects and make the model components identifiable.

The option to combine generic functions with R-INLA, is a flexible and easy way to incorporate an fGn model component within the general class of latent Gaussian models. This makes it possible to analyse also more complex additive regression models than the ones presented here, in which an fGn component can be combined with for example smooth effects of covariates, seasonal and oscillatory effects, non-linear trend models and also spatial model components. However, to make full use of the computational power of R-INLA, the fGn model should be approximated to have Markov properties as this would give a sparse precision matrix of the underlying latent field. One alternative to achieve this is to approximate fGn as a mix of a large number of AR(1) processes, originally suggested in Granger and Joyeux (1980). Preliminary results (Myrvoll-Nilsen, 2016) indicate that such an approximation is promising, also using only a few AR(1) components. In fact, we have already seen that aggregation of just three AR(1) processes give an excellent approximation of fGn, as long as the weights and coefficients of the AR components are chosen in a clever way. In addition, such a decomposition of fGn can potentially be linked to linear multibox energy balance models (Fredriksen and Rypdal, 2016), in analysing temperature data. This requires further study and will be reported elsewhere.


The 20th Century Reanalysis V2c data is provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from their Web site at The authors wish to thank Hege-Beate Fredriksen for valuable discussions and for processing the data to give aggregated land and sea-surface temperatures.


  • Abry and Veitch (1998) Abry, P. and Veitch, D. (1998). Wavelet analysis of long range dependent traffic. IEEE Transactions on Information Theory, 44:2–15.
  • Benhmehdi et al. (2011) Benhmehdi, S., Makarava, N., Menhamidouche, N., and Holschneider, M. (2011). Bayesian estimation of the self-similarity exponent of the Nile River fluctuation. Nonlinear Processes in Geophysics, 18:441–446.
  • Beran et al. (2013) Beran, J., Feng, Y., Ghosh, S., and Kulik, R. (2013). Long-Memory Processes: Probabilistic Properties and Statistical Methods. Springer, Heidelberg.
  • Besag et al. (1991) Besag, J., York, J., and Mollié, A. (1991). Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43:1–59.
  • Fredriksen and Rypdal (2016) Fredriksen, H.-B. and Rypdal, M. (2016). Long-range persistence in global surface temperatures explained by linear multi-box energy balance models. Manuscript in progress.
  • Giraitis et al. (2003) Giraitis, L., Kokoszka, P., Leipus, R., and Teyssiere, T. (2003). Rescaled variance and related tests for long memory in volatility and levels. Journal of Econometrics, 112:265–294.
  • Granger and Joyeux (1980) Granger, C. W. J. and Joyeux, R. (1980). An introduction to long-memory time series models and fractional differencing. Journal of Time Series Analysis, 1:15–29.
  • Hurst (1951) Hurst, H. E. (1951). Long term storage capacities of reservoirs. Transactions of the American Society for Civil Engineers, 116:776–808.
  • Kang and Vidakovic (2016) Kang, M. and Vidakovic, B. (2016). A note on Bayesian wavelet-based estimation of scaling. arXiv:1605.01146v.
  • Kass and Raftery (1995) Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90:773–795.
  • Løvsletten and Rypdal (2016) Løvsletten, O. and Rypdal, M. (2016). Statistics of regional surface temperatures after 1900: Long-range versus short-range dependence and significance of warming trends. Journal of Climate, 29:4057–4068.
  • Makarava et al. (2011) Makarava, N., Benmehdi, S., and Holschneider, M. (2011). Bayesian estimation of self-similarity exponent. Physical Review E, 84:021109.
  • Makarava and Holschneider (2012) Makarava, N. and Holschneider, M. (2012). Estimation of the Hurst exponent from noisy data: a Bayesian approach. The European Physical Journal B, 85:1–6.
  • Mandelbrot and Ness (1968) Mandelbrot, B. B. and Ness, J. W. V. (1968). Fractional Brownian motions, fractional noises and applications. SIAM Review, 10:422–437.
  • Martins et al. (2013) Martins, T. G., Simpson, D., Lindgren, F., and Rue, H. (2013). Bayesian computing with INLA: New features. Computational Statistics & Data Analysis, 67:68–83.
  • McCoy and Walden (1996) McCoy, E. and Walden, A. (1996). Wavelet analysis and synthesis of stationary long-memory processes. Water Resources Research, 5:26–56.
  • Myrvoll-Nilsen (2016) Myrvoll-Nilsen, E. (2016). Computationally efficient Bayesian approximation of fractional Gaussian noise using AR1 processes. Master thesis, Norwegian University of Science and Technology.
  • Peng et al. (1994) Peng, C.-K., Buldyrev, S., Havlin, S., Simons, M., Stanley, H. E., and Goldberger, A. (1994). Mosaic organization of DNA nucleotides. Physical Review E, 49:1685–1689.
  • Riebler et al. (2016) Riebler, A., Sørbye, S. H., Simpson, D., and Rue, H. (2016). An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Statistical Methods in Medical Research, 25:1145–1165.
  • Rue and Held (2005) Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications, volume 104 of Monographs on Statistics and Applied Probability. Chapman & Hall, London.
  • Rue et al. (2009) Rue, H., Martino, S., and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society, Series B, 71:319–392.
  • Simpson et al. (2016) Simpson, D. P., Rue, H., Martins, T. G., Riebler, A., and Sørbye, S. H. (2016). Penalising model component complexity: A principled, practical approach to constructing priors (with discussion). To appear in Statistical Science.
  • Sørbye and Rue (2014) Sørbye, S. H. and Rue, H. (2014). Scaling intrinsic Gaussian Markov random field priors in spatial modelling. Spatial Statistics, 8:39–51.

Appendix A Appendix: Incorporating fGn models within latent Gaussian models

We have implemented the fGn model within the R-INLA package (see which uses nested Laplace approximations to do Bayesian inference for the class of latent Gaussian models (Rue et al., 2009; Martins et al., 2013). We used the generic model “rgeneric” allowing a (Gaussian) latent model component to be defined using R and it is passed into the inla-program which is written in C. This is a convenient feature that allows us to prototype new models easily, despite the fact that the fGn model does not have any Markov properties. In general, Markov properties would improve the computational efficiency significantly, see Rue and Held (2005, Ch. 2).

The rgeneric code to represent the fGn model is given below. We make use of the Toeplitz structure of the covariance matrix to speed up the computations. The models is defined as a function returning properties of the Gaussian model, like its graph, the precision matrix, the normalising constant, and so on.

rgeneric.fgn = function (cmd = c("graph", "Q", "initial",
                                 "log.norm.const", "log.prior",
                                 "quit", "mu", ""),
                         theta = NULL,
                         args = NULL)
    stopifnot(require(ltsa) && require(FGN) && require(HKprocess)) = function(...) {
        return(list(prec = function (x) exp(x),
                       H = function(x) exp(x)/(1 + exp(x))))
    interpret.theta = function(n, theta) {
        fun =
        return(list(prec = fun[[1]](theta[1]),
                       H = fun[[2]](theta[2])))
    graph = function(n, theta) {
        return (, n, n)))
    Q = function(n, theta) {
        param = interpret.theta(n, theta)
        S = toeplitz(acvfFGN(param$H, n-1))
        S =$prec * TrenchInverse(S))
    log.norm.const = function(n, theta) {
        param = interpret.theta(n, theta)
        r = acvfFGN(param$H, n-1)
        S = toeplitz(r) / param$prec
        val = ltzc(r, rep(0, n))
        val2 = c(-n/2 * log(2*pi) -0.5*val[2] + n/2 * log(param$prec))
    log.prior = function(n, theta) {
        param = interpret.theta(n, theta)
        lprior = inla.pc.dprec(param$prec, u=1, alpha=0.01, log=TRUE)
                 + log(param$prec)
        lprior = lprior + PCPRIOR.H(theta[2])
        return (lprior)
    mu = function(n, theta) return (numeric(0))
    initial = function(n, theta) return(c(4, 0.))
    quit = function(n, theta) return(invisible())

    ## the PCPRIOR.H-function must be provided
    stopifnot(exists("PCPRIOR.H") && is.function(PCPRIOR.H))
    cmd = match.arg(cmd)
    val =, args = list(n = as.integer(args$n),
                      theta = theta))


We can now include the given fGn model component together with other latent components. This is an easy example, where the function returning the log-prior for (the internal representation of ) is defined in the file fgn-prior.R (not included here).

n=200; H=0.8
model = inla.rgeneric.define(rgeneric.fgn, n=n,
                             R.init = "fgn-prior.R")
y = scale(SimulateFGN(n,H))
r = inla(y ~ -1 + f(idx, model=model), verbose=TRUE,
    data = data.frame(y = y, idx = 1:n, iidx = 1:n), = list(hyper = list(
                           prec = list(initial = 12, fixed=TRUE))))
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description