Abstract
Fractional Gaussian noise (fGn) is a selfsimilar stochastic process used to model antipersistent or persistent dependency structures in observed time series. Properties of the autocovariance function of fGn are characterised by the Hurst exponent (H), which in Bayesian contexts typically has been assigned a uniform prior on the unit interval. This paper argues why a uniform prior is unreasonable and introduces the use of a penalised complexity (PC) prior for H. The PC prior is computed to penalise divergence from the special case of white noise, and is invariant to reparameterisations. An immediate advantage is that the exact same prior can be used for the autocorrelation coefficient of a firstorder autoregressive process AR(1), as this model also reflects a flexible version of white noise. Within the general setting of latent Gaussian models, this allows us to compare an fGn model component with AR(1) using Bayes factors, avoiding confounding effects of prior choices for the hyperparameters. Among others, this is useful in climate regression models where inference for underlying linear or smooth trends depends heavily on the assumed noise model.
Key words: Autoregressive process; Bayes factor; longrange dependence; PC prior; RINLA
1 Introduction
Many real time series exhibit statistical properties which can be modelled by selfsimilar stochastic processes. Such processes have probability distributions which are invariant to changes of scale in time (or space), and are particularly useful to model longrange dependency structures. Longmemory is observed within a wide range of fields, among others economics, climatology, geophysics and network engineering (Beran et al., 2013). Of specific use is fractional Brownian motion (fBm) (Mandelbrot and Ness, 1968), which is the only selfsimilar continuoustime Gaussian process with stationary increments. Its discretetime increment process, referred to as fractional Gaussian noise (fGn), has autocovariance function characterised by the selfsimilarity parameter . This parameter is often referred to as the Hurst exponent, known to quantify the Hurst phenomenon (Hurst, 1951).
Several methods to estimate have been proposed in the literature, among others heuristic approaches like the rescaled range method (Hurst, 1951), detrended fluctuation analysis (Peng et al., 1994) and the rescaled variance method (Giraitis et al., 2003). Alternative approaches include use of wavelets (McCoy and Walden, 1996; Abry and Veitch, 1998), and maximum likelihood and Whittle estimation, see Beran et al. (2013) for a comprehensive overview. Using a Bayesian framework, has typically been assigned a uniform prior (Benhmehdi et al., 2011; Makarava et al., 2011; Makarava and Holschneider, 2012). This is computationally simple and in lack of prior knowledge, a noninformative prior might seem like a good choice. However, one major drawback is that the uniform prior is not invariant to reparameterisations, and as seen in Section 2 this can give surprising results when the prior is transformed.
This paper introduces the use of penalised complexity (PC) priors (Simpson et al., 2016), in estimating the Hurst exponent . We also use a PC prior for the precision of the process. PC priors represent a new principlebased approach to compute priors for a wide range of hyperparameters in hierarchical models, where a given model component can be seen as a flexible version of a simple base model. The simplest base model in the case of fGn is to have no random effect, corresponding to infinite precision. For a fixed precision, the fGn process represents a flexible version of white noise (). The PC prior is computed to penalise model component complexity. This is achieved by assigning a prior to a measure of distance from the flexible model to the base model, which is then transformed to give a prior for the hyperparameter of interest. The informativeness of the prior is adjusted by an intuitive and interpretable useddefined scaling criterion.
The framework of PC priors has several beneficial properties, including robustness, invariance to reparameterisations, and it also provides meaningful priors with a clear interpretation (Simpson et al., 2016; Riebler et al., 2016). This paper utilises the invariance property to compare fGn with a firstorder autoregressive process, AR(1), using Bayes factor. Similarly to fGn, the AR(1) process represents a flexible version of uncorrelated white noise, where the firstlag autocorrelation coefficient . This implies that the PC priors for the hyperparameters and can be chosen as transformations of the exact same prior assigned to the distance measure. This eliminates confounding effects of prior choices in the calculation of Bayes factor which is very important as the Bayes factor is known to be sensitive to prior choices (Kass and Raftery, 1995). A comparison of the AR and fGn models is for example relevant in analysing climatic time series (Løvsletten and Rypdal, 2016), identifying short versus longrange dependency structures in temperature series at different temporal and/or spatial scales. However, the given ideas could also be used in comparing other models as long as these represent flexible versions of the same base model.
The structure of this paper is as follows. Section 2 gives a brief review of the principles underlying PC priors and utilise these to derive the PC prior for . Specifically, the model complexity of fGn is seen to be nonsymmetric around which implies that the prior for should also be nonsymmetric. Section 2 also illustrates how Beta priors on would shift the mode of the prior for distance away from the natural simple base model. In section 3, we focus on model comparison of fGn and the firstorder AR process using Bayes factor. We perform a simulation study and also apply the given ideas to check for fGn structure in global land and seasurface temperatures, respectively. Both the estimation and the comparison of models can be computed within the general framework of latent Gaussian models, using the methodology of integrated nested Laplace approximations (INLA) (Rue et al., 2009). This implies that inference for fGn models is made easily available and that fGn can be combined with other model components in constructing an additive predictor, for example including covariates and nonlinear or oscillatory random effects. Concluding remarks are given in section 4, while Appendix A describes the implementation of fGn using the Rinterface RINLA, combined with generic functions.
2 Penalised complexity priors for the parameters of fractional Gaussian noise
Prior selection for hyperparameters is a difficult issue in Bayesian statistics and often subject to rather adhoc choices. A computationally simple choice is to adopt flat, noninformative priors. In the case of fGn, the Hurst parameter has typically been assigned a uniform prior, argued for in terms of having no knowledge about the parameter (Benhmehdi et al., 2011; Makarava et al., 2011; Makarava and Holschneider, 2012). Also, it is suggested to use the Jeffrey’s prior for the marginal standard deviation of the model, i.e. In this section, we derive the penalised complexity prior (Simpson et al., 2016) for and suggest to use a PC prior also for the precision . We also describe drawbacks in using a uniform, or more generally a Beta prior, for .
2.1 Penalising divergence from a base model
Define fractional Gaussian noise as a zeromean multinormal vector ). The correlation matrix is Toeplitz with firstrow elements
where is referred to as the Hurst exponent, while denotes the precision parameter. Asymptotically, the autocorrelation function of fGn is seen to have a powerlaw decay,
This implies that fGn reduces to uncorrelated white noise when . When , the process has positive correlation reflecting a persistent autocorrelation structure. Similarly, the autocorrelation is negative when , and the resulting process is then referred to as being antipersistent.
The specific structure of the autocorrelation function of fGn implies that the process can be seen as a flexible version of white noise, where represents a flexibility parameter. The framework of penalised complexity priors (Simpson et al., 2016) takes advantage of this inherit nested structure, which is seen in many model components where hyperparameters can be used to define flexible versions of a simple base model structure. A key idea is to assign a prior to the divergence from the flexible model to the base model and transform this to give a prior for the hyperparameter of interest. Specifically, the prior is calculated based on four principles, stated in Simpson et al. (2016). These are as follows:

The PC prior is calculated to support Occam’s razor, emphasising that model simplicity should be preferred over model complexity. Specifically, a model component is seen as a flexible version of a simple base model specification. In the fGn case, let represent the flexible version of a white noise base model , where is the identity matrix. The PC prior for is then designed to give shrinkage to the fixed value .

The second principle in deriving a PC prior implies penalisation of model complexity using a measure of distance from the flexible version of a model component to its base model. This is achieved by making use of the KullbackLeibler divergence, which in the zeromean Gaussian case simplifies to . For the fGn process, the determinant will be of order and we can define a distance measure from to as
(1) which is invariant to .

The third principle used to derive PC priors assumes constant rate penalisation. This implies that the relative change in the prior is constant, independent of the actual value of the distance measure. Consequently, the prior assigned to the distance will be exponential. In the fGn case, the PC prior is calculated separately for the two distances and ,
where and denotes the rate parameter. The prior for is then obtained by an ordinary change of variable transformation,
(2) where the derivative is found numerically. In practice, truncation due to the bounded interval of is avoided by calculating the prior for the logit transformation , which also has a variancestabilising effect.

The last principle in deriving PC priors states that the rate , controlling shrinkage of the prior, should be governed by a userdefined and interpretable scaling criterion. This needs to be chosen for each specific model component of interest. In the given case, a natural and easilyimplemented choice is to define a tail probability, like
(3) where is a specified small probability. Alternatively, the given probability statement could for example define the median of the prior. In any of these cases, the corresponding rate parameter is
2.2 Implications using Beta versus PC priors
The calculated distance function in (1) is nonsymmetric around (Figure 1), clearly illustrating that the properties of fractional Gaussian noise depend on the value of . Specifically, the distance measure for volatility or antipersistent behaviour () is seen to decrease more slowly than in the case of having positive correlation and long memory properties (). Also, notice that the given distance measure increases rapidly as and , reflecting the major change in the properties of the fGn process as it approaches its limiting nonstationary cases.
By transforming the exponential prior on the given distance measure, the resulting PC prior for will preserve the properties of the distance function. Specifically, the PC prior will always have its mode at the white noise base model (), independent of the parameter specifications for and in (3). Also, it preserves the nonsymmetry of the distance function and it goes to infinity when and . The PC prior for is illustrated in Figure 2 (a), using three different tail probabilities in (3). Specifically is set equal to 0.10, 0.15 and 0.20, in which the last two cases express an increased prior belief in persistent versus antipersistent behaviour. Alternatively, if we knew in advance that a given process has long memory properties, we could consider to define the prior only for the interval .
Using the given tail probabilities, the corresponding values for the rate parameter of the exponential distribution are approximately equal to 1.70, 1.27 and 0.97. Figure 2 (b) illustrates corresponding Beta priors, where the shape parameters are chosen to give the same tail probability statements as for the PC priors. Naturally, this can be obtained choosing the shape parameters of the Beta distribution in numerous way. Here, we just make a specific choice in which the first shape parameter of the Beta distribution is set equal to 1.0,1.8 and 2.5, respectively. Specifically, the first case gives the uniform distribution which equals the Beta(1,1) case.
In order to understand the implications of different priors on , we consider the corresponding priors for the distance measure , see Figure 2 (c) and (d). Here, we use negative distance to illustrate the priors when . Notice that when the parameters of the PC prior are changed, the mode of the prior for distance is still at the base model, implying that we would not impose a more complicated model if the true data are in fact white noise. However, we do change the rate of shrinkage to the base model and increase the probability of large positive distances. In changing the parameters of the Beta prior, the mode of the prior for distance is shifted and pulled away from the base model. Especially, by assigning a uniform prior to , the corresponding prior on distance is shifted to the left of the base model. The given negative distance corresponds to a mode at around . This seems like an unreasonable assumption, expect for cases where prior knowledge actually supports this. For example, theoretical results suggest that the Hurst exponent is around for various turbulence processes (Kang and Vidakovic, 2016).
2.3 The PC prior for the precision
The PC prior for is derived assuming a fixed precision parameter . An alternative base model for fGn is to have no random effect (). Using the same principles as above, the resulting PC prior for is the type2 Gumbel distribution (Simpson et al., 2016)
(4) 
which prescribes the size of the marginal precision of the fGn process. This density corresponds to using an exponential prior on the standard deviation.
Adopting the strategy of Simpson et al. (2016), the rate parameter is inferred using the probability statement , where is a small probability while specifies an upper limit for the marginal standard deviation . For example, if , the marginal standard deviation is , after the precision is integrated out. In practice, the PC prior for the precision is assigned to the logprecision , which is represented in closed form as
(5) 
3 Identify long versus shortrange dependency using Bayes factor
In analysing real time series, it is of major importance to model the true underlying dependency structure correctly as this will influence the inference made. Consider a stochastic regression model,
(6) 
where represents an underlying smooth trend while the noise term models random fluctuations. In analysing temperature data for different geographical regions, significance of a linear trend is seen to depend heavily on whether the noise term is modelled by fGn or a firstorder autoregressive process (Løvsletten and Rypdal, 2016).
3.1 Imposing identical priors for the hyperparameters of fGn and firstorder AR model
Autoregressive (AR) processes represent a commonly applied class of stochastic processes, used to model a onestep or Markovian dependency structure in time series. Specifically, assume a firstorder AR(1) process defined by
where the firstlag autocorrelation coefficient . The innovations are assumed Gaussian, , and has zeromean and precision . The resulting covariance matrix of is Toeplitz and defined by the autocorrelation function for lags .
Similarly to fGn, the firstorder AR process represents a flexible version of uncorrelated white noise (). The KullbackLeibler divergence is of order and an invariant distance measure from AR(1) to white noise is
The resulting PC prior for (Simpson et al., 2016) is then expressed as
(7) 
where .
In general, let and denote two different model components for the noise term in (6). Within a Bayesian framework, the two models can be compared using Bayes factor which quantifies the evidence in favour of one statistical model. The Bayes factor is defined as as the ratio of the marginal likelihoods
(8) 
where and represent priors for the two model components. In general, Bayes factors are sensitive to the choices of and (Kass and Raftery, 1995). When the aim is to compare a model with an fGn component to a model with an AR(1) component, the Bayes factor will depend on the prior distributions for the precision parameters of the models and the priors for the parameters and . The precision parameters have the same interpretation for both fGn and AR(1), hence we will use the same prior distribution for the precision of both models. Since we are using PC priors for both and , these parameters are just a reparameterisation of the distance from the (unit variance) AR(1) and fGn model, to the same base model. By choosing the same rate parameter for the PC priors of both and , the prior distributions will be the same even if the models and hyperparameters are different. This is a very convenient feature of PC priors which we make use of in the following examples.
3.2 Simulation results
To illustrate the capability of Bayes factor to identify fGn versus AR(1) structure, we perform a simulation study and generate fGn processes of different lengths , for specified values of the Hurst parameter, scaled to have variance 1. The selected values for the Hurst parameter are , and . These all give persistent fGn processes and are selected to mimic realistic values of the Hurst parameter in analysing longmemory processes. The PC prior for is implemented using the scaling criterion . This implies that the rate parameter of the exponential distribution of the distance measure is . This rate parameter is used to find the corresponding prior for the firstlag correlation coefficient in fitting an AR(1) model to the data. The marginal precisions of both the fGn and AR(1) models are assigned the PC prior using and in (5).
Bayes factor: Strength of evidence  
False  No conclusion  Positive  Strong  Very Strong  Total  
100  0.7  0.226  0.685  0.077  0.009  0.003  0.089 
200  0.7  0.232  0.493  0192  0.063  0.020  0.275 
500  0.7  0.097  0.224  0.220  0.186  0.273  0.679 
100  0.8  0.366  0.513  0.105  0.015  0.001  0.121 
200  0.8  0.260  0.353  0.219  0.123  0.045  0.387 
500  0.8  0.072  0.099  0.121  0.153  0.555  0.829 
100  0.9  0.521  0.352  0.096  0.029  0.002  0.127 
200  0.9  0.316  0.252  0.189  0.146  0.097  0.432 
500  0.9  0.060  0.073  0.091  0.104  0.672  0.867 
In evaluating the evidence given by Bayes factors, we have chosen to apply the categories given in Kass and Raftery (1995), in which Bayes factor should be larger than 3 to give a positive evidence in favour of one model. Note that these categories could also be specified in terms of twice the natural logarithm of the Bayes factor, which would have the same scale as the likelihood ratio test statistic. Table 1 illustrates that rather long time series are needed in order for the Bayes factor to give correct identification of the underlying fGn process. This seems natural as the longmemory properties might not be clearly apparent if the time series is short. We also notice that the results improve with higher values of the Hurst parameter, in which the longmemory structure of fGn would get more apparent. Especially, if and the length of the time series is , we have very strong evidence of fGn in twothirds of the cases. Naturally, the given results could be improved upon using a prior for that puts more probability mass to the upper tail of the distribution.
3.3 Temperature data example
In this section, we analyse a real temperature data set to investigate
whether we find evidence of fGn compared with AR(1) structure, using
Bayes factor. The dataset ”NOAACIRES 20th Century Reanalysis V2c”, downloaded
from http://www.esrl.noaa.gov/psd/data/20thC_Rean
,
contains reanalysed data from the period 1851  2014. The data are
available for a 2 degree latitude times a 2 degree longitude global
grid covering the earth and combine
temperature measurements with model estimated data.
We assume a regression model for temperature as given in (6), where the trend is assumed to be either linear or nonlinear. The nonlinear trend is modelled using an intrinsic CAR model (Besag et al., 1991), also referred to as a secondorder intrinsic Gaussian Markov random field (IGMRF) defined on a line, see Rue and Held (2005, Ch. 3). The precision parameter of the model is assigned the penalised complexity prior in (4), using scaling parameters in (5). By choosing a small value of , the prior imposes only small deviation from a straight line. Note that the given IGMRF model needs to be scaled to have a generalised variance equal to 1, such that we can use the same prior for independently of the time scale used for the analysis (Sørbye and Rue, 2014). The hyperprior specifications for the AR(1) and fGn components used to model climate noise, are chosen to be the same as in the simulation study of Section 3.2.
Trend parameter estimates  fGn parameter estimates  Evidence fGn  

Time series  CI  CI  CI  Category  
Land (a)  0.006  (0.004, 0.008)  0.76  (0.67, 0.84)  0.22  (0.19, 0.26)  Strong 
Land (q)  0.006  (0.004, 0.008)  0.77  (0.72, 0.82)  0.31  (0.29, 0.34)  Very strong 
Sea (a)  0.004  (0.003, 0.006)  0.95  (0.91, 0.99)  0.21  (0.13, 0.37)  Strong 
Sea (q)  0.004  (0.003, 0.006)  0.99  (0.97,1.00)  0.39  (0.21, 0.74)  Very strong 
Trend parameter estimates  fGn parameter estimates  Evidence fGn  

Time series  CI  CI  CI  Category  
Land (a)  0.047  (0.015, 0.102)  0.69  (0.58, 0.79)  0.20  (0.18, 0.23)  No conclusion 
Land (q)  0.038  (0.008, 0.093)  0.75  (0.69, 0.80)  0.30  (0.28, 0.33)  Positive 
Sea (a)  0.041  (0.012, 0.095)  0.92  (0.83, 0.98)  0.15  (0.10, 0.23)  No conclusion 
Sea (q)  0.039  (0.012, 0.091)  0.98  (0.95, 1.00)  0.34  (0.16, 0.66)  No conclusion 
Using the given data, it would be interesting to study evidence of fGn for different geographical regions, but here we only report results for aggregated data over land areas and seasurfaces. We calculate the Bayes factor (8) in each case, using both an annual and quarterly time scale. The results are displayed in Table 23 using a linear and nonlinear trend, respectively. The results clearly illustrate that the evidence of fGn is very sensitive to the trend model. Using a linear trend, the evidence of fGn compared with AR(1) is strong () or very strong (). Using a nonlinear trend, the test using Bayes factor is inconclusive, except for the quarterly land data where there is weak evidence of fGn structure. It is wellknown that nonlinear trends can easily be interpreted incorrectly as long memory (Beran et al., 2013). The given results illustrate that a nonlinear trend, even a very weak one (Figure 3), typically captures some of the longrange dependency structure of the time series, influencing the Bayes factor. However, it does not change the estimate of , radically. We also notice that the quarterly data give higher values for the Hurst parameter and stronger evidence of fGn, than the annual data, which among others could be explained by the length of these series being four times longer than for the annual data.
The estimated Hurst parameter is seen to be higher for the seasurface temperatures than the land temperatures, both using different trend models and different time scales. In fact, in the cases where the Hurst parameter is very close to 1 this could indicate that more components should be included in the model, for example to explain oscillatory behaviour. It is wellknown that estimation of the Hurst parameter is not robust against departures from stationarity and trends and maybe also other models for the climate noise should be considered. Alternatively, the variability of the trend model could be adjusted to explain more of the fluctuations of the temperatures. A big advantage of using PC priors is that such adjustments are easily made, simply choosing a larger upper value of in (5).
4 Concluding remarks
The framework of penalised complexity priors (Simpson et al., 2016) represents a new approach to calculate priors for hyperparameters in Bayesian hierarchical models. The PC priors are intuitive in the sense that they are designed to shrink towards a welldefined simple base model and will always have a local mode at this base model where the distance measure is 0. Also, the parameters of a PC prior have a clear interpretation as they govern the rate of shrinkage to the base model and thereby the informativeness of the prior. In estimating the Hurst parameter of fractional Gaussian noise, the argument of having no knowledge about the parameter has encouraged the use of a noninformative prior. If we accept that a given prior on should give reasonable results also on the distance scale, the uniform prior needs be ruled out. Similarly, changes in the parameters of Beta priors would shift the mode at distance scale and pull estimates away from the base model, making interpretation of these parameters nonintuitive and nontransparent.
In calculating Bayes factor, we take advantage of the fact that PC priors are invariant to reparameterisations. This implies that priors for two different hyperparameters simply represent two different transformations of the same prior on distance scale, as long as the corresponding model components represent flexible versions of the same base model. This is very useful in calculation of Bayes factors as these are known to be rather sensitive to prior choices. In the current analysis, we have focused on identifying the underlying noise term in a regression model as either AR(1) or fGn but the given ideas could also be applied for other model components as long as these share the same base model. The given analysis demonstrates that it is difficult to separate the trend from the model noise and that potential nonlinearity in the trend could easily be interpreted as long memory. This makes it important to control the smoothness of the underlying trend, in which a linear model is considered as the base model and deviation from the base model is controlled by a PC prior on the precision parameter. In general, the use of PC priors is very helpful to control the effect of different random effects and make the model components identifiable.
The option to combine generic functions with RINLA, is a flexible and easy way to incorporate an fGn model component within the general class of latent Gaussian models. This makes it possible to analyse also more complex additive regression models than the ones presented here, in which an fGn component can be combined with for example smooth effects of covariates, seasonal and oscillatory effects, nonlinear trend models and also spatial model components. However, to make full use of the computational power of RINLA, the fGn model should be approximated to have Markov properties as this would give a sparse precision matrix of the underlying latent field. One alternative to achieve this is to approximate fGn as a mix of a large number of AR(1) processes, originally suggested in Granger and Joyeux (1980). Preliminary results (MyrvollNilsen, 2016) indicate that such an approximation is promising, also using only a few AR(1) components. In fact, we have already seen that aggregation of just three AR(1) processes give an excellent approximation of fGn, as long as the weights and coefficients of the AR components are chosen in a clever way. In addition, such a decomposition of fGn can potentially be linked to linear multibox energy balance models (Fredriksen and Rypdal, 2016), in analysing temperature data. This requires further study and will be reported elsewhere.
Acknowledgement
The 20th Century Reanalysis V2c data is provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from their Web site at http://www.esrl.noaa.gov/psd/. The authors wish to thank HegeBeate Fredriksen for valuable discussions and for processing the data to give aggregated land and seasurface temperatures.
References
 Abry and Veitch (1998) Abry, P. and Veitch, D. (1998). Wavelet analysis of long range dependent traffic. IEEE Transactions on Information Theory, 44:2–15.
 Benhmehdi et al. (2011) Benhmehdi, S., Makarava, N., Menhamidouche, N., and Holschneider, M. (2011). Bayesian estimation of the selfsimilarity exponent of the Nile River fluctuation. Nonlinear Processes in Geophysics, 18:441–446.
 Beran et al. (2013) Beran, J., Feng, Y., Ghosh, S., and Kulik, R. (2013). LongMemory Processes: Probabilistic Properties and Statistical Methods. Springer, Heidelberg.
 Besag et al. (1991) Besag, J., York, J., and Mollié, A. (1991). Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43:1–59.
 Fredriksen and Rypdal (2016) Fredriksen, H.B. and Rypdal, M. (2016). Longrange persistence in global surface temperatures explained by linear multibox energy balance models. Manuscript in progress.
 Giraitis et al. (2003) Giraitis, L., Kokoszka, P., Leipus, R., and Teyssiere, T. (2003). Rescaled variance and related tests for long memory in volatility and levels. Journal of Econometrics, 112:265–294.
 Granger and Joyeux (1980) Granger, C. W. J. and Joyeux, R. (1980). An introduction to longmemory time series models and fractional differencing. Journal of Time Series Analysis, 1:15–29.
 Hurst (1951) Hurst, H. E. (1951). Long term storage capacities of reservoirs. Transactions of the American Society for Civil Engineers, 116:776–808.
 Kang and Vidakovic (2016) Kang, M. and Vidakovic, B. (2016). A note on Bayesian waveletbased estimation of scaling. arXiv:1605.01146v.
 Kass and Raftery (1995) Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90:773–795.
 Løvsletten and Rypdal (2016) Løvsletten, O. and Rypdal, M. (2016). Statistics of regional surface temperatures after 1900: Longrange versus shortrange dependence and significance of warming trends. Journal of Climate, 29:4057–4068.
 Makarava et al. (2011) Makarava, N., Benmehdi, S., and Holschneider, M. (2011). Bayesian estimation of selfsimilarity exponent. Physical Review E, 84:021109.
 Makarava and Holschneider (2012) Makarava, N. and Holschneider, M. (2012). Estimation of the Hurst exponent from noisy data: a Bayesian approach. The European Physical Journal B, 85:1–6.
 Mandelbrot and Ness (1968) Mandelbrot, B. B. and Ness, J. W. V. (1968). Fractional Brownian motions, fractional noises and applications. SIAM Review, 10:422–437.
 Martins et al. (2013) Martins, T. G., Simpson, D., Lindgren, F., and Rue, H. (2013). Bayesian computing with INLA: New features. Computational Statistics & Data Analysis, 67:68–83.
 McCoy and Walden (1996) McCoy, E. and Walden, A. (1996). Wavelet analysis and synthesis of stationary longmemory processes. Water Resources Research, 5:26–56.
 MyrvollNilsen (2016) MyrvollNilsen, E. (2016). Computationally efficient Bayesian approximation of fractional Gaussian noise using AR1 processes. Master thesis, Norwegian University of Science and Technology.
 Peng et al. (1994) Peng, C.K., Buldyrev, S., Havlin, S., Simons, M., Stanley, H. E., and Goldberger, A. (1994). Mosaic organization of DNA nucleotides. Physical Review E, 49:1685–1689.
 Riebler et al. (2016) Riebler, A., Sørbye, S. H., Simpson, D., and Rue, H. (2016). An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Statistical Methods in Medical Research, 25:1145–1165.
 Rue and Held (2005) Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications, volume 104 of Monographs on Statistics and Applied Probability. Chapman & Hall, London.
 Rue et al. (2009) Rue, H., Martino, S., and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society, Series B, 71:319–392.
 Simpson et al. (2016) Simpson, D. P., Rue, H., Martins, T. G., Riebler, A., and Sørbye, S. H. (2016). Penalising model component complexity: A principled, practical approach to constructing priors (with discussion). To appear in Statistical Science.
 Sørbye and Rue (2014) Sørbye, S. H. and Rue, H. (2014). Scaling intrinsic Gaussian Markov random field priors in spatial modelling. Spatial Statistics, 8:39–51.
Appendix A Appendix: Incorporating fGn models within latent Gaussian models
We have implemented the fGn model within the RINLA package (see www.rinla.org) which uses nested Laplace approximations to do Bayesian inference for the class of latent Gaussian models (Rue et al., 2009; Martins et al., 2013). We used the generic model “rgeneric” allowing a (Gaussian) latent model component to be defined using R and it is passed into the inlaprogram which is written in C. This is a convenient feature that allows us to prototype new models easily, despite the fact that the fGn model does not have any Markov properties. In general, Markov properties would improve the computational efficiency significantly, see Rue and Held (2005, Ch. 2).
The rgeneric code to represent the fGn model is given below. We make use of the Toeplitz structure of the covariance matrix to speed up the computations. The models is defined as a function returning properties of the Gaussian model, like its graph, the precision matrix, the normalising constant, and so on.
rgeneric.fgn = function (cmd = c("graph", "Q", "initial", "log.norm.const", "log.prior", "quit", "mu", "theta.fun"), theta = NULL, args = NULL) { stopifnot(require(ltsa) && require(FGN) && require(HKprocess)) theta.fun = function(...) { return(list(prec = function (x) exp(x), H = function(x) exp(x)/(1 + exp(x)))) } interpret.theta = function(n, theta) { fun = theta.fun() return(list(prec = fun[[1]](theta[1]), H = fun[[2]](theta[2]))) } graph = function(n, theta) { return (inla.as.sparse(matrix(1, n, n))) } Q = function(n, theta) { param = interpret.theta(n, theta) S = toeplitz(acvfFGN(param$H, n1)) S = inla.as.sparse(param$prec * TrenchInverse(S)) return(S) } log.norm.const = function(n, theta) { param = interpret.theta(n, theta) r = acvfFGN(param$H, n1) S = toeplitz(r) / param$prec val = ltzc(r, rep(0, n)) val2 = c(n/2 * log(2*pi) 0.5*val[2] + n/2 * log(param$prec)) return(val2) } log.prior = function(n, theta) { param = interpret.theta(n, theta) lprior = inla.pc.dprec(param$prec, u=1, alpha=0.01, log=TRUE) + log(param$prec) lprior = lprior + PCPRIOR.H(theta[2]) return (lprior) } mu = function(n, theta) return (numeric(0)) initial = function(n, theta) return(c(4, 0.)) quit = function(n, theta) return(invisible()) ## the PCPRIOR.Hfunction must be provided stopifnot(exists("PCPRIOR.H") && is.function(PCPRIOR.H)) cmd = match.arg(cmd) val = do.call(cmd, args = list(n = as.integer(args$n), theta = theta)) return(val) }
We can now include the given fGn model component together with other latent components. This is an easy example, where the function returning the logprior for (the internal representation of ) is defined in the file fgnprior.R (not included here).
library(FGN) n=200; H=0.8 source("fgn.R") model = inla.rgeneric.define(rgeneric.fgn, n=n, R.init = "fgnprior.R") y = scale(SimulateFGN(n,H)) r = inla(y ~ 1 + f(idx, model=model), verbose=TRUE, data = data.frame(y = y, idx = 1:n, iidx = 1:n), control.family = list(hyper = list( prec = list(initial = 12, fixed=TRUE)))) summary(r)