The role of noise modeling in the estimation of restingstate brain effective connectivity
Abstract
Causal relations among neuronal populations of the brain are studied through the socalled effective connectivity (EC) network. The latter is estimated from EEG or fMRI measurements, by inverting a generative model of the corresponding data. It is clear that the goodness of the estimated network heavily depends on the underlying modeling assumptions. In this present paper we consider the EC estimation problem using fMRI data in restingstate condition. Specifically, we investigate on how to model endogenous fluctuations driving the neuronal activity.
First]G. Prando First]M. Zorzi First]A. Bertoldo First]A. Chiuso
ystem identification, Estimation and filtering, Neuroscience
1 Introduction
A deeper understanding of the complex interactions which take place inside our brain, which currently represents a major open research challenge, could possibly open new avenues to novel terapeutic treatments as well as early diagnosis tools. It is thus not surprising that this area is attracting the interest of scientists from many different fields, as it calls for diverse expertise ranging from medical science to physics, from biology to engineering and statistics. In particular, the estimation of effective connectivity, accounting for the causal relationships among different neuronal populations, has been the subject of several research contributions in the last decade (ValdesSosa et al., 2011; Razi and Friston, 2016).
Effective connectivity is typically inferred from EEG or fMRI measurements, by inverting a generative model of the corresponding data. With the language of System Identification, estimating an effective connectivity model can be framed as the problem of inferring a dynamic network model from indirect measurements. In fact, such models should not only describe the causal interactions taking place at neuronal level (described by the socalled effective connectivity), but also capture the mapping between such activity and the available measurements. Specifically, when the fMRI technique is adopted, this mapping is called hemodynamic response and links the underlying synaptic dynamics to the socalled BOLD (Blood Oxygenation Level Dependent) signal, which is measured through fMRI. This signal reflects an increase of the neuronal activity on a change of the relative levels of oxyhemoglobin and deoxyhemoglobin.
The neuroscience literature has proposed different characterizations of the aforementioned generative models, ranging from linear to nonlinear models, from discretetime (DT) to continuoustime (CT) frameworks (see ValdesSosa et al. (2011) for a comprehensive review). In this work we consider the Dynamical Causal Model (DCM) proposed by Friston et al. (2003), which is nowadays a widespread approach in the neuroscience community (Daunizeau et al., 2011). A DCM is a CT nonlinear statespace model, where the hemodynamic response is described by the socalled BalloonWindkessel model (Buxton et al., 1998). Whereas the original version of DCM was suited for fMRI data recorded while the subject is performing a task (Friston et al., 2003), the DCM has also been recently extended to account for restingstate data (Friston et al., 2014). While in taskdependent DCM, neuronal dynamics is driven by deterministic inputs (external stimuli), in restingstate DCM it is elicited by stochastic sources, representing brain endogenous fluctuations. In the present paper, we will focus on DCM for restingstate data.
In Friston et al. (2014) and Razi et al. (2015), brain endogenous fluctuations are assumed to have a powerlaw spectrum. Such choice not only leads to a low frequencyconcentrated spectrum, but also accounts for the scalefree properties of neuronal fluctuations which have been observed in previous works (Stam and De Bruin, 2004; LinkenkaerHansen et al., 2001; Shin and Kim, 2006). Other contributions by the same authors (Friston, 2008; Friston et al., 2008) suggest the use of nonMarkovian noise processes when modeling biophysical fluctuations, such as those taking place in the brain.
In our previous work (Prando et al., 2017) we tackled the estimation of restingstate effective connectivity by adopting a linearized DCM and postulating that neuronal activity is driven by white Gaussian noise; the effective connectivity matrix was inferred by imposing a sparsity inducing prior (Wipf and Nagarajan, 2010). The remarkable advantage of such a choice is that we are able to avoid the combinatorial bottleneck which typically characterizes the DCM inversion algorithms. The main contribution of this work is twofold: first, following the modeling assumptions for the endogenous fluctuations suggested in the aforementioned literature, we extend the model used in Prando et al. (2017) allowing for a CT autoregressive (AR) model of the random fluctuations driving the neuronal activity. Second, using both real and synthetic data, we compare white vs. AR(1) noise models as sources for the restingstate neuronal activity. This comparison is carried out both studying the effect of noise models in capturing functional correlations as well as evaluating the difference between effective connectivity estimated under different modeling assumptions. Our preliminary results suggest that, indeed, a lowpass AR(1) noise is more suited in our setup.
The manuscript is organized as follows. Modeling aspects are treated in Sec. 2, with a specific focus on the modeling of brain endogenous fluctuations. An extensive simulation study is reported in Sec. 3. Concluding remarks and future investigations are outlined in Sec. 4.
Notation: denotes the identity matrix of size and the zero matrix; denotes the (block)diagonal operator. indicates the tranpose of matrix .
2 Effective Connectivity Modeling
Inference of brain effective connectivity requires the definition of a generative model of the observed BOLD timeseries measured by fMRI scanners. Here we use the DCMs introduced in (Friston et al., 2003, 2014); these are nonlinear statespace models which jointly describe the neuronal activity and the map from the neural activity to the BOLD signal. Denoting with the activity of neuronal populations at time , its evolution is described as
(1) 
where is the effective connectivity matrix. The process represents random fluctuations, whose modeling is the main focus of this work. Previous contributions (Friston et al., 2014; Razi et al., 2015) have postulated that has a powerlaw power spectrum:
(2) 
with and being parameters to be inferred and denoting the angular frequency.
The DCM specification is completed by the socalled BalloonWindkessel model (Buxton et al., 1998), a 4thorder nonlinear SISO system which takes as input each and outputs the measured BOLD signal :
(3)  
(4) 
The model depends on the parameters that are characterized in the literature by empirical priors and have to be estimated from data (Friston et al., 2003). The states are biophysical quantities that are affected by the neuronal activity: denotes the vasodilatatory signal, is the blood inflow, and are respectively the blood volume and the deoxyhemoglobin content. The output equation (4) includes the measurement noise , while the signal component depends on the resting blood volume fraction (typically ) and on the constants , and . We refer the reader to Stephan et al. (2007) for more details.
In our previous work (Prando et al., 2017) we proposed an alternative formulation of the DCM in Eqs. (1)(4). First, since fMRI data are acquired with a sampling time 2 s., we adopt a sampled version of Eq. (1) and model as a CT white noise with intensity . Then, defining , we discretize Eq. (1) as
(5) 
where the noise process satisfies:
(6) 
Second, we have proposed a statistical linearization of the BalloonWindkessel model, exploiting the empirical priors for the hemodynamic parameters . Specifically, we have reformulated model (3)(4) as an FIR model
(7) 
with a prior on derived by the prior on . The length of the impulse response is chosen large enough to capture all the relevant dynamics (see Sec. III in Prando et al. (2017)). Using the linearisation (7) and defining
(8)  
(9)  
(10)  
the resting state DCM becomes a linear stochastic statespace model; namely:
(11) 
Exploiting the linearity of model (11) as well as a suitable sparsity inducing prior on , in Prando et al. (2017) we have introduced an EMtype iterative procedure which is much less computationally demanding than variational methods typically employed to invert classical DCMs. For reasons of space we refer the interested reader to Prando et al. (2017) for details on the estimation algorithm.
Choosing the most suitable statistical description of the endogenous fluctuations in Eqs. (1) and (5) is subject of current research and debate in the scientific community. The simplest choice, which we have followed in Prando et al. (2017), is to model as CT white noise. Instead, motivated by the findings of LinkenkaerHansen et al. (2001); Stam and De Bruin (2004), Friston et al. (2014) and Razi et al. (2015) postulate that the power spectrum of is concentrated at low frequencies. Following this line of work, in this paper we reformulate system (11) assuming to be a 1storder CT autoregressive process.
2.1 Vector Autoregressive (VAR) Process Noise
According to the discussion in the previous section, we could either model in (1) as a CT AR process, or we could directly consider the discrete time counterpart Eq. (5) and model as DT coloured noise. Following the first option we model using the CT model
(12) 
where is a CT white noise with intensity and is an Hurwitzstable matrix. Alternatively, can be described by the discretetime model
(13) 
with Schurstable matrix and a DT white noise with covariance . These two options lead in general to different models. However, in the special case , the models in (12)(13) are equivalent under a proper choice of and . Indeed, sampling Eq. (12), we get
(14) 
Using (2.1), we have
(15)  
Now, assuming and recalling Eq. (6), we can rewrite Eq. (15) as the DT AR model
(16) 
where has covariance
(17) 
which directly depends on the dynamics of model (1). Hence, the two DT AR models in Eqs. (13) and (16) coincide only if and is set equal to .
Given previous neuroscience studies (Stam and De Bruin, 2004) and the fact that directly represent the brain endogenous fluctuations, in the following we shall work with the AR(1) CT noise model (12) where , . Under this hypothesis Eq. (1) can be rewritten as
(18) 
where the last equation defines and . The CT model (18) can be discretised obtaining:
(19) 
where, again, the notation has been used (analogously for and ) and is a stationary white noise with covariance
(20) 
Using the statistical linearization of the BalloonWindkessel model as above, the linear stochastic state space model
(21) 
is obtained where, letting ,
(22) 
The process noise is white with covariance . It should be noticed that the order of model (21) has increased only by , w.r.t. that of (11). The algorithm in Prando et al. (2017) can also be applied to this scenario with obvious modifications. We thus refer the reader to Prando et al. (2017) for details. {rem} We should stress that modeling as an AR(1) process coincides with a specific instance of the hierarchical model proposed by Friston (2008) (see Eqs. (2)(3)), which arises when the dynamics is expressed only in terms of the first two generalized coordinates of motion.
3 Experiments
This section aims at validating (or invalidating) the assumption that the endogenous fluctuations should be modeled as an autoregressive process. To this purpose we shall perform experiments on both synthetic as well as real data. Concerning the synthetic experiment, we shall simulate fMRI signals exciting the simple brain model (1)(4) with endogenous brain fluctuations which are either white or AR(1) noise. Both for simulated and real data we estimate the effective connectivity (EC), using either model (11) and (21) (assuming diagonal). We shall then compare the results with the goal of understanding which modelling assumption is most suited to describe the real fMRI data.
3.1 Synthetic Data.
Synhetic neural responses are simulated using the discretizations (5) and (19) with s, with the “true” EC matrix given by
(23) 
The BOLD signal is then obtained giving these simulated neural responses as inputs to the BalloonWindkessel model (3)(4). The output is then downsampled to s (the sampling rate commonly used in fMRI scanners), obtaining estimation () and test () datasets , , each containing a BOLD timeseries of length . For each dataset the Balloon model parameters are drawn from their empirical priors reported in Friston et al. (2003) and the measurement noise variance (see Eq. (4)) is chosen so as to guarantee SNR=10. When using AR input noise (see (19)) the coefficients are drawn for each MonteCarlo run from a uniform distribution .
3.2 Real Data.
Real BOLD timeseries from 333 brain regions (ROIs) are measured with s on two subjects at rest on 10 minutes scanning sessions, corresponding to BOLD timeseries of length .
The 333 ROIs are then clustered in socalled functional networks. Here we focus on the Visual Network (VIS), responsible for the human vision and on the Default Mode Network (DMN), which collects the most active areas during restingstate condition. In Sec. 3.5 we will use the BOLD timeseries in VIS and DMN (denoted with and respectively) to infer the EC restricted to these networks.
Finally, the 333 BOLD time series are reduced to 53 components by averaging on functionally homogenous areas, obtaining the dataset which is used to infer the wholebrain EC.
3.3 Modeling Assumptions
We estimate the EC matrix under three different modeling assumptions (denoted below with ); in brackets the corresponding symbol used for the estimated EC:

White noise ()

AR(1) noise with ()

AR(1) noise with ()
3.4 Evaluation Metrics
The performance are evaluated according to two metrics: one directly considers the estimated EC, while the other involves an indirect measure based on the socalled functional connectivity (FC). As we shall see below, FC is nothing but the correlation between BOLD signals measured in different regions, and thus is a measure of statistical dependencies among brain regions. The interested reader is referred to Razi and Friston (2016) for further details on the distinction and relation between EC and FC.
Effective Connectivity Metrics.
In the synthetic case, where groundtruth EC is available, we can evaluate how close the estimated EC is to the “true” one using:

The RootMeanSquaredError (RMSE)
(24) where the notation denotes the matrix with its diagonal set to 0
^{2} . 
The number of errors on the sparsity pattern of the EC matrix, that is
(25) where , if , otherwise.
On real data, no groundtruth is available and thus the two metrics above cannot be computed. Instead, we can quantify the dissimilarity arising from the different modeling assumptions listed in Sec. 3.3 by means of the Pearson’s correlation coefficient among the estimated ECs. For two generic vectors and , this is defined as
(26) 
where and analogously for .
Hence, for each pair of estimates within the set we compute
(27) 
where denotes the vectorization of . For synthetic data the same dissimilarity measure can also be computed w.r.t. the groundtruth, i.e. between the true and the estimated .
Functional Connectivity Metrics.
FC between two brain regions is the Pearson’s correlation coefficient between the corresponding BOLD timeseries. Hence, the FC matrix computed from a dataset containing timeseries , , is given by
(28) 
with as defined in Eq. (26). We also consider the FC matrix obtained from the output correlations of the estimated models, where the EC has been estimated according to modeling assumption , namely
(29) 
where and is the solution of the Lyapunov equation if model (11) is used. Alternatively, if model (21) is adopted, is the solution of .
To assess how well the estimated EC (and thus the model), under the modeling assumption , is able to capture the empirical FC, we evaluate the following Pearson’s correlation coefficient
(30) 
where and respectively denote the vectorization of the upper diagonal parts of and of the FC matrix computed with the estimation data .
When dealing with the synthetic experiment we can also use the test dataset to compute .
3.5 Results
Synthetic Data.
Figs. 1 and 2 show the results obtained in the synthetic datasets in terms of metrics (24) and (25), respectively. The three modeling assumptions give rise to similar performance, independently of how the data have been generated. The unique exception is observed on the data generated from VAR process noise when a white noise is postulated for the brain endogenous fluctuations : the RMSE is significantly larger than that obtained assuming a colored process noise (that is, under assumptions AR and VAR). However, such poor performance does not negatively impact the metrics based on FC, that is and , as can be noticed in Figs. 34. The similarity of the performance produced by the three modeling assumptions is particularly evident in Fig. 4, which reports the correlation between the output FC of the estimated model and the empirical FC computed on test dataset. These outcomes show that all the estimated models are equally able to reproduce both estimation and test data, even if the EC matrix is not correctly inferred, thus suggesting a possible identifiability problem. Furthermore, these results seem to discredit the reliability of metrics in assessing the goodness of the estimated EC. However, such performance index is commonly used for this purpose when a ground truth is not available, that is when real data are treated.
If we consider the correlation among the estimated ECs the situation is quite different: when data are generated by a white process noise (Fig. 5(a)), takes values which are very close to 1, regardless of the modeling assumption. On the other hand, when a VAR process is responsible for the excitation of model (21) (Fig. 5(b)), the EC estimated under hypothesis is much less similar to those obtained from hypothesis and . In this case is still large, but smaller than what is reported in Fig. 5(a). The correlations between the true EC matrix and the estimated ones (first row of plots in Fig. 5) reflect what we observe in Figs. 1 and 2: while , , are very similar in Fig. 5(a), in Fig. 5(b) the ECs estimated postulating a white process noise appear quite dissimilar from the true one, w.r.t. to those returned by and modeling assumptions.
Real Data.
Considering real data, we have to rely only on the metrics and since a ground truth is not available. Table 1 reports the correlations between the FCs () computed from the BOLD timeseries and the model FCs () produced by the systems estimated under the modeling assumptions . Not surprisingly, in all the three datasets, these hypothesis give rise to equal values, in agreement with the results obtained on synthetic data. However, if we inspect the correlation between the estimated ECs (Fig. 6), the situation more closely resembles what reported in Fig. 5(b) for synthetic data. Indeed, the inferred ECs result to be quite different, according to the similarity metrics we adopt. This could suggest that modeling the brain endogenous fluctuations as a VAR process may be a more realistic hypothesis. However, recalling that the synthetic scenario considers a network of 7 brain regions, while the simulations on real data refer to 50 regions, we should stress that the previous statement requires further numerical experiments, as well as sound statistical analysis, to be validated. Moreover, we should observe how takes much smaller values in the real data scenario, w.r.t. what reported in the synthetic setup. Hence, further analyses are required also to understand the reasons of this disagreement.
Dataset  Modeling Assumption  

W  AR  VAR  
0.93  0.92  0.92  
0.94  0.95  0.94  
0.92  0.92  0.92 
4 Conclusions
In this work we have tackled the estimation of brain effective connectivity from both synthetic and real restingstate fMRI data. Following our previous work (Prando et al., 2017), we describe the dynamics of the neuronal activity by means of a linear statespace model, which also includes a linearized hemodynamic response, in charge of mapping the neuronal activity to the socalled BOLD signal, measured by fMRI. At rest, brain activity is elicited by endogenous fluctuations, which play the role of process noise in the aforementioned statespace system. While in Prando et al. (2017), this was considered white, here we reformulate our generative model in order to account for the presence of colored process noise. Specifically, we assume it to be a 1storder autoregressive process. Such hypothesis is motivated by previous works in the neuroscience community, where brain fluctuations are characterized as lowfrequency signals (Friston et al., 2014; LinkenkaerHansen et al., 2001).
Not surprisingly, using an AR process noise in the generative model seems a more robust choice, independently of the noise type which has actually generated the data. Furthermore, despite its widespread use in neuroscience, we argue that functional connectivity does not allow to discriminate between the modeling assumptions of white or AR process noise. This is to be expected since FC is just a zero lag output correlation, which is largely insufficient to identify the model parameters.
Finally, evaluating the correlation among the ECs estimated under different modeling assumptions, we find a partial agreement between the results on the real data and those obtained on synthetic data generated by an AR process noise. However, we believe that deeper investigations are required to validate such statement. To this purpose we are conducting bootstrap simulations to compare synthetic and real experiments on solid statistical grounds.
Footnotes
 thanks: This work has been partially supported by the project BIRD162411/16 and the FIRB project RBFR12M3AC funded by MIUR.
 The diagonal elements of (i.e. the selfconnections) do not give any useful information on EC.
References
 Buxton, R.B., Wong, E.C., and Frank, L.R. (1998). Dynamics of blood flow and oxygenation changes during brain activation: the balloon model. Magnetic resonance in medicine, 39(6), 855–864.
 Daunizeau, J., David, O., and Stephan, K.E. (2011). Dynamic causal modelling: a critical review of the biophysical and statistical foundations. Neuroimage, 58(2), 312–322.
 Friston, K. (2008). Hierarchical models in the brain. PLoS computational biology, 4(11), e1000211.
 Friston, K.J., Harrison, L., and Penny, W. (2003). Dynamic causal modelling. Neuroimage, 19(4), 1273–1302.
 Friston, K.J., Kahan, J., Biswal, B., and Razi, A. (2014). A dcm for resting state fmri. Neuroimage, 94, 396–407.
 Friston, K.J., TrujilloBarreto, N., and Daunizeau, J. (2008). Dem: a variational treatment of dynamic systems. Neuroimage, 41(3), 849–885.
 LinkenkaerHansen, K., Nikouline, V.V., Palva, J.M., and Ilmoniemi, R.J. (2001). Longrange temporal correlations and scaling behavior in human brain oscillations. Journal of Neuroscience, 21(4), 1370–1377.
 Prando, G., Zorzi, M., Bertoldo, A., and Chiuso, A. (2017). Estimating effective connectivity in linear brain network models. In 56th IEEE Conference on Decision and Control.
 Razi, A. and Friston, K.J. (2016). The connected brain: causality, models, and intrinsic dynamics. IEEE Signal Processing Magazine, 33(3), 14–35.
 Razi, A., Kahan, J., Rees, G., and Friston, K.J. (2015). Construct validation of a dcm for resting state fmri. Neuroimage, 106, 1–14.
 Shin, C.W. and Kim, S. (2006). Selforganized criticality and scalefree properties in emergent functional neural networks. Physical Review E, 74(4), 045101.
 Stam, C.J. and De Bruin, E.A. (2004). Scalefree dynamics of global functional connectivity in the human brain. Human brain mapping, 22(2), 97–109.
 Stephan, K.E., Weiskopf, N., Drysdale, P.M., Robinson, P.A., and Friston, K.J. (2007). Comparing hemodynamic models with dcm. Neuroimage, 38(3), 387–401.
 ValdesSosa, P.A., Roebroeck, A., Daunizeau, J., and Friston, K. (2011). Effective connectivity: influence, causality and biophysical modeling. Neuroimage, 58(2), 339–361.
 Wipf, D. and Nagarajan, S. (2010). Iterative reweighted and methods for finding sparse solutions. IEEE Journal of Selected Topics in Signal Processing, 4(2), 317–329.