Bayesian Joint Modeling of Multiple Brain Functional Networks
Abstract
Brain function is organized in coordinated modes of spatiotemporal activity (functional networks) exhibiting an intrinsic baseline structure with variations under different experimental conditions. Existing approaches for uncovering such network structures typically do not explicitly model shared and differential patterns across networks, thus potentially reducing the detection power. We develop an integrative modeling approach for jointly modeling multiple brain networks across experimental conditions. The proposed Bayesian Joint Network Learning approach develops flexible priors on the edge probabilities involving a common intrinsic baseline structure and differential effects specific to individual networks. Conditional on these edge probabilities, connection strengths are modeled under a Bayesian spike and slab prior on the offdiagonal elements of the inverse covariance matrix. The model is fit under a posterior computation scheme based on Markov chain Monte Carlo. Numerical simulations illustrate that the proposed joint modeling approach has increased power to detect true differential edges while providing adequate control on false discoveries and achieving greater accuracy in the estimation of edge strengths compared to existing methods. An application of the method to fMRI Stroop task data provides unique insights into brain network alterations between cognitive conditions which existing graphical modeling techniques failed to reveal.
Keywords: Brain networks; Dirichlet process; multiple graphical models; spike and slab prior; Stroop task.
1 Introduction
The study of the neural bases of human cognition has made rapid progress with the advent of modern brain imaging techniques. Among these, functional Magnetic Resonance Imaging (fMRI) has allowed researchers to associate specific sets of brain regions with the performance of a variety of cognitive, sensory, and motor tasks by detecting activationrelated local changes in the bloodoxygenleveldependent (BOLD) signal. In recent years, the interest of the neuroimaging community has somewhat shifted from localized brain activation to functional connectivity, that is, how different regions of the brain change their activity together as a functionally coherent circuit (see Smith et al., 2011, for a review).
Research on functional connectivity involves investigations of resting state fMRI and taskbased fMRI seeking to identify coherent patterns of spatiotemporal activity during rest and task, respectively (Biswal et al., 1995, 1997). This has had an important impact in clinical and translational research, with applications in clinical diagnosis and surgical planning (Lee et al., 2013). Functional connectivity is most often assessed in terms of brain networks, i.e. sets of connections between different brain regions under a graphtheoretic approach, where each connection represents a path of information transmission. Several approaches have been proposed for modeling the brain network in terms of a graph; these include pairwise correlation analysis (Stam and Reijneveld, 2007; Supekar et al., 2008), partial correlation analysis (Salvador et al., 2005; Wang et al., 2016), and sparse inverse covariance or precision matrix estimation (Huang et al., 2010; Wang et al., 2016). Smith et al. (2011) showed the precision matrix approach and the related partial correlations method to be very successful in distinguishing a true, direct functional connection between two nodes from an apparent one mediated by a third common cause.
Many researchers are interested in comparing brain networks across cognitive states induced by experimental conditions with the aim of identifying functional connections whose strengths reflect differences or commonalities between these conditions (Fair et al., 2007; Fox et al., 2007; Greicius et al., 2003; Buckner et al., 2013; Hermundstad et al., 2013; Mennes et al., 2013). Under a graphtheoretic approach, edges featuring differential strengths correspond to brain connections that are more activated or suppressed during one experimental condition as compared to others. Such edges are potentially of great clinical and translational significance. On the other hand, connections shared across experimental conditions may represent an intrinsic functional network architecture which is common across varying cognitive states (Fox et al., 2007; Vincent et al., 2007; Cole et al., 2014). The comparison of brain networks across multiple conditions may be performed on a single subject or, as in our case, at a group level (Huang et al., 2010; Smith et al., 2011), with the latter being able to average out subjectspecific idiosyncrasies and potentially providing greater power to detect underlying biological differences and similarities (Smith et al., 2013; Kim et al., 2014).
Existing approaches for comparing multiple brain networks typically estimate each brain network separately and then use massunivariate hypothesis testing to infer significant differences between these networks while controlling the familywise error rate (Genovese et al., 2002; Eguiluz et al., 2005; van den Heuvel et al., 2008). These approaches, although valuable, may have reduced power to detect true differences (Fornito et al., 2013) and suffer from inflated false positive rates (Zalesky et al., 2010) due to a massive number of multiple comparisons, since they do not borrow information across networks. Alternatively, network metrics and statistics can be used to avoid testing every possible connection (Zalesky et al., 2010; Fornito et al., 2013; Meskaldji et al., 2011). These techniques improve statistical power but come at the cost of reduced explanatory value since they typically do not provide inferences at the level of individual connections.
Recently, there has been a limited growth in the development of penalized approaches for the joint estimation of multiple graphical models. These approaches (Guo et al., 2011; Danaher et al., 2014; Zhu et al., 2014) rely on penalization to enforce sparsity and typically smooth over the strength of connections across networks to enforce shared edges, which is a useful modeling assumption but may not be supported in practical brain network applications. With the exception of a recent work by Belilovsky et al. (2016), who developed a penalized neighborhood selection approach to obtain point estimates for brain networks, very few existing penalized methods have been vetted for estimating multiple brain networks, to our knowledge. Unfortunately, the approach by Belilovsky et al. (2016) cannot be used to obtain positive definite precision matrices, precluding accurate quantification of edge strengths in terms of partial correlations. Moreover, the above penalized approaches only report point estimates; they do not provide measures of uncertainty for the brain networks, which are often desirable in accounting for the underlying heterogeneity in a group level analyses, as well as quantifying estimation errors in brain imaging studies. Kim et al. (2015) suggested that comparing brain networks based on penalized approaches may result in misleading inferences since the estimated network differences may be artifacts resulting from estimation errors under point estimates.
On the other hand, while Bayesian approaches have proven extremely useful in estimating individual brain networks (see Mumford and Ramsey (2014) for a review), few attempts have been made to develop Bayesian methods for the joint estimation of multiple graphical networks. Some existing Bayesian methods for jointly estimating multiple graphs include the approach by Yajima et al. (2012), who focused on multiple directed acyclic graphs, and the Bayesian Markov random field approach by Peterson et al. (2015) for estimating multiple proteinprotein interaction networks. The former cannot be used to obtain undirected brain networks which is the focus of this article, and the latter is only applicable to examples involving a small number of nodes. We note that there is some recent work on jointly estimating multiple temporally dependent brain networks (Qiu et al., 2016; Lin et al., 2017), but these approaches cannot be directly generalized for the integrative analysis of multiple brain networks across different experimental conditions or cohorts. The above discussion suggests a clear need for developing flexible Bayesian approaches for joint estimation of multiple brain networks which pool information across graphs to provide more accurate inferences.
In this article, we develop a Bayesian Gaussian graphical modeling approach for estimating multiple networks. This approach models the probability of a connection as a parametric function of a baseline component shared across networks and differential components unique to each network. The shared and differential effects are modeled under a Dirichlet process mixture of Gaussians prior (Müller et al., 1996), and the edge probabilities are estimated by pooling information across experimental conditions, thereby resulting in the joint estimation of multiple brain networks. The role of the edge probabilities is twofold  they characterize uncertainty in network estimation and enable direct testing of shared and differential patterns across networks after multiplicity corrections. The connection strengths are encapsulated via network specific precision matrices, which are modeled separately for each network under a spike and slab Bayesian graphical lasso prior informed by the above edge probabilities. Adopting a joint modeling approach results in more accurate estimation of edge strengths and significantly improved power to detect true differential connections while ensuring adequate control for false discoveries, as demonstrated via extensive numerical experiments. The approach, denoted as Bayesian Joint Network Learning (BJNL), is implemented via a fully Gibbs posterior computation scheme which proceeds via Markov chain Monte Carlo (MCMC).
Our method was applied to a fMRI Stroop task experiment (Stroop, 1935) in which data were collected under blocks of passive fixation and blocks of task performance, requiring the participants to alternately execute the same task with a maximum or minimum degree of voluntary effort investment. We sought to test the ability of the method to assess varying degrees of network differences between passive fixation and task performance, as well as between effortful and relaxed task performance, while also estimating the commonalities across the conditions. Results from BJNL provide insights into how the brain network reorganizes under different cognitive conditions. Specifically, we found that brain connections are dramatically different when subjects are actively engaged in the task as compared with the rest condition. In addition to the identification of a considerable number of connections that were altered under the task vs. passive fixation, we found the topological characteristics of the brain networks to be different. For example, the rest condition was found to be associated with more efficient information transmission. On the other hand, a comparison of brain networks under the two task conditions corresponding to a varying degree of cognitive exertion revealed similar topological features but also demonstrated alterations in brain networks that are associated with high level cognitive processing. In addition, our approach characterized the uncertainty inherent in the group level analysis in terms of edge probabilities and posterior distributions for different graph metrics. In contrast, an alternate analysis involving separate estimation of multiple brain networks using the graphical lasso approach (Friedman et al., 2008), followed by permutation testing, did not reveal any significant differences as elaborated in Section 5.
This paper is organized as follows. In Section 2 we describe the fMRI data set and illustrate the proposed approach. Section 3 provides details about the posterior computation strategy for implementing the BJNL. Section 4 reports extensive numerical studies comparing the proposed BJNL to competing approaches. Section 5 discusses the results of the application of the method to the fMRI data set, and Section 6 summarizes our findings. The Supplementary Materials contain explicit posterior computation steps, as well as additional numerical results. For ease of use and as a starting point for further extension of the method, we provide a Matlab GUI implementation of our method in the Supplementary Materials as well.
2 Methodology
2.1 Description of the fMRI data set
Fortyfive volunteers participated in the study. All subjects were right handed with an average age of 21.9 (SD = 2.2) years. MRI scanning was performed at the N.O.C.S.A.E Hospital in Baggiovara (MO), Italy, using a 3T Philips Achieva scanner. For each subject, the imaging session consisted of the collection of 6 echoplanar imaging (EPI) runs (112 volumes each, TR=2.5s, 25 axial slice, mm voxels) and a T1weighted highresolution volume (180 sagittal slices, 1mm isotropic voxels) for anatomical reference. While in the scanner, subjects performed a 4color version of the Stroop task with a buttonpress response modality (Gianaros et al., 2005). In this task, subjects are presented with a color word displayed in colored fonts in the center of a computer screen and are asked to press a button on a response device corresponding to the font color of the stimulus. There are two types of trials: congruent trials, where the font color matches the text (e.g., the word ‘RED’ in red fonts), and incongruent trials, where the font color does not match the text (e.g., the word ‘RED’ in green fonts). The ‘Stroop effect’ refers to a significant slowing of response times to the incongruent trials compared to the congruent ones (Stroop, 1935). Figure 1 illustrates the Stroop task experiment.
Stimuli were presented in (task) blocks of 30s containing 6 congruent and 6 incongruent trials appearing in a pseudorandom order with a 2.5s intertrial interval. Each task block was alternated with 25sblocks of passive fixation on a centrally presented cross. Six fMRI runs were collected for each subject, with each run consisting of 4 blocks of task and 5 blocks of passive fixation appearing in ABABABABA order (A=passive fixation, B=task). Crucially, subjects were instructed to perform oddnumbered runs “with maximum exertion” (EXR condition) and evennumbered runs “as relaxed as possible” (RLX condition). This scheme was reversed for a subset of volunteers to check for potential order effects. A major aim of the study was to compare the brain connectivity under REST (passive fixation) and the two TASK conditions (RLX and EXR).
2.2 Bayesian modeling of multiple networks
We develop a novel Bayesian approach for jointly estimating multiple grouplevel brain functional networks from multisubject fMRI data. For each subject, the data are demeaned and prewhitened across time points, where the prewhitened fMRI observations are considered statistically independent. The prewhitened fMRI data over nodes or regions of interest (ROI) for the th subject and th experimental condition at time point is denoted by . Our goal is to jointly estimate multiple networks denoted by using Gaussian graphical models characterized by sparse inverse covariance matrices. The graph is defined by the vertex set containing nodes and the edge set containing all edges/connections in the graph .
The prewhitened fMRI measurements for th experimental condition are modeled as , where
(1) 
where denotes the prior distribution, and denote the strength and probability of the functional connection between nodes and for network respectively, denotes the space of all positive definite matrices, denotes the indicator function, is the intractable normalizing constant for the prior on the precision matrix, denotes a variate Gaussian distribution with mean and covariance , and and denote the exponential and double exponential distributions with scale parameters and respectively. Small values of the scale parameters and in equation (3) result in a spike and slab prior (George and McCulloch, 1993) on the precision offdiagonals, so that is denoted as the spike and slab Bayesian graphical lasso. The spike and slab prior shrinks the values corresponding to absent edges toward zero and encourages values away from zero for important connections. The slab component is modeled under a Gaussian distribution having thick tails under small values of the precision parameter, while the spike component is modeled under a double exponential distribution having a sharp spike at zero under a large value of . It is straightforward to show that so that the prior in model (1) is proper using the results in Wang et al. (2012).
Pooling Information Across Experimental Conditions: Information is pooled across experimental conditions to estimate the edge weights , leading to joint estimation of multiple networks. Note that by pooling information to model the edge probabilities instead of the edge strengths, we are able to jointly model multiple brain networks without constraining the edge strengths in separate networks to be similar. The prior weights represent the unknown probabilities of having functional connections, and are modeled via a parametric link function comprising unknown shared and differential effects as described below
(2) 
for where is the parametric link function relating the probability for edge in network to the network specific differential effect () and common effect () across all networks, and denotes a Dirichlet process mixture prior defined by the precision parameter and base measure . The Dirichlet process mixture prior induces a flexible class of distributions on the edge probabilities, and also results in clusters of edges having the same prior inclusion probabilities, which enforces parsimony in the number of model parameters. The number of clusters and the cluster sizes are unknown and controlled via the precision parameter (Antoniak, 1974).
Under specification (2), the baseline effect represent the shared feature for edge which is estimated by pooling information across experimental conditions, resulting in the joint estimation of multiple networks. The baseline effect controls the overall probability of having an edge across all networks, while the differential effects contribute to the network specific variations which are estimated using the information from individual experimental conditions. For example, large differences between and potentially imply a differential status for edge between and . On the other hand when , the model specifies equal probability for edge in networks and . For ease in interpretability we choose a logistic form link in (2) as , so that can be interpreted as the log odds of having the edge in the network , and the log odds ratio of having edge in the brain network versus can be expressed as (). A schematic representation of the proposed model is illustrated in Figure 2.
Note that the parameters in (2) are not identifiable since for any real constant . However, the quantities of interest such as the log odds (), the logodds ratio (), and the edge probabilities themselves are clearly identifiable, which is adequate for our purposes. The proposed specification (2) is purposely overcomplete, which is an issue routinely arising in Bayesian models. Such overcompleteness ensures computational efficiency and interpretability and is designed to avoid any problems in identifiability of parameters and functionals of interest  refer, for example, to Ghosh and Dunson (2009).
3 Posterior Computation
We design a block Gibbs sampler in order to fit the proposed model (1). The sampler enables data adaptive shrinkage by introducing latent scale parameters to sample the precision matrix offdiagonals corresponding to the spike component under a scale mixture representation of Gaussians while defining conjugate priors on the precision parameters in the slab component. Define edge inclusion indicators as if edge is included in , and otherwise, where . The augmented likelihood for equation (1) can be written as
(3) 
where , , corresponds to a Gamma distribution with mean , and is the intractable normalizing constant which cancels out in the expression for to yield a marginal prior as in (1) after integrating out . In our implementation we prespecify to ensure a sharp spike at zero leading to strong shrinkage for precision offdiagonals corresponding to absent edges. On the other hand, we choose and such that is small, enabling adaptive thick tails for the Gamma prior on the latent scale parameters corresponding to the slab component.
We choose a logistic link function in (2) for our purposes, although more general link functions can also be used. For implementing a fully Gibbs sampler, we rely on an approximation to the logistic function using a probit link, which employs a data augmentation scheme as in O’brien and Dunson (2004). In particular,
where denotes a tdistribution, corresponds to a inverse Gamma distribution, , and is the Gaussian latent variable used for data augmentation. This approximation results in sampling from a posterior that is approximately equal to the posterior under specification (1)(2) using a logistic link function. Although such an approximation is used, we note that the resulting posterior computation is fully Gibbs since all MCMC samples are drawn from exact posterior distributions. Alternatively, one could adapt the Polyagamma data augmentation in Polson et al. (2013) for Bayesian logistic regression. However, the approximation in O’brien and Dunson (2004) works reasonably well in a wide variety of numerical studies in our experience.
Moreover, the stickbreaking representation (Sethuraman, 1994) is used for the Dirichlet process mixture prior in (2), which facilitates posterior computation and can be written as
(4) 
where represents a Beta distribution. The slice sampling technique (Walker, 2007) is used to sample the atoms from the infinite mixture in (2), which significantly expedites computation. A stepbystep description of the posterior computation is provided in the Supplementary Materials.
Edge Detection: The important edges in the graph under the proposed approach can be estimated by either including edges with high marginal inclusion probabilities or those with nonnegligible absolute values for the precision offdiagonal elements, lying above a chosen threshold. We propose a strategy to choose such thresholds in a manner which controls the false discovery rate. Denoting as the marginal posterior exclusion probability for edge in network , one can compute the false discovery rate (FDR) as in Newton et al. (2004) or Peterson et al. (2015) as
(5) 
depending on whether the edges are included based on posterior inclusion probabilities or edge strengths. Clearly the FDR increases with , and one can choose a suitable threshold to control the FDR. In our numerical experiments we found that choosing the edges based on whether the absolute edge strengths were greater than results in overall good numerical performance and adequate control of FDR across a wide spectrum of scenarios. Hence we recommend this as a default threshold for estimating the network under our approach, and we note that the corresponding threshold for posterior probability for edge selection can be obtained as one which yields similar FDR as computed using expression (5). Once all the networks have been estimated using this strategy, the differential edges are identified as those which show up in one network but not the others.
The proposed BJNL also provides a natural framework for testing differences in edge strengths between experimental conditions. As opposed to penalized likelihood approaches, which can only provide point estimates for the partial correlations, the MCMC samples from BJNL can be used to obtain the posterior distribution for differences in partial correlations. These differences can be Ztransformed using Fisher’s method to obtain normally distributed values, which can then be tested for significance using ttests after controlling for false discoveries (Benjamini and Hochberg, 1995). In our experience based on extensive numerical studies we find that the differential edges based on the FDRcorrected criteria and the partial correlation based approach are overwhelmingly common; however, there could be some additional differences detected under the ttest due to significant variations in edge strengths for important edges.
4 Numerical Studies
4.1 Simulation Setup
We conducted a series of simulations to compare group level network estimation between BJNL and two popular penalized network estimation methods: the graphical lasso (GL) (Friedman et al., 2008), which estimates networks separately, and the Joint Graphical Lasso (JGL) (Danaher et al., 2014) under a fused lasso penalty, which pools information across graphs to jointly estimate multiple networks. The JGL and the graphical lasso were implemented using the JGL and glasso packages in R, respectively. Our method was implemented in Matlab, version 8.3.0.532 (R2014a), and a GUI implementing the method has been submitted as a Supplemental Material.
The data for the simulation study was generated under a Gaussian graphical model for =60 subjects with =300 time points each and for dimensions . Each subject had data corresponding to two experimental conditions having networks with shared and differential patterns. We considered three different network structures: (a) ErdosRenyi networks which randomly generate edges with equal probabilities, (b) smallworld networks generated under the WattsStrogatz model (Watts and Strogatz, 1998), and (c) scalefree networks generated using the preferential attachment model (Barabási and Albert, 1999) resulting in a hub network. For each type of network, we obtained an adjacency matrix corresponding to the first experimental condition, and then flip a proportion of the edges in this adjacency matrix to obtain the second network, adding edges where there were no edges and removing an equal number of edges. The proportion of flipped edges was set to 25%(low), 50%(medium), and 75%(high), which correspond to varying levels of discordance between the experimental conditions.
After generating the networks, the corresponding precision matrices were constructed as follows. For each edge, we generated the corresponding offdiagonal element from a Uniform(1,1) distribution and fixed the diagonal elements to be one and the offdiagonals corresponding to absent edges as zero. In order to ensure that the resulting precision matrices were positive definite, we subtracted the minimum of the eigenvalues from each diagonal element of the generated precision matrix. To enable a group level comparison for each scenario, all subjects had the same network across all time points within each experimental condition and the same precision matrices for each network.
Tuning: We used BJNL with 1000 burnin iterations and 5000 MCMC iterations. We specified the tuning parameters as follows. We chose and with and in prior specification (3) to enforce a sharp spike at zero and thick tails for the slab component. The stick breaking weights in the mixture distribution in (4) were modeled as , where , and were chosen to encourage a small number of edge clusters for a parsimonious representation. Our experience in extensive numerical studies suggests that the performance of the approach is not sensitive to the choice of as long as it is large enough (); however, extremely large values of can result in numerical instability. Moreover, performance is fairly robust to the choice of the hyperparameters in the prior for the precision parameter of the slab component in (3), as long as the ratio .
The joint graphical lasso depends on two tuning parameters: a lasso penalty and a fused lasso penalty. We searched a 3030 grid over for both parameters to find the best combination of tuning parameters using a AIC criteria as recommended in Danaher et. al (2014). The graphical lasso was run independently for each network over a grid of regularization parameter values, and the optimal graph was selected for each network using a BIC criteria as described in Yuan and Lin (2007).
Performance metrics: We assessed the performance of the three algorithms in terms of the ability to estimate the individual networks, as measured by the area under the receiver operating characteristic (ROC) curve, the accuracy in estimating the strength of connections, as measured by the error in estimating the precision matrix, and the power to detect differential edges and control over false discoveries, as measured via sensitivity and 1specificity for differential edges. For all the metrics, we performed pairwise comparisons using Wilcoxon signed rank tests in order to assess whether one approach performed significantly better than the others. For edge detection, point estimates for the penalized networks were obtained by choosing the threshold for the absolute offdiagonal elements as 0.005, while for BJNL we computed thresholds controlling for false discoveries as described in Section 3.
4.2 Simulation Results
Figure 3 displays the ROC curves, Figure 4 displays box plots of the reported metrics for the ErdosRenyi case, and Table 1 reports results for the 100 node simulations. The box plots for the other networks, as well as the Table for the 40 node case are reported in the Supplementary Materials due to space constraints. The results across the three network types are relatively consistent. First, we note that the degree of dissimilarity between the networks does not appear to have a major effect on the relative performance of the algorithms, although we conjecture that the differences could be more pronounced for lower sample sizes. For all settings involving ErdosRenyi graphs, the proposed BJNL approach outperformed JGL and GL uniformly across all metrics under the Wilcoxon signed rank test. Notably, the proposed approach simultaneously achieved significantly a higher TPR and a significantly lower FDR for differential edges, indicating that it was both better able to detect significant differences and less likely to incorrectly classify an edge as differential. Similarly, for the smallworld and scalefree network simulations, BJNL had a significantly better performance than the frequentist methods in terms of AUC, Error, and TPR for differential edges. However, in terms of FDR we see that the JGL outperformed the BJNL, a result that seems to be due to the reluctance of the penalized approach to classify edges as differential, as indicated by its low TPR. For both the smallworld and scalefree network simulations, the GL had the highest FDR, which is likely due to its separate estimation of each network.
Figure 4 displays box plots of the reported metrics for the ErdosRenyi case, which provides a clear picture of the significant improvements under the proposed approach. Moreover, the boxplots for the Erdos Renyi case, together with the boxplots for the other two networks provided in the Supplementary Materials, suggest a greater power to detect true differential edges with an adequate control over false discoveries across all network types. Further, an increased improvement of the TPR over competing approaches and relative stability of the FDR for differential edges for versus indicates a clear advantage of the proposed joint estimation approach for increasing dimensions.
Moreover, the BJNL has a higher AUC which indicates an improved power to detect important connections across varying sparsity levels, and the corresponding low error reflects its accuracy in estimating connection strengths. On the other hand, the significantly higher error under the JGL potentially points to the perils of smoothing over edge strengths under penalized approaches. In particular, assigning similar magnitudes for precision matrix offdiagonals for shared edges may adversely affect the identification of differential edges, as well as the estimation of varying edge strengths for common edges across networks.
5 Stroop task analysis
5.1 Description of Analysis
We applied the proposed BJNL to the fMRI Stroop task study to investigate similarities and differences in the brain network under the two experimental conditions and passive fixation (REST). The first analysis was aimed at comparing the mental states of task performance (TASK) and passive fixation (REST), with the hypothesis that the brain networks exhibit major differences between these two grossly different conditions. The TASK data consisted of the subjectwise concatenation of the prewhitened fMRI time courses acquired during the EXR and RLX blocks, while the REST data consisted of the subjectwise concatenation of the prewhitened fMRI time courses acquired during the passive fixation blocks. The second analysis aimed to detect finer differences in connectivity between the mental states of EXR and RLX task performance. The study hypothesized that the mental states should be similar between the two task conditions with some fine differences in the network. In this case, the subjectwise prewhitened fMRI time courses were concatenated for the EXR blocks and also separately for the RLX blocks, corresponding to the two experimental conditions to be compared.
We performed a brain network analysis based on region of interest (ROI) level data, adopting the 90 node Automated Anatomical Labeling (AAL) cortical parcellation scheme described in TzourioMazoyer et al. (2002). For each ROI, we estimated the representative BOLD time series by performing a singular value decomposition on the time series of the voxels within the ROI and extracting the first principal time series. This resulted in 90 time courses of fMRI measurements, one for each ROI, which were then demeaned. We classified each ROI into one of nine functional modules as defined in Smith et al. (2009) using the technique described in Kemmer et al. (2015). We performed standard preprocessing including slicetiming correction, warping to standard Talairach space, blurring, demeaning, and prewhitening. Further details are provided in the Supplementary Materials. The proposed BJNL was run using the same tuning parameters as in the simulations.
Graph metrics: We analyzed the brain’s connectivity structure during the different mental states in terms of four graph metrics: global efficiency, local efficiency, clustering coefficient, and characteristic path length. Efficiency measures how effectively information is transmitted from nodetonode in a network. Global efficiency measures information transmission across the entire graph and is calculated by taking the average across all ROIs of the inverse shortest path lengths between ROIs. Thus, large values of global efficiency indicate that, on average, the number of steps required to transmit information from one node to another is small. Local efficiency measures information transmission between an ROI and its neighbors and is calculated for each ROI by taking the average of the inverse shortest path lengths between ROIs in the relevant neighborhood, where the relevant neighborhood is the collection of ROIs with a connection to the selected ROI. The clustering coefficient measures the interconnectedness of the graph and is calculated for each ROI by examining how many of its neighbors are also neighbors to each other. Finally, characteristic path length is the average across ROIs of the shortest path length in the networks, with smaller values indicating a more efficient network. All graph metrics were calculated using the Matlab Brain Connectivity Toolbox (Rubinov and Sporns, 2010). Differences in the graph metrics values across mental states were computed at each MCMC iteration, and the central tendency and dispersion of their distributions were statistically assessed by ttests and KolmogorovSmirnov tests.
5.2 Results
TASK vs REST Conditions: The analysis revealed a large contingent of edges with significantly different strengths in the two mental states. These results provide evidence supporting the study hypothesis that there are major differences in the brain networks due to the manifest phenomenological and procedural dissimilarity of task performance and rest. Ttests (, FDRcorrected) of the Fisher Ztransformed differences in partial correlation at each MCMC iteration revealed 763 significantly different edges, with 618 out of the 763 edges lying within our adopted functional module partition. Figure 6 displays circle plots of the sum of the strengths of the significant connections between modules, and Table 2 (A) displays the significant edge counts by functional module. Moreover, our examination of network metrics revealed significant differences in the mean and the posterior distributions for all network metrics between the two conditions (Figure 5).
EXR vs. RLX conditions of task performance: Compared with the relatively large network differences between TASK and REST, the network structures corresponding to the two task conditions, i.e. EXR and RLX, were much more similar to each other, as hypothesized by the investigator. Ttests (, FDRcorrected) of the Fisher’s Ztransformed partial correlation differences at each MCMC iteration revealed 247 significantly different edges between the EXR and RLX conditions, 226 of which lay within our functional module partition. Circle plots of the sum of the strengths of the significant connections between modules are provided in Figure 6 separately for positive and negative connections, and Table 2 (B) displays the significant edge counts by functional module. Also, none of the differences between the graph metrics for the two states were significant, providing evidence that the overall network features such as efficiency and path lengths between the two states are quite similar (Figure 5).
Interpretation of Findings: The above analyses provide an application of BJNL to both grossly and subtly different cognitive conditions. As expected, we identified drastic differences between Stroop task performance and passive fixation (REST). Descriptively, the TASK condition was associated with stronger positive connections involving frontoparietal circuits, DMN, sensorimotor, and visual cortices compared to the passive fixation condition. We also found significant differences in all graph metrics we calculated, which suggests highly different patterns of information transmission between task performance and passive fixation. These differences outline a picture of an overall state of more efficient brain connectivity during REST. For example the characteristic path length was significantly smaller during passive fixation than during task performance, indicating that information transmission between any two nodes requires fewer steps. This finding provides exciting new insight into the neurophysiological differences between the brain’s intrinsic network architecture under the REST condition and the taskrelated network in which the brain requires modification of some connections in order to perform the task.
The analysis of EXR vs. RLX revealed fewer dissimilarities, as would be expected due to the only difference between conditions being in the executive stance with which the subjects were instructed to perform the task (either with maximum effort, or in a relaxed fashion). Our results showed there were no global topological differences between the EXR and RLX conditions. However, BJNL did reveal some fine differences in the functional modules including the EC and FPL. These networks are involved in high level cognitive function. In general, the relaxed task performance condition featured significantly more negative connections between regions, which was especially evident in the executive control network.
As an illustrative alternative analysis, we used a permutation testing procedure to assess the differences in edge strengths. The labels for EXR and RLX were randomly permuted, and the graphical lasso was used to estimate two separate graphs. We then saved the edgewise difference in the graphs for each permutation. This procedure was repeated 10,000 times, and empirical values were calculated for the difference in edge strengths under the true labels. The pvalues were then FDRcorrected, as with the BJNL analysis, and none of the resulting pvalues were significant. The result suggests the proposed BNJL method has much better statistical power to detect differences in brain networks under different cognitive states compared to an approach involving independent estimation of networks.
6 Discussion
In this paper we introduced a novel Bayesian approach to joint estimation of multiple group level brain networks under different experimental conditions which pools information across networks to estimate shared patterns while also being able to detect differential edges. To our knowledge, ours is one of the first Bayesian approaches to address this important problem. Through a wide variety of simulation studies, we demonstrated clear and significant advantages of the proposed joint estimation approach over commonly used penalized approaches, with such improvements becoming more evident as the number of nodes increases. The method was applied to a Stroop task data, and the analysis reveal important dissimilarities between the task and rest conditions, but more subdued differences between the two task conditions. In contrast, an alternate analysis involving separate estimation of task networks followed by permutation testing did not reveal any significant differences, thus highlighting the advantages of joint estimation of multiple networks.
Although the proposed Bayesian approach typically results in more accurate estimation of multiple group level networks and provides measures of uncertainty, it is slower than the penalized methods and the total computation time increases with the number of experimental conditions as well as the number of nodes. However, the latter is true of any alternate graphical modeling approach. It is important to note that given a fixed number of nodes, the proposed method can be made considerably faster by adopting a parallel computation scheme which samples the precision matrices in parallel given the networks . Future work should investigate the scalability of BJNL to larger number of experimental conditions while taking into account the dynamic nature of the brain networks over time. In this paper, we demonstrated BJNL for estimating networks using fMRI data because they are the most prevalent type of functional images. However, our method can also be generalized to data from other imaging modalities such as positron emission tomography in a straightforward manner. Going beyond multiple experimental conditions, our approach can also be used to jointly model networks across multiple cohorts, such as healthy individuals, subjects with mild cognitive disorder, and those with Alzheimer’s disease.
7 Supplementary Materials
The Supplementary Materials contain the posterior computation steps in detail, the fMRI data preprocessing steps, the results of the 40 node simulation studies, and the boxplots for performance metrics under the small world and scalefree networks simulation scenarios.
References
 (1)
 Antoniak (1974) Antoniak, C. E. (1974), ‘Mixtures of dirichlet processes with applications to bayesian nonparametric problems’, The annals of statistics pp. 1152–1174.
 Barabási and Albert (1999) Barabási, A.L. and Albert, R. (1999), ‘Emergence of scaling in random networks’, science 286(5439), 509–512.
 Belilovsky et al. (2016) Belilovsky, E., Varoquaux, G. and Blaschko, M. B. (2016), Testing for differences in gaussian graphical models: applications to brain connectivity, in ‘Advances in Neural Information Processing Systems’, pp. 595–603.
 Benjamini and Hochberg (1995) Benjamini, Y. and Hochberg, Y. (1995), ‘Controlling the false discovery rate: a practical and powerful approach to multiple testing’, Journal of the royal statistical society. Series B (Methodological) pp. 289–300.
 Biswal et al. (1997) Biswal, B. B., Kylen, J. V. and Hyde, J. S. (1997), ‘Simultaneous assessment of flow and bold signals in restingstate functional connectivity maps’, NMR in Biomedicine 10(45), 165–170.
 Biswal et al. (1995) Biswal, B., Zerrin Yetkin, F., Haughton, V. M. and Hyde, J. S. (1995), ‘Functional connectivity in the motor cortex of resting human brain using echoplanar mri’, Magnetic resonance in medicine 34(4), 537–541.
 Buckner et al. (2013) Buckner, R. L., Krienen, F. M. and Yeo, B. T. (2013), ‘Opportunities and limitations of intrinsic functional connectivity mri’, Nature neuroscience 16(7), 832–837.
 Cole et al. (2014) Cole, M. W., Bassett, D. S., Power, J. D., Braver, T. S. and Petersen, S. E. (2014), ‘Intrinsic and taskevoked network architectures of the human brain’, Neuron 83(1), 238–251.
 Danaher et al. (2014) Danaher, P., Wang, P. and Witten, D. M. (2014), ‘The joint graphical lasso for inverse covariance estimation across multiple classes’, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76(2), 373–397.
 Eguiluz et al. (2005) Eguiluz, V. M., Chialvo, D. R., Cecchi, G. A., Baliki, M. and Apkarian, A. V. (2005), ‘Scalefree brain functional networks’, Physical review letters 94(1), 018102.
 Fair et al. (2007) Fair, D. A., Schlaggar, B. L., Cohen, A. L., Miezin, F. M., Dosenbach, N. U., Wenger, K. K., Fox, M. D., Snyder, A. Z., Raichle, M. E. and Petersen, S. E. (2007), ‘A method for using blocked and eventrelated fmri data to study âresting stateâ functional connectivity’, Neuroimage 35(1), 396–405.
 Fornito et al. (2013) Fornito, A., Zalesky, A. and Breakspear, M. (2013), ‘Graph analysis of the human connectome: promise, progress, and pitfalls’, Neuroimage 80, 426–444.
 Fox et al. (2007) Fox, M. D., Snyder, A. Z., Vincent, J. L. and Raichle, M. E. (2007), ‘Intrinsic fluctuations within cortical systems account for intertrial variability in human behavior’, Neuron 56(1), 171–184.
 Friedman et al. (2008) Friedman, J., Hastie, T. and Tibshirani, R. (2008), ‘Sparse inverse covariance estimation with the graphical lasso’, Biostatistics 9(3), 432–441.
 Genovese et al. (2002) Genovese, C. R., Lazar, N. A. and Nichols, T. (2002), ‘Thresholding of statistical maps in functional neuroimaging using the false discovery rate’, Neuroimage 15(4), 870–878.
 George and McCulloch (1993) George, E. I. and McCulloch, R. E. (1993), ‘Variable selection via gibbs sampling’, Journal of the American Statistical Association 88(423), 881–889.
 Ghosh and Dunson (2009) Ghosh, J. and Dunson, D. B. (2009), ‘Default prior distributions and efficient posterior computation in bayesian factor analysis’, Journal of Computational and Graphical Statistics 18(2), 306–320.
 Gianaros et al. (2005) Gianaros, P. J., Derbtshire, S. W., May, J. C., Siegle, G. J., Gamalo, M. A. and Jennings, J. R. (2005), ‘Anterior cingulate activity correlates with blood pressure during stress’, Psychophysiology 42(6), 627–635.
 Greicius et al. (2003) Greicius, M. D., Krasnow, B., Reiss, A. L. and Menon, V. (2003), ‘Functional connectivity in the resting brain: a network analysis of the default mode hypothesis’, Proceedings of the National Academy of Sciences 100(1), 253–258.
 Guo et al. (2011) Guo, J., Levina, E., Michailidis, G., Zhu, J. et al. (2011), ‘Joint estimation of multiple graphical models’, Biometrika 98(1), 1–15.
 Hermundstad et al. (2013) Hermundstad, A. M., Bassett, D. S., Brown, K. S., Aminoff, E. M., Clewett, D., Freeman, S., Frithsen, A., Johnson, A., Tipper, C. M., Miller, M. B. et al. (2013), ‘Structural foundations of restingstate and taskbased functional connectivity in the human brain’, Proceedings of the National Academy of Sciences 110(15), 6169–6174.
 Huang et al. (2010) Huang, S., Li, J., Sun, L., Ye, J., Fleisher, A., Wu, T., Chen, K., Reiman, E., Initiative, A. D. N. et al. (2010), ‘Learning brain connectivity of alzheimer’s disease by sparse inverse covariance estimation’, NeuroImage 50(3), 935–949.
 Kemmer et al. (2015) Kemmer, P. B., Guo, Y., Wang, Y. and Pagnoni, G. (2015), ‘Networkbased characterization of brain functional connectivity in zen practitioners’, Frontiers in psychology 6, 603.
 Kim et al. (2015) Kim, J., Pan, W., Initiative, A. D. N. et al. (2015), ‘Highly adaptive tests for group differences in brain functional connectivity’, NeuroImage: Clinical 9, 625–639.
 Kim et al. (2014) Kim, J., Wozniak, J. R., Mueller, B. A., Shen, X. and Pan, W. (2014), ‘Comparison of statistical tests for group differences in brain functional networks’, NeuroImage 101, 681–694.
 Lee et al. (2013) Lee, M. H., Smyser, C. D. and Shimony, J. S. (2013), ‘Restingstate fmri: a review of methods and clinical applications’, American Journal of Neuroradiology 34(10), 1866–1872.

Lin et al. (2017)
Lin, Z., Wang, T., Yang, C. and Zhao, H. (2017), ‘On joint estimation of gaussian graphical models for
spatial and temporal data’, Biometrics pp. n/a–n/a.
http://dx.doi.org/10.1111/biom.12650  Mennes et al. (2013) Mennes, M., Kelly, C., Colcombe, S., Castellanos, F. X. and Milham, M. P. (2013), ‘The extrinsic and intrinsic functional architectures of the human brain are not equivalent’, Cerebral cortex 23(1), 223–229.
 Meskaldji et al. (2011) Meskaldji, D. E., Ottet, M.C., Cammoun, L., Hagmann, P., Meuli, R., Eliez, S., Thiran, J. P. and Morgenthaler, S. (2011), ‘Adaptive strategy for the statistical analysis of connectomes’, PloS one 6(8), e23009.
 Müller et al. (1996) Müller, P., Erkanli, A. and West, M. (1996), ‘Bayesian curve fitting using multivariate normal mixtures’, Biometrika 83(1), 67–79.
 Mumford and Ramsey (2014) Mumford, J. A. and Ramsey, J. D. (2014), ‘Bayesian networks for fmri: a primer’, Neuroimage 86, 573–582.
 Newton et al. (2004) Newton, M. A., Noueiry, A., Sarkar, D. and Ahlquist, P. (2004), ‘Detecting differential gene expression with a semiparametric hierarchical mixture method’, Biostatistics 5(2), 155–176.
 O’brien and Dunson (2004) O’brien, S. M. and Dunson, D. B. (2004), ‘Bayesian multivariate logistic regression’, Biometrics 60(3), 739–746.
 Peterson et al. (2015) Peterson, C., Stingo, F. C. and Vannucci, M. (2015), ‘Bayesian inference of multiple gaussian graphical models’, Journal of the American Statistical Association 110(509), 159–174.
 Polson et al. (2013) Polson, N. G., Scott, J. G. and Windle, J. (2013), ‘Bayesian inference for logistic models using pólya–gamma latent variables’, Journal of the American statistical Association 108(504), 1339–1349.
 Qiu et al. (2016) Qiu, H., Han, F., Liu, H. and Caffo, B. (2016), ‘Joint estimation of multiple graphical models from high dimensional time series’, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78(2), 487–504.
 Rubinov and Sporns (2010) Rubinov, M. and Sporns, O. (2010), ‘Complex network measures of brain connectivity: uses and interpretations’, Neuroimage 52(3), 1059–1069.
 Salvador et al. (2005) Salvador, R., Suckling, J., Coleman, M. R., Pickard, J. D., Menon, D. and Bullmore, E. (2005), ‘Neurophysiological architecture of functional magnetic resonance images of human brain’, Cerebral cortex 15(9), 1332–1342.
 Sethuraman (1994) Sethuraman, J. (1994), ‘A constructive definition of dirichlet priors’, Statistica sinica pp. 639–650.
 Smith et al. (2013) Smith, S. M., Beckmann, C. F., Andersson, J., Auerbach, E. J., Bijsterbosch, J., Douaud, G., Duff, E., Feinberg, D. A., Griffanti, L., Harms, M. P. et al. (2013), ‘Restingstate fmri in the human connectome project’, Neuroimage 80, 144–168.
 Smith et al. (2009) Smith, S. M., Fox, P. T., Miller, K. L., Glahn, D. C., Fox, P. M., Mackay, C. E., Filippini, N., Watkins, K. E., Toro, R., Laird, A. R. et al. (2009), ‘Correspondence of the brain’s functional architecture during activation and rest’, Proceedings of the National Academy of Sciences 106(31), 13040–13045.
 Smith et al. (2011) Smith, S. M., Miller, K. L., SalimiKhorshidi, G., Webster, M., Beckmann, C. F., Nichols, T. E., Ramsey, J. D. and Woolrich, M. W. (2011), ‘Network modelling methods for fmri’, Neuroimage 54(2), 875–891.
 Stam and Reijneveld (2007) Stam, C. J. and Reijneveld, J. C. (2007), ‘Graph theoretical analysis of complex networks in the brain’, Nonlinear biomedical physics 1(1), 3.
 Stroop (1935) Stroop, J. R. (1935), ‘Studies of interference in serial verbal reactions.’, Journal of experimental psychology 18(6), 643.
 Supekar et al. (2008) Supekar, K., Menon, V., Rubin, D., Musen, M. and Greicius, M. D. (2008), ‘Network analysis of intrinsic functional brain connectivity in alzheimer’s disease’, PLoS Comput Biol 4(6), e1000100.
 TzourioMazoyer et al. (2002) TzourioMazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., Mazoyer, B. and Joliot, M. (2002), ‘Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the mni mri singlesubject brain’, Neuroimage 15(1), 273–289.
 van den Heuvel et al. (2008) van den Heuvel, M. P., Stam, C. J., Boersma, M. and Pol, H. H. (2008), ‘Smallworld and scalefree organization of voxelbased restingstate functional connectivity in the human brain’, Neuroimage 43(3), 528–539.
 Vincent et al. (2007) Vincent, J. L., Patel, G. H., Fox, M. D., Snyder, A. Z., Baker, J. T., Van Essen, D. C., Zempel, J. M., Snyder, L. H., Corbetta, M. and Raichle, M. E. (2007), ‘Intrinsic functional architecture in the anaesthetized monkey brain’, Nature 447(7140), 83–86.
 Walker (2007) Walker, S. G. (2007), ‘Sampling the dirichlet mixture model with slices’, Communications in StatisticsâSimulation and Computation® 36(1), 45–54.
 Wang et al. (2012) Wang, H. et al. (2012), ‘Bayesian graphical lasso models and efficient posterior computation’, Bayesian Analysis 7(4), 867–886.
 Wang et al. (2016) Wang, Y., Kang, J., Kemmer, P. B. and Guo, Y. (2016), ‘An efficient and reliable statistical method for estimating functional connectivity in large scale brain networks using partial correlation’, Frontiers in neuroscience 10.
 Watts and Strogatz (1998) Watts, D. J. and Strogatz, S. H. (1998), ‘Collective dynamics of âsmallworldânetworks’, nature 393(6684), 440–442.
 Yajima et al. (2012) Yajima, M., Telesca, D., Ji, Y. and Muller, P. (2012), ‘Differential patterns of interaction and gaussian graphical models’.
 Zalesky et al. (2010) Zalesky, A., Fornito, A. and Bullmore, E. T. (2010), ‘Networkbased statistic: identifying differences in brain networks’, Neuroimage 53(4), 1197–1207.
 Zhu et al. (2014) Zhu, Y., Shen, X. and Pan, W. (2014), ‘Structural pursuit over multiple undirected graphs’, Journal of the American Statistical Association 109(508), 1683–1696.