MultiView MultiGraph Embedding for Brain Network Clustering Analysis
Abstract
Network analysis of human brain connectivity is critically important for understanding brain function and disease states. Embedding a brain network as a whole graph instance into a meaningful lowdimensional representation can be used to investigate disease mechanisms and inform therapeutic interventions. Moreover, by exploiting information from multiple neuroimaging modalities or views, we are able to obtain an embedding that is more useful than the embedding learned from an individual view. Therefore, multiview multigraph embedding becomes a crucial task. Currently only a few studies have been devoted to this topic, and most of them focus on vectorbased strategy which will cause structural information contained in the original graphs lost. As a novel attempt to tackle this problem, we propose Multiview Multigraph Embedding (M2E) by stacking multigraphs into multiple partiallysymmetric tensors and using tensor techniques to simultaneously leverage the dependencies and correlations among multiview and multigraph brain networks. Extensive experiments on real HIV and bipolar disorder brain network datasets demonstrate the superior performance of M2E on clustering brain networks by leveraging the multiview multigraph interactions.
Index terms— Brain Network Embedding, Multigraph Embedding, Tensor Factorization, Multiview Learning
1 Introduction
Benefiting from modern neuroimaging technology, there is an increasing amount of graph data representing the human brain, called brain networks, e.g., functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI). These data have complex structure, which are inherently represented as graphs with a set of nodes and links. Moreover, the linkage structure extracted from different modalities can often be treated as multiview data. The connections in fMRI brain networks encode correlations among brain regions in terms of functional activities, while in DTI networks, the connections can capture the white matter fiber pathways that connect different brain regions. Even if these individual views might be sufficient on their own for a given learning task, they can often provide complementary information to each other which can lead to improve performance on the learning task [\citeauthoryearMa et al.2017a, \citeauthoryearSun et al.2017]. As labeled data are difficult to obtain, it is critical to leverage the multiview information to obtain an effective embedding for the clustering task. Therefore, in this study, we focus on investigating the multiview multigraph embedding problem for brain network clustering analysis. Specifically, we aim to learn the latent embedding representation of multiple brain networks from multiple views. In recent years, there has been an increasing interest in singlegraph node embedding among researchers [\citeauthoryearMousazadeh and Cohen2015, \citeauthoryearOu et al.2016], yet it is challenging to extend to the multigraph embedding, where we consider the embedding of multiple graph instances together to obtain a discriminative representation for each graph. By exploring the consistency and complementary properties of different views, multiview learning [\citeauthoryearLiu et al.2013a, \citeauthoryearCao et al.2014] is rendered more effective, more promising, and has better generalization ability than singleview learning. Although there have been numerous works on singlegraph embedding and multiview learning, to the best of our knowledge, there is no embedding method available which enables preserving multigraph structures on multiple views.
There are several major challenges in multiview multigraph embedding problem. Initially, the complex graph structure makes conventional methods difficult to capture the subtle local topological information [\citeauthoryearJie et al.2014]. For the subgraph based method [\citeauthoryearCao et al.2015], the number of subgraphs is exponential to the size of the graphs. Thus the subgraph enumeration process is both time and memory consuming. Besides, simply preserving pairwise distances, as with many spectral methods, is insufficient for capturing the structure of multiple graphs. Moreover, preserving both local distances and graph topology is crucial for producing effective lowdimensional representations of the brain network data. Furthermore, traditional normalization strategies cannot generate meaningful clustering results.
To address the aforementioned issues, in this paper we propose a novel onestep Multiview Multigraph Embedding (M2E) approach for brain network analysis. The goal of M2E is to find lowdimensional representations from multiview multigraph data which reveal patterns and structures among the brain networks. The conceptual view of M2E is shown in Figure 1. Our main contributions are summarized as follows:

In each view, we stack all the brain networks on each view into a tensor and use tensor and matrix techniques to simultaneously leverage the dependencies and correlations among multiview and multigraph data in a unified framework. This provides an innovative perspective on the analysis of brain network structures.

In order to reflect the latent clustering structure shared by different views and different graphs, we require coefficient matrices learned from different views towards a consensus with soft regularization.

We present an effective optimization strategy to solve the M2E problem, with consideration of symmetric structure of brain networks.
Through extensive experiments on HIV and bipolar disorder brain network datasets that contain fMRI and DTI views, we demonstrate that M2E can significantly boost the embedding performances. Furthermore, the derived factors are visualized which could be informative for investigating disease mechanisms.
2 Related Work
Brain Network Embedding makes characterization of brain disorders at a wholebrain connectivity level possible, thus providing a new direction for brain disease clustering. The goal of graph embedding is to find lowdimensional representations of nodes that can preserve the important structure and properties of graphs [\citeauthoryearMa et al.2016]. In particular, [\citeauthoryearMousazadeh and Cohen2015] proposed a graph embedding algorithm based on Laplaciantype operator on manifold, which can apply to recover the geometry of data and extend a function on new data points. Recently, [\citeauthoryearOu et al.2016] established a general formulation of highorder proximity measurements, and then applied it with generalized SVD for graph embedding. In the field of brain network neuroscience, most of the existing works aim to learn the structure from a specific kind of brain networks [\citeauthoryearKong and Yu2014, \citeauthoryearKuo et al.2015]. In contrast to the node embedding on single graph, we aim at learning an effective graph embedding approach on multiview multigraph brain networks, such as fMRI brain network together with DTI brain network.
Our approach is also closely related to the literature on multiview clustering and multiview embedding. [\citeauthoryearKumar and Daumé2011, \citeauthoryearKumar, Rai, and Daume2011] are the first works proposed to solve the multiview clustering problem via spectral projection. [\citeauthoryearNie, Li, and Li2016] extended the multiview spectral clustering to a parameterfree autoweighted method. Matrix factorization based methods [\citeauthoryearLiu et al.2013a] are another category, which mainly use nonnegative matrix factorization (NMF) to integrate multiview data. Additionally, [\citeauthoryearShao, He, and Yu2015] proposed a CP factorization and norm regularization based method for multiview incomplete clustering. [\citeauthoryearMa et al.2017b] coupled the spectral clustering and norm to discriminate the hubs and to reduce the potential influence of the hubs for graph clustering. However, there is no method available which enables us to take multiple graphs as input and consider multigraph structures; thus multiview learning cannot solve the brain network embedding problem well.
3 Preliminaries
In this section, we introduce some related concepts and notation about tensor. Table 1 lists basic symbols that will be used throughout the paper.
Tensors are higher order arrays that generalize the notion of vectors and matrices. The order of a tensor is the number of dimensions (a.k.a. modes or ways). An th order tensor is represented as , where is the cardinality of its th mode, . All vectors are column vectors unless otherwise specified. For an arbitrary matrix , its th row and th column vector are denoted by and , respectively.
Definitions of partially symmetric tensor, mode matricization and CP factorization are given below, which will be used to present our model.
Definition 1 (Partial Symmetric Tensor).
An th order tensor is a rankone partial symmetric tensor if it is partial symmetric on modes , and can be written as the tensor product of vectors, i.e.,
(1) 
where .
Definition 2 (Mode Matricization).
The mode matricization of a tensor , denoted by , where . Each tensor element with indices () maps to a matrix element (), such that
(2) 
Symbol  Definition and Description 

each lowercase letter represents a scale  
each boldface lowercase letter represents a vector  
each boldface uppercase letter represents a matrix  
each calligraphic letter represents a tensor, set or space  
a set of integers in the range of to inclusively.  
denotes the outer product  
denotes KhatriRao product  
denotes the CP factorization 
Definition 3 (CP Factorization).
For a general tensor , its CANDECOMP/PARAFAC (CP) factorization is
(3) 
where for , are latent factor matrices of size , is the number of factors, and is used for shorthand.
To obtain the CP factorization , the objective is to minimize the following estimation error:
(4) 
However, is not jointly convex w.r.t. . A widely used optimization technique is the Alternating Least Squares (ALS) algorithm, which alternatively minimize for each variable while fixing the other, that is,
(5) 
where .
4 Methodology
In this section, we first define the problem of interest. Then we formulate the proposed Multiview Multigraph Embedding (M2E) method. Finally, we introduce an effective optimization approach to solve the proposed formulation.
4.1 Problem Definition
We study the problem of multiview multigraph embedding for brain network clustering analysis. Suppose that the problem includes subjects with views, where each view has a set of symmetric brain networks corresponding to subjects. Specifically, each brain network is represented as a weighted undirected graph, i.e., a symmetric affinity matrix where denotes the number of nodes and each element reflects connectivity between nodes. There exists a onetoone mapping between nodes in different graphs, which means that all the graphs have a common node set . Thus, for the th view, we have graphs associated with affinity matrices, denoted as . We use to represent the multiview multigraph instances.
The goal of this work is to learn a common embedding across all brain networks, denoted as , where is the embedding dimension and each row of corresponds to an embedding of a brain network as a whole. More specifically, we aim at finding by simultaneously leveraging the dependencies and correlations among multiple views and multiple graphs in , and taking into account the symmetric property of brain networks. In particular, we investigate the use of learned embedding for clustering brain networks. Let the number of clusters be . So, we cluster brain networks into groups.
4.2 M2e Approach
Solving challenging multiview multigraph embedding problem requires the use of “complex” structured models – those incorporating relationships between multiple views and multiple graphs. The multimode structure of tensor provides a natural way to encode the underlying multiple correlations between data [\citeauthoryearHe et al.2017]. Inspired by the success of tensor analysis on many structured learning problems, here we explore the use of tensor operator techniques to consider all possible dependence relationships among different views and different graphs.
Given a multiview multigraph dataset . In order to capture the multigraph structures directly, we concatenate the affinity matrices of different subjects for each view to form a thirdorder tensor comprising three modes: nodes, nodes, and subjects, denoted as . Notice that since each brain network is a symmetric network, thus the resulting tensor is a partial symmetric tensor.
Tensor provides a natural and efficient representation for multigraph data, but there is no guarantee that such representation will be good for subsequent learning, since learning will only be successful if the regularities that underlie the data can be discerned by the model [\citeauthoryearHe et al.2014]. In previous work, it was found that CP factorization is particularly effective to acknowledge the connections and find valuable features among tensor data [\citeauthoryearVan Loan2016]. Motivated by these observations, in the following we investigate how to exploit the benefits of CP factorization to find an effective embedding in the sense of multiview partial symmetric tensors .
A simple method is to learn a viewindependent multiview representation from the tensors , and then feed it into a conventional multiview embedding method. This can be formulated as follows:
(6) 
where and are the latent factor matrices obtained by CP factorization. A graphical representation of this process in one view is given in Figure 2. can be viewed as common features of nodes among all graphs involved in the th view, while are treated as embedded features of each graph in the th view.
Based on the above obtained multiview features , we can directly establish the following multiview model to learn a common embedding :
(7) 
where are the weight parameters reflecting the importance of different views.
However, the twostep method, referred to as M2ETS, is not guaranteed to produce an optimal clustering result, because multiple views and multiple graphs are explored separately. For clustering, we assume that a data point in different views would be assigned to the same cluster with high probability. Therefore, in terms of tensor factorization, we require coefficient matrices learned from different views to be softly regularized towards a common consensus. This consensus matrix is considered to reflect the latent clustering structure shared by different views and different graphs. Based on this idea, we incorporate the Eq. (6) and Eq. (7) together and achieve the following optimization problem for M2E method:
(8)  
Notice that the first term is used to explore the dependencies among multiple graphs, and the second term is used to explore the consensus correlations among multiple views. not only tune the relative weight among different views, but also between the first term and the second term. is the final embedding solution used for multview multigraph brain network clustering. To induce groupings on , we simply use means [\citeauthoryearHartigan and Wong1979].
In order to verify the effectiveness of soft regularization in Eq. (8), we propose M2EDS as compared method which learns the latent embedding representation by using the directly shared coefficient matrices for all views [\citeauthoryearLiu et al.2013b]; the objective function is shown as
(9) 
In this formulation, different views are treated equally. However, in reality, different views may have different effects. The detail will be discussed in the Section Experiments.
4.3 Optimization Framework
The model parameters in Eq. (8) that have to be estimated include , and . Since the optimization problem is not convex with respect to and together, there is no closedform solution. We introduce an effective iteration method to solve this problem. The main idea is to decouple the parameters using an Alternating Direction Method of Multipliers (ADMM) approach [\citeauthoryearBoyd et al.2011]. Specifically, the following three steps are repeated until convergence.
Fixing and , compute
Note that is a partially symmetric tensor and the objective function in Eq. (8) involving a fourthorder term is difficult to optimize directly. To obviate this problem, we use a variable substitution technique and minimize the following objective function
(10) 
where are auxiliary variables.
The augmented Lagrangian function for the problem in Eq. (10) is
(11) 
where are Lagrange multipliers, and is the penalty parameter which can be adjusted efficiently according to [\citeauthoryearLin, Liu, and Su2011].
To compute , the optimization problem in Eq. (11) can be formulated as
(12) 
where is the mode matricization of , and .
The problem (13) is a univariate optimization problem, and can be solved easily. An effective approach to solve such a problem is by the proximal gradient method [\citeauthoryearParikh, Boyd, and others2014], which updates by
(14) 
where is the Lipschitz coefficient of Eq.(13) that equals to the maximum eigenvalue of .
To efficiently compute , we consider the following property of the KhatriRao product of two matrices
(15) 
where denotes the Hadamard product or elementwise product of two matrices.
Then the auxiliary matrix can be optimized successively in a similar way
(16) 
where and . And and is the mode2 matricization of .
Moreover, we update the Lagrange multipliers using the gradient descent method by
(17) 
Fixing and , compute
By fixing and , We minimize the following objective function
(18) 
where is the mode3 matricization of tensor and .
Such an optimization problem can be solved in a similar way as Eq. (12), from which we get the update rule of as follows
(19) 
where , , and is the maximum eigenvalue of 2.
Fixing and , minimize over
When and are fixed, the problem in Eq. (8) is reduce to a convex optimization problem with respect to . By taking the derivative of the objective function in Eq. (8) with respect to and setting it to zero, we get
(20) 
Based on the above analysis, we outline the optimization framework for multiview multigraph brain network embedding in Algorithm 1.
Computational Analysis
Algorithm 1 iteratively solves Eq. (8). In each iteration solving , and all requires , solving requires , and solving requires . Overall, the time complexity involves , which is linear to the number of nodes , so the proposed method is applicable for larger scale of brain network.
5 Experiments and Evaluation
In order to empirically evaluate the performance of the proposed M2E approach for multiview multigraph brain network clustering analysis, we test our model on two real datasets, HIV and Bipolar disorder with fMRI brain networks and DTI brain networks, and compare with several stateoftheart multiview clustering methods.
Method  

Dataset  Measure  SEC  convexSub  AMGL  multiNMF  CoRegSc  MIC  SCMV3DT  M2ETS  M2EDS  M2E 
Accuracy  50.00(8)  52.86(7)  52.86(7)  57.23(4)  57.14(5)  55.72(6)  64.29(3)  52.86(7)  68.57(2)  71.43(1)  
F1  49.74(8)  53.52(7)  53.52(7)  56.18(6)  60.53(4)  58.28(5)  66.67(3)  49.23(9)  68.57(2)  72.22(1)  
Precision  50.00(9)  52.77(7)  52.77(7)  57.37(5)  56.10(5)  61.12(4)  62.50(3)  53.25(6)  68.57(2)  69.73(1)  
HIV  Recall  49.86(8)  54.29(7)  54.29(7)  55.98(5)  65.71(4)  56.28(6)  71.43(2)  45.57(9)  68.57(3)  75.00(1) 
Accuracy  54.64(7)  52.57(9)  52.77(8)  58.52(4)  56.70(6)  61.86(2)  54.64(7)  57.73(5)  60.82(3)  68.04(1)  
F1  56.86(7)  28.46(8)  28.13(9)  66.99(3)  60.38(6)  72.59(1)  59.26(6)  64.34(4)  61.22(5)  68.69(2)  
Precision  58.00(9)  74.78(1)  74.83(2)  58.34(8)  59.25(5)  59.03(6)  57.14(9)  58.73(7)  65.22(4)  72.34(3)  
BP  Recall  55.76(7)  17.84(8)  17.71(9)  80.40(2)  61.53(5)  94.23(1)  61.53(5)  71.15(3)  57.69(6)  65.38(4) 
5.1 Data Collection and Preprocessing

Human Immunodeficiency Virus Infection (HIV): The original dataset is unbalanced, we randomly sampled 35 patients and 35 controls from the dataset for performance evaluation. A detailed description about data acquisition is available in [\citeauthoryearCao et al.2015]. For fMRI data, we used DPARSF
^{1} and SPM^{2} toolboxes for preprocessing. We construct each graph with nodes where links are created based on the correlations between different brain regions. For DTI data, we used FSL toolbox^{3} for preprocessing and parcellated each DTI image into regions by the AAL [\citeauthoryearTzourioMazoyer et al.2002]. 
Bipolar Disorder (BP): This dataset consists of 52 bipolar subjects who are currently in euthymia and 45 healthy controls. For fMRI data, we used the toolbox CONN
^{4} to construct fMRI data of the BP brain network [\citeauthoryearWhitfieldGabrieli and NietoCastanon2012]. Using the labels Freesurfergenerated cortical/subcortical gray matter regions, functional brain networks were derived using pairwise BOLD signal correlations. For DTI, same as fMRI, we constructed the DTI image into regions.
5.2 Baselines and Metrics
We compare the proposed M2E with eight other methods for multiview clustering on brain networks. We adopt accuracy and F1score as our evaluation metrics.
SEC is a singleview spectral embedding clustering framework [\citeauthoryearNie et al.2011]. convexSub is a convex subspace representation learning method [\citeauthoryearGuo2013]. CoRegSc is a coregularization based multiview spectral clustering framework [\citeauthoryearKumar, Rai, and Daume2011]. MultiNMF is the NMFbased multiview clustering method by searching for a factorization that gives compatible clustering solutions across multiple views [\citeauthoryearLiu et al.2013a]. MIC first uses the kernel matrices to form an initial tensor across all the multiple sources [\citeauthoryearShao, He, and Yu2015]. AMGL is a recently proposed nonparameter multiview spectral learning framework [\citeauthoryearNie, Li, and Li2016]. SCMV3DT uses tproduct in the thirdorder tensor space and represents multiview data by a tlinear combination with sparse and lowrank [\citeauthoryearYin et al.2016]. M2ETS is the twostep method of M2E. M2EDS is the directly shared method of M2E mentioned on Section Methodology.
Since SEC is designed for singleview data, we first concatenate all the views together and then apply SEC on the concatenated views. For all the spectral clustering based methods, we construct the RBF kernel matrices with kernel width to be the median distance among all the brain network samples. Following [\citeauthoryearVon Luxburg2007], we construct each graph by selecting nearest neighbors among raw data. We tune the parameters of each baseline methods using the strategy mentioned in the corresponding paper. There are three main parameters in our model, namely , and , where and are the weight parameters reflecting the importance of different views, and is the number of factors representing the embedded dimension. We apply the grid search to determine the optimal values of these three parameters. In particular, we empirically select and from , and is selected from . For evaluation, since there are two possible label values, normal and control, for each brain network sample on both HIV and BP datasets, we set the number of clusters to be and test how well our method can group the brain networks of patients and normal controls into two different clusters.
In order to make a fair comparison, we apply the “Litekmeans” function in Matlab [\citeauthoryearCai2011] for all the compared methods during their means clustering step. We repeat this means clustering procedure 20 times with random initialization, as “Litekmeans” greatly depends on initialization. For the evaluation, we repeat running the program of each clustering methods 20 times and report the average Accuracy, F1 score, Precision and Recall as the results.
5.3 Clustering Results
Table 2 shows the clustering results. We see that in terms of accuracy, the proposed M2E method performs better than all the other baseline methods on both HIV and BP datasets. The singleview clustering SEC does not distinguish the features from different views, which leads to a poor performance than the multiview methods. Compared with subgraph method convexSub, M2E achieves better performance, which may because our method can capture the complex multiway relation in brain networks. The common property of three multiview clustering methods, AMGL, multiNMF and CoRegSc, is that the features they learned for each view are based on vector representations. However, for graph instances, the structural information is hardly persevered by the flattened vector representations, which could be the underlying reason that these three methods cannot outperform M2E. Moreover, by using tensor techniques to model the multiview multigraph collectively, M2E could learn discriminative latent representations and graphspecific features. While MIC and SCMV3DT also use tensor to represent the multiview learning, their performance are not beyond M2E. This is mainly because they still learn the vector representation of each graph, which makes them fail to capture the multigraph structure. M2ETS cannot explore multiple views and multiple graphs simultaneously; therefore it gets a worse clustering result than M2E. Besides, by comparing with M2EDS, the proposed M2E utilizing the constraint to regularize clustering solutions obtained from multiple views towards a consensus solution can find the true clustering more effectively than simply concatenating all the features together.
Note that in terms of F1 score, M2E is the best one on HIV dataset, while MIC is the best on BP dataset. This is because MIC method clusters large number of samples into the same class while M2E produces relatively even results. Besides, the precisionrecall result also explains why our M2E does not outperform MIC in terms of F1 score. And the other methods like convexSub, AMGL and multiNMF have the same problem as MIC, therefore their precision or recall results are higher than M2E. However, in our context, itâs not reasonable to cluster all samples in one class, in that it cannot distinguish patients from controls.
5.4 Parameter Sensitivity Analysis
In this section, to evaluate how the parameters of M2E affects performance, we study the sensitivity of the three main parameters, including , and , where is the parameter of DTI view, is the parameter of fMRI view and is the embedded dimension. For evaluating the regularization parameter and , we set the to the optimal value.
According to Figures 2(a) and 2(b), we observe that the result is relatively sensitive to the change of and , which shows that different views have different effects on the performance. Besides, when and are very large, the performance gets worse. This shows that and are also important for tuning the first term and the second term of objective function in Eq. (8).
Figures 3(a) and 3(b) show the performance of M2E with the value varying from to . We can observe that the embedded dimension has a significant effect on the accuracy. The highest accuracy is achieved when equals to on HIV dataset and on BP dataset. Generally speaking, the performance shakes greatly with the change of the rank. But in most cases the optimal value of lies in a small range of values as demonstrated in [\citeauthoryearHao et al.2013] and it is not timeconsuming to find it using the grid search strategy in practical applications.
5.5 Factor Analysis
M2E extracts consisting of , for and consisting of , where these factors indicate the signatures of sources in vertex and subject domain, respectively. Due to limited space, we only show the factors on HIV dataset here. We visualize the learning results of and on fMRI and DTI dataset in Figures 4(a) and 4(b).
We show the largest factors in terms of magnitude for fMRI and DTI in Figures 4(a) and 4(b). Left panel shows the node embedded feature . The coordinate system represents neuroanatomy and the color shows the activity intensity of the brain region. The right panel shows the graph embedded feature which represents the factor strengths for both patients and controls. Based on our objective function in Eq. (8), is used to preserve the individual information of each view and dependencies among multiple graphs. As we can see from the left panels of Figures 4(a) and 4(b), the embedded neuroanatomy learned from fMRI data and DTI data are widely different from each other. However, is learned by forcing each view to the consensus correlation . From the right panels of Figures 4(a) and 4(b), results from both views show that the controls have relatively positive correlation with node embedded feature, while the patient have relatively negative correlation. Moreover, those neuroimaging findings in HIV generally support clinical observations of functional impairments in attention, psychomotor speed, memory, and executive function. In particular, regions identified in our current study are consistent with those reported in structural and functional MRI studies of HIV associated neurocognitive disorder (HAND), including regions within the frontal and parietal lobes [\citeauthoryearRisacher and Saykin2013].
6 Conclusion
We present a novel multiview multigraph embedding framework based on partiallysymmetric tensor factorization for brain network analysis. The proposed M2E method not only takes advantages of the complementary and dependent information among multiple views and multiple graphs, but also exploits the graph structures. In particular, we first model the multiview multigraph data as multiple partiallysymmetric tensors, and then learn the consensus graph embedding via the integration of tensor factorization and a multiview embedding method. We apply our approach on two real HIV and BP datasets with a fMRI view and a DTI view for unsupervised brain network analysis. Extensive experimental results demonstrate the effectiveness of M2E for multiview multigraph embedding on brain networks.
Acknowledgements
This work is supported in part by NSF grants No. IIS1526499 and CNS1626432, NIH grant No. R01MH080636, and NSFC grants No. 61503253 and 61672313.
Footnotes
References
 Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; and Eckstein, J. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning 3(1):1–122.
 Cai, D. 2011. Litekmeans: the fastest matlab implementation of kmeans. Software available at: http://www. zjucadcg. cn/dengcai/Data/Clustering. html.
 Cao, B.; He, L.; Kong, X.; Philip, S. Y.; Hao, Z.; and Ragin, A. B. 2014. Tensorbased multiview feature selection with applications to brain diseases. In ICDM, 40–49. IEEE.
 Cao, B.; Kong, X.; Zhang, J.; Yu, P. S.; and Ragin, A. B. 2015. Identifying hivinduced subgraph patterns in brain networks with side information. Brain informatics 2(4):211–223.
 Guo, Y. 2013. Convex subspace representation learning from multiview data. In AAAI, volume 1, 2.
 Hao, Z.; He, L.; Chen, B.; and Yang, X. 2013. A linear support higherorder tensor machine for classification. IEEE Transactions on Image Processing 22(7):2911–2920.
 Hartigan, J. A., and Wong, M. A. 1979. Algorithm as 136: A kmeans clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics) 28(1):100–108.
 He, L.; Kong, X.; Yu, P. S.; Yang, X.; Ragin, A. B.; and Hao, Z. 2014. Dusk: A dual structurepreserving kernel for supervised tensor learning with applications to neuroimages. In SDM, 127–135. SIAM.
 He, L.; Lu, C.T.; Ma, G.; Wang, S.; Shen, L.; Philip, S. Y.; and Ragin, A. B. 2017. Kernelized support tensor machines. In ICML, 1442–1451.
 Jie, B.; Zhang, D.; Gao, W.; Wang, Q.; Wee, C.Y.; and Shen, D. 2014. Integration of network topological and connectivity properties for neuroimaging classification. IEEE Transactions on Biomedical Engineering 61(2):576–589.
 Kong, X., and Yu, P. S. 2014. Brain network analysis: a data mining perspective. ACM SIGKDD Explorations Newsletter 15(2):30–38.
 Kumar, A., and Daumé, H. 2011. A cotraining approach for multiview spectral clustering. In ICML, 393–400.
 Kumar, A.; Rai, P.; and Daume, H. 2011. Coregularized multiview spectral clustering. In NIPS, 1413–1421.
 Kuo, C.T.; Wang, X.; Walker, P.; Carmichael, O.; Ye, J.; and Davidson, I. 2015. Unified and contrasting cuts in multiple graphs: application to medical imaging segmentation. In KDD, 617–626. ACM.
 Lin, Z.; Liu, R.; and Su, Z. 2011. Linearized alternating direction method with adaptive penalty for lowrank representation. In NIPS, 612–620.
 Liu, J.; Wang, C.; Gao, J.; and Han, J. 2013a. Multiview clustering via joint nonnegative matrix factorization. In SDM, 252–260. SIAM.
 Liu, W.; Chan, J.; Bailey, J.; Leckie, C.; and Ramamohanarao, K. 2013b. Mining labelled tensors by discovering both their common and discriminative subspaces. In SDM, 614–622. SIAM.
 Ma, G.; He, L.; Cao, B.; Zhang, J.; Yu, P. S.; and Ragin, A. B. 2016. Multigraph clustering based on interiornode topology with applications to brain networks. In ECMLPKDD, 476â492. Springer.
 Ma, G.; He, L.; Lu, C.T.; Shao, W.; Yu, P. S.; Leow, A. D.; and Ragin, A. B. 2017a. Multiview clustering with graph embedding for connectome analysis. In CIKM.
 Ma, G.; Lu, C.T.; He, L.; Yu, P. S.; and Ragin, A. B. 2017b. Multiview graph embedding with hub detection for brain network analysis. In ICDM.
 Mousazadeh, S., and Cohen, I. 2015. Embedding and function extension on directed graph. Signal Processing 111:137–149.
 Nie, F.; Zeng, Z.; Tsang, I. W.; Xu, D.; and Zhang, C. 2011. Spectral embedded clustering: A framework for insample and outofsample spectral clustering. IEEE Transactions on Neural Networks 22(11):1796–1808.
 Nie, F.; Li, J.; and Li, X. 2016. Parameterfree autoweighted multiple graph learning: A framework for multiview clustering and semisupervised classification. In IJCAI.
 Ou, M.; Cui, P.; Pei, J.; Zhang, Z.; and Zhu, W. 2016. Asymmetric transitivity preserving graph embedding. In KDD, 1105–1114.
 Parikh, N.; Boyd, S.; et al. 2014. Proximal algorithms. Foundations and Trends® in Optimization 1(3):127–239.
 Risacher, S. L., and Saykin, A. J. 2013. Neuroimaging biomarkers of neurodegenerative diseases and dementia. In Seminars in neurology, volume 33, 386–416. Thieme Medical Publishers.
 Shao, W.; He, L.; and Yu, P. S. 2015. Clustering on multisource incomplete data via tensor modeling and factorization. In PAKDD, 485–497. Springer.
 Sun, L.; Wang, Y.; Cao, B.; Yu, P. S.; Srisaan, W.; and Leow, A. D. 2017. Sequential keystroke behavioral biometrics for mobile user identification via multiview deep learning. In ECMLPKDD.
 TzourioMazoyer, N.; Landeau, B.; Papathanassiou, D.; Crivello, F.; Etard, O.; Delcroix, N.; Mazoyer, B.; and Joliot, M. 2002. Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the mni mri singlesubject brain. Neuroimage 15(1):273–289.
 Van Loan, C. F. 2016. Structured matrix problems from tensors. In Exploiting Hidden Structure in Matrix Computations: Algorithms and Applications. Springer. 1–63.
 Von Luxburg, U. 2007. A tutorial on spectral clustering. Statistics and computing 17(4):395–416.
 WhitfieldGabrieli, S., and NietoCastanon, A. 2012. Conn: a functional connectivity toolbox for correlated and anticorrelated brain networks. Brain connectivity 2(3):125–141.
 Yin, M.; Gao, J.; Xie, S.; and Guo, Y. 2016. Lowrank multiview clustering in thirdorder tensor space. arXiv preprint arXiv:1608.08336.