Revisiting convolutional neural network on graphs with polynomial approximations of LaplaceBeltrami spectral filtering
Abstract
This paper revisits spectral graph convolutional neural networks (graphCNNs) given in Defferrard (2016) and develops the LaplaceBeltrami CNN (LBCNN) by replacing the graph Laplacian with the LB operator. We then define spectral filters via the LB operator on a graph. We explore the feasibility of Chebyshev, Laguerre, and Hermite polynomials to approximate LBbased spectral filters and define an update of the LB operator for pooling in the LBCNN. We employ the brain image data from Alzheimer’s Disease Neuroimaging Initiative (ADNI) and demonstrate the use of the proposed LBCNN. Based on the cortical thickness of the ADNI dataset, we showed that the LBCNN didn’t improve classification accuracy compared to the spectral graphCNN. The three polynomials had a similar computational cost and showed comparable classification accuracy in the LBCNN or spectral graphCNN. Our findings suggest that even though the shapes of the three polynomials are different, deep learning architecture allows us to learn spectral filters such that the classification performance is not dependent on the type of the polynomials or the operators (graph Laplacian and LB operator).
Keywords:
Graph convolutional neural network signals on surfaces Chebyshev polynomial Hermite polynomial Laguerre polynomial LaplaceBeltrami operator.∎
Declarations
Funding
This research/project is supported by the National Science Foundation MDS2010778, National Institute of Health R01 EB022856, EB02875, and National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISGGC2019002). Additional funding is provided by the Singapore Ministry of Education (Academic research fund Tier 1; NUHSRO/2017/052/T1SRPPartnership/01), NUS Institute of Data Science. This research was also supported by the A*STAR Computational Resource Centre through the use of its highperformance computing facilities.
Conflicts of interest/Competing interests
The authors declare that they have no conflict of interest.
Availability of data and material
Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.ucla.edu).
Code availability
1 Introduction
Graph convolutional neural networks (graphCNNs) are deep learning techniques that apply to graphstructured data. Graphstructured data are in general complex, which imposes significant challenges on existing convolutional neural network algorithms. Graphs are irregular and have variable number of unordered vertices with different topology at each vertex. This makes important algebraic operations such as convolutions and pooling challenging to apply to the graph domain. Hence, existing research on graphCNN has been focused on defining convolution and pooling operations.
There are two types of approaches for defining convolution on a graph: one through the spatial domain and the other through the spectral domain Bronstein et al. (2017); Zhang et al. (2020). Existing spatial approaches, such as diffusionconvolutional neural networks (DCNNs) Atwood and Towsley (2015), PATCHYSAN Niepert et al. (2016); Duvenaud et al. (2015), gated graph sequential neural networks Li et al. (2015), DeepWalk Perozzi et al. (2014), messagepassing neural network (MPNN) Gilmer et al. (2017), develop convolution in different ways to process the vertices on a graph whose neighborhood has different sizes and connections. An alternative approach is to take into account of the geometry of a graph and to map individual patches of a graph to a representation that is more amenable to classical convolution, including 2D polar coordinate representation Masci et al. (2015), local windowed spectral representation Boscaini et al. (2015), anisotropic variants of heat kernel diffusion filters Boscaini et al. (2016b, a), Gaussian Mixturemodel kernels Monti et al. (2016).
On the other hand, several graphCNN methods called ”spectral graphCNN” defines convolution in the spectral domainBruna et al. (2013); Defferrard et al. (2016); Henaff et al. (2015); Kipf and Welling (2016); Yi et al. (2017); Ktena et al. (2017); Shuman et al. (2016). The advantage of spectral graphCNN methods lies in the analytic formulation of the convolution operation. Based on the spectral graph theory, Bruna et al. Bruna et al. (2013) proposed convolution on graphstructured data in the spectral domain via the graph Fourier transform. However, the eigendecomposition of the graph Laplacian for building the graph Fourier transform is computationally intensive when a graph is large. Moreover, spectral filters in Bruna et al. (2013) are nonlocalized in the spatial domain. Defferrard et al. Defferrard et al. (2016) addressed these problems by proposing Chebyshev polynomials to parametrize spectral filters such that the resulting convolution is approximated by the polynomials of the graph Laplacian. Kipf and Welling Kipf and Welling (2016) adopted the firstorder polynomial filter and stacked more spectral convolutional layers to replace higherorder polynomial expansions. In Defferrard et al. (2016); Shuman et al. (2016), it is shown that the order Chebyshev polynomial approximation of graph Laplacian filters performs the ring filtering operation.
In this study, we revisited the spectral graphCNN based on the graph Laplacian Defferrard et al. (2016); Shuman et al. (2016) and developed the LBCNN where spectral filters are designed via the LaplaceBeltrami (LB) operator on a graph. We call these filters as LBbased spectral filters. We investigated whether the proposed LBCNN is superior to the graphCNN Defferrard et al. (2016); Shuman et al. (2016) because the LB operator incorporates the intrinsic geometry of a graph better than graph Laplacian PerraultJoncas et al. (2017). We further explored the feasibility of Chebyshev, Laguerre, and Hermite polynomials to approximate LBbased spectral filters in the LBCNN. We chose Laguerre and Hermite polynomials beyond Chebyshev polynomials since these polynomials have potentials to approximate the heat kernel convolution on a graph as shown in Tan and Qiu (2015); Huang et al. (2020). We employed the brain image data from Alzheimer’s Disease Neuroimaging Initiative (ADNI) and demonstrated the use of the proposed LBCNN. We compared the computational time and classification performance of the LBCNN with the spectral graphCNN Defferrard et al. (2016); Shuman et al. (2016) when Chebyshev, Laguerre, and Hermite polynomials were used.
This study contributes to

providing the approximation of LB spectral filters using Chebyshev, Laguerre, Hermite polynomials and their implementation in the LBCNN;

updating the LB operator for pooling in the LBCNN;

demonstrating the feasibility of using the LB operator and different polynomials for graphCNNs.
2 Methods
Similar to classical CNN and spectral graphCNN Defferrard et al. (2016), the LBCNN has three major components, including convolution, rectified linear unit (ReLU), and pooling. In the following, we will first describe LB spectral filters in the convolutional layer and how to define a pooling operation via coarsening a graph and updating the LB operator.
2.1 LaplaceBeltrami spectral filters
Polynomial approximation of LB spectral filters
Consider the LaplaceBeltrami (LB) operator on surface . Let be the eigenfunction of the LBoperator with eigenvalue
(1) 
where . A signal on the surface can be represented as a linear combination of the LB eigenfunctions
(2) 
where is the coefficient associated with the eigenfunction .
We now consider an LB spectral filter on with spectrum as
(3) 
Based on Eq. (2), the convolution of a signal with the filter can be written as
(4) 
As suggested in Defferrard et al. (2016); Wee et al. (2019); Coifman and Maggioni (2006); Hammond et al. (2011); Kim et al. (2012); Tan and Qiu (2015), the filter spectrum in Eq. (4) can be approximated as the expansion of Chebyshev polynomials, , , such that
(5) 
is the expansion coefficient associated with the Chebyshev polynomial. is the Chebyshev polynomial of the form . The left panel on Fig. 1 shows the shape of the Chebyshev polynomial up to order 6. We can rewrite the convolution in Eq. (4) as
(6) 
Likewise, in Eq. (4) can also be approximated using other polynomials, such as Laguerre or Hermite polynomials Olver et al. (2010). in Eq. (6) can be replaced by Laguerre, , or Hermite, , polynomials, where
(7)  
In this paper, we adopt the following normalized definition of Hermite polynomials:
(8) 
where the inner product of with itself is independent of . The last two panels of Fig. 1 show the shapes of Laguerre and Hermite polynomials up to order 6, respectively.
Numerical implementation of LB spectral filters via polynomial approximations
We now discretize the surface as a triangulated mesh, , with a set of triangles and vertices . For the implementation of the LB spectral filters, we adopt the discretization scheme of the LB operator in Tan and Qiu (2015). The element of the LBoperator on can be computed as
(9) 
where is estimated by the Voronoi area of nonobtuse triangles Meyer et al. (2003) and the Heron’s area of obtuse triangles containing Tan and Qiu (2015); Meyer et al. (2003). The offdiagonal entries are defined as if and form an edge, otherwise . The diagonal entries are computed as . Other cotan discretizations of the LB operator are discussed in Chung and Taylor (2004); Qiu et al. (2006); Chung et al. (2015). When the number of vertices on is large, the computation of the LB eigenfunctions can be costly Huang et al. (2020).
For the sake of simplicity, we denote the order polynomial as , where can represent Chebyshev, Laguerre, or Hermite polynomial. We take the advantage of the recurrence relation of these polynomials (Table 1) and compute LB spectral filters recursively as follows.

compute based on Eq. (9) for the triangulated mesh ;

compute the maximum eigenvalue of . For the standardization across surface meshes, we normalize as such that the eigenvalues are mapped from to for Chebyshev polynomials Defferrard et al. (2016); Huang et al. (2020). is an identity matrix. For Laguerre and Hermite polynomials, we normalize as , which maps the eigenvalues from to ;

for a signal , compute recursively by
(10) with the initial conditions and . The recurrence relations of different polynomials are given in Table 1.
Step 3 is repeated from till .
Localization of spectral filters based on polynomial approximations
Analogue to the spatial localization property of Chebyshev polynomial approximation of graph Laplacian spectral filters Defferrard et al. (2016), we can show that Chebyshev, Laguerre, or Hermite polynomial approximation of LB spectral filters also has this localization property. We consider the discretization of given in Eq. (9). Consider two vertices and on . We can define the shortest distance between and , denoted by , as the minimum number of edges on the path connecting and . Hence,
(11) 
where denotes the th power of the LB operator Tan and Qiu (2015). In other words, the coverage of is localized in the ball with radius from the central vertex.
can be represented in terms of and is localized if according to Eq. (11). The spectral filter composed of , , …, is a spatially localized filter with localization property given by
(12) 
In practice, we can also show the spatial localization of filter composed of , , …, by applying to an impulse signal with at vertex and at the others. Then, the filter output is given by . When satisfying , since , we have
(13) 
2.2 Rectified Linear Unit
Similar to classic CNN, a rectified linear unit (ReLU) in the LBCNN can be represented by many nonlinear activation functions. The activation function is a map from to , which does not involve any geometrical property of a triangulated mesh. In our proposed LBCNN on a mesh, we adopt the wellknown ReLU:
2.3 Mesh coarsening and pooling
For the LBCNN, the pooling layer involves mesh coarsening, pooling of signals, and an update of the LB operator. First, we adopt the Graclus multilevel clustering algorithm Dhillon et al. (2007) to coarsen a graph based on the graph Laplacian. This algorithm is built on the METIS Karypis and Kumar (1998) to cluster similar vertices together from a given graph by a greedy algorithm. At each coarsening level, two neighboring vertices with maximum local normalized cut are matched until all vertices are explored Shi and Malik (2000). In our case, the discrete LBoperator in Eq. (9) is used. The local normalized cut on a mesh is computed by . The coarsening process is repeated until the coarsest level is achieved. After coarsening, a balanced binary tree is generated where each node has either one (i.e. singleton) or two child nodes. Fake nodes are added to pair with those singleton. The weights of the edges involving fake nodes are set as 0. Then, the pooling on this binary tree can be efficiently implemented as a simple 1dimensional pooling of size 2.
We now discuss the update of the LB operator for a coarsen mesh. When two matched nodes are merged together at a coarser level, the weights of the edges involving the two nodes are merged by summing them together. By doing so, each coarsened mesh has its updated .
2.4 LBCNN Architecture
We are now well equipped with all the components for a LBCNN network. The LBCNN network is composed of total connected stages. The first stages are the stages for feature extraction. Each stage contains three sequentially concatenated layers: (1) a convolutional layer with multiple LB spectral filters; (2) a ReLU layer; (3) a pooling layer with stride 2 or higher that uses average pooling. In the last stage, a fully connected layer followed by a softmax function is employed to make a decision, and the output layer contains classification labels.
Fig. 2 illustrates one of LBCNN architectures that are analogous to classical CNN for image data defined on equispaced grids. In this example, the th convolution layer is composed of LB spectral filters that can be approximated using Chebyshev, Laguerre, and Hermite polynomials, an ReLU, and an average pooling with pooling size and stride being the same as the pooling size. In the fully connected layer, there are hidden nodes, and an norm regularization with weight of is applied to prevent overfitting.
All the networks can be trained by the back propagation algorithm with epochs, minibatch size of , initial learning rate of , learning rate decay of for every epochs, momentum of and no dropout.
2.5 MRI data acquisition and preprocessing
We utilized the structural T1weighted MRI from the ADNI2 cohort (adni.loni.ucla.edu). The aim of this study was to illustrate the use of the LBCNN and spectral graphCNN via the HC/AD classification since it has been well studied using T1weighted image data (e.g., Cuingnet et al. (2011); Liu et al. (2013); HosseiniAsl et al. (2016); Korolev et al. (2017); Liu et al. (2018); Islam and Zhang (2018); Basaia et al. (2019); Wee et al. (2019)). Hence, this study involved 643 subjects with HC or AD scans (392 subjects had HC scans; 253 subjects had AD scans). There were 8 subjects who fell into both groups due to the conversion from HC to AD. There were total 1122 scans for HC and 587 for AD.
The MRI data of the ADNI2 cohort were acquired from participants aging from 55 to 90 years using either 1.5 or 3T scanners. The T1weighted images were segmented using FreeSurfer (version 5.3.0) Fischl et al. (2002). The white and pial cortical surfaces were generated at the boundary between white and gray matter and the boundary of gray matter and CSF, respectively. Cortical thickness was computed as the distance between the white and pial cortical surfaces. It represents the depth of the cortical ribbon. We represented cortical thickness on the mean surface, the average between the white and pial cortical surfaces. We employed large deformation diffeomorphic metric mapping (LDDMM) Zhong et al. (2010); Du et al. (2011) to align individual cortical surfaces to the atlas and transferred the cortical thickness of each subject to the atlas. The cortical atlas surface was represented as a triangulated mesh with 655,360 triangles and 327,684 vertices. At each surface vertex, a spline regression implemented by piecewise step functions James et al. (2013) was performed to regress out the effects of age and gender. The residuals from the regression were used in the below spectral graphCNN and LBCNN.
3 Results
3.1 Spatial localization of the LB spetral filters via polynomial approximations
Fig. 3 shows the localization property of spectral filters using the Chebyshev, Laguerre and Hermite polynomials. The input signal is 1 at only one vertex and 0 at all other vertices of the hippocampus. The is strictly localized in a ball of radius , i.e., rings from the central vertex. Fig. 4
Consider a signal having 1 on a small patch (see the first panel in Fig. 4) and 0 on the rest of a hippocampus surface mesh with 1184 vertices and 2364 triangles. Fig. 4 shows the convolutions of this signal with spectral filters , or for . The spectral filters designed by different polynomials show different impacts on the signal in the spatial domain.
3.2 Comparison of spectral graphCNN and LBCNN
Spectral CNN  Polynomial  Layer  ACC (%)  SEN (%)  SPE (%)  GMean (%)  

Graph  Chebyshev  4  6  
Laguerre  5  7  
Hermite  3  7  
LB  Chebyshev  5  7  
Laguerre  5  7  
Hermite  4  7 
ACC: accuracy; SEN: sensitivity; SPE: specificity; GMean: geometric mean.
We aimed to compare the computational cost and classification accuracy of the spectral graphCNN Defferrard et al. (2016); Wee et al. (2019) and LBCNN on the cortical thickness of the HC and the AD patients while Chebyshev, Laguerre and Hermite polynomials were used to approximate spectral filters.
In our experiments, the architecture of the spectral graphCNN and LBCNN was the same as shown in Fig. 2 except the number of layers. Tenfold crossvalidation was applied to the dataset (HC: ; AD: ). One fold of real data was left out for testing. The remaining nine folds were further separated into training () and validation () sets randomly. To prevent potential data leakage in the tenfold crossvalidation, we constructed nonoverlap training, validation, and testing sets with respect to subjects instead of MRI scans. This ensured that the scans from the same subjects were in the same set. The above data splitting was done for the HC and AD groups separately so that the ratio of the number of subjects in the two groups was similar in all sets.
Computational Cost
The computation cost of the LBCNN was similar to that of the spectral graphCNN since they only differ on the edge weights between vertices. For instance, when the network with 3 convolutional layers and 6order Hermite approximation was used, the spectral graphCNN and LBCNN respectively had training time sec and sec over the tenfold cross validation with no significant difference (twosample test ).
Next, we compared the computational cost of the LBCNN with 3 convolutional layers using different polynomial approximations. Fig. 5 shows the training time of the LBCNNs using the Chebyshev, Laguerre and Hermite approximation of order , , and . Regardless of which polynomial was used, the training time increased as increased since more trainable parameters were needed to characterize the spectral filters. Given , the three polynomial approximation methods had similar computation cost ().
Classification Performance
To compare classification performance of the spectral graphCNN and LBCNN on HC and AD, the number of convolutional layers and polynomial approximation order were tuned for each CNN independently to achieve the best classification accuracy and geometric mean (GMean) on the validation set. The spectral graphCNNs with Chebyshev, Laguerre and Hermite approximations respectively required 4 convolutional layers with polynomial order of , 5 layers with , and 3 layers with . The LBCNNs with Chebyshev and Laguerre approximations needed 5 layers with while the LBCNN with Hermite approximation required 4 layers with . Table 2 lists the accuracy, sensitivity, specificity and GMean of all these CNNs in classifying AD and HC.
Twosample test found no significant difference in classification accuracy between the spectral graphCNN and LBCNN. For instance, when Chebyshev polynomials were used to approximate the spectral filters, the spectral graphCNN classification accuracy was while the LBCNN classification accuray was (). Likewise, there were no group differences in classification accuracy between the spectral graphCNN and LBCNN when Laguerre and Hermite polynomial approximations were used (Laguerre: ; Hermite:). Hence, the LB CNN classification performance can be viewed as equivalent to the spectral graphCNN.
As for the comparisons among the three different polynomials, the classification performance of the Laguerre approximation was comparable to the Chebyshev approximation (graphCNN: ; LBCNN: ). However, the classification performance of both Chebyshev and Laguerre polynomial approximations was greater than that of the Hermite polynomial approximation (all ). In Huang et al. (2020), Hermite polynomial approximation shows slower convergence to heat kernel, compared to Chebyshev and Laguerre polynomial approximations.
4 Discussions
In this study, we revisited the spectral graphCNN Defferrard et al. (2016); Shuman et al. (2016) and developed the LBCNN by replacing the graph Laplacian by the LB operator. We also employed Chebyshev, Laguerre, and Hermite polynomials to approximate the LB spectral filters in the LBCNN and spectral graphCNN. Based on cortical thickness of the ADNI dataset, we showed that the LBCNN didn’t improve classification accuracy compared to the spectral graphCNN Defferrard et al. (2016); Shuman et al. (2016). The three polynomials had the similar computational cost and showed comparable classification accuracy in the LBCNN or spectral graphCNN Defferrard et al. (2016); Shuman et al. (2016). Our findings suggest that even though the shapes of the three polynomials are different, deep learning architecture allows to learn spectral filters such that the classification performance is not dependent on the type of the polynomials or the operators (graph Laplacian and LB operator).
References
 Atwood J, Towsley D (2015) Diffusionconvolutional neural networks. arXiv preprint arXiv:151102136
 Basaia S, Agosta F, Wagner L, Canu E, Magnani G, Santangelo R, Filippi M, Initiative ADN, et al. (2019) Automated classification of alzheimer’s disease and mild cognitive impairment using a single mri and deep neural networks. NeuroImage: Clinical 21:101645
 Boscaini D, Masci J, Melzi S, Bronstein MM, Castellani U, Vandergheynst P (2015) Learning classspecific descriptors for deformable shapes using localized spectral convolutional networks. Computer Graphics Forum 34(5):13–23
 Boscaini D, Masci J, Rodoia E, Bronstein M (2016a) Learning shape correspondence with anisotropic convolutional neural networks. In: NIPS’16 Proceedings of the 30th International Conference on Neural Information Processing Systems, ACM, pp 3197–3205
 Boscaini D, Masci J, Rodola E, Bronstein MM, Cremers D (2016b) Anisotropic diffusion descriptors. Computer Graphics Forum 35(2):431â441
 Bronstein M, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine 34(4):18–42
 Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:13126203
 Chung M, Taylor J (2004) Diffusion smoothing on brain surface via finite element method. In: Proceedings of IEEE International Symposium on Biomedical Imaging (ISBI), vol 1, pp 432–435
 Chung M, Qiu A, Seo S, Vorperian H (2015) Unified heat kernel regression for diffusion, kernel smoothing and wavelets on manifolds and its application to mandible growth modeling in CT images. Medical Image Analysis 22:63–76
 Coifman RR, Maggioni M (2006) Diffusion wavelets. Applied and Computational Harmonic Analysis 21(1):53–94
 Cuingnet R, Gerardin E, Tessieras J, Auzias G, Lehéricy S, Habert MO, Chupin M, Benali H, Colliot O (2011) Automatic classification of patients with Alzheimer’s disease from structural MRI: a comparison of ten methods using the ADNI database. Neuroimage 56(2):766–781
 Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS, pp 3844–3852
 Dhillon IS, Guan Y, Kulis B (2007) Weighted graph cuts without eigenvectors a multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(11):1944â1957
 Du J, Younes L, Qiu A (2011) Whole brain diffeomorphic metric mapping via integration of sulcal and gyral curves, cortical surfaces, and images. NeuroImage 56(1):162–173
 Duvenaud DK, Maclaurin D, AguileraIparraguirre J, GomezBombarelli R, Hirzel T, AspuruGuzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. arXiv preprint arXiv:150909292
 Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, Van Der Kouwe A, Killiany R, Kennedy D, Klaveness S, et al. (2002) Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron 33(3):341–355
 Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. arXiv preprint arXiv:170401212
 Hammond DK, Vandergheynst P, Gribonval R (2011) Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis 30(2):129 – 150
 Henaff M, Bruna J, LeCun Y (2015) Deep convolutional networks on graphstructured data. arXiv preprint arXiv:150605163
 HosseiniAsl E, Keynton R, ElBaz A (2016) Alzheimer’s disease diagnostics by adaptation of 3d convolutional network. In: 2016 IEEE International Conference on Image Processing (ICIP), IEEE, pp 126–130
 Huang SG, Lyu I, Qiu A, Chung MK (2020) Fast polynomial approximation of heat kernel convolution on manifolds and its application to brain sulcal and gyral graph pattern analysis. IEEE Transactions on Medical Imaging 39(6):2201–2212
 Islam J, Zhang Y (2018) Brain mri analysis for alzheimerâs disease diagnosis using an ensemble system of deep convolutional neural networks. Brain Informatics 5(2):2
 James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer
 Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing 20(1):359–392
 Kim WH, Pachauri D, Hatt C, Chung MK, Johnson S, Singh V (2012) Wavelet based multiscale shape features on arbitrary surfaces for cortical thickness discrimination. In: Advances in Neural Information Processing Systems, pp 1241–1249
 Kipf TN, Welling M (2016) Semisupervised classification with graph convolutional networks. arXiv preprint arXiv:160902907
 Korolev S, Safiullin A, Belyaev M, Dodonova Y (2017) Residual and plain convolutional neural networks for 3d brain mri classification. In: 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), IEEE, pp 835–838
 Ktena SI, Parisot S, Ferrante E, Rajchl M, Lee M, Glocker B, Rueckert D (2017) Distance metric learning using graph convolutional networks: Application to functional brain networks. arXiv preprint arXiv:170302161
 Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neural networks. arXiv preprint arXiv:151105493
 Liu M, Zhang J, Adeli E, Shen D (2018) Landmarkbased deep multiinstance learning for brain disease diagnosis. Medical image analysis 43:157–168
 Liu X, Tosun D, Weiner MW, Schuff N, Initiative ADN, et al. (2013) Locally linear embedding (lle) for mri based alzheimer’s disease classification. Neuroimage 83:148–157
 Masci J, Boscaini D, Bronstein MM, Vandergheynst P (2015) Geodesic convolutional neural networks on riemannian manifolds. In: Computer Vision (ICCV), 2015 IEEE International Conference on, IEEE, p 832â840
 Meyer M, Desbrun M, Schröder P, Barr AH (2003) Discrete differentialgeometry operators for triangulated 2manifolds. In: Visualization and mathematics III, Springer, pp 35–57
 Monti F, Boscaini D, Masci J, RodolÃ E, Svoboda J, Bronstein MM (2016) Geometric deep learning on graphs and manifolds using mixture model cnns. arXiv preprint arXiv:161108402
 Niepert M, Ahmed M, Kutzkov K (2016) Learning convolutional neural networks for graphs. In: Proceeding of the 33rd International Conference on Machine Learning, ACM, p 2014ââ2023
 Olver FWJ, Lozier DW, Boisvert RF, Clark CW (2010) NIST handbook of mathematical functions. Cambridge University Press
 Perozzi B, AlRfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD, ACM, p 701â710
 PerraultJoncas DC, MeilÇ M, McQueen J (2017) Improved graph laplacian via geometric consistency. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 4460–4469
 Qiu A, Bitouk D, Miller M (2006) Smooth functional and structural maps on the neocortex via orthonormal bases of the LaplaceBeltrami operator. IEEE Transactions on Medical Imaging 25:1296–1396
 Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8):888–905
 Shuman DI, Ricaud B, Vandergheynst P (2016) Vertexfrequency analysis on graphs. Applied and Computational Harmonic Analysis 40(2):260â291
 Tan M, Qiu A (2015) Spectral LaplaceBeltrami wavelets with applications in medical images. IEEE Transactions on Medical Imaging 34:1005–1017
 Wee CY, Liu C, Lee A, Poh JS, Ji H, Qiu A, Initiative ADN (2019) Cortical graph neural network for AD and MCI diagnosis and transfer learning across populations. NeuroImage: Clinical 23:101929
 Yi L, Su H, Guo X, Guibas L (2017) Syncspeccnn: Synchronized spectral cnn for 3d shape segmentation. In: Computer Vision and Pattern Recognition (CVPR), Conference on, IEEE, pp 6584–6592
 Zhang Z, Cui P, Zhu W (2020) Deep learning on graphs: A survey. IEEE Transactions on Knowledge and Data Engineering
 Zhong J, Phua DYL, Qiu A (2010) Quantitative evaluation of lddmm, freesurfer, and caret for cortical surface mapping. Neuroimage 52(1):131–141