Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural Networks
Abstract
The success of deep learning has been widely recognized in many machine learning tasks during the last decades, ranging from image classification and speech recognition to natural language understanding. As an extension of deep learning, Graph neural networks (GNNs) are designed to solve the nonEuclidean problems on graphstructured data which can hardly be handled by general deep learning techniques. Existing GNNs under various mechanisms, such as random walk, PageRank, graph convolution, and heat diffusion, are designed for different types of graphs and problems, which makes it difficult to compare them directly. Previous GNN surveys focus on categorizing current models into independent groups, lacking analysis regarding their internal connection. This paper proposes a unified framework and provides a novel perspective that can widely fit existing GNNs into our framework methodologically. Specifically, we survey and categorize existing GNN models into the spatial and spectral domains, and reveal connections among subcategories in each domain. Further analysis establishes a strong link across the spatial and spectral domains.
1 Introduction
The effectiveness of deep learning [28] has been widely recognized in various machine learning tasks [42, 43, 22, 49, 35] during the last decades, achieving remarkable success on Euclidean data. Recent decades has witnessed a great number of emerging applications where effective information analysis generally boils down to the nonEuclidean geometry of the data represented by a graph, such as social networks [27], transportation networks [4], spread of epidemic disease [38], brain’s neuronal networks [36], gene data on biological regulatory networks [12], telecommunication networks [14], and knowledge graph [32]. Such nonEuclidean problems on graphstructured data can hardly be handled by general deep learning techniques. Modeling data by the graph is challenging due to that graph data is irregular, i.e., each graph has a variable size of nodes, and each node in a graph has a different number of neighbors, rendering some operations such as convolutions not directly applicable to the graph structure. Recently, there has been increasing interest in extending deep learning for graph data. Inspired by the success of deep learning, ideas are borrowed from deep learning models to handle the intrinsic complexity of the graph. This rising trend attracts increasing interest in the machine learning community, and a large number of GNN models are developed based on various theories [8, 25, 13, 19, 2, 46].
Despite GNNs dominate graph representation learning in recent years, there is still a limited understanding of their representational power and physical meaning. The lack of understanding of GNNs significantly hinders the comparison and improvement of stateoftheart methods. This gap also makes it challenging to extend GNNs to many domains such as business intelligence or drug development, since black box models may be associated with uncontrollable risks. Therefore, there is a pressing need to demystify GNNs, which motivates researchers to explore a generalized framework for GNNs [53, 16, 55]. However, these works can only explain few GNNs, and the interpretation for the majority of GNNs is still missing.
There exist a large number of different mechanisms among current GNNs, such as random walk, Page Rank, attention model, lowpass filter, message passing, and so on. These methods can be classified into several coarsegrained groups [37, 20, 56, 58, 50] such as spectral [8, 25, 13] and spatial domain [19, 2, 46]. However, the current taxonomies fail to provide an understanding of the connections among different GNN models. Elucidating the underlying mechanisms of GNNs, and understanding connections among all types of GNNs is still at the forefront of GNNs research [53, 48, 31, 30]. This work is not trivial since the mechanisms behind existing GNNs are not inherently consistent, so their internal connection remains unclear. This gap incurs difficulty in understanding GNNs and comparing emerging methods. Previous surveys of GNNs [37, 20, 56, 58, 50] focus on categorizing current models into independent groups and expounding each group separately without analysis regarding their relationship.
The objective of this paper is to provide a unified framework to generalize GNNs, bridging the gap among the existing works in spatial and spectral domains which are currently deemed as independent. The main focus of this work is the connection among GNNs from a theoretical perspective, going beyond existing taxonomies and designing a new scheme for GNNs. Our research is unique in how it links present works of various categories of GNNs. Firstly, we briefly introduce the proposed framework, including spatial and spectral domains, and present their internal connection. Then detailed subcategories from the spatial and spectral domains are provided respectively, and several popular GNN examples are used to illustrate our taxonomies in each subcategory. Detailed contributions of this paper are summarized as follows:

Proposing a taxonomy for summarizing GNN approaches in the spatial domain. This paper unifies GNN methods in the spatial domain by formulating spatial operation on graph connectivity. Then GNNs can be treated as the same function of graph matrix with different configurations.

Providing a taxonomy for summarizing GNN methods in the spectral domain. This survey categorizes GNN models by frequency response function in the spectral domain, and applies approximation theory to illuminate their generalization and specialization relationship.

Incorporating spatial and spectral models into a unified framework. By comparing the analytical forms, the proposed framework links the frequency response function of the spectral domain and node aggregation function in the spatial domain.
The rest of this survey is organized as follows. In Section 2, we introduce problem setup and necessary preliminaries. Then we present details of our proposed taxonomy in Section 3. Section 4 and 5 elaborate the details of the proposed taxonomy. We conclude the survey in Section 6.
2 Problem Setup and Preliminary
This section outlines the background of graph neural networks and problem setup.
Definition 2.1 (Graph)
A graph is defined as , where is a set of n nodees, represents edges. An entry denotes a node, indicates an edge between node and . The adjacency matrix is defined by iff there is a link between node and . Graph signal or node attributes is a feature matrix with each entry representing the feature vector on node .
Learning nodelevel embeddings can be summarized as
(1) 
where indicates the parameters of the model. We aim to find a which can integrate graph structure and original node attributes, outputting a new node embedding . Since this review focuses on two major categories, i.e., spectral and spatial methods, two related definitions are listed below for understanding this paper.
Definition 2.2 (Spatial Method)
Treating graph Laplacian [11] as spatial connectivity among nodes, spatial method integrate and signal :
(2) 
Therefore, spatial methods focus on finding a function .
Definition 2.3 (Spectral Method)
Graph Laplacian is defined as where is degree matrix. Due to its generalization ability [7] , the normalized Laplacian is defined as . The Laplacian is diagonalized by the Fourier basis (i.e., graph Fourier transform) [44, 59]: where is the diagonal matrix whose diagonal elements are the corresponding eigenvalues (i.e., ), and is also called eigenvectors. The graph Fourier transform of a signal is defined as and its inverse as . A graph convolution operation is defined in the Fourier domain such that
(3) 
where is the elementwise product, and are two signals defined on node domain. It follows that a node signal is filtered by spectral signal as:
(4) 
where is known as frequency response function of filter . Therefore, spectral methods is defined as learning .
3 The proposed taxonomy
As shown in Fig. (1), the proposed framework categorizes GNNs into the spatial (A0) and spectral (B0) groups, each of which is further divided into three subcategories respectively. By transforming the analytical form of these subcategories, we found three relations of equivalence as below:
(A1)Local Aggregation (B1)Frequency Aggregation: Through graph and matrix theory, Local Aggregation adjusts weights on a set of neighbor nodes, which corresponds to adjusting weights on frequency components in Frequency Aggregation.
(A2)Connectivity Order (B2)Approximation Order: Accumulating different orders of neighbors in Connectivity Order can be rewritten as the sum of different orders of frequency components, which is exactly the analytical form of Approximation Order.
(A3)Propagation Type (B3)Approximation Type: Propagation Type defines a label propagation with or without reverse propagation, while Approximation Type adjusts filter function with or without simple denominator (i.e., 1). In this way, they share the same formula after a simple transformation.
Sections 3 and 4 will discuss details of the proposed taxonomy, and illustrate each subcategory with several GNN examples.
4 Spatialbased GNNs (A0)
Several important aspects are often discussed in the existing literature of spatial methods, such as selfloop, normalization, highorder neighbors, aggregation, and combination among nodes. Based on these operations, we propose a new taxonomy of graph neural networks, categorizing spatialbased GNNs into three groups:
4.1 Local Aggregation (A1)
A number of works [41, 53, 52, 16, 19, 46] can be treated as learning the aggregation scheme among first order neighbors (i.e., direct neighbors). This aspect focuses on adjusting the weights for node and its neighbors to reveal the pattern regarding the supervision signal. Formally, updated node embeddings, , can be written as:
(5) 
where denotes a neighbor of node , is their representations, and indicate the weight functions. First item on the right hand side denotes the representation of node , while the second represents the update from its neighbors. Applying random walk normalization (i.e., dividing neighbors by degree of the current node), Eq. (5) can be written as:
(6) 
or symmetric normalization:
(7) 
where represents the degree of node . Normalization has better generalization capacity, which is not only due to some implicit evidence but also because of a theoretical proof on performance improvement [24]. In a simplified configuration, weights for the neighbors () are the same. Therefore, they can be rewritten in matrix form as:
(8) 
or
(9) 
where and are the weights. Eq. (8) and (9) can be generalized as the same form:
(10) 
where denotes normalized , which could be implemented by random walk or symmetric normalization. Several stateoftheart methods are selected to illustrate this schema:
(1) Graph Convolutional Network [25] is a simplification of ChebNet [13]. GCN adds a selfloop to nodes , and applies a renormalization trick which changes degree matrix from to . Specifically, GCN can be written as:
where , and is normalized adjacency matrix with self loop. Therefore, Eq. (4.1) is equivalent to Eq. (10) when setting and with the renormalization trick, and the result of GCN is exactly the sum of the current node and average of its neighbors.
(2) GraphSAGE [19] with mean aggregator averages a node with its neighbors by:
(11) 
where indicates the representation, and denotes the neighbor nodes. Eq. (11) can be written in matrix form:
(12) 
which is equivalent to Eq. (10) with and . Note that the key difference between GCN and GraphSAGE is the normalization. The former is symmetric normalization and the latter is random walk normalization.
4.2 Order of Connectivity (A2)
To collect richer local structure, several studies [2, 13, 48, 45, 17] involve higher orders of neighbors. Since direct neighbors (i.e., firstorder neighbors) are not always sufficient for representing the node. On the other hand, large order usually averages all node representations, causing an oversmoothing issue and losing its focus on the local neighborhood [30]. This motivates many models to tune the aggregation scheme on different orders of neighbors. Therefore, proper constraint and flexibility of orders are critical for node representation. High order of neighbors has been proved to characterize challenging signal such as Gaborlike filters [1]. Formally, this type of work can be written as:
(14) 
where indicates a nth order neighbors of node . Eq. (14) can be rewritten in matrix form:
(15) 
where is a polynomial function. Applying normalization, Equation (15) can be rewritten in matrix form as:
(16) 
where , and could also be normalized by random walk normalization: . Several existing works are analyzed below, showing that they are variants of Eq. (15) or (16):
(1) ChebNet [21] first introduced truncated Chebyshev polynomial for estimating wavelet in graph signal processing. Based on this polynomial approximation, Defferrard et al.[13] designed ChebNet which embeds a novel neural network layer for the convolution operator. Specifically, ChebNet is written as:
(17) 
where denotes the Chebyshev polynomial and is the Chebyshev coefficient. is the coefficient after expansion and reorganization. Since , we have:
(18) 
which can be reorganized as:
(19) 
which is exactly Eq. (16)
(2) DeepWalk [41] is a random walk based model that is integrated with deep learning technique. DeepWalk first draws a group of random paths from graph and applies a skipgram algorithm to extract node features. Assuming the number of samples is large enough, then the transfer probability of random walk on a graph can be written as:
(20) 
with random walk normalization. Let the window size of skipgram be and the current node is the (t+1)th one, the farthest neighbor current node can reach is a tth order one. If the training is sufficient and samples are adequate, the node will converge to its neighbors. Therefore, the updated representation is as follows:
(21) 
(3) Diffusion convolutional neural networks (DCNN) [2] considers using a degreenormalized transition matrix, i.e., renormalized adjacency matrix: :
(22) 
where denotes a tensor containing the power series of , and the operator represents elementwise multiplication. It can be transformed as:
(23) 
(4) Node2Vec [17] defines a 2nd order random walk to control the balance between BFS (breath first search) and DFS (depth first search). Consider a random walk that traversed edge (, ) and now resides at node . The transition probabilities to next stop from node is defined as:
(24) 
where denotes the shortest path between nodes and . =0 indicates a 2nd order random walk returns to its source node, (i.e., ), while =1 means that this walk goes to a BFS node, and =2 to a DFS node. The parameters and control the distribution of these three cases. Assuming the random walk is sufficiently sampled, Node2Vec can be rewritten in matrix form:
(25) 
which can be transformed and reorganized as:
(26) 
where transition probabilities is random walk normalized adjacency matrix.
(5) Simple Graph Convolution (SGC) [48] removes nonlinear function between neighboring graph convolution layers, and combine graph propagation in one single layer:
(27) 
where is renormalized adjacency matrix, i.e., , where is degree matrix with self loop. Therefore, it can be easily rewritten as:
(28) 
4.3 Propagation Directions (A3)
Most works merely consider label propagation from the node to its neighbors (i.e., gathering information form its neighbors) but ignore propagation in reverse direction. Reverse propagation means that labels or attributes can be propagated back to itself with probabilities, or restart propagating with a certain probability. This reverse behavior can avoid oversmoothing issue [26]. Note that [A2] can also alleviate oversmoothing issue by manually adjusting the order number, while [A3] can automatically fit the proper order number. Several works explicitly or implicitly implement reverse propagation by applying rational function on the adjacency matrix [10, 26, 31, 33, 23, 29, 6]. Since general label propagation is implemented by multiplying graph Laplacian, reverse propagation could be implemented by multiplying inverse graph Laplacian as:
(29) 
where and are two different polynomial functions, and the bias of is often set to 1.
(1) AutoRegressive label propagation (LP) [60, 57, 5] is a widely used methodology for graphbased learning. The objective of LP is twofold: one is to extract embeddings that match with the label, the other is to become similar to neighboring vertices. The label can be treated as part of node attributes, so we have:
(30) 
which is equivalent to the form of Eq. (29), i.e., and .
(2) Personalized PageRank (PPNP) [26] can obtain node’s representation via teleport (restart) probability which is the ratio of keeping the original representation , i.e., no propagation. (1) is the ratio of performing the normal label propagation:
(31) 
where is random walk normalized adjacency matrix with self loop. Eq. (44) is with a rational function whose numerator is a constant.
(3) ARMA filter [6] utilize ARMA filter for approximating any desired filter response function, which can be written in the spatial domain as:
(32) 
Note that ARMA filter is an unnormalized version of PPNP. When a+b=1, ARMA becomes PPNP.
(4) RationalNet [10] proposes a general rational function and optimized by Remez algorithm, and the analytic form is exactly Eq. (29)
(5) CayleyNets [29] applies rational filtering function in the complex domain.
4.4 Connection among spatial methods
Three groups of spatial methods introduced above (i.e., A1, A2, A3) are strongly connected under generalization and specialization relationship, as shown in Fig. (1): (1) Generalization: Local Aggregation can be extended to Order of Connectivity by adding more neighbors of higher order. Order of Connectivity can be upgraded to Propagation Direction by adding reverse propagation; (2) Specialization: Local Aggregation is a special case of Order of Connectivity when setting the the order to 1. Order of Connectivity is a special case of Propagation Direction if removing reverse propagation.
5 Spectralbased GNNs (B0)
Spectralbased GNN models are built on spectral graph theory which applies eigendecomposition and analyzes the weightadjusting function (i.e., filter function) on eigenvalues of graph matrices. The weights yielded by filter function are assigned to frequency components (eigenvectors) for reconstructing the target signal. Based on spectral operation, we propose a new taxonomy of graph neural networks, categorizing spectralbased GNNs into three subgroups:
5.1 Frequency Aggregation (B1)
There exist numerous works that can be boiled down to adjusting weights of frequency components in the spectral domain. The goal of filter function is to adjust eigenvalues (i.e., the weights of eigenvectors) to fit the target output. Many of them are proving to be lowpass filters [31], which means that only lowfrequency components are emphasized, i.e., the first few eigenvalues are enlarged, and the others are reduced. There exist a large number of works that can be understood as adjusting weights of frequency component during aggregation. Specifically, a linear function of is employed:
(33) 
where is the ith eigenvector, is frequency filter function controlled by parameters , and selected lowest frequency components. The goal of is to change the weights of eigenvalues to fit the target output. Several stateoftheart methods introduced in the last section are analyzed to illustrate this scheme:
(1) Graph Convolutional Network [25] can be rewritten in spectral domain as:
(34) 
Therefore, the frequency response function is which is a lowpass filter, i.e.,smaller eigenvalue will be adjusted to a large value, in which small eigenvalue corresponds to low frequency component.
(3) Graph Isomorphism Network (GIN) [53] can be rewritten as:
(36) 
GIN can be seen as a generalization of GCN or GraphSAGE without normalized adjacency matrix . The frequency response function is
5.2 Order of Approximation (B2)
Considering higher order of frequency, filter function can approximate any smooth filter function, because it is equivalent to applying the polynomial approximation. Therefore, introducing higherorder of frequencies boosts the representation power of filter function in simulating spectral signal. Formally, this type of work can be written as:
(37) 
where is a polynomial function.
(1) ChebNet [21] can be rewritten as:
(38) 
where is the Chebyshev polynomial and is the Chebyshev coefficient. is the coefficient after expansion and reorganization. Therefore, we can rewrite it as:
(39) 
where .
(2) DeepWalk [41] updates representation as:
where , and all parameters are determined by the predefined step size t.
(3) Diffusion convolutional neural networks (DCNN) [2] can be rewritten as:
(40) 
(4) Node2Vec [17] can be transformed and reorganized after substituting as:
(41) 
Therefore, Node2Vec’s frequency response function, , is a second order function of with predefined parameters, i.e., and .
(5) Simple Graph Convolution (SGC) [48] can be easily rewritten after substituting as:
where is a polynomial function of .
5.3 Approximation Type (B3)
Although polynomial approximation is widely used and empirically effective, it only works when applying on a smooth signal in the spectral domain. However, there is no guarantee that any real world signal is smooth. Therefore, the rational approximation is introduced to improve the accuracy of nonsmooth signal modelling. Rational kernel based method can be written as:
(42) 
where is a rational function, and are independent polynomial functions. Spectral methods process graph as a signal in the frequency domain.
(2) Personalized PageRank (PPNP) [26] can be rewritten in the spectral domain with substituting as:
(44) 
where
(3) ARMA filter [6] can be rewritten in the spectral domain with substituting as:
(45) 
Note that ARMA filter is an unnormalized version of PPNP. When a+b=1, ARMA filter becomes PPNP.
5.4 Connection among spectral methods
There is a strongtie among the abovementioned three groups of spectral methods in the perspective of generalization and specialization, as shown in Fig. (1): (1) Generalization: Frequency Aggregation can be extended to Order of Connectivity by adding more higher order of eigenvalues, i.e., 1k. Order of Connectivity can be upgraded to Approximation Type if the denominator of filter function is not 1; (2) Specialization: Frequency Aggregation is a special case of Order of Connectivity by setting the highest order to 1. Order of Connectivity is a special case of Approximation Type by setting the denominator of filter function to 1.
6 Conclusion
In this paper, we propose a unified framework that summarizes the stateoftheart GNNs, providing a new perspective for understanding GNNs of different mechanisms. By analytically categorizing current GNNs into the spatial and spectral domains and further dividing them into subcategories, our analysis reveals that the subcategories are not only strongly connected by generalization and specialization relations within their domain, but also by equivalence relation across the domains. We demonstrate the generalization power of our proposed framework by reformulating numerous existing GNN models. The above survey of the stateoftheart graph neural networks, showing that GNNs is still a young research area. Increasing number of emerging GNN models [15, 54, 9, 47, 51, 40] makes the theoretical understanding [34, 39] a urgent need. Therefore, the nextgeneration GNNs are expected to be more interpretable and transparent to the application [18, 3, 55].
References
 (2019) Mixhop: higherorder graph convolution architectures via sparsified neighborhood mixing. International Conference on Machine Learning. Cited by: §4.2.
 (2016) Diffusionconvolutional neural networks. In Advances in Neural Information Processing Systems, pp. 1993–2001. Cited by: §1, §1, §4.2, §4.2, §5.2.
 (2019) Explainability techniques for graph convolutional networks. arXiv preprint arXiv:1905.13686. Cited by: §6.
 (1997) Transportation network analysis. Cited by: §1.
 (2006) 11 label propagation and quadratic criterion. Cited by: §4.3, §5.3.
 (2019) Graph neural networks with convolutional arma filters. CoRR. Cited by: §4.3, §4.3, §4.3, §5.3.
 (2004) Extremal graph theory. Courier Corporation. Cited by: Definition 2.3.
 (2014) Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014), CBLS, April 2014, pp. http–openreview. Cited by: §1, §1.
 (2019) Reinforcement learning based graphtosequence model for natural question generation. arXiv preprint arXiv:1908.04942. Cited by: §6.
 (2018) Rational neural networks for approximating jump discontinuities of graph convolution operator. ICDM. Cited by: §4.3, §4.3.
 (1997) Spectral graph theory. American Mathematical Soc.. Cited by: Definition 2.2.
 (2002) A genomic regulatory network for development. science 295 (5560), pp. 1669–1678. Cited by: §1.
 (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems, pp. 3844–3852. Cited by: §1, §1, §4.1, §4.2, §4.2.
 (2008September 23) Diagnosing fault patterns in telecommunication networks. Google Patents. Note: US Patent 7,428,300 Cited by: §1.
 (2019) A fair comparison of graph neural networks for graph classification. arXiv preprint arXiv:1912.09893. Cited by: §6.
 (2017) Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine LearningVolume 70, pp. 1263–1272. Cited by: §1, §4.1.
 (2016) Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864. Cited by: §4.2, §4.2, §5.2.
 (2020) Learning individual treatment effects from networked observational data. ACM International Conference on Web Search and Data Mining. Cited by: §6.
 (2017) Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, pp. 1024–1034. Cited by: §1, §1, §4.1, §4.1, §5.1.
 (2017) Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584. Cited by: §1.
 (2011) Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis 30 (2), pp. 129–150. Cited by: §4.2, §5.2.
 (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal processing magazine 29. Cited by: §1.
 (201701) Autoregressive moving average graph filtering. IEEE Transactions on Signal Processing 65 (2), pp. 274–288. External Links: Document, ISSN 1053587X Cited by: §4.3.
 (2007) On the effectiveness of laplacian normalization for graph semisupervised learning. Journal of Machine Learning Research 8 (Jul), pp. 1489–1517. Cited by: §4.1.
 (2017) Semisupervised classification with graph convolutional networks. ICLR. Cited by: §1, §1, §4.1, §5.1.
 (2018) Predict then propagate: graph neural networks meet personalized pagerank. Cited by: §4.3, §4.3, §4.3, §5.3.
 (2009) Life in the network: the coming age of computational social science. Science (New York, NY) 323 (5915), pp. 721. Cited by: §1.
 (2015) Deep learning. nature 521 (7553), pp. 436. Cited by: §1.
 (2018) Cayleynets: graph convolutional neural networks with complex rational spectral filters. IEEE Transactions on Signal Processing 67 (1), pp. 97–109. Cited by: §4.3, §4.3, §4.3.
 (2018) Deeper insights into graph convolutional networks for semisupervised learning. In ThirtySecond AAAI Conference on Artificial Intelligence, Cited by: §1, §4.2.
 (201906) Label efficient semisupervised learning via graph filtering. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1, §4.3, §5.1.
 (2015) Learning entity and relation embeddings for knowledge graph completion.. In AAAI, Vol. 15, pp. 2181–2187. Cited by: §1.
 (201511) Distributed autoregressive moving average graph filters. IEEE Signal Processing Letters 22 (11), pp. 1931–1935. External Links: Document, ISSN 10709908 Cited by: §4.3.
 (2019) What graph neural networks cannot learn: depth vs width. arXiv preprint arXiv:1907.03199. Cited by: §6.
 (2015) Effective approaches to attentionbased neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421. Cited by: §1.
 (2012) Highthroughput anatomy: charting the brain’s networks. Nature 490 (7419), pp. 293. Cited by: §1.
 (2017) Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5115–5124. Cited by: §1.
 (2002) Spread of epidemic disease on networks. Physical review E 66 (1), pp. 016128. Cited by: §1.
 GRAPH neural networks exponentially lose expressive power for node classification. Cited by: §6.
 (2019) Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery. External Links: 1906.05267 Cited by: §6.
 (2014) Deepwalk: online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701–710. Cited by: §4.1, §4.2, §5.2.
 (2016) You only look once: unified, realtime object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788. Cited by: §1.
 (2015) Faster rcnn: towards realtime object detection with region proposal networks. In Advances in neural information processing systems, pp. 91–99. Cited by: §1.
 (2013) The emerging field of signal processing on graphs: extending highdimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine 30 (3), pp. 83–98. Cited by: Definition 2.3.
 (2015) Line: largescale information network embedding. In Proceedings of the 24th international conference on world wide web, pp. 1067–1077. Cited by: §4.2.
 (2017) Graph attention networks. arXiv preprint arXiv:1710.10903. Cited by: §1, §1, §4.1.
 (2019) Antimoney laundering in bitcoin: experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591. Cited by: §6.
 (2019) Simplifying graph convolutional networks. In International Conference on Machine Learning, pp. 6861–6871. Cited by: §1, §4.2, §4.2, §5.2.
 (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144. Cited by: §1.
 (2019) A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596. Cited by: §1.
 (201804) Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, pp. 145301. External Links: Document, Link Cited by: §6.
 (2018) How powerful are graph neural networks?. arXiv preprint arXiv:1810.00826. Cited by: §4.1.
 (2019) How powerful are graph neural networks?. In International Conference on Learning Representations, External Links: Link Cited by: §1, §1, §4.1, §4.1, §5.1.
 (2017) Differentiable learning of logical rules for knowledge base reasoning. In Advances in Neural Information Processing Systems, pp. 2319–2328. Cited by: §6.
 (2019) GNN explainer: a tool for posthoc explanation of graph neural networks. Cited by: §1, §6.
 (2018) Deep learning on graphs: a survey. arXiv preprint arXiv:1812.04202. Cited by: §1.
 (2004) Learning with local and global consistency. In Advances in neural information processing systems, pp. 321–328. Cited by: §4.3, §5.3.
 (2018) Graph neural networks: a review of methods and applications. arXiv preprint arXiv:1812.08434. Cited by: §1.
 (2012) Approximating signals supported on graphs. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, pp. 3921–3924. Cited by: Definition 2.3.
 (2003) Semisupervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International conference on Machine learning (ICML03), pp. 912–919. Cited by: §4.3, §5.3.