Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural Networks

Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural Networks

Abstract

The success of deep learning has been widely recognized in many machine learning tasks during the last decades, ranging from image classification and speech recognition to natural language understanding. As an extension of deep learning, Graph neural networks (GNNs) are designed to solve the non-Euclidean problems on graph-structured data which can hardly be handled by general deep learning techniques. Existing GNNs under various mechanisms, such as random walk, PageRank, graph convolution, and heat diffusion, are designed for different types of graphs and problems, which makes it difficult to compare them directly. Previous GNN surveys focus on categorizing current models into independent groups, lacking analysis regarding their internal connection. This paper proposes a unified framework and provides a novel perspective that can widely fit existing GNNs into our framework methodologically. Specifically, we survey and categorize existing GNN models into the spatial and spectral domains, and reveal connections among subcategories in each domain. Further analysis establishes a strong link across the spatial and spectral domains.

1 Introduction

The effectiveness of deep learning [28] has been widely recognized in various machine learning tasks [42, 43, 22, 49, 35] during the last decades, achieving remarkable success on Euclidean data. Recent decades has witnessed a great number of emerging applications where effective information analysis generally boils down to the non-Euclidean geometry of the data represented by a graph, such as social networks [27], transportation networks [4], spread of epidemic disease [38], brain’s neuronal networks [36], gene data on biological regulatory networks [12], telecommunication networks [14], and knowledge graph [32]. Such non-Euclidean problems on graph-structured data can hardly be handled by general deep learning techniques. Modeling data by the graph is challenging due to that graph data is irregular, i.e., each graph has a variable size of nodes, and each node in a graph has a different number of neighbors, rendering some operations such as convolutions not directly applicable to the graph structure. Recently, there has been increasing interest in extending deep learning for graph data. Inspired by the success of deep learning, ideas are borrowed from deep learning models to handle the intrinsic complexity of the graph. This rising trend attracts increasing interest in the machine learning community, and a large number of GNN models are developed based on various theories [8, 25, 13, 19, 2, 46].

Despite GNNs dominate graph representation learning in recent years, there is still a limited understanding of their representational power and physical meaning. The lack of understanding of GNNs significantly hinders the comparison and improvement of state-of-the-art methods. This gap also makes it challenging to extend GNNs to many domains such as business intelligence or drug development, since black box models may be associated with uncontrollable risks. Therefore, there is a pressing need to demystify GNNs, which motivates researchers to explore a generalized framework for GNNs [53, 16, 55]. However, these works can only explain few GNNs, and the interpretation for the majority of GNNs is still missing.

There exist a large number of different mechanisms among current GNNs, such as random walk, Page Rank, attention model, low-pass filter, message passing, and so on. These methods can be classified into several coarse-grained groups [37, 20, 56, 58, 50] such as spectral [8, 25, 13] and spatial domain [19, 2, 46]. However, the current taxonomies fail to provide an understanding of the connections among different GNN models. Elucidating the underlying mechanisms of GNNs, and understanding connections among all types of GNNs is still at the forefront of GNNs research [53, 48, 31, 30]. This work is not trivial since the mechanisms behind existing GNNs are not inherently consistent, so their internal connection remains unclear. This gap incurs difficulty in understanding GNNs and comparing emerging methods. Previous surveys of GNNs [37, 20, 56, 58, 50] focus on categorizing current models into independent groups and expounding each group separately without analysis regarding their relationship.

The objective of this paper is to provide a unified framework to generalize GNNs, bridging the gap among the existing works in spatial and spectral domains which are currently deemed as independent. The main focus of this work is the connection among GNNs from a theoretical perspective, going beyond existing taxonomies and designing a new scheme for GNNs. Our research is unique in how it links present works of various categories of GNNs. Firstly, we briefly introduce the proposed framework, including spatial and spectral domains, and present their internal connection. Then detailed subcategories from the spatial and spectral domains are provided respectively, and several popular GNN examples are used to illustrate our taxonomies in each subcategory. Detailed contributions of this paper are summarized as follows:

  1. Proposing a taxonomy for summarizing GNN approaches in the spatial domain. This paper unifies GNN methods in the spatial domain by formulating spatial operation on graph connectivity. Then GNNs can be treated as the same function of graph matrix with different configurations.

  2. Providing a taxonomy for summarizing GNN methods in the spectral domain. This survey categorizes GNN models by frequency response function in the spectral domain, and applies approximation theory to illuminate their generalization and specialization relationship.

  3. Incorporating spatial and spectral models into a unified framework. By comparing the analytical forms, the proposed framework links the frequency response function of the spectral domain and node aggregation function in the spatial domain.

The rest of this survey is organized as follows. In Section 2, we introduce problem setup and necessary preliminaries. Then we present details of our proposed taxonomy in Section 3. Section 4 and 5 elaborate the details of the proposed taxonomy. We conclude the survey in Section 6.

2 Problem Setup and Preliminary

This section outlines the background of graph neural networks and problem setup.

Definition 2.1 (Graph)

A graph is defined as , where is a set of n nodees, represents edges. An entry denotes a node, indicates an edge between node and . The adjacency matrix is defined by iff there is a link between node and . Graph signal or node attributes is a feature matrix with each entry representing the feature vector on node .

Figure 1: Illustration of major graph neural operations and their relationship. Spatial and spectral methods are divided into three groups, respectively. Group A1, A2, and A3 are strongly-correlated by generalization and specialization, so are group B1, B3, and B3. The equivalence relationship among them is marked in the same color.

Learning node-level embeddings can be summarized as

(1)

where indicates the parameters of the model. We aim to find a which can integrate graph structure and original node attributes, outputting a new node embedding . Since this review focuses on two major categories, i.e., spectral and spatial methods, two related definitions are listed below for understanding this paper.

Definition 2.2 (Spatial Method)

Treating graph Laplacian [11] as spatial connectivity among nodes, spatial method integrate and signal :

(2)

Therefore, spatial methods focus on finding a function .

Definition 2.3 (Spectral Method)

Graph Laplacian is defined as where is degree matrix. Due to its generalization ability [7] , the normalized Laplacian is defined as . The Laplacian is diagonalized by the Fourier basis (i.e., graph Fourier transform) [44, 59]: where is the diagonal matrix whose diagonal elements are the corresponding eigenvalues (i.e., ), and is also called eigenvectors. The graph Fourier transform of a signal is defined as and its inverse as . A graph convolution operation is defined in the Fourier domain such that

(3)

where is the element-wise product, and are two signals defined on node domain. It follows that a node signal is filtered by spectral signal as:

(4)

where is known as frequency response function of filter . Therefore, spectral methods is defined as learning .

3 The proposed taxonomy

As shown in Fig. (1), the proposed framework categorizes GNNs into the spatial (A0) and spectral (B0) groups, each of which is further divided into three subcategories respectively. By transforming the analytical form of these subcategories, we found three relations of equivalence as below:

(A1)Local Aggregation (B1)Frequency Aggregation: Through graph and matrix theory, Local Aggregation adjusts weights on a set of neighbor nodes, which corresponds to adjusting weights on frequency components in Frequency Aggregation.

(A2)Connectivity Order (B2)Approximation Order: Accumulating different orders of neighbors in Connectivity Order can be rewritten as the sum of different orders of frequency components, which is exactly the analytical form of Approximation Order.

(A3)Propagation Type (B3)Approximation Type: Propagation Type defines a label propagation with or without reverse propagation, while Approximation Type adjusts filter function with or without simple denominator (i.e., 1). In this way, they share the same formula after a simple transformation.

Sections 3 and 4 will discuss details of the proposed taxonomy, and illustrate each subcategory with several GNN examples.

4 Spatial-based GNNs (A0)

Several important aspects are often discussed in the existing literature of spatial methods, such as self-loop, normalization, high-order neighbors, aggregation, and combination among nodes. Based on these operations, we propose a new taxonomy of graph neural networks, categorizing spatial-based GNNs into three groups:

4.1 Local Aggregation (A1)

A number of works [41, 53, 52, 16, 19, 46] can be treated as learning the aggregation scheme among first order neighbors (i.e., direct neighbors). This aspect focuses on adjusting the weights for node and its neighbors to reveal the pattern regarding the supervision signal. Formally, updated node embeddings, , can be written as:

(5)

where denotes a neighbor of node , is their representations, and indicate the weight functions. First item on the right hand side denotes the representation of node , while the second represents the update from its neighbors. Applying random walk normalization (i.e., dividing neighbors by degree of the current node), Eq. (5) can be written as:

(6)

or symmetric normalization:

(7)

where represents the degree of node . Normalization has better generalization capacity, which is not only due to some implicit evidence but also because of a theoretical proof on performance improvement [24]. In a simplified configuration, weights for the neighbors () are the same. Therefore, they can be rewritten in matrix form as:

(8)

or

(9)

where and are the weights. Eq. (8) and (9) can be generalized as the same form:

(10)

where denotes normalized , which could be implemented by random walk or symmetric normalization. Several state-of-the-art methods are selected to illustrate this schema:

(1) Graph Convolutional Network [25] is a simplification of ChebNet [13]. GCN adds a self-loop to nodes , and applies a renormalization trick which changes degree matrix from to . Specifically, GCN can be written as:

where , and is normalized adjacency matrix with self loop. Therefore, Eq. (4.1) is equivalent to Eq. (10) when setting and with the renormalization trick, and the result of GCN is exactly the sum of the current node and average of its neighbors.

(2) GraphSAGE [19] with mean aggregator averages a node with its neighbors by:

(11)

where indicates the representation, and denotes the neighbor nodes. Eq. (11) can be written in matrix form:

(12)

which is equivalent to Eq. (10) with and . Note that the key difference between GCN and GraphSAGE is the normalization. The former is symmetric normalization and the latter is random walk normalization.

(3) Graph Isomorphism Network (GIN) [53] updates node representations as:

(13)

which is equivalent to Eq. (10) with and . Note that GIN dose not perform normalization.

4.2 Order of Connectivity (A2)

To collect richer local structure, several studies [2, 13, 48, 45, 17] involve higher orders of neighbors. Since direct neighbors (i.e., first-order neighbors) are not always sufficient for representing the node. On the other hand, large order usually averages all node representations, causing an over-smoothing issue and losing its focus on the local neighborhood [30]. This motivates many models to tune the aggregation scheme on different orders of neighbors. Therefore, proper constraint and flexibility of orders are critical for node representation. High order of neighbors has been proved to characterize challenging signal such as Gabor-like filters [1]. Formally, this type of work can be written as:

(14)

where indicates a n-th order neighbors of node . Eq. (14) can be rewritten in matrix form:

(15)

where is a polynomial function. Applying normalization, Equation (15) can be rewritten in matrix form as:

(16)

where , and could also be normalized by random walk normalization: . Several existing works are analyzed below, showing that they are variants of Eq. (15) or (16):

(1) ChebNet [21] first introduced truncated Chebyshev polynomial for estimating wavelet in graph signal processing. Based on this polynomial approximation, Defferrard et al.[13] designed ChebNet which embeds a novel neural network layer for the convolution operator. Specifically, ChebNet is written as:

(17)

where denotes the Chebyshev polynomial and is the Chebyshev coefficient. is the coefficient after expansion and reorganization. Since , we have:

(18)

which can be reorganized as:

(19)

which is exactly Eq. (16)

(2) DeepWalk [41] is a random walk based model that is integrated with deep learning technique. DeepWalk first draws a group of random paths from graph and applies a skip-gram algorithm to extract node features. Assuming the number of samples is large enough, then the transfer probability of random walk on a graph can be written as:

(20)

with random walk normalization. Let the window size of skip-gram be and the current node is the (t+1)-th one, the farthest neighbor current node can reach is a t-th order one. If the training is sufficient and samples are adequate, the node will converge to its neighbors. Therefore, the updated representation is as follows:

(21)

(3) Diffusion convolutional neural networks (DCNN) [2] considers using a degree-normalized transition matrix, i.e., renormalized adjacency matrix: :

(22)

where denotes a tensor containing the power series of , and the operator represents element-wise multiplication. It can be transformed as:

(23)

(4) Node2Vec [17] defines a 2nd order random walk to control the balance between BFS (breath first search) and DFS (depth first search). Consider a random walk that traversed edge (, ) and now resides at node . The transition probabilities to next stop from node is defined as:

(24)

where denotes the shortest path between nodes and . =0 indicates a 2nd order random walk returns to its source node, (i.e., ), while =1 means that this walk goes to a BFS node, and =2 to a DFS node. The parameters and control the distribution of these three cases. Assuming the random walk is sufficiently sampled, Node2Vec can be rewritten in matrix form:

(25)

which can be transformed and reorganized as:

(26)

where transition probabilities is random walk normalized adjacency matrix.

(5) Simple Graph Convolution (SGC) [48] removes non-linear function between neighboring graph convolution layers, and combine graph propagation in one single layer:

(27)

where is renormalized adjacency matrix, i.e., , where is degree matrix with self loop. Therefore, it can be easily rewritten as:

(28)

4.3 Propagation Directions (A3)

Most works merely consider label propagation from the node to its neighbors (i.e., gathering information form its neighbors) but ignore propagation in reverse direction. Reverse propagation means that labels or attributes can be propagated back to itself with probabilities, or restart propagating with a certain probability. This reverse behavior can avoid over-smoothing issue [26]. Note that [A2] can also alleviate over-smoothing issue by manually adjusting the order number, while [A3] can automatically fit the proper order number. Several works explicitly or implicitly implement reverse propagation by applying rational function on the adjacency matrix [10, 26, 31, 33, 23, 29, 6]. Since general label propagation is implemented by multiplying graph Laplacian, reverse propagation could be implemented by multiplying inverse graph Laplacian as:

(29)

where and are two different polynomial functions, and the bias of is often set to 1.

(1) Auto-Regressive label propagation (LP) [60, 57, 5] is a widely used methodology for graph-based learning. The objective of LP is two-fold: one is to extract embeddings that match with the label, the other is to become similar to neighboring vertices. The label can be treated as part of node attributes, so we have:

(30)

which is equivalent to the form of Eq. (29), i.e., and .

(2) Personalized PageRank (PPNP) [26] can obtain node’s representation via teleport (restart) probability which is the ratio of keeping the original representation , i.e., no propagation. (1-) is the ratio of performing the normal label propagation:

(31)

where is random walk normalized adjacency matrix with self loop. Eq. (44) is with a rational function whose numerator is a constant.

(3) ARMA filter [6] utilize ARMA filter for approximating any desired filter response function, which can be written in the spatial domain as:

(32)

Note that ARMA filter is an unnormalized version of PPNP. When a+b=1, ARMA becomes PPNP.

(4) RationalNet [10] proposes a general rational function and optimized by Remez algorithm, and the analytic form is exactly Eq. (29)

(5) CayleyNets [29] applies rational filtering function in the complex domain.

Remark: The computational cost of [A3] is expensive since it involves the inverse of matrix. Typical solution is to apply iterative algorithms [6, 26, 29].

4.4 Connection among spatial methods

Three groups of spatial methods introduced above (i.e., A1, A2, A3) are strongly connected under generalization and specialization relationship, as shown in Fig. (1): (1) Generalization: Local Aggregation can be extended to Order of Connectivity by adding more neighbors of higher order. Order of Connectivity can be upgraded to Propagation Direction by adding reverse propagation; (2) Specialization: Local Aggregation is a special case of Order of Connectivity when setting the the order to 1. Order of Connectivity is a special case of Propagation Direction if removing reverse propagation.

5 Spectral-based GNNs (B0)

Spectral-based GNN models are built on spectral graph theory which applies eigen-decomposition and analyzes the weight-adjusting function (i.e., filter function) on eigenvalues of graph matrices. The weights yielded by filter function are assigned to frequency components (eigenvectors) for reconstructing the target signal. Based on spectral operation, we propose a new taxonomy of graph neural networks, categorizing spectral-based GNNs into three subgroups:

5.1 Frequency Aggregation (B1)

There exist numerous works that can be boiled down to adjusting weights of frequency components in the spectral domain. The goal of filter function is to adjust eigenvalues (i.e., the weights of eigenvectors) to fit the target output. Many of them are proving to be low-pass filters [31], which means that only low-frequency components are emphasized, i.e., the first few eigenvalues are enlarged, and the others are reduced. There exist a large number of works that can be understood as adjusting weights of frequency component during aggregation. Specifically, a linear function of is employed:

(33)

where is the i-th eigenvector, is frequency filter function controlled by parameters , and selected lowest frequency components. The goal of is to change the weights of eigenvalues to fit the target output. Several state-of-the-art methods introduced in the last section are analyzed to illustrate this scheme:

(1) Graph Convolutional Network [25] can be rewritten in spectral domain as:

(34)

Therefore, the frequency response function is which is a low-pass filter, i.e.,smaller eigenvalue will be adjusted to a large value, in which small eigenvalue corresponds to low frequency component.

(2) GraphSAGE [19] can be written in matrix form:

(35)

so the frequency response function is .

(3) Graph Isomorphism Network (GIN) [53] can be rewritten as:

(36)

GIN can be seen as a generalization of GCN or GraphSAGE without normalized adjacency matrix . The frequency response function is

5.2 Order of Approximation (B2)

Considering higher order of frequency, filter function can approximate any smooth filter function, because it is equivalent to applying the polynomial approximation. Therefore, introducing higher-order of frequencies boosts the representation power of filter function in simulating spectral signal. Formally, this type of work can be written as:

(37)

where is a polynomial function.

(1) ChebNet [21] can be rewritten as:

(38)

where is the Chebyshev polynomial and is the Chebyshev coefficient. is the coefficient after expansion and reorganization. Therefore, we can rewrite it as:

(39)

where .

(2) DeepWalk [41] updates representation as:

where , and all parameters are determined by the predefined step size t.

(3) Diffusion convolutional neural networks (DCNN) [2] can be rewritten as:

(40)

(4) Node2Vec [17] can be transformed and reorganized after substituting as:

(41)

Therefore, Node2Vec’s frequency response function, , is a second order function of with predefined parameters, i.e., and .

(5) Simple Graph Convolution (SGC) [48] can be easily rewritten after substituting as:

where is a polynomial function of .

5.3 Approximation Type (B3)

Although polynomial approximation is widely used and empirically effective, it only works when applying on a smooth signal in the spectral domain. However, there is no guarantee that any real world signal is smooth. Therefore, the rational approximation is introduced to improve the accuracy of non-smooth signal modelling. Rational kernel based method can be written as:

(42)

where is a rational function, and are independent polynomial functions. Spectral methods process graph as a signal in the frequency domain.

(1) Auto-Regressive filter [60, 57, 5] can be rewritten as:

(43)

where

(2) Personalized PageRank (PPNP) [26] can be rewritten in the spectral domain with substituting as:

(44)

where

(3) ARMA filter [6] can be rewritten in the spectral domain with substituting as:

(45)

Note that ARMA filter is an unnormalized version of PPNP. When a+b=1, ARMA filter becomes PPNP.

5.4 Connection among spectral methods

There is a strong-tie among the above-mentioned three groups of spectral methods in the perspective of generalization and specialization, as shown in Fig. (1): (1) Generalization: Frequency Aggregation can be extended to Order of Connectivity by adding more higher order of eigenvalues, i.e., 1k. Order of Connectivity can be upgraded to Approximation Type if the denominator of filter function is not 1; (2) Specialization: Frequency Aggregation is a special case of Order of Connectivity by setting the highest order to 1. Order of Connectivity is a special case of Approximation Type by setting the denominator of filter function to 1.

6 Conclusion

In this paper, we propose a unified framework that summarizes the state-of-the-art GNNs, providing a new perspective for understanding GNNs of different mechanisms. By analytically categorizing current GNNs into the spatial and spectral domains and further dividing them into subcategories, our analysis reveals that the subcategories are not only strongly connected by generalization and specialization relations within their domain, but also by equivalence relation across the domains. We demonstrate the generalization power of our proposed framework by reformulating numerous existing GNN models. The above survey of the state-of-the-art graph neural networks, showing that GNNs is still a young research area. Increasing number of emerging GNN models [15, 54, 9, 47, 51, 40] makes the theoretical understanding [34, 39] a urgent need. Therefore, the next-generation GNNs are expected to be more interpretable and transparent to the application [18, 3, 55].

References

  1. S. Abu-El-Haija, B. Perozzi, A. Kapoor, H. Harutyunyan, N. Alipourfard, K. Lerman, G. V. Steeg and A. Galstyan (2019) Mixhop: higher-order graph convolution architectures via sparsified neighborhood mixing. International Conference on Machine Learning. Cited by: §4.2.
  2. J. Atwood and D. Towsley (2016) Diffusion-convolutional neural networks. In Advances in Neural Information Processing Systems, pp. 1993–2001. Cited by: §1, §1, §4.2, §4.2, §5.2.
  3. F. Baldassarre and H. Azizpour (2019) Explainability techniques for graph convolutional networks. arXiv preprint arXiv:1905.13686. Cited by: §6.
  4. M. G. Bell and Y. Iida (1997) Transportation network analysis. Cited by: §1.
  5. Y. Bengio, O. Delalleau and N. Le Roux (2006) 11 label propagation and quadratic criterion. Cited by: §4.3, §5.3.
  6. F. M. Bianchi, D. Grattarola, L. Livi and C. Alippi (2019) Graph neural networks with convolutional arma filters. CoRR. Cited by: §4.3, §4.3, §4.3, §5.3.
  7. B. Bollobás (2004) Extremal graph theory. Courier Corporation. Cited by: Definition 2.3.
  8. J. Bruna, W. Zaremba, A. Szlam and Y. Lecun (2014) Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014), CBLS, April 2014, pp. http–openreview. Cited by: §1, §1.
  9. Y. Chen, L. Wu and M. J. Zaki (2019) Reinforcement learning based graph-to-sequence model for natural question generation. arXiv preprint arXiv:1908.04942. Cited by: §6.
  10. Z. Chen, F. Chen, R. Lai, X. Zhang and C. Lu (2018) Rational neural networks for approximating jump discontinuities of graph convolution operator. ICDM. Cited by: §4.3, §4.3.
  11. F. R. Chung (1997) Spectral graph theory. American Mathematical Soc.. Cited by: Definition 2.2.
  12. E. H. Davidson, J. P. Rast, P. Oliveri, A. Ransick, C. Calestani, C. Yuh, T. Minokawa, G. Amore, V. Hinman and C. Arenas-Mena (2002) A genomic regulatory network for development. science 295 (5560), pp. 1669–1678. Cited by: §1.
  13. M. Defferrard, X. Bresson and P. Vandergheynst (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems, pp. 3844–3852. Cited by: §1, §1, §4.1, §4.2, §4.2.
  14. J. H. Drew and H. Liu (2008-September 23) Diagnosing fault patterns in telecommunication networks. Google Patents. Note: US Patent 7,428,300 Cited by: §1.
  15. F. Errica, M. Podda, D. Bacciu and A. Micheli (2019) A fair comparison of graph neural networks for graph classification. arXiv preprint arXiv:1912.09893. Cited by: §6.
  16. J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals and G. E. Dahl (2017) Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1263–1272. Cited by: §1, §4.1.
  17. A. Grover and J. Leskovec (2016) Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864. Cited by: §4.2, §4.2, §5.2.
  18. R. Guo, J. Li and H. Liu (2020) Learning individual treatment effects from networked observational data. ACM International Conference on Web Search and Data Mining. Cited by: §6.
  19. W. Hamilton, Z. Ying and J. Leskovec (2017) Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, pp. 1024–1034. Cited by: §1, §1, §4.1, §4.1, §5.1.
  20. W. L. Hamilton, R. Ying and J. Leskovec (2017) Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584. Cited by: §1.
  21. D. K. Hammond, P. Vandergheynst and R. Gribonval (2011) Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis 30 (2), pp. 129–150. Cited by: §4.2, §5.2.
  22. G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen and B. Kingsbury (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal processing magazine 29. Cited by: §1.
  23. E. Isufi, A. Loukas, A. Simonetto and G. Leus (2017-01) Autoregressive moving average graph filtering. IEEE Transactions on Signal Processing 65 (2), pp. 274–288. External Links: Document, ISSN 1053-587X Cited by: §4.3.
  24. R. Johnson and T. Zhang (2007) On the effectiveness of laplacian normalization for graph semi-supervised learning. Journal of Machine Learning Research 8 (Jul), pp. 1489–1517. Cited by: §4.1.
  25. T. N. Kipf and M. Welling (2017) Semi-supervised classification with graph convolutional networks. ICLR. Cited by: §1, §1, §4.1, §5.1.
  26. J. Klicpera, A. Bojchevski and S. Günnemann (2018) Predict then propagate: graph neural networks meet personalized pagerank. Cited by: §4.3, §4.3, §4.3, §5.3.
  27. D. Lazer, A. S. Pentland, L. Adamic, S. Aral, A. L. Barabasi, D. Brewer, N. Christakis, N. Contractor, J. Fowler and M. Gutmann (2009) Life in the network: the coming age of computational social science. Science (New York, NY) 323 (5915), pp. 721. Cited by: §1.
  28. Y. LeCun, Y. Bengio and G. Hinton (2015) Deep learning. nature 521 (7553), pp. 436. Cited by: §1.
  29. R. Levie, F. Monti, X. Bresson and M. M. Bronstein (2018) Cayleynets: graph convolutional neural networks with complex rational spectral filters. IEEE Transactions on Signal Processing 67 (1), pp. 97–109. Cited by: §4.3, §4.3, §4.3.
  30. Q. Li, Z. Han and X. Wu (2018) Deeper insights into graph convolutional networks for semi-supervised learning. In Thirty-Second AAAI Conference on Artificial Intelligence, Cited by: §1, §4.2.
  31. Q. Li, X. Wu, H. Liu, X. Zhang and Z. Guan (2019-06) Label efficient semi-supervised learning via graph filtering. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1, §4.3, §5.1.
  32. Y. Lin, Z. Liu, M. Sun, Y. Liu and X. Zhu (2015) Learning entity and relation embeddings for knowledge graph completion.. In AAAI, Vol. 15, pp. 2181–2187. Cited by: §1.
  33. A. Loukas, A. Simonetto and G. Leus (2015-11) Distributed autoregressive moving average graph filters. IEEE Signal Processing Letters 22 (11), pp. 1931–1935. External Links: Document, ISSN 1070-9908 Cited by: §4.3.
  34. A. Loukas (2019) What graph neural networks cannot learn: depth vs width. arXiv preprint arXiv:1907.03199. Cited by: §6.
  35. T. Luong, H. Pham and C. D. Manning (2015) Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421. Cited by: §1.
  36. V. Marx (2012) High-throughput anatomy: charting the brain’s networks. Nature 490 (7419), pp. 293. Cited by: §1.
  37. F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda and M. M. Bronstein (2017) Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5115–5124. Cited by: §1.
  38. M. E. Newman (2002) Spread of epidemic disease on networks. Physical review E 66 (1), pp. 016128. Cited by: §1.
  39. K. Oono and T. Suzuki GRAPH neural networks exponentially lose expressive power for node classification. Cited by: §6.
  40. C. W. Park and C. Wolverton (2019) Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery. External Links: 1906.05267 Cited by: §6.
  41. B. Perozzi, R. Al-Rfou and S. Skiena (2014) Deepwalk: online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701–710. Cited by: §4.1, §4.2, §5.2.
  42. J. Redmon, S. Divvala, R. Girshick and A. Farhadi (2016) You only look once: unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788. Cited by: §1.
  43. S. Ren, K. He, R. Girshick and J. Sun (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pp. 91–99. Cited by: §1.
  44. D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega and P. Vandergheynst (2013) The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine 30 (3), pp. 83–98. Cited by: Definition 2.3.
  45. J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan and Q. Mei (2015) Line: large-scale information network embedding. In Proceedings of the 24th international conference on world wide web, pp. 1067–1077. Cited by: §4.2.
  46. P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio and Y. Bengio (2017) Graph attention networks. arXiv preprint arXiv:1710.10903. Cited by: §1, §1, §4.1.
  47. M. Weber, G. Domeniconi, J. Chen, D. K. I. Weidele, C. Bellei, T. Robinson and C. E. Leiserson (2019) Anti-money laundering in bitcoin: experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591. Cited by: §6.
  48. F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu and K. Weinberger (2019) Simplifying graph convolutional networks. In International Conference on Machine Learning, pp. 6861–6871. Cited by: §1, §4.2, §4.2, §5.2.
  49. Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao and K. Macherey (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144. Cited by: §1.
  50. Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang and P. S. Yu (2019) A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596. Cited by: §1.
  51. T. Xie and J. C. Grossman (2018-04) Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, pp. 145301. External Links: Document, Link Cited by: §6.
  52. K. Xu, W. Hu, J. Leskovec and S. Jegelka (2018) How powerful are graph neural networks?. arXiv preprint arXiv:1810.00826. Cited by: §4.1.
  53. K. Xu, W. Hu, J. Leskovec and S. Jegelka (2019) How powerful are graph neural networks?. In International Conference on Learning Representations, External Links: Link Cited by: §1, §1, §4.1, §4.1, §5.1.
  54. F. Yang, Z. Yang and W. W. Cohen (2017) Differentiable learning of logical rules for knowledge base reasoning. In Advances in Neural Information Processing Systems, pp. 2319–2328. Cited by: §6.
  55. R. Ying, D. Bourgeois, J. You, M. Zitnik and J. Leskovec (2019) GNN explainer: a tool for post-hoc explanation of graph neural networks. Cited by: §1, §6.
  56. Z. Zhang, P. Cui and W. Zhu (2018) Deep learning on graphs: a survey. arXiv preprint arXiv:1812.04202. Cited by: §1.
  57. D. Zhou, O. Bousquet, T. N. Lal, J. Weston and B. Schölkopf (2004) Learning with local and global consistency. In Advances in neural information processing systems, pp. 321–328. Cited by: §4.3, §5.3.
  58. J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu and M. Sun (2018) Graph neural networks: a review of methods and applications. arXiv preprint arXiv:1812.08434. Cited by: §1.
  59. X. Zhu and M. Rabbat (2012) Approximating signals supported on graphs. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, pp. 3921–3924. Cited by: Definition 2.3.
  60. X. Zhu, Z. Ghahramani and J. D. Lafferty (2003) Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International conference on Machine learning (ICML-03), pp. 912–919. Cited by: §4.3, §5.3.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
410062
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description