GNNExplainer: Generating Explanations for Graph Neural Networks

GNNExplainer: Generating Explanations
for Graph Neural Networks

Rex Ying
Stanford University &Dylan Bourgeois
EPFL &Jiaxuan You
Stanford University &Marinka Zitnik
Stanford University &Jure Leskovec
Stanford University

Graph Neural Networks (GNNs) are a powerful tool for machine learning on graphs. GNNs combine node feature information with the graph structure by recursively passing neural messages along edges of the input graph. However, incorporating both graph structure and feature information leads to complex models and explaining predictions made by GNNs remains unsolved. Here we propose GnnExplainer, the first general, model-agnostic approach for providing interpretable explanations for predictions of any GNN-based model on any graph-based machine learning task. Given an instance, GnnExplainer identifies a compact subgraph structure and a small subset of node features that have a crucial role in GNN’s prediction. Further, GnnExplainer can generate consistent and concise explanations for an entire class of instances. We formulate GnnExplainer as an optimization task that maximizes the mutual information between a GNN’s prediction and distribution of possible subgraph structures. Experiments on synthetic and real-world graphs show that our approach can identify important graph structures as well as node features, and outperforms baselines by 17.1% on average. GnnExplainer provides a variety of benefits, from the ability to visualize semantically relevant structures to interpretability, to giving insights into errors of faulty GNNs.

1 Introduction

In many real-world applications, including social, information, chemical, and biological domains, data can be naturally modeled as graphs Cho et al. (2011); Zitnik et al. (2018); You et al. (2018). Graphs are powerful data representations but are challenging to work with because they require modeling structural information as well as node feature information Zhou et al. (2018); Zhang et al. (2018). To address this challenge, Graph Neural Networks (GNNs) have emerged as state of the art for machine learning on graphs, due to their ability to recursively incorporate information from neighboring nodes in the graph, naturally capturing both graph structure and node features Kipf and Welling (2016); Hamilton et al. (2017); Zhang and Chen (2018); Ying et al. (2018b).

Despite their strengths, GNNs lack transparency as they do not easily allow for a human-intelligible explanation of their predictions. Yet, the ability to understand GNN’s predictions is important and useful for several reasons: (i) it can increase trust in the GNN model, (ii) it improves model’s transparency in a growing number of decision-critical applications pertaining to fairness, privacy and other safety challenges Doshi-Velez and Kim (2017), and (iii) it allows practitioners to get an understanding of the network characteristics, identify and correct systematic patterns of mistakes made by models before deploying them in the real world.

Figure 1: GnnExplainer provides interpretable explanations for predictions made by any GNN model on any graph-based machine learning task. Shown is a hypothetical node classification task where a GNN model is trained on a social interaction graph to predict future sport activities. Given a trained GNN and a prediction = “Basketball” for person , GnnExplainer generates an explanation by identifying a small subgraph of the input graph together with a small subset of node features (shown on the right) that are most influential for . Examining explanation for , we see that many friends in one part of ’s social circle enjoy ball games, and so the GNN predicts that will like basketball. Similarly, examining explanation for , we see that ’s friends and friends of his friends enjoy water and beach sports, and so the GNN predicts = “Sailing.”

While currently there are no methods for explaining GNNs, recent approaches for explaining other types of neural networks have taken one of two main routes. One line of work locally approximates models with a simpler surrogate model, which itself can be probed for explanations Ribeiro et al. (2016); Schmitz et al. (1999); Lakkaraju et al. (2017). The other line of methods carefully examine models for relevant components and identify, for example, relevant features in the input data Erhan et al. (2009); Chen et al. (2018b); Sundararajan et al. (2017); Lundberg and Lee (2017) and influential input instances Koh and Liang (2017); Yeh et al. (2018). However, these approaches fall short in their ability to incorporate structural information, the essence of graphs. Since this aspect is crucial for the success of machine learning on graphs, any explanation of GNN’s predictions should leverage rich relational information provided by the graph as well as node features.

Here, we propose GnnExplainer, an approach to explain predictions from GNNs. GnnExplainer takes a trained GNN model and its prediction(s), and it returns an explanation in the form of a small subgraph of the input graph together with a small subset of node features that are most influential for the prediction(s) (Figure 1). The approach is model-agnostic and can explain predictions of any GNN-based model on any machine learning task for graphs, including node classification, link prediction, and graph classification. It handles single- as well as multi-instance explanations. In the case of single-instance explanations, GnnExplainer explains the GNN’s prediction for one particular instance (i.e., a node label, a new link, a graph-level label), and in the case of multi-instance explanations, GnnExplainer provides a consistent explanation for a set of GNN’s predictions.

GnnExplainer formalizes an explanation as a rich subgraph of the entire graph the GNN was trained that maximizes the mutual information with the GNN’s prediction(s). This is achieved by formulating a mean field variational approximation and learning a real-valued graph mask which selects the important subgraph of the GNN’s computation graph. Simultaneously, GnnExplainer also learns a feature mask that masks out unimportant node features (Figure 1).

We validate GnnExplainer on synthetic and real-world graphs. On synthetic graphs with planted network motifs important for prediction, we show that GnnExplainer can outperform baselines by up to 43.0% in explanation accuracy. Furthermore, we show how GnnExplainer can robustly identify important graph structures and node features that influence a GNN’s prediction the most. On two real-world datasets, i.e., molecular graphs and social interaction networks, we show that GnnExplainer can identify graph structures that a GNN learned to use for prediction, such as chemical groups or ring structures in molecules, and star structures in Reddit threads.

2 Related work

Although the problem of explaining GNNs is not well-studied, the related problems of interpretability and neural debugging recently received substantial attention in machine learning. At a high level, we can group these interpretability methods for non-graph neural networks into two main families.

Methods in the first family formulate a simple proxy model for a full neural network model. This can be done in a model-agnostic way, usually by learning a locally faithful approximation around the prediction, for example with a linear model Ribeiro et al. (2016) or a set of rules, representing sufficient conditions on the prediction Augasta and Kathirvalavakumar (2012); Zilke et al. (2016); Lakkaraju et al. (2017). Methods in the second family identify important aspects of the computation, for example, through feature gradients Erhan et al. (2009); Zeiler and Fergus (2014), backpropagation of neurons’ contributions to the input features Sundararajan et al. (2017); Shrikumar et al. (2017); Chen et al. (2018b), and counterfactual reasoning Kang et al. (2019). However, the saliency maps Zeiler and Fergus (2014) produced by these methods have been shown to be misleading in some instances Adebayo et al. (2018) and prone to issues like gradient saturation Sundararajan et al. (2017); Shrikumar et al. (2017). These issues are exacerbated on discrete inputs such as graph adjacency matrices since the gradient values can be very large but only on a very small interval. Because of that, such approaches are not suitable for explaining predictions made by neural networks on graphs.

Instead of creating new, inherently interpretable models, post-hoc interpretability methods Koh and Liang (2017); Yeh et al. (2018); Guidotti and others (2018); Adadi and Berrada (2018); Fisher et al. (2018); Hooker (2004) consider the model as a black box and then probe it for relevant information. However, no work has been done to leverage relational structures like graphs. The lack of methods for explaining predictions on graph-structured data is problematic, as in many cases, predictions on graphs are induced by a complex composition of nodes and their paths. For example, in some tasks, an edge could be important only when another alternative path exists to form a cycle, which determines the class of the node Debnath and others (1991); Duvenaud and others (2015), and thus their joint contribution cannot be modeled well using linear combinations of individual contributions.

Finally, recent GNN models augment the interpretability of GNNs via attention mechanisms Xie and Grossman (2018); Neil and others (2018); Velickovic et al. (2018). However, although the edge attention values can serve as an indication of graph structure importance, the values are the same for predictions on all nodes. Thus, this contradicts with many applications where an edge is essential for predicting a label of one node but not the label of another node. Furthermore, these approaches are limited to specific GNN architectures and cannot explain predictions using graph structure and node features jointly.

3 Formulating explanations for graph neural networks

Figure 2: A. GNN computation graph for making a prediction at node . Some edges form important neural message-passing pathways (green), which allow useful node information to be propagated across and aggregated at for prediction, while other edges do not (orange). However, GNN needs to aggregate important as well as unimportant messages to form a prediction at node , which can dilute the signal accumulated from ’s neighborhood. The goal of GnnExplainer is to identify a small set of important features and pathways (green) that are crucial for prediction. B. In addition to (in green), GnnExplainer identifies what feature dimensions of ’s nodes are important for prediction by learning a node feature mask.

Let denote a graph on edges and nodes that are associated with -dimensional node features , . Without loss of generality, we consider the problem of explaining a node classification task (see Section 4.4 for other tasks). Let denote a label function on nodes that maps every node in to one of classes. The GNN model is optimized on all nodes in the training set and can then be used to approximate on nodes in the test set.

3.1 Background

For each layer, the update of the GNN model involves three key computations Battaglia et al. (2018); Zhang et al. (2018); Zhou et al. (2018). (1) First, the model computes neural messages between every pair of nodes. The message for node pair is a function Msg of ’s and ’s representations and in the previous layer, and the relation between the nodes: (2) Second, for each node , GNN aggregates messages from ’s neighborhood and calculates an aggregated message via an aggregation method Agg Hamilton et al. (2017); Xu et al. (2019): where is neighborhood of node whose definition depends on a particular GNN variant. (3) Finally, GNN takes the aggregated message along with ’s representation from the previous layer, and it non-linearly transforms them to obtain ’s representation at layer : The final embedding for node after layers of computation is . Our GnnExplainer provides explanations for any GNN that can be formulated in terms of Msg, Agg, and Update computations.

3.2 GnnExplainer: Problem formulation

Our key insight is the observation that information used by GNN model to compute prediction is completely described by the ’s computation graph, which is defined by GNN’s neighborhood-based aggregation (Fig. 2). Let us denote the computation graph to generate the embedding of node by , the associated binary adjacency matrix by and the associated feature set by . GNN learns a conditional distribution , where is a random variable representing output class labels , indicating the probability of a node belonging to each of classes.

In effect, this implies that a GNN’s prediction is given by , meaning that we only need to consider structural information in and node feature information in to explain (Figure 2A). Formally, the GnnExplainer provides an explanation for as , where is a small subgraph of the computation graph and is a small subset of node features that are most important for (Figure 2B).

4 GnnExplainer

Next we describe our approach GnnExplainer. Given a trained GNN model and a prediction (i.e., single-instance explanation, Sections 4.1 and 4.2) or a set of predictions (i.e., multi-instance explanations, Section 4.3), the GnnExplainer will generate an explanation by identifying a subgraph of the computation graph and a subset of node features that are most influential for the model ’s prediction. In the case of explaining a set of predictions, GnnExplainer will aggregate individual explanations in the set and summarize it with a prototype. We conclude with a discussion on how GnnExplainer can be used for any machine learning task on graphs, including link prediction and graph classification (Section 4.4).

4.1 Single-instance explanations

Our goal is to identify a subgraph and the associated features that are important for GNN’s prediction . We formalize the notion of importance using mutual information and formulate it in an optimization framework as follows:


For the target node , quantifies the change in the probability of prediction if ’s computation graph and the associated features are limited to the explanation subgraph and , respectively. For example, consider the situation where . Then, if removing from strongly decreases the probability of prediction , the node is a good counterfactual explanation of the prediction. Similarly, consider the situation where . Then, if removing an edge between and strongly decreases the probability of the prediction then it is the absence of a link between and that is a good counterfactual explanation of prediction. Examining Eq. (1), we see that the entropy term is constant because is fixed, trained GNN. As a result, maximizing mutual information between predicted label distribution and explanation is equivalent to minimizing conditional entropy , which can be expressed as follows:


Explanation for prediction is thus a subgraph that minimizes uncertainty of when the GNN computation is limited to . In effect, maximizes probability of (Figure 2). To obtain a compact explanation, we impose a constraint on ’s size as: so that has at most nodes. In effect, this implies that GnnExplainer aims to denoise by taking edges that give the highest mutual information with the prediction.

GnnExplainer’s optimization framework. Direct optimization of GnnExplainer’s objective is not tractable as has exponentially many subgraphs that are candidate explanations for . We thus consider a fractional adjacency matrix111For typed edges, we define where is the number of edge types. for subgraphs , i.e., , and enforce the subgraph constraint as: for all . This continuous relaxation can be interpreted as a variational approximation of distribution of subgraphs of . In particular, if we treat as a random graph variable, the objective in Eq. (2) becomes:


which, by Jensen’s inequality, has the following upper bound:


To estimate we use the mean-field variational approximation and decompose into a multivariate Bernoulli distribution as: . This allows us to estimate the expectation with respect to the mean-field approximation and obtain in which -th entry represents the expectation on whether edge exists. We observed empirically that this approximation together with a regularizer for promoting discreteness Ying et al. (2018b) converges to good local minima despite the non-convexity of GNNs. Thus, a computationally efficient version of GnnExplainer’s objective, which we optimize using gradient descent, is as follows:


where denotes the mask that we need to learn, denotes element-wise multiplication, and denotes the sigmoid that maps the mask to . Finally, we remove low values in through thresholding and compute the element-wise multiplication of and to arrive at the explanation for GNN’s prediction at node .

4.2 Joint consideration of structural and node feature information

To identify what node features are most important for prediction , GnnExplainer learns a feature selector for nodes in explanation . Instead of defining to consists of all node features, i.e., , GnnExplainer considers as a subset of features of nodes in , which are defined through a binary feature selector (Figure 2B):


where has node features that are not masked out by . Explanation is then jointly optimized for maximizing the mutual information objective:


which represents a modified objective function from Eq. (1) that considers structural and node feature information to generate an explanation for prediction .

Learning binary feature selector . We specify as , where acts as a feature mask that we need to learn. Intuitively, if a particular feature is not important, the corresponding weights in GNN’s weight matrix take values close to zero. In effect, this implies that masking the feature out does not decrease predicted probability for Conversely, if the feature is important then masking it out would decrease predicted probability. However, the problem with this approach is that it ignores features that are important for prediction but take values close to zero. To address this issue we marginalize over all feature subsets and use a Monte Carlo estimate to sample from empirical marginal distribution for nodes in during training Zintgraf et al. (2017). Further, we use a reparametrization trick Kingma and Welling (2013) to backpropagate gradients in Eq. (7) to the feature mask . In particular, to backpropagate through a -dimensional random variable we reparametrize as: s.t. , where is a -dimensional random variable sampled from the empirical distribution and is a parameter representing the maximum number of features to be kept in the explanation.

Integrating additional constraints into explanations. To impose further properties on the explanation we can extend GnnExplainer’s objective function in Eq. (7) with regularization terms. For example, we use element-wise entropy to encourage structural and node feature masks to be discrete. Further, GnnExplainer can encode domain-specific constraints through techniques like Laplacian regularization, equivalent to the Lagrange multiplier of constraints. Finally, it is important to note that each explanation must be a valid computation graph. In particular, explanation needs to allow GNN’s neural messages to flow towards node such that GNN can make prediction . Importantly, GnnExplainer automatically provides explanations that represent valid computation graphs because it optimizes structural masks across entire computation graphs. Even if a disconnected edge is important for neural message-passing, it will not be selected for explanation as it cannot influence GNN’s prediction. In effect, this implies that tends to be a small connected subgraph.

4.3 Multi-instance explanations through graph prototypes

The output of a single-instance explanation (Sections 4.1 and 4.2) is a small subgraph of the input graph and a small subset of associated node features that are most influential for a single prediction. To answer questions like “How did a GNN predict that a given set of nodes all have label ?”, we need to obtain a global explanation of class . Our goal here is to provide insight into how the identified subgraph for a particular node relates to a graph structure that explains an entire class.

To this end, GnnExplainer provides multi-instance explanations that are based on graph alignments and prototypes. For a given class (or, any set of predictions that we want to explain), we first choose a reference node , for example, by computing the mean embedding of all nodes assigned to . We then take explanation for reference and align it to explanations of other nodes assigned to class . Technically, we use relaxed graph matching to find correspondences between nodes in the reference subgraph and subgraphs of other nodes Ying et al. (2018b) (See Appendix for details). Finding optimal matching of large graphs is challenging in practice. However, the single-instance explainer generates small graphs (Section 4.2) and thus near-optimal pairwise graph matchings can be efficiently computed. This procedure gives us adjacency matrices for all nodes predicted to belong to , where the matrices are aligned with respect to the node ordering in the reference ’s matrix . Finally, we can aggregate aligned adjacency matrices into a graph prototype using, for example, a robust median-based approach. Prototype gives insights into graph patterns shared between nodes that belong to the same class. One can then study prediction for a particular node by comparing explanation for that node’s prediction (i.e., returned by single-instance explanation approach) to the prototype (see Appendix for more information and experiments).

4.4 GnnExplainer model extensions

Any machine learning task on graphs. In addition to explaining node classification, GnnExplainer provides explanations for link prediction and graph classification with no change to its optimization algorithm. When predicting a link , GnnExplainer learns two masks and for both endpoints of the link. When classifying a graph, the adjacency matrix in Eq. (5) is the union of adjacency matrices for all nodes in the graph whose label we want to explain.

Any GNN model. Modern GNNs are based on message passing architectures on the input graph. The message passing computation graphs can be composed in many different ways and GnnExplainer can account for all of them. Thus, GnnExplainer can be applied to: Graph Convolutional Networks Kipf and Welling (2016), Gated Graph Sequence Neural Networks Li et al. (2015), Jumping Knowledge Networks Xu et al. (2018), Attention Networks Velickovic et al. (2018), Graph Networks Battaglia et al. (2018), GNNs with various node aggregation schemes Chen et al. (2018c, a); Huang et al. (2018); Hamilton et al. (2017); Ying et al. (2018b, a); Xu et al. (2019), Line-Graph NNs Chen et al. (2019), position-aware GNN You et al. (2019), and many other GNN architectures.

Computational complexity. The number of parameters in GnnExplainer’s optimization depends on the size of computation graph for node whose prediction we aim to explain. In particular, ’s adjacency matrix is equal to the size of the mask , which needs to be learned by GnnExplainer. However, since computation graphs are typically relatively small, compared to the size of exhaustive -hop neighborhoods (e.g., 2-3 hop neighborhoods Kipf and Welling (2016), sampling-based neighborhoods Ying et al. (2018a), neighborhoods with attention Velickovic et al. (2018)), GnnExplainer can effectively generate explanations even when input graphs are large.

5 Experiments

We begin by describing the graphs, baselines, and experimental setup. We then present experiments on explaining GNNs for node classification and graph classification tasks. Our qualitative and quantitative analysis demonstrates that GnnExplainer is accurate and effective in identifying explanations, both in terms of graph structure and node features.

Synthetic datasets. We construct four kinds of node classification datasets (Table 1). (1) In BA-Shapes, we start with a base Barabási-Albert (BA) graph on 300 nodes and a set of 80 five-node house-structured network motifs, which are attached to randomly selected nodes of the base graph. The resulting graph is further perturbed by adding random edges. Nodes are assigned to 4 classes based on their structural roles. (2) BA-Community dataset is a union of two BA-Shapes graphs. Nodes have normally distributed feature vectors and are assigned to one of 8 classes based on their structural roles and community memberships. (3) In Tree-Cycles, we start with a base 8-level balanced binary tree and 80 six-node cycle motifs, which are attached to random nodes of the base graph. (4) Tree-Grid is the same as Tree-Cycles except that 3-by-3 grid motifs are attached to the base tree graph in place of cycle motifs.

Real-world datasets. We consider two graph classification datasets: (1) Mutag is a dataset of molecule graphs labeled according to their mutagenic effect on the Gram-negative bacterium S. typhimurium Debnath and others (1991). (2) Reddit-Binary is a dataset of graphs, each representing an online discussion thread on Reddit. In each graph, nodes are users participating in a thread, and edges indicate that one user replied to another user’s comment. Graphs are labeled according to the type of user interactions in the thread: r/IAmA and r/AskReddit contain Question-Answer interactions, while r/TrollXChromosomes and r/atheism contain Online-Discussion interactions Yanardag and Vishwanathan (2015).

Table 1: Illustration of synthetic datasets (see “Synthetic datasets” for details) together with performance evaluation of GnnExplainer and baseline explainability methods.

Baselines. Many explainability methods cannot be directly applied to graphs (see Section 2). Nevertheless, we here consider the following baselines that can provide insights into predictions made by GNNs: (1) Grad is a gradient-based method. We compute gradient of the GNN’s loss function with respect to the adjacency matrix and the associated node features, similar to a saliency map approach. (2) Att is a graph attention GNN (GAT) Velickovic et al. (2018) that learns attention weights for edges in the computation graph, which we use as a proxy measure of edge importance. While Att does consider graph structure, it does not use node features and can only explain GAT models, not all GNNs. Furthermore, in Att it is not obvious which attention weights need to be used for edge importance, since a 1-hop neighbor of a node can also be a 2-hop neighbor of the same node due to cycles. Each edge’s importance is thus computed as the average attention weight across all layers.

Setup and implementation details. For each dataset, we first train a single GNN for each dataset, and use Grad and GnnExplainer to explain the predictions made by the GNN. Note that the Att baseline requires using a graph attention architecture like GAT Velickovic et al. (2018). We thus train a separate GAT model on the same dataset and use the learned edge attention weights for explanation. Hyperparameters control the size of subgraph and feature explanations respectively, which is informed by prior knowledge about the dataset. For synthetic datasets, we set to be the size of ground truth. On real-world datasets, we set . We set for all datasets. We further fix our weight regularization hyperparameters across all node and graph classification experiments. We refer readers to the Appendix for more training details. Datasets, source code for GnnExplainer, and baselines will be made public at publication time.

Results. We investigate questions: Does GnnExplainer provide sensible explanations? How do explanations compare to the ground-truth knowledge? How does GnnExplainer perform on various graph-based prediction tasks? Can it explain predictions made by different GNNs?

Figure 3: Evaluation of single-instance explanations. A-B. Shown are exemplar explanation subgraphs for node classification task on four synthetic datasets. Each method provides explanation for the red node’s prediction.
Figure 4: Evaluation of single-instance explanations. A-B. Shown are exemplar explanation subgraphs for graph classification task on two datasets, Mutag and Reddit-Binary.

1) Quantitative analyses. Results on node classification datasets are shown in Table 1. We have ground-truth explanations for synthetic datasets and we use them to calculate explanation accuracy for all explanation methods. Specifically, we formalize the explanation problem as a binary classification task, where edges in the ground-truth explanation are treated as labels and importance weights given by explainability method are viewed as prediction scores. A better explainability method predicts high scores for edges that are in the ground-truth explanation, and thus achieves higher explanation accuracy. Results show that GnnExplainer outperforms baselines by 17.1% on average. Further, GnnExplainer achieves up to 43.0% higher accuracy on the hardest Tree-Grid dataset.

2) Qualitative analyses. Results are shown in Figures 35. In a topology-based prediction task with no node features, e.g. BA-Shapes and Tree-Cycles, GnnExplainer correctly identifies network motifs that explain node labels, i.e. structural labels (Figure 3). As illustrated in the figures, house, cycle and tree motifs are identified by GnnExplainer but not by baseline methods. In Figure 4, we investigate explanations for graph classification task. In Mutag example, colors indicate node features, which represent atoms (hydrogen H, carbon C, etc). GnnExplainer correctly identifies carbon ring as well as chemical groups and , which are known to be mutagenic Debnath and others (1991).

Further, in Reddit-Binary example, we see that Question-Answer graphs (2nd row in Figure 4B) have 2-3 high degree nodes that simultaneously connect to many low degree nodes, which makes sense because in QA threads on Reddit we typically have 2-3 experts who all answer many different questions Kumar et al. (2018). Conversely, we observe that discussion patterns commonly exhibit tree-like patterns (2nd row in Figure 4A), since a thread on Reddit is usually a reaction to a single topic Kumar et al. (2018). On the other hand, Grad and Att methods give incorrect or incomplete explanations. For example, both baseline methods miss cycle motifs in Mutag dataset and more complex grid motifs in Tree-Grid dataset. Furthermore, although edge attention weights in Att can be interpreted as importance scores for message passing, the weights are shared across all nodes in input the graph, and as such Att fails to provide high quality single-instance explanations.

An essential criterion for explanations is that they must be interpretable, i.e., provide a qualitative understanding of the relationship between the input nodes and the prediction. Such a requirement implies that explanations should be easy to understand while remaining exhaustive. This means that a GNN explainer should take into account both the structure of the underlying graph as well as the associated features when they are available. Figure 5 shows results of an experiment in which GnnExplainer jointly considers structural information as well as information from a small number of feature dimensions222Feature explanations are shown for the two datasets with node features, i.e., Mutag and BA-Community.. While GnnExplainer indeed highlights a compact feature representation in Figure 5, gradient-based approaches struggle to cope with the added noise, giving high importance scores to irrelevant feature dimensions.

Further experiments on multi-instance explanations using graph prototypes are in Appendix.

Figure 5: Visualization of features that are important for a GNN’s prediction. A. Shown is a representative molecular graph from Mutag dataset (top). Importance of the associated graph features is visualized with a heatmap (bottom). In contrast with baselines, GnnExplainer correctly identifies features that are important for predicting the molecule’s mutagenicity, i.e. C, O, H, and N atoms. B. Shown is a computation graph of a red node from BA-Community dataset (top). Again, GnnExplainer successfully identifies the node feature that is important for predicting the structural role of the node but baseline methods fail.

6 Conclusion

In this paper, we present GnnExplainer, a novel method for explaining predictions of any GNN on any graph-based machine learning task without requiring modification of the underlying GNN architecture or re-training. We show how GnnExplainer can leverage recursive neighborhood-aggregation scheme of graph neural networks to identify important graph pathways as well as highlight relevant node feature information that is passed along edges of the pathways. While the problem of explainability of machine-learning predictions has received substantial attention in recent literature, our work is unique in the sense that it presents an approach that operates on relational structures—graphs with rich node features—and provides a straightforward interface for making sense out of GNN predictions, debugging GNN models and identifying systematic patterns of mistakes.


  • [1] A. Adadi and M. Berrada (2018) Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 6, pp. 52138–52160. External Links: ISSN 2169-3536 Cited by: §2.
  • [2] J. Adebayo, J. Gilmer, M. Muelly, I. Goodfellow, M. Hardt, and B. Kim (2018) Sanity checks for saliency maps. In NeurIPS, Cited by: §2.
  • [3] M. G. Augasta and T. Kathirvalavakumar (2012-04) Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems. Neural Processing Letters 35 (2), pp. 131–150 (en). External Links: ISSN 1573-773X Cited by: §2.
  • [4] P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, et al. (2018) Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261. Cited by: §3.1, §4.4.
  • [5] J. Chen, J. Zhu, and L. Song (2018) Stochastic training of graph convolutional networks with variance reduction. In ICML, Cited by: §4.4.
  • [6] J. Chen, L. Song, M. J. Wainwright, and M. I. Jordan (2018) Learning to explain: an information-theoretic perspective on model interpretation. arXiv preprint arXiv:1802.07814. Cited by: §1, §2.
  • [7] J. Chen, T. Ma, and C. Xiao (2018) FastGCN: fast learning with graph convolutional networks via importance sampling. In ICLR, Cited by: §4.4.
  • [8] Z. Chen, L. Li, and J. Bruna (2019) Supervised community detection with line graph neural networks. In ICLR, Cited by: §4.4.
  • [9] E. Cho, S. Myers, and J. Leskovec (2011) Friendship and mobility: user movement in location-based social networks. In KDD, Cited by: §1.
  • [10] A. Debnath et al. (1991) Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of Medicinal Chemistry 34 (2), pp. 786–797. Cited by: §2, §5, §5.
  • [11] F. Doshi-Velez and B. Kim (2017) Towards A Rigorous Science of Interpretable Machine Learning. (en). Note: arXiv: 1702.08608 Cited by: §1.
  • [12] D. Duvenaud et al. (2015) Convolutional networks on graphs for learning molecular fingerprints. In NIPS, Cited by: §2.
  • [13] D. Erhan, Y. Bengio, A. Courville, and P. Vincent (2009) Visualizing higher-layer features of a deep network. University of Montreal 1341 (3), pp. 1. Cited by: §1, §2.
  • [14] A. Fisher, C. Rudin, and F. Dominici (2018-01) All Models are Wrong but many are Useful: Variable Importance for Black-Box, Proprietary, or Misspecified Prediction Models, using Model Class Reliance. (en). Note: arXiv: 1801.01489 Cited by: §2.
  • [15] R. Guidotti et al. (2018) A Survey of Methods for Explaining Black Box Models. ACM Comput. Surv. 51 (5), pp. 93:1–93:42. Cited by: §2.
  • [16] W. Hamilton, Z. Ying, and J. Leskovec (2017) Inductive representation learning on large graphs. In NIPS, Cited by: §1, §3.1, §4.4.
  • [17] G. Hooker (2004) Discovering additive structure in black box functions. In KDD, Cited by: §2.
  • [18] W.B. Huang, T. Zhang, Y. Rong, and J. Huang (2018) Adaptive sampling towards fast graph representation learning. In NeurIPS, Cited by: §4.4.
  • [19] B. Kang, J. Lijffijt, and T. De Bie (2019) ExplaiNE: an approach for explaining network embedding-based link predictions. arXiv:1904.12694. Cited by: §2.
  • [20] D. P. Kingma and M. Welling (2013) Auto-encoding variational bayes. NeurIPS. Cited by: §4.2.
  • [21] T. N. Kipf and M. Welling (2016) Semi-supervised classification with graph convolutional networks. In ICLR, Cited by: §1, §4.4, §4.4.
  • [22] P. W. Koh and P. Liang (2017) Understanding black-box predictions via influence functions. In ICML, Cited by: §1, §2.
  • [23] S. Kumar, W. L. Hamilton, J. Leskovec, and D. Jurafsky (2018) Community interaction and conflict on the web. In WWW, pp. 933–943. Cited by: §5.
  • [24] H. Lakkaraju, E. Kamar, R. Caruana, and J. Leskovec (2017) Interpretable & Explorable Approximations of Black Box Models. Cited by: §1, §2.
  • [25] Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel (2015) Gated graph sequence neural networks. arXiv:1511.05493. Cited by: §4.4.
  • [26] S. Lundberg and S. Lee (2017) A Unified Approach to Interpreting Model Predictions. In NIPS, Cited by: §1.
  • [27] D. Neil et al. (2018) Interpretable Graph Convolutional Neural Networks for Inference on Noisy Knowledge Graphs. In ML4H Workshop at NeurIPS, Cited by: §2.
  • [28] M. Ribeiro, S. Singh, and C. Guestrin (2016) Why should i trust you?: explaining the predictions of any classifier. In KDD, Cited by: §1, §2.
  • [29] G. J. Schmitz, C. Aldrich, and F. S. Gouws (1999) ANN-DT: an algorithm for extraction of decision trees from artificial neural networks. IEEE Transactions on Neural Networks. Cited by: §1.
  • [30] A. Shrikumar, P. Greenside, and A. Kundaje (2017) Learning Important Features Through Propagating Activation Differences. In ICML, Cited by: §2.
  • [31] M. Sundararajan, A. Taly, and Q. Yan (2017) Axiomatic Attribution for Deep Networks. In ICML, Cited by: §1, §2.
  • [32] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio (2018) Graph attention networks. In ICLR, Cited by: §2, §4.4, §4.4, §5, §5.
  • [33] T. Xie and J. Grossman (2018) Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. In Phys. Rev. Lett., Cited by: §2.
  • [34] K. Xu, W. Hu, J. Leskovec, and S. Jegelka (2019) How powerful are graph neural networks?. In ICRL, Cited by: §3.1, §4.4.
  • [35] K. Xu, C. Li, Y. Tian, T. Sonobe, K. Kawarabayashi, and S. Jegelka (2018) Representation learning on graphs with jumping knowledge networks. In ICML, Cited by: §4.4.
  • [36] P. Yanardag and S. Vishwanathan (2015) Deep graph kernels. In KDD, pp. 1365–1374. Cited by: §5.
  • [37] C. Yeh, J. Kim, I. Yen, and P. Ravikumar (2018) Representer point selection for explaining deep neural networks. In NeurIPS, Cited by: §1, §2.
  • [38] R. Ying, R. He, K. Chen, P. Eksombatchai, W. Hamilton, and J. Leskovec (2018) Graph convolutional neural networks for web-scale recommender systems. In KDD, Cited by: §4.4, §4.4.
  • [39] Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec (2018) Hierarchical graph representation learning with differentiable pooling. In NeurIPS, Cited by: §1, §4.1, §4.3, §4.4.
  • [40] J. You, B. Liu, R. Ying, V. Pande, and J. Leskovec (2018) Graph convolutional policy network for goal-directed molecular graph generation. Cited by: §1.
  • [41] J. You, R. Ying, and J. Leskovec (2019) Position-aware graph neural networks. In ICML, Cited by: §4.4.
  • [42] M. Zeiler and R. Fergus (2014) Visualizing and Understanding Convolutional Networks. In ECCV, Cited by: §2.
  • [43] M. Zhang and Y. Chen (2018) Link prediction based on graph neural networks. In NIPS, Cited by: §1.
  • [44] Z. Zhang, P. C., and W. Zhu (2018) Deep Learning on Graphs: A Survey. arXiv:1812.04202. Cited by: §1, §3.1.
  • [45] J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, and M. Sun (2018) Graph Neural Networks: A Review of Methods and Applications. arXiv:1812.08434. Cited by: §1, §3.1.
  • [46] J. Zilke, E. Loza Mencia, and F. Janssen (2016) DeepRED - Rule Extraction from Deep Neural Networks. In Discovery Science, Cited by: §2.
  • [47] L. Zintgraf, T. Cohen, T. Adel, and M. Welling (2017) Visualizing deep neural network decisions: prediction difference analysis. In ICLR, Cited by: §4.2.
  • [48] M. Zitnik, M. Agrawal, and J. Leskovec (2018) Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics 34. Cited by: §1.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description