Adversarial Attack on Graph Structured Data
Abstract
Deep learning on graph structures has shown exciting results in various applications. However, few attentions have been paid to the robustness of such models, in contrast to numerous research work for image or text adversarial attack and defense. In this paper, we focus on the adversarial attacks that fool the model by modifying the combinatorial structure of data. We first propose a reinforcement learning based attack method that learns the generalizable attack policy, while only requiring prediction labels from the target classifier. Also, variants of genetic algorithms and gradient methods are presented in the scenario where prediction confidence or gradients are available. We use both synthetic and realworld data to show that, a family of Graph Neural Network models are vulnerable to these attacks, in both graphlevel and nodelevel classification tasks. We also show such attacks can be used to diagnose the learned classifiers.
1 Introduction
Graph structure plays an important role in many realworld applications. Representation learning on the structured data with deep learning methods has shown promising results in various applications, including drug screening (Duvenaud et al., 2015), protein analysis (Hamilton et al., 2017), knowledge graph completion (Trivedi et al., 2017), etc..
Despite the success of deep graph networks, the lack of interpretability and robustness of these models make it risky for some financial or security related applications. As analyzed in Akoglu et al. (2015), the graph information is proven to be important in the area of risk management. A graph sensitive evaluation model will typically take the useruser relationship into consideration: a user who connects with many highcredit users may also have high credit. Such heuristics learned by the deep graph methods would often yield good predictions, but could also put the model in a risk. A criminal could try to disguise himself by connecting other people using Facebook or Linkedin. Such ‘attack’ to the credit prediction model is quite cheap, but the consequence could be severe. Due to the large number of transactions happening every day, even if only onemillionth of the transactions are fraudulent, fraudsters can still obtain a huge benefit. However, few attentions have been put on domains involving graph structures, despite the recent advances in adversarial attacks and defenses for other domains like images (Goodfellow et al., 2014) and text (Jia & Liang, 2017).
So in this paper, we focus on the graph adversarial attack for a set of graph neural network(GNN) (Scarselli et al., 2009) models. These are a family of supervised (Dai et al., 2016) models that have achieved stateoftheart results in many transductive tasks (Kipf & Welling, 2016) and inductive tasks (Hamilton et al., 2017). Through experiments in both node classification and graph classification problems, we will show that the adversarial samples do exist for such models. And the GNN models can be quite vulnerable to such attacks.
However, effectively attacking graph structures is a nontrivial problem. Different from images where the data is continuous, the graphs are discrete. Also the combinatorial nature of the graph structures makes it much more difficult than text. Inspired by the recent advances in combinatorial optimization (Bello et al., 2016; Dai et al., 2017), we propose a reinforcement learning based attack method that learns to modify the graph structure with only the prediction feedback from the target classifier. The modification is done by sequentially add or drop edges from the graph. A hierarchical method is also used to decompose the quadratic action space, in order to make the training feasible. Figure 1 illustrates this approach. We show that such learned agent can also propose adversarial attacks for new instances without access to the classifier.
Several different adversarial attack settings are considered in our paper. When more information from the target classifier is accessible, a variant of the gradient based method and a genetic algorithm based method are also presented. Here we mainly focus on the following three settings:

white box attack (WBA): in this case, the attacker is allowed to access any information of the target classifier, including the prediction, gradient information, etc..

practical black box attack (PBA): in this case, only the prediction of the target classifier is available. When the prediction confidence is accessible, we denote this setting as PBAC; if only the discrete prediction label is allowed, we denote the setting as PBAD.

restrict black box attack (RBA): this setting is one step further than PBA. In this case, we can only do blackbox queries on some of the samples, and the attacker is asked to create adversarial modifications to other samples.
As we can see, regarding the amount of information the attacker can obtain from the target classifier, we can sort the above settings as WBA PBAC PBAD RBA. For simplicity, we focus on the nontargeted attack, though it is easy to extend to the targeted attack scenario.
In Sec 2, we first present the background about GNNs and two supervised learning tasks. Then in Sec 3 we formally define the graph adversarial attack problem. Sec 3.1 presents the attack method RLS2V that learns the generalizable attack policy over the graph structure. We also propose other attack methods with different levels of access to the target classifier in Sec 3.2. We experimentally show the vulnerability of GNN models in Sec 4, and also present a way of doing defense against such attacks.
2 Background
A set of graphs is denoted by , where . Each graph is represented by the set of nodes and edges . Here the tuple represents the edge between node and . In this paper, we focus on undirected graphs, but it is straightforward to extend to directed ones. Optionally, the nodes or edges can have associated features. We denote them as and , respectively.
This paper works on attacking the graph supervised classification algorithms. Here two different supervised learning settings are considered:
Inductive Graph Classification: We associate each graph with a label , where is the number of categories. The dataset is represented by pairs of graph instances and graph labels. This setting is inductive since the test instances will never be seen during training. Examples of such task including classifying the drug molecule graphs according to their functionality. In this case, the classifier is optimized to minimize the following loss:
(1) 
where is the cross entropy by default.
Transductive Node Classification: In node classification setting, a target node of graph is associated with a corresponding node label . The classification is on the nodes, instead of the entire graph. We here focus on the transductive setting, where only a single graph is considered in the entire dataset. That is to say, . It is transductive since test nodes (but not their labels) are also observed during training. Examples in this case include classifying papers in a citation database like Citeseer, or entities in a social network like Facebook. Here the dataset is represented as , and the classifier minimizes the following loss:
(2) 
When not causing confusion, we will overload the notations to represent the dataset, and in either settings. In this case, is implicitly omitted in inductive graph classification setting; While in transductive node classification setting, always refers to implicitly.
GNN family models
The Graph Neural Networks (GNNs) define a general architecture for neural network on graph . This architecture obtains the vector representation of nodes through an iterative process:
(3)  
where specifies the neighborhood of node . The initial node embedding is set to zero. For simplicity, we denote the outcome node embedding as . To obtain the graphlevel embedding from node embeddings, a global pooling is applied over the node embeddings.
3 Graph adversarial attack
Given a learned classifier and an instance from the dataset , the graph adversarial attacker asks to modify the graph into , such that
(4)  
Here is an equivalency indicator that tells whether two graphs and are equivalent under the classification semantics.
In this paper, we focus on the modifications to the discrete structures. The attacker is allowed to add or delete edges from to construct the new graph. Such type of actions are rich enough, since adding or deleting nodes can be performed by a series of modifications to the edges. Also modifying the edges is harder than modifying the nodes, since choosing a node only requires complexity, while naively choosing an edge requires .
Since the attacker is aimed at fooling the classifier , instead of actually changing the true label of the instance, the equivalency indicator should be defined first to restrict the modifications an attacker can perform. We use two ways to define the equivalency indicator:

Explicit semantics. In this case, a gold standard classifier is assumed to be accessible. Thus the equivalency indicator is defined as:
(5) where is an indicator function.

Small modifications. In many cases when explicit semantics is unknown, we will ask the attacker to make as few modifications as possible within a neighborhood graph:
(6) In the above equation, is the maximum number of edges that allowed to modify, and defines the hop neighborhood graph, where is the distance between two nodes in graph .
Take an example in friendship networks, a suspicious behavior would be adding or deleting many friends in a short period, or creating the friendship with someone who doesn’t share any common friend. The “small modification” constraint eliminates the possibility of above two possibilities, so as to regulate the behavior of . With either of the two realizations of robust classifier , it is easy to enforce the attacker. Each time when an invalid modification proposed, the classifier can simply ignore such move.
Below we first introduce our main algorithm, RLS2V, for learning attacker in Section 3.1. Then in Section 3.2, we present other possible attack methods under different scenarios.
3.1 Attacking as hierarchical reinforcement learning
Given an instance and a target classifier , we model the attack procedure as a Finite Horizon Markov Decision Process . The definition of such MDP is as follows:

Action As we mentioned in Sec 3, the attacker is allowed to add or delete edges in the graph. So a single action at time step is . However, simply performing actions in space is too expensive. We will shortly show how to use hierarchical action to decompose this action space.

State The state at time is represented by the tuple , where is a partially modified graph with some of the edges added/deleted from .

Reward The purpose of the attacker is to fool the target classifier. So the nonzero reward is only received at the end of the MDP, with reward being
(7) In the intermediate steps of modification, no reward will be received. That is to say, . In PBAC setting where the prediction confidence of the target classifier is accessible, we can also use as the reward.

Terminal Once the agent modifies edges, the process stops. For simplicity, we focus on the MDP with fixed length. In the case when fewer modification is enough, we can simply let the agent to modify the dummy edges.
Given the above settings, a sample trajectory from this MDP will be: , where , and . The last step will have reward and all other intermediate rewards are zero: . Since this is a discrete optimization problem with a finite horizon, we use Qlearning to learn the MDPs. In our preliminary experiments we also tried with policy optimization methods like Advantage Actor Critic, but found Qlearning works more stable. So below we focus on the modeling with Qlearning.
Qlearning is an offpolicy optimization where it fits the Bellman optimality equation directly as below:
(8) 
This implicitly suggests a greedy policy:
(9) 
In our finite horizon case, is fixed to 1. Note that directly operating the actions in space is too expensive for large graphs. Thus we propose to decompose the action into , where . Thus a single edge action is decomposed into two ends of this edge. The hierarchical Qfunction is then modeled as below:
(10)  
In the above formulation, and are two functions that implement the original . An action is considered as completed only when a pair of is chosen. Thus the reward will only be valid after is made. It is easy to see that such decomposition has the same optimality structure as in Eq (8), but making an action would only require complexity. Figure 1 illustrates this process.
Take a further look at Eq (10), since only the reward in last time step is nonzero, and also the budget of modification is given, we can explicitly unroll the Bellman equations as:
(11) 
To make notations compact, we still use to denote the Qfunction. Since each sample in the dataset defines an MDP, it is possible to learn a separate Q function for each MDP . However, we here focus on a more practical and challenging setting, where only one is learned. The learned Qfunction is thus asked to generalize or transfer over all the MDPs:
(12) 
where is parameterized by . Below we present the parameterization for such that generalizes over MDPs.
3.1.1 Parameterization of
From above, we can see the most flexible parameterization would be implementing timedependent Q functions. However, we found two distinct parametrization is typically enough, i.e., .
Since the function is scoring the nodes in the state graph, it is natural to use GNN family models for parameterization, in order to learn a generalizable attacker. Specifically, is parameterized as:
(13) 
where is the embedding of node in graph , obtained by structure2vec (S2V) (Dai et al., 2016):
(14) 
where and . Also is the representation of entire state tuple:
(15) 
In node attack scenario, the state embedding is taken from the hop neighborhood of node , denoted as . The parameter set of is . is parameterized similarly with parameter , with an extra consideration of the chosen node :
(16) 
We denote this method as RLS2V since it learns a Qfunction parameterized by S2V to perform attack.
3.2 Other attacking methods
The RLS2V is suitable for blackbox attack and transfer. However, for different attack scenarios, other algorithms might be preferred. We first introduce RandSampling that requires least information in Sec 3.2.1; Then in Sec 3.2.2, a whitebox attack GradArgmax is proposed; Finally the GeneticAlg, which is a kind of evolutionary computing, is proposed in Sec 3.2.3.
3.2.1 Random sampling
This is the simplest attack method that randomly adds or deletes edges from graph . When an edge modification action is sampled, we will only accept it when it satisfies the semantic constraint . It requires the least information for attack. Despite its simplicity, sometimes it can get good attack rate.
3.2.2 Gradient based white box attack
Gradients have been successfully used for modifying continuous inputs, e.g., images. However, taking gradient with respect to a discrete structure is nontrivial. Recall the general iterative embedding process defined in Eq (3), we associate a coefficient for each pair of :
(17)  
Let . That is to say, itself is the binary adjacency matrix. It is easy to see that the above formulation has the same effect as in Eq (3). However, such additional coefficients give us the gradient information with respect to each edge (either existing or nonexisting):
(18) 
In order to attack the model, we could perform the gradient ascent, i.e., . However, the attack is on a discrete structure, where only edges are allowed to be added or deleted. So here we need to solve a combinatorial optimization problem:
(19)  
We simply use a greedy algorithm to solve the above optimization. Here the modification of given a set of coefficients is performed by sequentially modifying edges of graph :
(20) 
That is to say, we modify the edges who are most likely to cause the change to the objective. Depending on the sign of the gradient, we either add or delete the edge. We name it as GradArgmax since it does the greedy selection based on gradient information.
The attack procedure is shown in Figure 2. Since this approach requires the gradient information, we consider it as a whitebox attack method. Also, the gradient considers all pairs of nodes in a graph, the computation cost is at least , excluding the backpropagation of gradients in Eq (18). Without further approximation, this approach cannot scale to large graphs.
3.2.3 Genetic algorithm
Evolution computing has been successfully applied in many zeroorder optimization scenarios, including neural architecture search (Real et al., 2017; Miikkulainen et al., 2017) and adversarial attack for images (Su et al., 2017). We here propose a blackbox attack method that implements a type of genetic algorithms. Given an instance and the target classifier , Such algorithm involves five major components, as elaborated below:

Population: the population refers to a set of candidate solutions. Here we denote it as , where each is a valid modification solution to the original graph . is the index of generation and is the maximum numbers of evolutions allowed.

Fitness: each candidate solution in current population will get a score that measures the quality of the solution. We use the loss function of target model as the score function. A good attack solution should increase such loss. Since the fitness is a continuous score, it is not applicable in PBAD setting, where only classification label is accessible.

Selection: Given the fitness scores of current population, we can either do weighted sampling or greedy selection to select the ‘breeding’ population for next generation.

Crossover: After the selection of , we randomly pick two candidates and do the crossover by mixing the edges from these two candidates:
(21) Here rp means randomly picking a subset.

Mutation: the mutation process is also biology inspired. For a candidate solution , suppose the modified edges are . Then for each edge , we have a certain probability to change it to either or .
The population size , the probability of crossover used in , the mutation probability and the number of evolutions are all hyperparameters that can be tuned. Due to the limitation of the fitness function, this method can only be used in the PBAC setting. Also since we need to execute the target model to get fitness scores, the computation cost of such genetic algorithm is , which is mainly made up by the computation cost of GNNs. The overall procedure is illustrated in Figure 3. We simply name it as GeneticAlg since it is an instantiation of general genetic algorithm framework.
WBA  PBAC  PBAD  RBA  Cost  

RandSampling  
GradArgmax  
GeneticAlg  
RLS2V 
4 Experiment
attack test set I  1520 nodes  4050 nodes  90100 nodes  

Settings  Methods  
(unattacked)  93.20%  98.20%  98.87%  99.07%  92.60%  96.20%  97.53%  97.93%  94.60%  97.47%  98.73%  98.20%  
RBA  RandSampling  78.73%  92.27%  95.13%  97.67%  73.60%  78.60%  82.80%  85.73%  74.47%  74.13%  80.93%  82.80% 
WBA  GradArgmax  69.47%  64.60%  95.80%  97.67%  73.93%  64.80%  70.53%  75.47%  72.00%  66.20%  67.80%  68.07% 
PBAC  GeneticAlg  39.87%  39.07%  65.33%  85.87%  59.53%  55.67%  53.70%  42.48%  65.47%  63.20%  61.60%  61.13% 
PBAD  RLS2V  42.93%  41.93%  70.20%  91.27%  61.00%  59.20%  58.73%  49.47%  66.07%  64.07%  64.47%  64.67% 
Restricted blackbox attack on test set II  
(unattacked)  94.67%  97.33%  98.67%  97.33%  94.67%  97.33%  98.67%  98.67%  96.67%  98.00%  99.33%  98.00%  
RBA  RandSampling  78.00%  91.33%  94.00%  98.67%  75.33%  84.00%  86.00%  87.33%  69.33%  73.33%  76.00%  80.00% 
RBA  RLS2V  44.00%  40.00%  67.33%  92.00%  58.67%  60.00%  58.00%  44.67%  62.67%  62.00%  62.67%  61.33% 
For GeneticAlg, we set the population size and the number of rounds . We tune the crossover rate and mutation rate in . For RLS2V, we tune the number of propagations of its S2V model . There is no parameter tuning for GradArgmax and RandSampling.
We use the proposed attack methods to attack the graph classification model in Sec 4.1 and node classification model in Sec 4.2. In each scenario, we first show the attack rate when queries are allowed for target model, then we show the generalization ability of the RLS2V for RBA setting.
4.1 Graphlevel attack
(a) # comp  (b) # comp  (c) # comp 
In this set of experiments, we use synthetic data, where the gold classifier is known. Thus the explicit semantics is used for the equivalency indicator . The dataset we constructed contains 15,000 graphs, generated with ErdosRenyi random graph model. It is a three class graph classification task, where each class contains 5,000 graphs. The classifier is asked to tell how many connected components are there in the corresponding undirected graph . The label set . So there could be up to 3 components in a graph. See Figure 4 for illustration. The gold classifier is obtained by performing a onetime traversal of the entire graph. The dataset is divided into training and two test sets. The test set I contains 1,500 graphs, while test set II contains 150 graphs. Each set contains the same number of instances from different classes.
We choose structure2vec as the target model for attack. We also tune its number of propagation parameter . Table 2 shows the results with different settings. For test set I, we can see the structure2vec achieves very high accuracy on distinguishing the number of connected components. Also increasing seems to improve the generalization in most cases. However, we can see under the practical blackbox attack scenario, the GeneticAlg and RLS2V can bring down the accuracy to . In attacking the graph classification algorithm, the GradArgmax seems not to be very effective. One reason could be the last pooling step in S2V when obtaining graphlevel embedding. During back propagation, the pooling operation will dispatch the gradient to every other node embeddings, which makes the looks similar in most entries.
For restrict blackbox attack on test set II (see the lower half of Table 2), the attacker is asked to propose adversarial samples without any access to the target model. Since RLS2V is learned on test set I, it is able to transfer its learned policy to test set II. This suggests that the target classifier makes some form of consistent mistakes.
This experiment shows that, (1) the adversarial examples do exist for supervised graph problems; (2) a model with good generalization ability can still suffer from adversarial attacks; (3) RLS2V can learn the transferrable adversarial policy to attack unseen graphs.
4.2 Nodelevel attack
In this experiment, we want to inspect the adversarial attack to the node classification problems. Different from Sec 4.1, here the setting is transductive, where the test samples (but not their labels) are also seen during training. Here we use four realworld datasets, namely the Citeseer, Cora, Pubmed and Finance. The first three are smallscaled citation networks commonly used for node classification, where each node is a paper with corresponding bagofwords features. The last one is a largescale dataset that contains transactions from an ecommerce within one day, where the node set contains buyers, sellers and credit cards. The classifier is asked to distinguish the normal transactions from abnormal ones. The statistics of each dataset is shown in Table 3. The nodes also contain features with different dimensions. For the full table please refer to Kipf & Welling (2016). We use GCN (Kipf & Welling, 2016) as the target model to attack. Here the “small modifications” is used to regulate the attacker. That is to say, given a graph and target node , the adversarial samples are limited to delete single edge within 2hops of node .
Dataset  Nodes  Edges  Classes  Train/Test I/Test II 

Citeseer  3,327  4,732  6  120/1,000/500 
Cora  2,708  5,429  7  140/1,000/500 
Pubmed  19,717  44,338  3  60/1,000/500 
Finance  2,382,980  8,101,757  2  317,041/812/800 
Table 4 shows the results. We can see although deleting a single edge is the minimum modification one can do to the graph, the attack rate is still about 10% on those small graphs, and 4% in the Finance dataset. We also ran an exhaustive attack as sanity check, which is the best any algorithm can do under the attack budget. The classifier accuracy will reduce to 60% or lower if twoedge modification is allowed. However, consider the average degree in the graph is not large, deleting two or more edges would violate the “small modification” constraints. We need to be careful to only create adversarial samples, instead of actually changing the true label of that sample.
In this case, the GradArgmax performs quite good, which is different from the case in graphlevel attack. Here the gradient with respect to the adjacency matrix is no longer averaged, which makes it easier to distinguish the useful modifications. For the restrict blackbox attack on test set II, the RLS2V still learns an attack policy that generalizes to unseen samples. Though we do not have gold classifier in realworld datasets, it is highly possible that the adversarial samples proposed are valid: (1) the structure modification is tiny and within 2hop; (2) we did not modify the node features.
Method  Citeseer  Cora  Pubmed  Finance 

(unattacked)  71.60%  81.00%  79.90%  88.67% 
RBA, RandSampling  67.60%  78.50%  79.00%  87.44% 
WBA, GradArgmax  63.00%  71.30%  72.4%  86.33% 
PBAC, GeneticAlg  63.70%  71.20%  72.30%  85.96% 
PBAD, RLS2V  62.70%  71.20%  72.80%  85.43% 
Exhaust  62.50%  70.70%  71.80%  85.22% 
Restricted blackbox attack on test set II  
(unattacked)  72.60%  80.20%  80.40%  91.88% 
RandSampling  68.00%  78.40%  79.00%  90.75% 
RLS2V  66.00%  75.00%  74.00%  89.10% 
Exhaust  62.60%  70.80%  71.00%  88.88% 
4.3 Inspection of adversarial samples
(a) pred  (b) pred  (c) pred 
In this section, we visualize the adversarial samples proposed by different attackers. The solutions proposed by RLS2V for graphlevel classification problem are shown in Figure 5. The ground truth labels are 1, 2, 3, while the target classifier mistakenly predicts 2, 1, 2, respectively. In Figure 5(b) and (c), the RL agent connects two nodes who are 4 hops away from each other (before the red edge is added). This shows that, although the target classifier structure2vec is trained with , it didn’t capture the 4hop information efficiently. Also Figure 5(a) shows that, even connecting nodes who are just 2hop away, the classifier makes mistake on it.
Figure 6 shows the solutions proposed by GradArgmax. Orange node is the target node for attack. Edges with blue color are suggested to be added by GradArgmax, while black ones are suggested to be deleted. Black nodes have the same node label as the orange node, while while nodes do not. The thicker the edge, the larger the magnitude of the gradient is. Figure 6(b) deletes one neighbor with the same label, but still have other black nodes connected. In this case, the GCN is oversensitive. The mistake made in Figure 6(c) is reasonable, since although the red edge does not connect two nodes with the same label, it connects to a large community of nodes from the same class in 2hop distance. In this case, the prediction made by GCN is reasonable.
(a) pred  (b) pred  (c) pred 
4.4 Defense against attacks
Method  Citeseer  Cora  Pubmed  Finance 

(unattacked)  71.30%  81.70%  79.50%  88.55% 
RBA, RandSampling  67.70%  79.20%  78.20%  87.44% 
WBA, GradArgmax  63.90%  72.50%  72.40%  87.32% 
PBAC, GeneticAlg  64.60%  72.60%  72.50%  86.45% 
PBAD, RLS2V  63.90%  72.80%  72.90%  85.80% 
Different from the images, here the possible number of graph structures is finite given the number of nodes. So by adding the adversarial samples back for further training, the improvement of the target model’s robustness can be expected. For example, in the experiment of Sec 4.1, adding adversarial samples for training is equivalent to increasing the size of the training set, which will definitely be helpful. So here we seek to use a cheap method for adversarial training — simply doing edge drop during training for defense.
Dropping the edges during training is different from Dropout (Srivastava et al., 2014). Dropout operates on the neurons in the hidden layers, while edge drop modifies the discrete structure. It is also different from simply drop the entire hidden vector, since deleting a single edge can affect more than just one edge. For example, GCN computes the normalized graph Laplacian. So after deleting a single edge, the normalized graph Laplacian needs to be recomputed for some entries. This approach is similar to Hamilton et al. (2017), who samples a fixed number of neighborhoods during training for the efficiency. Here we drop the edges globally at random, during each training step.
The new results after adversarial training are presented in Table 5. We can see from the table that, though the accuracy of target model remains similar, the attack rate of various methods decreases about . Though the scale of the improvement is not significant, it shows some effectiveness with such cheap adversarial training.
5 Related work
adversarial attack in continuous and discrete space:
In recent years, the adversarial attacks to the deep learning models have raised increasing attention from researchers. Some methods focus on the whitebox adversarial attack using gradient information, like box constrained LBFGS (Szegedy et al., 2013),
Fast Gradient Sign (Goodfellow et al., 2014), deep fool (MoosaviDezfooli et al., 2016), etc.. When the full information of target model is not accessible, one can train a substitute model(Papernot et al., 2017), or use zeroorder optimization method (Chen et al., 2017).
There are also some works working on the attack with discrete functions (Buckman et al., 2018) but not the combinatorial structures. The onepixel attack (Su et al., 2017) modifies the image by only several pixels using differential evolution; Jia & Liang (2017) attacks the text reading comprehension system with the help of rules and human efforts. Zügner et al. (2018) studied the problem of adversarial attack over graphs in parallel to our work, although with very different methods.
combinatorial optimization:
Modifying the discrete structure to fool the target classifier can be treated as a combinatorial optimization problem. Recently, there are some exciting works using reinforcement learning to learn to solve the general sequential decision problems (Bello et al., 2016) or graph combinatorial problems (Dai et al., 2017). These are closely related to RLS2V. The RLS2V extends the previous approach using hierarchical way to decompose the quadratic action space, in order to make the training feasible.
6 Conclusion
In this paper, we study the adversarial attack on graph structured data. To perform the efficient attack, we proposed three methods, namely RLS2V, GradArgmax and GeneticAlg for three different attack settings, respectively. We show that a family of GNN models are vulnerable to such attack. By visualizing the attack samples, we can also inspect the target classifier. We also discussed about defense methods through experiments. Our future work includes developing more effective defense algorithms.
Acknowledgements
This project was supported in part by NSF IIS1218749, NIH BIGDATA 1R01GM108341, NSF CAREER IIS1350983, NSF IIS1639792 EAGER, NSF CNS1704701, ONR N000141512340, Intel ISTC, NVIDIA and Amazon AWS. Tian Tian and Jun Zhu were supported by the National NSF of China (No. 61620106010) and Beijing Natural Science Foundation (No. L172037). We thank Bo Dai for valuable suggestions, and the anonymous reviewers who gave useful comments.
References
 Akoglu et al. (2015) Akoglu, Leman, Tong, Hanghang, and Koutra, Danai. Graph based anomaly detection and description: a survey. Data Mining and Knowledge Discovery, 29(3):626–688, 2015.
 Bello et al. (2016) Bello, Irwan, Pham, Hieu, Le, Quoc V, Norouzi, Mohammad, and Bengio, Samy. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940, 2016.
 Buckman et al. (2018) Buckman, Jacob, Roy, Aurko, Raffel, Colin, and Goodfellow, Ian. Thermometer encoding: One hot way to resist adversarial examples. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=S18SuCW.
 Chen et al. (2017) Chen, PinYu, Zhang, Huan, Sharma, Yash, Yi, Jinfeng, and Hsieh, ChoJui. Zoo: Zeroth order optimization based blackbox attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 15–26. ACM, 2017.
 Dai et al. (2016) Dai, Hanjun, Dai, Bo, and Song, Le. Discriminative embeddings of latent variable models for structured data. In ICML, 2016.
 Dai et al. (2017) Dai, Hanjun, Khalil, Elias B, Zhang, Yuyu, Dilkina, Bistra, and Song, Le. Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665, 2017.
 Duvenaud et al. (2015) Duvenaud, David K, Maclaurin, Dougal, Iparraguirre, Jorge, Bombarell, Rafael, Hirzel, Timothy, AspuruGuzik, Alán, and Adams, Ryan P. Convolutional networks on graphs for learning molecular fingerprints. In Advances in Neural Information Processing Systems, pp. 2215–2223, 2015.
 Gilmer et al. (2017) Gilmer, Justin, Schoenholz, Samuel S, Riley, Patrick F, Vinyals, Oriol, and Dahl, George E. Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212, 2017.
 Goodfellow et al. (2014) Goodfellow, Ian J, Shlens, Jonathon, and Szegedy, Christian. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
 Hamilton et al. (2017) Hamilton, William L, Ying, Rex, and Leskovec, Jure. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216, 2017.
 Jia & Liang (2017) Jia, Robin and Liang, Percy. Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328, 2017.
 Kipf & Welling (2016) Kipf, Thomas N and Welling, Max. Semisupervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
 Lei et al. (2017) Lei, Tao, Jin, Wengong, Barzilay, Regina, and Jaakkola, Tommi. Deriving neural architectures from sequence and graph kernels. arXiv preprint arXiv:1705.09037, 2017.
 Li et al. (2015) Li, Yujia, Tarlow, Daniel, Brockschmidt, Marc, and Zemel, Richard. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493, 2015.
 Miikkulainen et al. (2017) Miikkulainen, Risto, Liang, Jason, Meyerson, Elliot, Rawal, Aditya, Fink, Dan, Francon, Olivier, Raju, Bala, Navruzyan, Arshak, Duffy, Nigel, and Hodjat, Babak. Evolving deep neural networks. arXiv preprint arXiv:1703.00548, 2017.
 MoosaviDezfooli et al. (2016) MoosaviDezfooli, SeyedMohsen, Fawzi, Alhussein, and Frossard, Pascal. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582, 2016.
 Papernot et al. (2017) Papernot, Nicolas, McDaniel, Patrick, Goodfellow, Ian, Jha, Somesh, Celik, Z Berkay, and Swami, Ananthram. Practical blackbox attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519. ACM, 2017.
 Real et al. (2017) Real, Esteban, Moore, Sherry, Selle, Andrew, Saxena, Saurabh, Suematsu, Yutaka Leon, Le, Quoc, and Kurakin, Alex. Largescale evolution of image classifiers. arXiv preprint arXiv:1703.01041, 2017.
 Scarselli et al. (2009) Scarselli, Franco, Gori, Marco, Tsoi, Ah Chung, Hagenbuchner, Markus, and Monfardini, Gabriele. The graph neural network model. Neural Networks, IEEE Transactions on, 20(1):61–80, 2009.
 Srivastava et al. (2014) Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.
 Su et al. (2017) Su, Jiawei, Vargas, Danilo Vasconcellos, and Kouichi, Sakurai. One pixel attack for fooling deep neural networks. arXiv preprint arXiv:1710.08864, 2017.
 Szegedy et al. (2013) Szegedy, Christian, Zaremba, Wojciech, Sutskever, Ilya, Bruna, Joan, Erhan, Dumitru, Goodfellow, Ian, and Fergus, Rob. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
 Trivedi et al. (2017) Trivedi, Rakshit, Dai, Hanjun, Wang, Yichen, and Song, Le. Knowevolve: Deep temporal reasoning for dynamic knowledge graphs. In ICML, 2017.
 Zügner et al. (2018) Zügner, Daniel, Akbarnejad, Amir, and Günnemann, Stephan. Adversarial attacks on neural networks for graph data. In KDD, 2018.