# Fast Detection of Maximum Common Subgraph via Deep Q-Learning

## Abstract

Detecting the Maximum Common Subgraph (MCS) between two input graphs is fundamental for applications in biomedical analysis, malware detection, cloud computing, etc. This is especially important in the task of drug design, where the successful extraction of common substructures in compounds can reduce the number of experiments needed to be conducted by humans. However, MCS computation is NP-hard, and state-of-the-art exact MCS solvers do not have worst-case time complexity guarantee and cannot handle large graphs in practice. Designing learning based models to find the MCS between two graphs in an approximate yet accurate way while utilizing as few labeled MCS instances as possible remains to be a challenging task. Here we propose RlMcs, a Graph Neural Network based model for MCS detection through reinforcement learning. Our model uses an exploration tree to extract subgraphs in two graphs one node pair at a time, and is trained to optimize subgraph extraction rewards via Deep Q-Networks. A novel graph embedding method is proposed to generate state representations for nodes and extracted subgraphs jointly at each step. Experiments on real graph datasets demonstrate that our model performs favorably to exact MCS solvers and supervised neural graph matching network models in terms of accuracy and efficiency.

## 1 Introduction

Due to the flexible and expressive nature of graphs, designing machine learning approaches to solve graph tasks is gaining increasing attention from researchers. Among various graph tasks such as link prediction Zhang and Chen (2018), graph classification Ying et al. (2018) and generation You et al. (2018), detecting the largest subgraph that is commonly present in both input graphs, known as Maximum Common Subgraph (MCS) (Bunke and Shearer, 1998) (as shown in Figure 1), is relatively novel and less explored.

MCS naturally encodes the degree of similarity between two graphs, is domain-agnostic, and has various versions of definitions Duesbury et al. (2018), and thus has applications in many domains such as software analysis Park et al. (2013), graph database systems Yan et al. (2005) and cloud computing platforms Cao et al. (2011). In drug design, the manual testing of the effects of a new drug is known to be a major bottleneck, and the identification of compounds that share common or similar subgraphs which tend to have similar properties can effectively reduce the manual labor Ehrlich and Rarey (2011).

The main challenge in MCS detection is its NP-hard nature, causing the state-of-the-art exact MCS detection algorithms to run in exponential time in worst cases McCreesh et al. (2017); Hoffmann et al. (2017) and very hard to scale to large graphs in practice. The usefulness of MCS detection yet the inefficiency of exact MCS solvers call for the design of learning based approximate solvers, not only due to their leaning ability, potentially yielding good accuracy, but also because of the efficient nature of many such methods such as deep learning models.

However, existing machine learning approaches to graph matching either do not address the MCS detection task directly or rely on labeled data requiring the pre-computation of MCS results by running exact solvers. For example, there is large amount of works addressing image matching which turn images into graphs whose matching results have semantic meanings and thus do not satisfy the general-domain MCS constraints Zanfir and Sminchisescu (2018); Wang et al. (2019); Yu et al. (2020). The graph alignment/matching task aims to find the node-node correspondence between two graphs yet the result is unrelated to MCS Xu et al. (2019b, a). Graph similarity computation Bai et al. (2019a); Li et al. (2019); Bai et al. (2020a) is more closely related, but only predicts a scalar score for two graphs instead of the node-node correspondonce.

To the best of our knowledge, the only existing model that addresses MCS directly is NeuralMcs Bai et al. (2020b). Although performing MCS detection in an efficient way, NeuralMcs must be trained in a completely supervised fashion which may potentially overfit and requires a large amount of labeled MCS instances. In practice, when labeled MCS results are scarce, how to design a machine learning approach that efficiently and effectively extracts the MCS remains a challenge.

In this paper, we present RlMcs, a general framework for MCS detection suited both when training pairs exist and when training pairs are unavailable. The model utilizes a novel Joint Subgraph-Node Embedding (JSNE) network to perform graph representation learning, a Deep Q-Network (DQN) Mnih et al. (2015) to predict action distributions, and a novel exploration tree based on beam search to perform subgraph extraction iteratively. The entire model is trained end-to-end in the reinforcement learning (RL) framework. Besides, an approximate graph isomorphism checking algorithm is proposed specifically for the iterative MCS extraction procedure, named as Fast Iterative Graph Isomorphism (FIGI).

Experiments on synthetic and real graph datasets demonstrate that the proposed model significantly outperforms state-of-the-art exact MCS detection algorithms in terms of efficiency and exhibits competitive accuracy over other learning based graph matching models.

## 2 Problem Definition

We denote a graph as where and denote the vertex and edge set. An induced subgraph is defined as where preserves all the edges between nodes in , i.e. , if and only if . For example, in Figure 1, the five-member ring is an induced subgraph of and because all the five edges between the five nodes are included in the subgraph.

In this paper, we aim at detecting the Maximum Common induced Subgraph (MCS) between an input graph pair, denoted as , which is the largest induced subgraph that is contained in both and . In addition, we require to be a connected subgraph. We allow the nodes of input graphs to be labeled, in which case the labels of nodes in the MCS must match, as shown in Figure 1.

###### Lemma 2.1.

For a given input graph pair , the number of node in their MCS is bounded by the smaller of the two graphs, .

###### Proof.

Suppose . However, by definition MCS is a subgraph that is contained in and , which cannot be larger than either or . By contradiction, . ∎

when is subgraph isomorphic to or is subgraph isomorphic to . The task of subgraph isomorphism (checking if one graph is contained in another graph) can be regarded as a special case of MCS detection.

## 3 Proposed Method

In this section we formulate the problem of MCS detection as learning an RL agent that iteratively grows the extracted subgraphs by adding new node pairs to the current subgraphs in a graph-structure-aware environment. We first describe the environment setup, then depict our proposed Joint Subgraph-Node (JSNE) network and the Deep Q-Network (DQN) which together provides actions for our agent to grow the subgraphs in a tree search context. We also describe how to leverage supervised data, when available, via imitation learning.

### 3.1 MCS Detection as Markov Decision Process

Fundamentally different from image-based semantic graph matching Zanfir and Sminchisescu (2018) and other forms of graph alignment Xu et al. (2019a), graph matching for MCS detection yields two subgraphs that must be isomorphic to each other. Since subgraph isomorphism is a hard constraint the detection result must satisfy, instead of extracting two subgraphs in one shot, we design an RL agent which explores the input graph pair and sequentially grows the extracted two subgraphs one node pair at a time as shown in Figure 2. This not only allows the agent to capture the naturally occurring dependency between different extraction steps but also allows the environment to check whether two subgraphs are isomorphic across steps.

The iterative subgraph extraction process can be described by a Markov Decision Process , where is the set of states consisting of all the possible subgraphs extracted for input graph pairs, is the set of actions representing the selection of new node pairs added to the current subgraphs, is the transition dynamics that gives the transition probabilities between states, , which equals to under the Markov property assumption, and is the reward the agent receives after reaching . The MCS extraction procedure can then be formulated as a sequence , where represents empty subgraphs and represents the two final extracted subgraphs.

### 3.2 Subgraph Extraction Environment

In this section we give more details on the environment in which our RL agent extracts subgraphs from an input graph pair .

#### State Space

We define the state of the environment as the intermediate extracted subgraphs and from input graph pair at time step , which is fully observable by our RL agent. Figure 2 (c) shows an example graph pair from which the agent extracts subgraphs by following sequences which are part of the exploration tree which will be described in Section 3.4.

###### Proposition 3.1.

For a given input graph pair , the size of the state space is exponential in the input graph size, .

###### Proof.

At each step , denote the size of the extracted two subgraphs as , the whole graph sizes as and . In this paper we consider connected induced subgraph, so the largest possible number of subgraphs extracted from one of the two input graphs of size is . For example, in two complete graphs with unlabeled nodes, choosing any number of nodes along with the edges between these nodes would lead to a valid connected induced subgraph. Similarly, the maximum amount of subgraph pairs is .

According to Lemma 2.1, . Denote . Since our RL agent grows one node pair at each step, i.e. , the initial subgraph is empty, i.e. and the final subgraph size at most is , the total maximum amount of subgraph pairs that could occur is

(1) |

∎

#### Action Space

At any given step , our RL agent maintains a subgraph for each of the two input graphs, denoted as and . The agent “grows” both and by adding one new node pair as well as the induced edges between the new nodes and the selected nodes in the subgraphs, so . Since MCS requires and to be connected as defined in Section 2, the action space can be reduced to only choosing the nodes that are directly connected to and .

Intuitively, these candidate nodes are at the “frontier” of the searching and subgraph-growing procedure, and are called “frontier” nodes. Formally, we define the candidate node sets in and to be and . In Figure 1 (a), there are four frontier nodes highlighted in red color. For labeled graphs, since MCS requires node labels to match, we further reduce by removing the node pairs with different labels. We will discuss in detail the policy for selecting a node pair in Section 3.3.1.

#### Transition Dynamics

In the MCS detection environment, the MCS constraints impose rules that certain actions proposed by the agent must be rejected causing the state to remain unchanged. The subgraph connectivity constraint is ensured by restricting the candidate nodes to be from the frontier node sets as described in Section 3.2.2. However, the isomorphism constraint, i.e. the final extracted and must be isomorphic to each other, need to be checked by the environment, possibly via an exact or approximate graph isomorphism algorithm Cordella et al. (2001).

Leverage the fact that at each step , , and that graph isomorphism checking needs to be performed at each step, we propose our own Fast Iterative Graph Isomorphism (FIGI) algorithm which only incurs linear additional time in the number of nodes. At the root node, the two empty subgraphs are isomorphic. At step , assuming is already isomorphic to . Then, if we can simply check the new node pair proposed by the policy, and ensure and at are still isomorphic, the final result satisfies the isomorphism constraint by proof of induction. In implementation, this is achieved by maintaining a one-to-one node mapping at each step, , such that and . Specifically, we denote the proposed node pair as . We check if the nodes connected to and the nodes connected to match by first obtaining the nodes and and check if and . Since the mapping can be implemented as a hash table and iteratively updated, and the fact that the MCS size is bounded as in Lemma 2.1, in each iteration, the running time of FIGI is only in .

#### Reward

Since MCS detection aims to find the largest common subgraph, we define the reward our RL agent receives at each step to be 1, i.e. for, , where is the last step when the agent cannot further grow and without violating the MCS constraints, e.g. at the terminal tree node 15, 16, 21, or 22 in Figure 1 (c). The RL agent is trained to achieve the largest accumulated reward, i.e. the predicted MCS size.

### 3.3 Policy Network

Having illustrated the graph generation environment, we outline the architecture of our policy network consisting of a graph embedding network and a Deep Q-Network (DQN) Mnih et al. (2015), which is learned by the RL agent to act in the environment. This DQN-based policy network takes the intermediate extracted subgraphs and as well as the original graphs and as inputs, and outputs the action , which predicts a new node pair, as described in Section 3.2.

#### Embedding Generation: JSNE

Existing Graph Neural Networks (GNN) either aim to embed each node in a graph into a vector Duvenaud et al. (2015); Kipf and Welling (2016); Hamilton et al. (2017); Xu et al. (2019c) or an entire graph into a vector (typically via pooling) Ying et al. (2018); Zhang et al. (2018); Bai et al. (2019b); Hermsdorff and Gunderson (2019). However, for MCS detection, the node embeddings at each step should be conditioned on the current extracted subgraphs and . What is worse, most GNNs embed for a single graph, with exceptions such as Graph Matching Networks (GMN) Zanfir and Sminchisescu (2018); Li et al. (2019), which however do not take subgraph information into account and run in at least quadratic time complexity .

Here we present our JSNE network which jointly embed two graphs conditioned on the selected subgraphs in an elegant and efficient way within a unified model. As illustrated in Figure 2, at each step , we add a “pseudo subgraph node” connected to every node in and to perform graph comparison via cross-graph communication through the psedudo node serving as an information-exchanging bridge. This resembles some earlier models Frasconi et al. (1998); Scarselli et al. (2009) which connects a “supersource” node to all the nodes in one graph to generate graph-level embedding. However, our goal is different and connects to and which grow across steps.

In each iteration, by adopting any existing message passing based GNN, e.g. GAT Velickovic et al. (2018), JSNE produces the embeddings of all the nodes and the subgraph jointly. Running in only ) time, JSNE approximates the quadratic GMN models which explicitly compare nodes in two graphs. Multiple JSNE layers are sequentially stacked similar to typical GNN architecture. Node embeddings are one-hot encoded initially according to their node labels (for unlabeled graphs a constant vector is used), and the subgraph embedding is initialized to zeros. The embeddings are iteratively updated as the subgraphs grow.

#### Action Prediction: DQN

Once the node and subgraph embeddings are generated by the multi-layered JSNE network, we use a Multilayer Perceptron (MLP) to produce a distribution of actions over the candidate node pairs (which is further reduced for labeled graphs as mentioned in Section 3.2.2), :

(2) |

where and denote node embeddings where and , denotes the subgraph embeddings , and denote graph-level embeddings generated by aggregating the node embeddings using an aggregation function such as and .

### 3.4 Subgraph Exploration Tree

So far we have described how to generate a sequence of actions in RlMcs. However, according to Proposition 3.1, the search space is exponential in the graph sizes and is thus intractable to search thoroughly. Besides, for MCS detection, it is very likely to obtain suboptimal solution, i.e. predicted subgraph size smaller than the true MCS size, e.g. state node 15 and 16 in Figure 2, which is unwanted considering the nature of the task, though such suboptimal solutions still satisfy the the MCS constraints.

In order to address these issues, we propose a novel subgraph exploration tree, inspired by beam search, a dominant strategy for approximate decoding in structured prediction Negrinho et al. (2018) such as machine translation Sutskever et al. (2014), and also a well-known graph search algorithm Baranchuk et al. (2019). With a hyperparameter budget beam size, the agent is allowed to transition to at most beam size number of best new states at any given state. In Figure 2, , so each level of the tree can have up to 3 state nodes. The proposed algorithm is shown in Algorithm 1.

###### Proposition 3.2.

For a given input graph pair , the maximum depth of the subgraph exploration tree is linear in the smaller of the two input graph sizes, .

###### Proof.

At each state node in the search tree, the intermediate subgraphs satisfy the MCS constraints by definition. Thus, the final predicted MCS size cannot be larger than the true MCS size . Defining the tree depth as the largest number of steps starting from root to terminal node, is then equal to the predicted MCS size. Thus according to Lemma 2.1. ∎

### 3.5 Overall Training

We adopt the standard Deep Q-learning framework Mnih et al. (2013). For each , the agent performs the subgraph exploration tree search according to Algorithm 1, after which the parameters in the JSNE and DQN are updated by performing mini-batch gradient descents over the mean squared error loss. Since imitation learning is known to help with training stability and performance Levine and Koltun (2013), we allow the agent to follow expert sequences generated by ground-truth MCS solvers, e.g. McSplit McCreesh et al. (2017) during tree search, extending the exploration tree to an exploration-imitation tree with a hyperparameter denoting the percentage of pairs utilizing such labeled MCS instances. More details are shown in the Supplementary Material.

### 3.6 Complexity Analysis

At each state, the agent needs to generate embeddings and the q values which have worst-case time complexity ) due to the action space consisting of all the node pairs. Overall the tree depth is bounded by according to Proposition 3.2 and each level of the tree has at most beam size state nodes. Thus, the overall time complexity for each forward pass is . It is noteworthy that in contrast, state-of-the-art exact MCS computation algorithms Hoffmann et al. (2017); McCreesh et al. (2017) do not have worst-case time complexity guarantee, and as shown next, RlMcs strikes a good balance between speed and accuracy.

## 4 Experiments

We evaluate RlMcs against two state-of-the-art exact MCS detection algorithms and a series of approximate graph matching methods from various domains. We conduct experiments on a variety of synthetic and real-world datasets. The code and datasets are provided as part of the Supplementary Material. All the baseline implementations are provided as well.

### 4.1 Baseline Methods

There are three groups of methods: Exact solvers including McSplit McCreesh et al. (2017) and k Hoffmann et al. (2017), supervised models including I-pca Wang et al. (2019), gmn Li et al. (2019) and NeuralMcs Bai et al. (2020b), and unsupervised models including gw-qap Xu et al. (2019a) and our proposed RlMcs.

McSplit and k are given a time budget of 100 seconds for each pair, whose results on training graph pairs are used to train the supervised model I-pca, gmn and NeuralMcs. Specifically, the true MCS results are returned by the solvers as node-node mappings for nodes included in the MCS as illustrated in Figure 1. For each graph pair, I-pca, gmn and NeuralMcs are supervised models and generate a matching matrix indicating the likelihood of each node pair being matched, which is fed into the binary cross entropy loss function against the true matching matrix generated by exact solvers, replacing their original domain-specific loss functions. gw-qap performs Gromov-Wasserstein discrepancy Peyré et al. (2016) based optimization for each graph pair and outputs a matching matrix , which does not require supervision by true MCS results.

During testing, necessary adaptation is performed to I-pca, gmn and gw-qap, which are designed for other tasks. Since they all yield a for each graph pair indicating the likelihood of node pairs being matching, we feed the predicted into our proposed subgraph exploration tree as detailed in Section 3.4. Specifically, we use as the q value for node pair instead of calling as in Algorithm 1. All other aspects including the definition of frontier nodes, checking if a selection of node pair is allowed by environment, the value of beam size, etc., are set the same way or value as our model RlMcs. A more detailed description can be found in the Supplementary Material.

Method | BA | ER | WS | Aids | ||||
---|---|---|---|---|---|---|---|---|

Core=32 | Core=40 | Core=32 | Core=40 | Core=32 | Core=40 | |||

McSplit * | 0* | 0* | 0* | 0* | 0* | 0* | 100* | 100* |

k * | 0* | 0* | 0* | 0* | 0* | 0* | 100* | 100* |

I-pca | 57.656 | 46.550 | 77.844 | 62.000 | 92.406 | 82.500 | 30.719 | 93.332 |

gmn | 56.750 | 45.550 | 77.906 | 61.900 | 78.125 | 82.500 | 87.272 | 97.182 |

gw-qap | 35.938 | 27.825 | 16.688 | 13.075 | 40.625 | 40.425 | 34.618 | 57.919 |

NeuralMcs | 57.188 | 47.256 | 77.094 | 61.000 | 84.375 | 80.000 | 99.562 | 99.626 |

RlMcs | 58.750 | 68.750 | 81.438 | 62.650 | 96.875 | 84.000 | 93.187 | 98.467 |

### 4.2 Parameter Settings

For our model, we utilize 3 layers of JSNE each with 64 dimensions for the embeddings. We use as our activation function. We set beam size to 5. We ran all experiments with Intel i7-6800K CPU and one Nvidia Titan GPU. For DQN, we use MLP layers to project concatenated embeddings from 320 dimensions to a scalar. We observe better performance using for real datasets and for synthetic datasets. For training, we set the learning rate to 0.001, the number of iterations to 600 for synthetic datasets and 2000 for real datasets, and use the Adam optimizer Kingma and Ba (2015). All experiments were implemented with the PyTorch and PyTorch Geometric libraries Fey and Lenssen (2019).

### 4.3 Evaluation Metrics

For each testing graph pair, we collect the extracted subgraphs , and measure the accuracy and running time (msec in wall time), which are then averaged across all the testing pairs. The definition of MCS detection accuracy is , where .
If and are returned within time limit and are connected, induced, as well as isomorphic to each, . Otherwise . When checking isomorphism, we set a timeout (10 seconds) for exact isomorphism checking Cordella et al. (2001), and switch to an approximate isomorphism checking^{1}

### 4.4 Results on Synthetic Data

The key property of RlMcs is its ability to extract MCS without much supervised data in an efficient and relatively accurate way. We generate three types of synthetic dataset using three types of popular graph generation models: Barabási-Albert (BA) Barabási and Albert (1999), Erdős-Rényi (ER) Gilbert (1959) and WattsâStrogatz (WS) Watts and Strogatz (1998). For each model, we first generate 1000 graph pairs for training and 100 for testing. When generating , we first generate a “common core” and then “grows” upon the core to obtain and where . We vary the core size to obtain several datasets. The exact procedure is shown in the Supplementary Material. This way, we obtain which we know must contain but can be larger, essentially allowing the accuracy score to be above 1. Notice by incorporating the hard checking ( described in Section 4.3), all the predictions satisfy the MCS constraints, and the larger the accuracy, the larger the predicted subgraphs.

We see that our model is both able to outperform baselines on instances where the ground truth MCS is difficult to obtain. The exact solvers fail to return within the time limit on all pairs, which is not surprising considering the NP-hard nature of the task and the lack of time complexity guarantee of McSplit and k. Notice that in the original paper of McSplit, the time limit is set to 1000 seconds (around 17 minutes per pair), an unaffordable time budget considering most machine learning models finish within 10 seconds as shown in Figure 3. The failure of exact solvers under the more realistic time budget setting implies that purely supervised models (I-pca, gmn and NeuralMcs) cannot even be used in practice due to severe lack of labeled instances. In these experiments, we give these supervised models advantage by feeding as “fuzzy” ground-truth MCS into their training stage. In practice, the most reasonable choice would be gw-qap and our model RlMcs, with our model being much more accurate.

### 4.5 Results on Real Datasets

In addition to the synthetic datasets, we show that with the presence of supervision data, NeuralMcs is competitive in accuracy against baselines. We use two datasets: (1) Reddit Yanardag and Vishwanathan (2015) is a collection of 11929 online discussion threads represented as user interaction graphs, which is commonly used in graph classification tasks; (2) Aids Zeng et al. (2009) is a popular dataset used in graph similarity computation and search. We observe McSplit and k fail to solve large graphs in these datasets most of the time, so we randomly random graphs less than 17 nodes forming 3556 graph pairs with average graph size being 11.8 nodes for Reddit, and randomly sample 29610 graph pairs whose average graph size is 8.7 nodes for Aids.

Notice that under this setting, RlMcs still uses zero labeled instances, i.e. purely unsupervised relying on the subgraph exploration and DQN training to extract MCS. In contrast, I-pca, gmn, and NeuralMcs rely on the exact solvers to provide ground-truth instances. For these relatively small graphs, exact solvers successfully return the correct results under 100 seconds, but still much slower than machine learning approaches as shown in Figure 3. From Table 1, we see the heavy reliance on supervised data indeed brings performance gain to supervised models, especially NeuralMcs, and the unsupervised gw-qap performs relatively poorly. However, our RL agent still yields the second best accuracy and performs better than I-pca and gmn.

### 4.6 Contribution of Each Module

#### Contribution of JSNE

We replace the JSNE module with two other graph embedding modules to study the effect of JSNE. Specifically, we use GAT Velickovic et al. (2018) and GMN Li et al. (2019). As shown in Table 2, with GAT and GMN, our agent can no longer perform conditional embeddings based on subgraph extraction status, and instead uses the same embeddings without considering subgraph growth, leading to worse performance.

Method | Accuracy |
---|---|

RlMcs w/ GAT | 82.685 |

RlMcs w/ GMN | 72.026 |

RlMcs w/ JSNE (full) | 93.187 |

#### Contribution of Subgraph Exploration with Search Tree

As mentioned in Section 4.1, we adapt the baseline models to tackle MCS detection via feeding the matching matrix matrix into our Algorithm 1. To investigate the effectiveness of our search tree, we feed generated by these methods into a simpler strategy based on thresholding and Hungarian Algorithm Kuhn (1955): We remove nodes with overall matching scores computed as summation across rows and columns of lower than a tunable threshold. Since the number of remaining nodes in and may be different, we then run the Hungarian Algorithm for Linear Assignment Problems (LAP) on and , yielding two subgraphs of the same size returned as the prediction. We also try running the Hungarian Algorithm on the original to obtain two subgraphs as final prediction, and report the better of the two results.

It is noteworthy that NeuralMcs uses its own search method called Guided Subgraph Extraction (GSE) Bai et al. (2020b), which can be roughly considered as a simpler variant of our proposed search with . More details on comparing these strategies can be found in the Supplementary Material. As shown in Table 3, with this simpler alternative strategy, the performance of Graph Matching Networks and NeuralMcs drops by a large amount, while the performance of I-pca and gw-qap increases slightly.

Method | Accuracy |
---|---|

I-pca w/o Search | 33.987 |

I-pca w/ Search | 30.719 |

gmn w/o Search | 43.137 |

gmn w/ Search | 87.272 |

gw-qap w/o Search | 35.948 |

gw-qap w/ Search | 34.618 |

NeuralMcs w/o Search | 29.669 |

NeuralMcs w/ GSE | 99.562 |

#### Contribution of FIGI

The Fast Iterative Graph Isomorphism (FIGI) algorithm proposed in Section 3.2.3 is used by the environment to check if a selection of new node pair is allowed. Here we compare FIGI with two alternatives: (1) The exact graph isomorphism used in evaluation as described in Section 4.3; (2) The Subgraph Isomorphism Network (SIN) proposed in NeuralMcs Bai et al. (2020b) which essentially performs Weisfeiler-Lehman (WL) graph isomorphism test Shervashidze et al. (2011) using node embeddings generated at each step . As shown in Table 4, on Reddit FIGI successfully ensures all the returned predictions satisfy the isomorphism constraint posed by the MCS definition, and is much faster than other approaches. In fact, we observe that under all the settings in our experiments, FIGI exhibits perfect isomorphism detection accuracy. A more detailed discussion on FIGI in the Supplementary Material.

Method | Iso % | Running Time |
---|---|---|

RlMcs w Exact GI | 100 | 1014.945 |

RlMcs w/ SIN GI | 100 | 665.457 |

RlMcs w/ FIGI | 100 | 594.687 |

## 5 Related Work

MCS detection is HP-hard, with existing methods based on constraint programming (Vismara and Valery, 2008; McCreesh et al., 2016), branch and bound (McCreesh et al., 2017; Liu et al., 2019), mathematical programming (Bahiense et al., 2012), conversion to maximum clique detection (Levi, 1973; McCreesh et al., 2016), etc. Closed related to MCS detection is Graph Edit Distance (GED) computation (Bunke, 1983), which in the most general form refers to finding a series of edit operations that transform one graph to another and has also been adopted in many task where the matching or similarity between graphs is necessary. There is a growing trend of using machine learning approaches to approximate graph matching and similarity score computation, but these works either do not address MCS detection specifically and must be adapted Zanfir and Sminchisescu (2018); Wang et al. (2019); Yu et al. (2020); Xu et al. (2019b, a); Bai et al. (2019a, 2020a); Li et al. (2019); Ling et al. (2020), or rely on labeled instances Bai et al. (2020b)

## 6 Conclusion and Future Work

We have proposed a reinforcement learning method which unifies graph representation learning, deep Q-learning and imitation learning into a single framework. We show that the resulting model shows superior performance on various graph datasets. In the future, we plan to extend our method to subgraph matching Sun et al. (2012), which requires the matching and retrieval of all subgraphs contained in a large graph. Additionally, to improve the scalability of our method, we will explore new graph search and matching algorithms.

## Appendix A Dataset Description

The datasets have been deposited in a data repository preserving anonymity Anonymous (2020) and can be found at http://doi.org/10.5281/zenodo.3676334. The code including the implementation of our model and the baseline methods as well as the code for graph generation will be made available.

### a.1 Details of Graph Generation

As mentioned in the main text, we adapt popular graph generation algorithms for generating graph pairs each sharing a common induced subgraph (common core) used as synthetic datasets. Therefore, for each graph pair, the MCS is at least as large as their common core, which can be used for evaluation as described in the main text. The challenge is to ensure the common core graph is subgraph isomorphic to both parent graphs while following the procedure of an underlying well-known graph generation algorithm. This section details the generation procedure with three underlying generation algorithm.

We denote the graph pair to generate as and . We denote the nodes of the common core as , whose size is , the nodes that are not in the common core as and for and respectively. Thus, is equivalent to , and is equivalent to . We denote the total number of nodes in the two graphs as N, i.e. . We denote as a one-to-one function mapping one node from to one node from .

Barabási-Albert (BA) Barabási and Albert (1999) generates graphs by successively adding and randomly connecting new nodes to the previously added nodes. In our case, we generate a graph pair by connecting the first nodes in the common core to the same previously added nodes, and then for the next nodes follow the BA framework independently. This is detailed in Algorithm 2. We set the edge density (which is an integer) to 2 in the experiments.

Erdős-Rényi (ER) Gilbert (1959) generates graphs by randomly adding edges to isolated nodes where all edges have equal probability to be generated with an edge density parameter, . In our case, we first generate the random core graph of nodes. Then, to ensure the newly added edges do not modify the already generated common core, for each new edge, we only add the edge if the two nodes in the edge are not in the common core graph. As this entire process does not ensure that the generated graphs are connected, we repeat it until , and the common core graph are connected. This is detailed in Algorithm 3. We set to 0.07 in the experiments.

WattsâStrogatz (WS) Watts and Strogatz (1998) generates graphs by starting with a ring lattice, a graph where each node is connected to a fixed number of neighbors, and then randomly re-wiring the edges of the lattice. In our case, we first generate and as two ring lattice graphs which are identical to each other. Then, we select nodes to be the common core, and perform random rewiring with rewiring probability on the two graphs ensuring the common core is subgraph isomorphic to and . As this entire process does not ensure that the generated graphs are connected, we repeat it until , and the common core graph are connected. This is detailed in Algorithm 4. We set the ring density to 4 and to 0.2 and the rewiring probability, , to in the experiments.

### a.2 Details of Real Graph Datasets

#### Aids

Aids is a collection of antivirus screen chemical compounds, obtained from the Developmental Therapeutics Program at NCI/NIH. The Aids dataset has been used by many works in graph matching (Zeng et al., 2009; Wang et al., 2012; Zheng et al., 2013; Zhao et al., 2013; Liang and Zhao, 2017; Bai et al., 2019a). These chemical compounds have node labels representing chemical elements (ex. Carbon, Nitrogen, Chlorine, and etc.) and edges denoting bonds between atoms. There are a total of 700 graphs, from which we sample 29610 graph pairs. The average graph size is 8.664 with the largest graph having 10 nodes.

Reddit is a collectionn of online dicussion networks from the Reddit online discussion website (Yanardag and Vishwanathan, 2015). The nodes in this dataset are unlabeled, where nodes represent users in a thread and edges represent whether users interactions in the discussion. Totally, there is 7112 graphs, from which we sample 3556 pairs. The average graph size is 11.8 nodes and the largest graph has 16 nodes.

## Appendix B Differences between RlMcs and NeuralMcs

The major difference between RlMcs proposed in this paper and NeuralMcs Bai et al. (2020b) is that, RlMcs uses reinforcement learning to perform MCS detection while NeuralMcs is purely supervised, trained on the ground-truth MCS matching matrix. Another obvious difference is the proposed JSNE module for graph representation generation used in RlMcs versus the GMN nodule used in NeuralMcs. As shown in the main text, replacing JSNE with GMN in our model leads to worse performance, which can be largely attributed to the fact that our reinforcement learning approach requires node embeddings to be conditioned on the subgraph extraction state, while in contrast, NeuralMcs generates the matching between all node pairs in one shot as a matching matrix, which therefore does not necessarily require the node embeddings to dynamic change as subgraph extraction proceeds.

As for subgraph extraction procedure, it is noteworthy that RlMcs uses the proposed Subgraph Exploration Tree method, while NeuralMcs uses a simpler procedure called “Guided Subgraph Extraction” Bai et al. (2020b). There are two major differences between Subgraph Exploration Tree and Guided Subgraph Extraction. First, the Subgraph Exploration Tree uses our proposed FIGI algorithm for isomorphism checking, while the Guided Subgraph Extraction uses a Subgraph Isomorphism Network (SIN) to perform that. Second, Subgraph Exploration Tree has a tunable beam size parameter while in Guided Subgraph Extraction, the search is much greedier, which is equivalent to beam size being always 1.

Since the differences between the proposed Subgraph Exploration Tree and Guided Subgraph Extraction are quite subtle, we conduct the following experiments to study the two differences in more details.

### b.1 SIN vs FIGI

In this set of experiments, we make the choice of graph isomorphism checking algorithm a tunable parameter in both RlMcs and NeuralMcs while letting both models use the same beam size. As shown in Table 5, NeuralMcs performs the same no matter which algorithm is used, while RlMcs performs better using our proposed FIGI instead of the SIN method.

Iso Checking | Method | Acc |
---|---|---|

SIN | NeuralMcs | 99.644 |

FIJI | NeuralMcs | 99.644 |

SIN | RlMcs | 89.661 |

FIJI | RlMcs | 93.187 |

In fact, both FIGI and SIN are fast approximate graph isomorphism checking algorithms. For FIGI, at step , the node-node mapping is updated once a new node pair is selected. This guarantees that there is no “false positive” but there can be “false negative”, i.e. if FIGI returns true for two graphs being isomorphic to each other, they must be isomorphic, but if FIGI returns false, they could be either isomorphic or not. There is no false positive because the node-node mapping at each step ensures the isomorphism between two graphs. There can be false negative for the following reason. When a new node pair is selected, is updated by adding (which implies ), leading to . However, notice at step the nodes and do not have to match to each other, and the already-mapped nodes in can be remapped to the newly selected nodes and , e.g. an already-mapped node pair can be potentially remapped to and . Since there is possible node-node mappings for node pairs, FIGI simply assumes the mapping at does not change and is passed to the mapping at by assuming matches and adding . Therefore, even when FIGI returns false for the new two subgraphs at leading to the environment rejecting the proposed action , the inclusion of nodes and may lead to subgraphs at isomorphic to each other, via some unfound node-node re-mappings.

For SIN proposed in NeuralMcs Bai et al. (2020b), it guarantees no false negative result but there may be false positive result, i.e. if SIN returns false for two graphs being isomorphic to each other, the conclusion must be correct, but if SIN returns true for two graphs being isomorphic, in reality they may not be isomorphic. The fundamental reason is that SIN mimics Weisfeiler-Lehman (WL) graph isomorphism test Shervashidze et al. (2011), which assigns labels to nodes in the two graphs and compares if the two node label sets are different. In SIN, the label assignment is implemented as embedding aggregation, i.e. for each node, the label is iteratively updated as the aggregation of the embeddings of the neighboring nodes^{2}^{3}

In conclusion, both FIGI and SIN are inexact, and in this set of experiments, using FIGI over SIN brings certain performance gain to RlMcs on Reddit.

### b.2 beam size

Table 6 shows what would happen if different beam sizes are used for both NeuralMcs and RlMcs. Notice we use FIGI consistently and the difference in the search process is beam size. It can be seen that larger beam size helps increasing the performance for both models, which is not surprising due to the enlarged search space by larger beam size.

beam size | Method | Acc |
---|---|---|

1 | NeuralMcs | 99.434 |

5 | NeuralMcs | 99.644 |

1 | RlMcs | 86.375 |

5 | RlMcs | 93.187 |

## Appendix C Details on Training RlMcs

We adopt the standard Deep Q-learning framework Mnih et al. (2013). For each , the agent performs the subgraph exploration tree search, after which the parameters in the JSNE and DQN are updated. Notice each state is represented as a node in the subgraph exploration tree, and in each state, the agent tries to pick a new node pair. Since at the beginning of training, the q approximation is not well trained, and random behavior may be better, we adopt the epsilon-greedy method by switching between random policy and Q policy using a probability hyperparameter . This probability is tuned to decay slowly as the agent learns to play the game, eventually stabilizing at a fixed probability. We set the starting epsilon to 0.7 decaying to 0.001.

We denote the DQN as a function which generates a q value for each state-action pair. For each graph pair, after its subgraph extraction tree process is over, we collect all the transitions, i.e. 4-tuples in the form of where is 1 if is a terminal state^{4}

To stabilize our training, we adopt a target network which is a copy of the DQN network and use it for computing . This target network is synchronized with the DQN periodically, in every 100 iterations.

### c.1 Leveraging Labeled MCS Instances

As mentioned in the main text, the main advantage of RlMcs is that it does not require any labeled MCS instances, and thus can achieve better performane on larger graph datasets. It is noteworthy, however, that the subgraph exploration stage in RlMcs can naturally incorporate the ground-truth MCS results by extending the exploration tree into an exploration-imitation tree. This is accomplished by running the subgraph exploration procedure for a second time, where the initial pair selection is taken from the ground truth and on each iteration, only nodes from the ground truth can be selected, ultimately producing another imitation tree. As the ground truth may provide several correct sequences of node pair selections, we allow the beam size of this second exploration-imitation tree to be tuned. This allows for more fine-grain tuning of exploration versus exploitation. By leveraging the labeled instances, the model can try better actions earlier on, improving the learning process. As shown in Table 7, by incorporating the second imitation tree whose beam size is also set to 5, we are able to achieve higher performance.

Method | Acc |
---|---|

NeuralMcs w/o sup | 93.187 |

NeuralMcs w sup | 95.934 |

## Appendix D Scalability Study

We conduct the following additional experiment to verify the scalability of RlMcs on larger graphs. The total number of nodes increases to 96 (compared to 64 as used in the main text). This time we also increase the time budget for exact solvers McSplit and k from 100 seconds as used in the main text to 1000 seconds which is the largest time limit as in the original paper of McSplit McCreesh et al. (2017). This corresponds to almost 17 minutes given to each graph pair in the testing set. As shown in Table 8, the exact solvers still fail to yield results for all the 100 testing graph pairs within the time limit, while RlMcs performs reasonably well with above 90% accuracy and solves in approximately 30 seconds on average due to guaranteed worst-case time complexity.

Method | WS: Core=48,Tot=96 | |
---|---|---|

Acc | Running Time (sec) | |

McSplit * | 0* | 1000* |

k * | 0* | 1000* |

RlMcs | 91.416 | 33.378 |

## Appendix E Result Visualization

We plot 15 graph pairs from the smalelst dataset Aids and 8 graph pairs from the largest dataset WS in Figure 4 and 5. For Aids, all the extraction results satisfy the node label constraints, i.e. only nodes with the same label can be matched to each other in the detected MCS. Notice that for many graph pairs in WS, RlMcs achieves larger than 100% accuracy due to the definition of accuracy and the fact that the ground-truth is the common core which is “fuzzy”. Specifically, since the common core only gives a lower bound of the true MCS between these large graphs, if the model extracts two subgraphs for a given graph pair which satisfy the MCS constraints and the size is larger than the common core, the accuracy for that graph pair would be larger than 100%.

### Footnotes

- https://networkx.github.io/documentation/stable/reference/algorithms/generated/networkx.algorithms.isomorphism.could_be_isomorphic.html
- In WL graph isomorphism test, the label assignment is essentially also neighbor aggregation, but uses an additional hash function to compress the node labels.
- In WL graph isomorphism test, the node label set is simply represented as a multiset of all the node labels in the graph to check isomorphism. Multiset is used since multiple nodes can share the same label.
- The terminal state is either when there is no frontier node pairs to select from, i.e. a graph has been fully explored, or if all frontier nodes lead to non-isomorphic subgraphs.

### References

- Cited by: Appendix A.
- The maximum common edge subgraph problem: a polyhedral investigation. Discrete Applied Mathematics 160 (18), pp. 2523–2541. Cited by: §5.
- SimGNN: a neural network approach to fast graph similarity computation. WSDM. Cited by: §A.2.1, §1, §5.
- Learning-based efficient graph similarity computation via multi-scale convolutional set matching. AAAI. Cited by: §1, §5.
- Unsupervised inductive whole-graph embedding by preserving graph proximity. IJCAI. Cited by: §3.3.1.
- Neural maximum common subgraph detection with guided subgraph extraction. External Links: Link Cited by: §B.1, Appendix B, Appendix B, §1, §4.1, §4.6.2, §4.6.3, §5.
- Emergence of scaling in random networks. science 286 (5439), pp. 509–512. Cited by: §A.1, §4.4.
- Learning to route in similarity graphs. ICML. Cited by: §3.4.
- A graph distance metric based on the maximal common subgraph. Pattern recognition letters 19 (3-4), pp. 255–259. Cited by: §1.
- What is the distance between graphs. Bulletin of the EATCS 20, pp. 35–39. Cited by: §5.
- Privacy-preserving query over encrypted graph-structured data in cloud computing. In 2011 31st International Conference on Distributed Computing Systems, pp. 393–402. Cited by: §1.
- An improved algorithm for matching large graphs. In 3rd IAPR-TC15 workshop on graph-based representations in pattern recognition, pp. 149–159. Cited by: §3.2.3, §4.3.
- The weisfeiler-lehman method and graph isomorphism testing. arXiv preprint arXiv:1101.5211. Cited by: §B.1.
- Comparison of maximum common subgraph isomorphism algorithms for the alignment of 2d chemical structures. ChemMedChem 13 (6), pp. 588–598. Cited by: §1.
- Convolutional networks on graphs for learning molecular fingerprints. In NIPS, pp. 2224–2232. Cited by: §3.3.1.
- Maximum common subgraph isomorphism algorithms and their applications in molecular science: a review. Wiley Interdisciplinary Reviews: Computational Molecular Science 1 (1), pp. 68–79. Cited by: §1.
- Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, Cited by: §4.2.
- A general framework for adaptive processing of data structures. IEEE transactions on Neural Networks 9 (5), pp. 768–786. Cited by: §3.3.1.
- Random graphs. The Annals of Mathematical Statistics 30 (4), pp. 1141–1144. Cited by: §A.1, §4.4.
- Inductive representation learning on large graphs. In NIPS, pp. 1024–1034. Cited by: §3.3.1.
- A unifying framework for spectrum-preserving graph sparsification and coarsening. In NeurIPS, pp. 7734–7745. Cited by: §3.3.1.
- Between subgraph isomorphism and maximum common subgraph. In AAAI, Cited by: §1, §3.6, §4.1.
- Adam: a method for stochastic optimization. ICLR. Cited by: §4.2.
- Semi-supervised classification with graph convolutional networks. ICLR. Cited by: §3.3.1.
- The hungarian method for the assignment problem. Naval research logistics quarterly 2 (1-2), pp. 83–97. Cited by: §4.6.2.
- A note on the derivation of maximal common subgraphs of two directed or undirected graphs. Calcolo 9 (4), pp. 341. Cited by: §5.
- Guided policy search. In ICML, pp. 1–9. Cited by: §3.5.
- Graph matching networks for learning the similarity of graph structured objects. ICML. Cited by: §1, §3.3.1, §4.1, §4.6.1, §5.
- Similarity search in graph databases: a multi-layered indexing approach. In ICDE, pp. 783–794. Cited by: §A.2.1.
- Hierarchical graph matching networks for deep graph similarity learning. External Links: Link Cited by: §5.
- A learning based branch and bound for maximum common subgraph problems. IJCAI. Cited by: §5.
- Clique and constraint models for maximum common (connected) subgraph problems. In International Conference on Principles and Practice of Constraint Programming, pp. 350–368. Cited by: §5.
- A partitioning algorithm for maximum common subgraph problems. Cited by: Appendix D, §1, §3.5, §3.6, §4.1, §5.
- Playing atari with deep reinforcement learning. NeurIPS Deep Learning Workshop 2013. Cited by: Appendix C, §3.5.
- Human-level control through deep reinforcement learning. Nature 518 (7540), pp. 529–533. Cited by: §1, §3.3.
- Learning beam search policies via imitation learning. In NeurIPS, pp. 10652–10661. Cited by: §3.4.
- Deriving common malware behavior through graph clustering. Computers & Security 39, pp. 419–430. Cited by: §1.
- Gromov-wasserstein averaging of kernel and distance matrices. In ICML, pp. 2664–2672. Cited by: §4.1.
- The graph neural network model. IEEE Transactions on Neural Networks 20 (1), pp. 61–80. Cited by: §3.3.1.
- Weisfeiler-lehman graph kernels. JMLR 12 (Sep), pp. 2539–2561. Cited by: §B.1, §4.6.3.
- Efficient subgraph matching on billion node graphs. VLDB. Cited by: §6.
- Sequence to sequence learning with neural networks. In NeurIPS, pp. 3104–3112. Cited by: §3.4.
- Graph attention networks. ICLR. Cited by: §3.3.1, §4.6.1.
- Finding maximum common connected subgraphs using clique detection or constraint satisfaction algorithms. In International Conference on Modelling, Computation and Optimization in Information Systems and Management Sciences, pp. 358–368. Cited by: §5.
- Learning combinatorial embedding networks for deep graph matching. ICCV. Cited by: §1, §4.1, §5.
- An efficient graph indexing method. In ICDE, pp. 210–221. Cited by: §A.2.1.
- Collective dynamics of âsmall-worldânetworks. nature 393 (6684), pp. 440. Cited by: §A.1, §4.4.
- Scalable gromov-wasserstein learning for graph partitioning and matching. In NeurIPS, pp. 3046–3056. Cited by: §1, §3.1, §4.1, §5.
- Gromov-wasserstein learning for graph matching and node embedding. ICML. Cited by: §1, §5.
- How powerful are graph neural networks?. ICLR. Cited by: §3.3.1.
- Substructure similarity search in graph databases. In SIGMOD, pp. 766–777. Cited by: §1.
- Deep graph kernels. In SIGKDD, pp. 1365–1374. Cited by: §A.2.2, §4.5.
- Hierarchical graph representation learning with differentiable pooling. arXiv preprint arXiv:1806.08804. Cited by: §1, §3.3.1.
- Graph convolutional policy network for goal-directed molecular graph generation. In NeurIPS, pp. 6410–6421. Cited by: §1.
- Learning deep graph matching with channel-independent embedding and hungarian attention. In ICLR, External Links: Link Cited by: §1, §5.
- Deep learning of graph matching. In CVPR, pp. 2684–2693. Cited by: §1, §3.1, §3.3.1, §5.
- Comparing stars: on approximating graph edit distance. PVLDB 2 (1), pp. 25–36. Cited by: §A.2.1, §4.5.
- Link prediction based on graph neural networks. In NeurIPS, pp. 5165–5175. Cited by: §1.
- An end-to-end deep learning architecture for graph classification. In AAAI, Cited by: §3.3.1.
- A partition-based approach to structure similarity search. PVLDB 7 (3), pp. 169–180. Cited by: §A.2.1.
- Graph similarity search with edit distance constraint in large graph databases. In CIKM, pp. 1595–1600. Cited by: §A.2.1.