A Learning based Branch and Bound
for Maximum Common Subgraph Problems
Abstract
Branchandbound (BnB) algorithms are widely used to solve combinatorial problems, and the performance crucially depends on its branching heuristic. In this work, we consider a typical problem of maximum common subgraph (MCS), and propose a branching heuristic inspired from reinforcement learning with a goal of reaching a tree leaf as early as possible to greatly reduce the search tree size. Extensive experiments show that our method is beneficial and outperforms current best BnB algorithm for the MCS.
A Learning based Branch and Bound
for Maximum Common Subgraph Problems
Yanli Liu , Chumin Li , Hua Jiang^{†}^{†}thanks: Contact Author , Kun He ^{†}^{†}thanks: Contact Author
Huazhong university of science and technology, China
University of Picardie Jules Verne, France
Wuhan university of science and technology, China
{yanlil2008,jh_hgt}@163.com, chumin.li@upicardie.fr, Brooklet60@hust.edu.cn
1 Introduction
A graph is a logic model to describe a set of objects and the relationship of the objects abstracted from realworld applications. Given two graphs and , it is often crucial to determine the similarities or differences of and . For this purpose, one should find a graph with as many vertices as possible that is isomorphic to an induced subgraph in both and . This problem is called Maximum Common induced Subgraph (MCS). It is NPhard and widely occurs in applications such as image or video analysis [?; ?], information retrieval [?], biochemistry [?; ?; ?], and pattern recognition [?; ?; ?].
Due to the importance of the MCS, many approaches have been designed to solve it, see, e.g, [?; ?; ?; ?; ?; ?; ?]. In this paper, we focus on the branchandbound (BnB) scheme for the MCS. Examples of existing BnB algorithms for the MCS can be found in [?; ?; ?].
It is wellknown that the performance of a BnB algorithm crucially depends on its branching heuristic. The branching heuristic in McSplit [?], the best BnB algorithm for the MCS to our knowledge, aims at minimizing the number of branches for the current branching point and uses vertex degrees to rank vertices. Unfortunately, due to the NPhardness of the MCS, the BnB search tree size may not be predictable in general using vertex degrees or any other superficial feature of a graph.
In this paper, we propose to use reinforcement learning to discover which branching choice yields the greatest reduction of the search tree by trying them out during the search. Concretely, we consider the BnB algorithm as an agent and each branching choice as an action. When the agent takes an action, it receives a reward determined by the consequence of this action, which in our context is the reduction of the search space. The score of the action depends on the accumulated rewards it received in the past. Then, at every branching point, the agent selects, among the actions resulting in the minimum number of branches, the action with the greatest score to branch on, ties being broken in favor of the vertex with the maximum degree.
We implemented our branching heuristic on top of McSplit. The new algorithm is called McSplit+RL and is extensively evaluated on more than 21,743 MCS instances from diverse applications (biochemical reaction, images analysis, 2D, 3D, 4D objects, complex networks), including the large instances used in [?] for evaluating McSplit. Empirical results show that McSplit+RL solves 130 instances more than McSplit on large graphs, illustrating the effectiveness of combining reinforcement learning in designing branching heuristic for the BnB search.
This paper is organized as follows. Section 2 defines some notations and the MCS problem. Section 3 presents the general BnB scheme for the MCS. Section 4 presents our branching heuristic inspired by reinforcement learning. Section 5 empirically compares McSplit and McSplit+RL and gives insight into the effectiveness of the new branching heuristic. Section 6 concludes.
2 Problem Definition
For a simple (unweighted, undirected), labelled graph , is a finite set of vertices, is a set of edges, and is a label function that assigns, to each vertex , a label value . If the labels are the same for all vertices, then the labelled graph is reduced to an unlabelled graph. Two vertices and are adjacent iff . The degree of a vertex is the number of its adjacent vertices.A subset induces a subgraph of , where , and , .
Given a pattern graph and a target graph , the Maximum Common induced Subgraph (MCS) problem is to find a subset and a subset of the greatest cardinality and a bijection such that: (1) , (2) , , and (3) given any , and are adjacent in if and only if and are adjacent in . In other words, the MCS is to find a maximum subgraph and a maximum subgraph such that and are isomorphic. or is called a maximum common induced subgraph of and . The vertex pair is called a match.
Let , an optimal solution of the MCS is denoted as a set of matches .
3 Branch and Bound for the MCS
Given two graphs and , the BnB algorithm depicted in Algorithm 1 gradually constructs and proves an optimal solution using a depthfirst search. During the search, the algorithm maintains two variables: , the solution under construction; and , the best solution found so far. In addition, every vertex of is associated with a subset of vertices of that can be matched with . At the beginning, and are initialized to , and is initialized to be the set of all vertices of having same label of . The first call MCS returns a maximum common subgraph of and .
At each branching point, the algorithm first computes an upper bound UB of the cardinality of the best solution that can be found from this branching point, by calling the overestimate function. It then compares UB with . If UB , a solution better than cannot be found from this branching point, and the algorithm prunes the current branch and backtracks. Otherwise, it selects a notyet matched vertex from and tries to match it with every vertex in in turn. As a consequence of matching with , is added into , and for each notyet matched of , is updated as follows: If is adjacent to , remove all vertices nonadjacent to from ; otherwise, remove all vertices adjacent to from .
Note that after updating , , is adjacent to in iff is adjacent to in , so that the match can be further added into to yield a feasible solution. If a solution better than is found in a leaf of the search tree where further match is impossible, the algorithm updates to be before backtracking to find an even better solution.
An important issue for implementing Algorithm 1 is how to implement , which determines how to select the vertex in line 11 and how to design the overestimate function. A natural way is to explicitly create a list of vertices of for each vertex of . With this implementation, Ndiaye and Solnon ? represent each vertex of as a variable whose domain is a set of vertices of . Then, they select the vertex with the smallest domain in line 11; and use a soft alldifferent constraint in the overestimate function to compute a bound. The difficulty in this implementation of is that given a vertex of , it is not straightforward to know the number of variables whose domain contains . If the domain of a vertex is the smallest and contains a vertex of , but also occurs in the domain of many other vertices of , branching on may not be the best choice to minimize the search tree size.
McCreesh et al. [?] use an elegant way to represent , based on the fact that many vertices of have the same domain during the search. Thus the vertices of having the same value should be put together to have a compact representation of . The following example illustrates this representation and its use in Algorithm 1.
Example 1
Figure 1 shows two undirected and unlabelled graphs and , = , = . Initially, for each vertex of , represented using the pair .
Then, vertex 0 is chosen in line 11 for branching. The first match added into is . Consequently, is split into and , and is split into and , because vertices , and are adjacent to the matched vertex , while vertex is not; and vertices , , and are adjacent to the matched vertex , while vertex is not. The updated is then represented by , saying that and .
Note that the splitting of and is equivalent to removing , , and from , and from , and .
More generally, a pair is called a label class in [?], where () is a subset of (), meaning that for each . Let be the set of label classes. When a new match is added into , Algorithm 1 splits each label class in into two label classes and in lines 14 – 21, so that the vertices in () are all adjacent to () and the vertices in () are all nonadjacent to (). Note that and , as well as and , are disjoint.
This representation of enables the following branching heuristic and bound computation in [?].

Given a label class , there are matches to try. So, McCreesh et al. first select a label class such that is the smallest and then the vertex in with the greatest degree in line 11 for branching, which is very similar to choosing a label class with the smallest and then breaking ties using vertex degrees. This heuristic is better than the heuristic used in [?] consisting in selecting a label class with the smallest .

Let be the set of label classes at a branching point. A label class can offer at most matches to . So, the overestimate function in [?] computes and returns , which is equivalent to the bound given by the soft alldifferent constraint in [?] but is much simpler to compute.
Nevertheless, the branching heuristic in [?] depends heavily on vertex degrees and may not result in the smallest search tree. In the next section, we will propose a new branching heuristic inspired by reinforcement learning.
4 Learning Rewards for Branching
In reinforcement learning, there is an agent and an uncertain environment. The agent is a learner and decision maker, and has a goal or goals. In order to achieve its goal, the agent interacts with the environment by taking actions and observing the impact of its actions to the environment. Concretely, when the agent takes an action, it receives a reward related to its goal from the environment. It has a value function that transforms the cumulative rewards it received over time to a score of the action, representing a prediction of rewards in the future for this action. So, the agent should take the action with the maximum score among all available actions in each step to achieve its goal. For a comprehensive presentation of reinforcement learning and its applications, see, e.g., [?].
Inspired by reinforcement learning, we regard Algorithm 1 as an agent. It has a goal: reach a search tree leaf as early as possible to reduce as much as possible the search tree size. To achieve this goal, the agent adds successively a match into . However, it usually has many choices of in a step and does not know which choice is better. We then regard each choice as an action. Then, our key issue now is how to define the reward function and the value function.
As can be seen in Algorithm 1, the algorithm reaches a leaf when UB . So, reducing UB quickly allows to reach a leaf quickly. Therefore, we define the reward for an action to be the decrease of UB after taking this action. Concretely, let be the set of label classes before taking the action and the set of label classes obtained by splitting the label classes in according to their adjacency to and in lines 14 – 21 of Algorithm 1, can be quickly computed as follows.
Our value function maintains a score () for each vertex (), initialized to 0. Each time is computed, and are updated as follows:
At each branching point (line 11 of Algorithm 1), our algorithm first selects a label class with the smallest , and the vertex in with the greatest score . Then, for each in in the decreasing order of the score , the algorithm matches and , and recursively continues the search after adding the match into . All ties are broken in favor of the vertex with the maximum degree.
Note that original algorithm presented in [?] and our approach all select the label class with the smallest . The difference lies in how to choosing from . In [?], and are chosen according to their degree. In our approach, and are chosen according to their score. We will show the impact of this difference in MCS solving in the next section.
5 Empirical Evaluation
Experiments were performed on Intel Xeon CPUs E52680 v4@2.40 gigaHertz under Linux with 4G memory. The cutoff time is 1800 seconds for each instance. We first present the algorithms and the benchmarks used in the experiments, and then present and analyze the experimental results.
5.1 Solvers and Benchmark
The following algorithms (also called solvers) are used in our experiments.
McSplit [?]: An implementation of Algorithm 1 using the label class representation of the mapping. It is more than an order of magnitude faster than the previous state of the art for unlabelled and undirected MCS instances. It is also extended for labelled graphs.
McSplit [?]: A variant of McSplit using a topdown strategy to call the main McSplit method to search for a solution of cardinality and backtracks when the bound is strictly less than the required cardinality, and terminates when a solution of the required cardinality is found. This strategy is similar to the algorithm [?]. McSplit is specially designed for the large subgraph isomorphism instances for which McSplit is beaten by .
McSplit+RL: An implementation of Algorithm 1 on top of McSplit with reinforcement learning presented in this paper. In other words, the only difference between McSplit+RL and McSplit is the use of reinforcement learning in the branching heuristic in McSplit+RL as presented in Section 4.
McSplit+RL: A variant of McSplit+RL using the topdown strategy of McSplit.
The benchmark consists of two sets of instances.
Biochemical reactions instances describing the biochemical reaction networks from the biomodels.net ^{1}^{1}1Available at http://liris.cnrs.fr/csolnon/SIP.html
. All the 136 graphs are directed, unlabelled bipartite graphs having 9 to 386 vertices. Every pair of graphs gives an MCS instance, resulting in 9316 Bio instances (including 136 selfmatch pairs).
Large subgraph isomorphism and MCS instances [?; ?]. This benchmark set includes realworld graphs and graphs generated using random models, such as segmented images, modelling 3D objects, scalefree networks ^{1}. Pattern graphs range from 4 vertices to 900; target graphs range from 10 vertices to 6,671. There are totally 12,427 instances, including: 6278 images instances from imagesCVIU11 (there are 43 pattern graphs and 146 target graphs, each pair of pattern graph and target graph resulting in an instance); 1225 LV instances given by each pair of graphs (including two same graphs) among the 49 graphs selected in [?] from the LV set; 3430 largerLV instances from the above 49 LV graphs as pattern and the remaining 70 graphs as target in the LV set; 200 Mesh, 24 PR15, 100 Scalefree and 1170 Si instances used in [?] to evaluate McSplit.
5.2 Performance of the new approach
Table 1 compares the performance of different algorithms for various instance sets. The first column gives the instance set name and the number of instances. The other columns gives the number of instances solved within the 1800s by the corresponding algorithm and the average runtime (inside parentheses) to solve these solved instances. McSplit+RL and McSplit+RL solve 130 and 117 instances more than McSplit and McSplit, respectively. Note that the four solvers share the same implementation, and the only difference between McSplit and McSplit+RL, as well as between McSplit and McSplit+RL, is their branching heuristic.
Instance set()  McSplit  McSplit+RL  McSplit  McSplit+RL 

Bio(9316)  6655(45.8)  6729(40.7)  6818(43.2)  6884(38.8) 
Images(6278)  1245(87.1)  1280(87.9)  1266(86.0)  1283(95.6) 
LV(1225)  400(59.4)  418(64.8)  410(44.4)  425(39.1) 
LargerLV(3430)  578(77.0)  584(119.8)  633(103.7)  650(102.2) 
Mesh(200)  –  –  1(57.4)  1(754.8) 
PR15(24)  24(12.9)  24(13.5)  24(0.1)  24(0.1) 
Scalefree(100)  13(0.0)  13(0.0)  80(7.3)  80(9.0) 
Si(1170)  419(22.6)  416(19.8)  1157(3.7)  1159(5.2) 
total(21,743)  9,334  9,464  10,389  10,506 
As Figure 3 showed, if we exclude the instances that are solved by both McSplit and McSplit+RL within 1s (5s, 10s), McSplit+RL solves 4.27 (6.26, 7.54) more instances than McSplit. The harder the instances are, the greater the effectiveness of reinforcement learning is.
Recall that McSplit and McSplit are already highly efficient. The results show the effectiveness of the learning approach in designing branching heuristic in a BnB algorithm for the MCS, and the compatibility of the new branching heuristic with the topdown strategy of McSplit.
5.3 Analysis
The search of a BnB algorithm might be divided into two phases. In phase 1, the algorithm finds an optimal solution ; and in phase 2, it proves the optimality of by proving no better solution exists. Table 2 shows, for a set of representative instances, the total solving time (phase 1 + phase 2) and the time for finding the optimal solution (phase 1) of McSplit and McSplit+RL, respectively. McSplit+RL usually finds the optimal solution more quickly than McSplit, allowing to prune the search more easily in line 8 of Algorithm 1. Note that when the time for finding the optimal solution is smaller, the time for solving an MCS instance usually is also smaller. As Figure 4 showed, McSplit+RL needs less time for finding the optimal solutions of 184 instances than McSplit.
InstanceSet  McSplit  McSplit+RL  

()  ()  
Bio030.txt061.txt  50  73  372.30  371.60  623.74(204.04)  184.65  183.30  187.68(50.20) 
Bio022.txt046.txt  38  31  238.65  32.42  309.68(79.55)  1119.95  247.70  1227.46(458.42) 
Bio001.txt018.txt  46  79  1452.47  0.29  5214.98(545.06)  905.52  0.09  1720.57(214.56) 
Imagespattern11target10  15  3506  0.15  0.07  0.02(0.00)  0.24  0.17  0.02(0.00) 
Imagespattern43target113  89  2877  513.94  513.92  83.54(0.19)  171.95  171.94  16.49(0.07) 
Imagespattern24target119  21  5376  17.51  15.48  2.81(0.00)  9.64  7.20  0.95(0.00) 
Imagespattern29target120  22  4301  9.05  7.88  1.28(0.00)  8.15  7.52  0.61(0.00) 
LVg10g18  41  64  1103.46  854.83  1002.25(1.28)  1066.48  898.92  1419.91(1.78) 
LVg12g19  48  64  296.51  296.43  953.03(180.30)  18.60  7.85  13.22(5.98) 
LVg11g21  42  64  2.06  0.00  4.10(1.00)  3.11  0.00  3.26(0.99) 
LVg10g17  41  64  203.83  158.45  241.36(20.52)  149.38  131.21  271.18(5.43) 
LargerLVg11g78  42  627  6.23  6.23  2.33(0.27)  2.13  2.13  0.43(0.07) 
LargerLVg12g55  48  256  1471.98  1423.98  3580.15(73.53)  562.62  246.93  345.12(30.99) 
LargerLVg13g70  49  501  213.09  213.08  234.17(0.61)  401.03  197.28  187.58(1.66) 
LargerLVg6g72  19  561  18.46  18.46  33.90(0.09)  0.02  0.02  0.02(0.00) 
LargerLVg6g71  19  501  0.01  0.01  0.02(0.00)  0.14  0.02  0.14(0.00) 
PR15pattern1target  83  4838  9.74  9.68  0.27(0.02)  23.31  23.25  0.54(0.02) 
PR15pattern9target  68  4838  5.56  5.49  0.24(0.00)  5.09  4.99  0.16(0.00) 
Sisi2_b03m_m200.05  40  200  2.08  2.08  3.19(0.94)  0.09  0.09  0.16(0.03) 
Sisi2_m4Dr2_m256.02  51  256  1516.22  1516.22  1169.12(127.65)  1156.89  1156.89  460.27(70.89) 
Let () denote the number of times a vertex in the pattern graph ( in the target graph ) is used for branching at line 11 (line 12) of Algorithm 1. Table 2 also gives the standard deviation () of () when McSplit and McSplit+RL solves these representative graphs. This standard deviation with McSplit+RL is usually significantly smaller than with McSplit, meaning that more vertices participate in branching in McSplit+RL than in McSplit.
This phenomenon might be explained as follows. McSplit always branches on the vertex with the maximum degree given a label class, concentrating the branching on a small subset of vertices with high degree. However, McSplit+RL can also branch on vertices with lower degree in a label class, because reinforcement learning discovers that these vertices allow a big bound decrease.
In summary, the search of McSplit+RL is more diversified while leading to quick pruning, explaining why McSplit+RL usually finds the optimal solution more quickly than McSplit.
6 Conclusion
We proposed a branching heuristic inspired from reinforcement learning in a BnB algorithm for the MCS, by regarding the algorithm as an agent and a match as an action. The reward of an action is the decrease of the upper bound and the score of a vertex is the sum of rewards of the actions it participated in the past. Then, the algorithm uses these scores to select the action for branching. Intensive experiments show that this branching heuristic allows to solve significantly more instances, because it allows a more diversified search.
Our results suggest that reinforcement learning is a very promising tool for NPhard problem solving. We will improve the reward and value function definitions to further improve our branching heuristic for the MCS in the future.
References
 [Bahiense et al., 2012] Laura Bahiense, Gordana Manic, Breno Piva, and Cid C. de Souza. The maximum common edge subgraph problem: A polyhedral investigation. Discrete Applied Mathematics, 160(18):2523–2541, 2012.
 [Bunke and Messmer, 1995] Horst Bunke and Bruno T. Messmer. Efficient attributed graph matching and its application to image analysis. In Image Analysis and Processing, 8th International Conference, ICIAP, volume 974 of LNCS, pages 45–55. Springer, 1995.
 [Cao et al., 2011] Ning Cao, Zhenyu Yang, Cong Wang, Kui Ren, and Wenjing Lou. Privacypreserving query over encrypted graphstructured data in cloud computing. In 2011 International Conference on Distributed Computing Systems, ICDCS, pages 393–402. IEEE Computer Society, 2011.
 [Conte et al., 2004] Donatello Conte, Pasquale Foggia, Carlo Sansone, and Mario Vento. Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence, 18(3):265–298, 2004.
 [Cootes et al., 2007] Adrian P. Cootes, Stephen H. Muggleton, and Michael J.E. Sternberg. The identification of similarities between biological vetworks: application to the metabolome and interactome. Molecular Biology, 369(4):1126–1139, 2007.
 [Cordella et al., 2004] Luigi P. Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell., 26(10):1367–1372, 2004.
 [Faccioli et al., 2005] P. Faccioli, P. Provero, C. Herrmann, A. M. Stanca, C. Morcia, and V.Terzi. From single genes to coexpression networks:extracting knowledge from barleyfunctional genomics. Plant Molecular Biology, 58(5):739–750, 2005.
 [Giugno et al., 2013] Rosalba Giugno, Vincenzo Bonnici, Nicola Bombieri, Alfredo Pulvirenti, Alfredo Ferro, and Dennis Shasha. Grapes: a software for parallel searching on biological graphs targeting multicore architectures. PLOS ONE, 8(10):e76911, 2013.
 [Hoffmann et al., 2017] Ruth Hoffmann, Ciaran McCreesh, and Craig Reilly. Between subgraph isomorphism and maximum common subgraph. In AAAI, pages 3907–3914. AAAI Press, 2017.
 [Levi, 1973] G. Levi. A note on the derivation of maximal common subgraphs of two directed or undirected graphs. Calcolo, 9(4):341–352, 1973.
 [Liu and Lee, 2001] Jianzhuang Liu and Yong Tsui Lee. A graphbased method for face identification from a single 2d line drawing. IEEE Trans. Pattern Anal. Mach. Intell., 23(10):1106–1119, 2001.
 [McCreesh et al., 2016] Ciaran McCreesh, Samba Ndojh Ndiaye, Patrick Prosser, and Christine Solnon. Clique and constraint models for maximum common (connected) subgraph problems. In CP, volume 9892 of Lecture Notes in Computer Science, pages 350–368. Springer, 2016.
 [McCreesh et al., 2017] Ciaran McCreesh, Patrick Prosser, and James Trimble. A partitioning algorithm for maximum common subgraph problems. In IJCAI, pages 712–719. ijcai.org, 2017.
 [McGregor, 1982] James J. McGregor. Backtrack search algorithms and the maximal common subgraph problem. Softw., Pract. Exper., 12(1):23–34, 1982.
 [Ndiaye and Solnon, 2011] Samba Ndojh Ndiaye and Christine Solnon. CP models for maximum common subgraph problems. In CP, volume 6876 of Lecture Notes in Computer Science, pages 637–644. Springer, 2011.
 [Solnon et al., 2015] Christine Solnon, Guillaume Damiand, Colin de la Higuera, and JeanChristophe Janodet. On the complexity of submap isomorphism and maximum common submap problems. Pattern Recognition, 48(2):302–316, 2015.
 [Solnon, 2010] Christine Solnon. Alldifferentbased filtering for subgraph isomorphism. Artif. Intell., 174(1213):850–864, 2010.

[Sutton and
Barto, 2018]
Richard S. Sutton and Andrew G. Barto.
Reinforcement Learning: An introduction (2nd Ed.).
2018.
http://www.incompleteideas.net/
book/thebook2nd.html.  [Vismara and Valery, 2008] Philippe Vismara and Benoît Valery. Finding maximum common connected subgraphs using clique detection or constraint satisfaction algorithms. volume 14 of Communications in Computer and Information Science, pages 358–368. Springer, 2008.