The Power of Data Reduction for Matching

# The Power of Data Reduction for Matching

George B. Mertzios School of Engineering and Computing Sciences, Durham University, UK, george.mertzios@durham.ac.uk André Nichterlein Supported by a postdoc fellowship of the German Academic Exchange Service (DAAD) while at Durham University. Rolf Niedermeier Institut für Softwaretechnik und Theoretische Informatik, TU Berlin, Germany, {andre.nichterlein,rolf.niedermeier}@tu-berlin.de
###### Abstract

Finding maximum-cardinality matchings in undirected graphs is arguably one of the most central graph primitives. For -edge and -vertex graphs, it is well-known to be solvable in  time; however, for several applications this running time is still too slow. We investigate how linear-time (and almost linear-time) data reduction (used as preprocessing) can alleviate the situation. More specifically, we focus on (almost) linear-time kernelization. We start a deeper and systematic study both for general graphs and for bipartite graphs. Our data reduction algorithms easily comply (in form of preprocessing) with every solution strategy (exact, approximate, heuristic), thus making them attractive in various settings.

## 1 Introduction

“Matching is a powerful piece of algorithmic magic” [18]. In Matching, given a graph, one has to compute a maximum-cardinality set of nonoverlapping edges. Matching is arguably among the most fundamental graph-algorithmic primitives allowing for a polynomial-time algorithm. More specifically, on an -vertex and -edge graph a maximum matching can be found in time [17]. Improving this upper time bound, even for bipartite graphs, resisted decades of research. Recently, however, Duan and Pettie [8] presented a linear-time algorithm that computes a -approximate maximum-weight matching, where the running time dependency on is . For the unweighted case, the  algorithm of Micali and Vazirani [17] implies a linear-time -approximation, where in this case the running time dependency on  is  [8]. We take a different route: First, we do not give up the quest for optimal solutions. Second, we focus on efficient data reduction rules—not solving an instance but significantly shrinking its size before actually solving the problem111Doing so, however, we focus here on the unweighted case.. In the context of decision problems, in parameterized algorithmics this is known as kernelization, a particularly active area of algorithmic research on NP-hard problems.

The spirit behind our approach is thus closer to the identification of efficiently (i.e. linearly) solvable special cases of Matching. There is quite some body of work in this direction. For instance, since an augmenting path can be found in linear time [10], the standard augmenting path-based algorithm runs in  time, where  is the number of edges in the maximum matching. Yuster [20] developed an -time algorithm, where is the difference between maximum and minimum vertex degree of the input graph. Moreover, there are linear-time algorithms for computing maximum matchings in special graph classes, including convex bipartite [19], strongly chordal [7], and chordal bipartite graphs [6].

All this and the more general spirit of “parameterization for polynomial-time solvable problems” [12] (also referred to as “FPT in P” or “FPTP” for short) forms the starting point of our research. Remarkably, Fomin et al. [9] recently developed an algorithm to compute a maximum matching in graphs of treewidth  in randomized time.

Following the paradigm of kernelization, that is, provably effective and efficient data reduction, we provide a systematic exploration of the power of polynomial-time data reduction for Matching. Thus, our aim (fitting within FPTP) is to devise problem kernels that are computable in (almost) linear time. A particular motivation for this is that with such very efficient kernelization algorithms it is possible to transform multiplicative () into additive () “(almost) linear-time FPTP” algorithms. Furthermore, kernelization algorithms (typically based on data reduction rules) can be used as a preprocessing to heuristics or approximation algorithms with the goal of getting larger matchings.

As kernelization is usually defined for decision problems, we use in the remainder of the paper the decision version of Matching. In a nutshell, a kernelization of a decision problem instance is an algorithm that produces an equivalent instance whose size can solely be upper-bounded by a function in the parameter (preferably a polynomial). The focus on decision problems is justified by the fact that all our results, although formulated for the decision version, in a straightforward way extend to the corresponding optimization version.

(Maximum-Cardinality) Matching
Input: An undirected graph  and a nonnegative integer . Question: Is there a size  subset  of nonoverlapping (i.e. disjoint) edges?

Since solving the given instance and returning a trivial yes- or no-instance always produces a constant-size kernel in polynomial time, we are looking for kernelization algorithms that are faster than the algorithms solving the problem. For NP-hard problems, each kernelization algorithm, since running in polynomial time, is (presumably) faster than any solution algorithm. This is, of course, no longer true when applying kernelization to a polynomial-time solvable problem like Matching. While the focus of classical kernelization for NP-hard problems is mostly on improving the size of the kernel, we particularly emphasize that for polynomially solvable problems it now becomes crucial to also focus on the running time of the kernelization algorithm. Moreover, the parameterized complexity analysis framework can also be applied to the kernelization algorithm itself. For example, a kernelization algorithm running in  time ( is the “problem specific” parameter) might be preferable to another one running in  time. In this paper, we present kernelization algorithms for Matching which run in linear time (see Sections 2.1 and 3) or in almost linear time (i.e. in  time, see Section 2.2).

##### Our contributions.

In this paper we present three efficiently computable kernels for Matching (see Table 1 for an overview).

All our parameterizations can be categorized as “distance to triviality” [13]. They are motivated as follows. First, note that maximum-cardinality matchings be can trivially found in linear time on trees (or forests). So we consider the corresponding edge deletion distance (feedback edge number) and vertex deletion distance (feedback vertex number). Notably, there is a trivial linear-time algorithm for computing the feedback edge number and there is a linear-time factor-4 approximation algorithm for the feedback vertex number [1]. We mention in passing that the parameter vertex cover number, which is lower-bounded by the feedback vertex number, has been frequently studied for kernelization [3, 4]. In particular, Gupta and Peng [14] (implicitly) provided a quadratic-size kernel for Matching with respect to the parameter vertex cover number. Coming to bipartite graphs, note that our parameterization by vertex deletion distance to chain graphs is motivated as follows. First, chain graphs form one the most obvious easy cases for bipartite graphs where Matching can be solved in linear time [19]. Second, we show that the vertex deletion distance of any bipartite graph to a chain graph can be 2-approximated in linear time. Moreover, vertex deletion distance to chain graphs lower-bounds the vertex cover number of a bipartite graph.

An overview of our main results is given in Table 1. We study kernelization for Matching parameterized by the feedback vertex number, that is, the vertex deletion distance to a forest (see Section 2). As a warm-up we first show that a subset of our data reduction rules for the “feedback vertex set kernel” also yields a linear-time computable linear-size kernel for the typically much larger parameter feedback edge number (see Section 2.1). As for Bipartite Matching no faster algorithm is known than on general graphs, we kernelize Bipartite Matching with respect to the vertex deletion distance to chain graphs (see Section 3).

Seen from a high level, our two main results employ the same algorithmic strategy, namely upper-bounding (as a function of the parameter) the number of neighbors in the appropriate vertex deletion set ; that is, in the feedback vertex set or in the deletion set to chain graphs, respectively. To achieve this we develop new “irrelevant edge techniques” tailored to these two kernelization problems. More specifically, whenever a vertex of the deletion set  has large degree, we efficiently detect edges incident to whose removal does not change the size of the maximum matching. Then the remaining graph can be further shrunk by scenario-specific data reduction rules. While this approach of removing irrelevant edges is natural, the technical details and the proofs of correctness can become quite technical and combinatorially challenging. In particular, for the case of feedback vertex number we could only upper-bound the number of neighbors of each vertex in  by .

As a technical side remark, we emphasize that in order to achieve an (almost) linear-time kernelization algorithm, we often need to use suitable data structures and to carefully design the appropriate data reduction rules to be exhaustively applicable in linear time, making this form of “algorithm engineering” much more relevant than in the classical setting of mere polynomial-time data reduction rules.

##### Notation and Observations.

We use standard notation from graph theory. In particular all paths we consider are simple paths. Two paths in a graph are called internally vertex-disjoint if they are either completely vertex-disjoint or they overlap only in their endpoints. A matching in a graph is a set of pairwise disjoint edges. Let  be a graph and let  be a matching in . The degree of a vertex is denoted by . A vertex  is called matched with respect to  if there is an edge in  containing , otherwise  is called free with respect to . If the matching  is clear from the context, then we omit “with respect to ”. An alternating path with respect to  is a path in  such that every second edge of the path is in . An augmenting path is an alternating path whose endpoints are free. It is well known that a matching  is maximum if and only if there is no augmenting path for it. Let  and  be two matchings in . We denote by  the graph containing only the edges in the symmetric difference of  and , that is, . Observe that every vertex in  has degree at most two.

For a matching  for  we denote by  a maximum matching in  with the largest possible overlap (in number of edges) with . That is, is a maximum matching in  such that for each maximum matching  for  it holds that . Observe that, if  is a maximum matching for , then . Furthermore observe that  consists of only odd-length paths and isolated vertices, and each of these paths is an augmenting path for . Moreover the paths in  are as short as possible:

###### Observation 1.1.

For any path  in  it holds that  for every .

###### Proof.

Assume that . Then  is a shorter path which is also an augmenting path for  in . The corresponding maximum matching  satisfies , a contradiction to the definition of . ∎

###### Observation 1.2.

Let  be a graph with a maximum matching , let be a vertex subset of size , and let  be a maximum matching for . Then, .

##### Kernelization.

A parameterized problem is a set of instances  where  for a finite alphabet , and  is the parameter. We say that two instances  and of parameterized problems  and  are equivalent if is a yes-instance for  if and only if is a yes-instance for . A kernelization is an algorithm that, given an instance  of a parameterized problem , computes in polynomial time an equivalent instance  of  (the kernel) such that for some computable function . We say that  measures the size of the kernel, and if , we say that admits a polynomial kernel. Often, a kernel is achieved by applying polynomial-time executable data reduction rules. We call a data reduction rule  correct if the new instance  that results from applying  to  is equivalent to . An instance is called reduced with respect to some data reduction rule if further application of this rule has no effect on the instance.

## 2 Kernelization for Matching on General Graphs

In this section, we investigate the possibility of efficient and effective preprocessing for Matching. As a warm-up, we first present in Section 2.1 a simple, linear-size kernel for Matching with respect to the parameter “feedback edge set”. Exploiting the data reduction rules and ideas used for this kernel, we then present in Section 2.2 the main result of this section: an exponential-size kernel for the smaller parameter “feedback vertex number”.

### 2.1 Warm-up: Parameter feedback edge number

We provide a linear-time computable linear-size kernel for Matching parameterized by the feedback edge number, that is, the size of a minimum feedback edge set. Observe that a minimum feedback edge set can be computed in linear time via a simple depth-first search or breadth-first search. The kernel is based on the next two simple data reduction rules due to Karp and Sipser [16]. They deal with vertices of degree at most two.

###### Reduction Rule 2.1.

Let . If , then delete . If , then delete  and its neighbor and decrease the solution size  by one ( is matched with its neighbor).

###### Reduction Rule 2.2.

Let  be a vertex of degree two and let  be its neighbors. Then remove , merge  and , and decrease the solution size  by one.

The correctness was stated by Karp and Sipser [16]. For completeness, we give a proof.

###### Lemma 2.1.

Reduction Rules 2.2 and 2.1 are correct.

###### Proof.

If  has degree zero, then clearly  cannot be in any matching and we can remove .

If  has degree one, then let  be its single neighbor. Let  be a maximum matching of size at least  for . Then  is matched in  since otherwise adding the edge  would increase the size of the matching. Thus, a maximum matching in  is of size at least . Conversely, a maximum matching of size  in  can easily be extended by the edge  to a maximum matching of size  in .

If  has degree two, then let  and  be its two neighbors. Let  a maximum matching of size at least . If  is not matched in , then and  are matched since otherwise adding the edge  resp.  would increase the size of the matching. Thus, deleting  and merging  and  decreases the size of  by one ( looses either the edge incident to  or one of the edges incident to  and ). Hence, the resulting graph  has a maximum matching of size at least . Conversely, let  be a matching of size at least  for . If the merged vertex  is free, then  is a matching of size  in . Otherwise, is matched to some vertex . Then matching  in  with either  or  (at least one of the two vertices is a neighbor of ) and matching  with the other vertex yields a matching of size  for . ∎

Although Reduction Rules 2.1 and 2.2 are correct, it is not clear whether Reduction Rule 2.2 can be exhaustively applied in linear time. However, for our purpose it suffices to consider the following restricted version which we can exhaustively apply in linear time.

###### Reduction Rule 2.3.

Let  be a vertex of degree two and  be its neighbors with  and  having degree at most two. Then remove , merge  and , and decrease  by one.

###### Lemma 2.2.

Reduction Rules 2.3 and 2.1 can be exhaustively applied in  time.

###### Proof.

We give an algorithm which exhaustively applies Reduction Rules 2.3 and 2.1 in linear time. First, using bucket sort, sort the vertices by degree and keep three lists containing all degree-zero/one/two vertices. Then one applies Reduction Rules 2.3 and 2.1 in a straightforward way. When a neighbor of a vertex is deleted, then check if the vertex has now degree zero, one, or two. If yes, then add the vertex to the corresponding list.

We next show that this algorithm runs in linear time. First, observe that the deletion of each each degree-zero vertex can be done in constant time as no further vertices are affected. Second, consider a degree-one vertex  with a neighbor  and observe that deleting  and  can be done  time since one needs to update the degrees of all neighbors of . Furthermore, decreasing  by one can be done in constant time for each deleted degree-one vertex. Finally, consider a degree-two vertex  with two neighbors  and , each of degree at most two. Deleting  takes constant time. To merge  and  iterate over all neighbors of  and add them to the neighborhood of . If a neighbor  of  is already a neighbor of , then decrease the degree of  by one. Then, relabel  to be the new contracted vertex .

Overall, the worst-case running time to apply Reduction Rules 2.3 and 2.1 exhaustively can be upper-bounded by . ∎

###### Theorem 2.3.

Matching admits a linear-time computable linear-size kernel with respect to the parameter “feedback edge number” .

###### Proof.

Apply Reduction Rules 2.3 and 2.1 exhaustively in linear time (see Lemma 2.2). We claim that the reduced graph  has less than  vertices and  edges. Denote with  a feedback edge set for , . Furthermore, denote with , , and  the vertices that have degree one, two, and more than two in the . Thus, as each leaf in  has to be incident to an edge in . Next, since  is a forest (or tree), we have  and thus . Finally, each degree-two vertex in  needs at least one neighbor of degree at least three since  is reduced with respect to Reduction Rule 2.3. Thus, the vertices in  are either incident to an edge in  or adjacent to one of the at most  vertices in  that have degree at least three. Since the sum over all degrees of vertices in  is at most , it follows that . Thus, the number of vertices in  is . Since  is a forest, it follows that  has at most  edges. ∎

Applying the -time algorithm for Matching [17] on the kernel yields:

###### Corollary 2.4.

Matching can be solved in  time, where  is the feedback vertex number.

### 2.2 Parameter feedback vertex number

We next provide for Matching a kernel of size  computable in  time where  is the “feedback vertex number”. Using a known linear-time factor 4-approximation algorithm [1], we can approximate feedback vertex set and use it in our kernelization algorithm.

Roughly speaking, our kernelization algorithm extends the linear-time computable kernel with respect to the parameter “feedback edge set”. Thus, Reduction Rules 2.3 and 2.1 play an important role in the kernelization. Compared to the other kernels presented in this paper, the kernel presented here comes at the price of higher running time  and bigger kernel size (exponential size). It remains open whether Matching parameterized by the “feedback vertex number” admits a linear-time computable kernel (possibly of exponential size), and whether it admits a polynomial kernel computable in  time.

Subsequently, we describe our kernelization algorithm which keeps in the kernel all vertices in the given feedback vertex set  and shrinks the size of . Before doing so, we need some further notation. In this section, we assume that each tree is rooted at some arbitrary (but fixed) vertex such that we can refer to the parent and children of a vertex. A leaf in  is called a bottommost leaf either if it has no siblings or if all its siblings are also leaves. (Here, bottommost refers to the subtree with the root being the parent of the considered leaf.) The outline of the algorithm is as follows (we assume throughout that  since otherwise the input instance is already a kernel of size ):

1. Reduce  wrt. Reduction Rules 2.1 and 2.3.

2. Compute a maximum matching  in .

3. Modify  in linear time such that only the leaves of  are free(Section 2.2.1).

4. Bound the number of free leaves in  by (Section 2.2.2).

5. Bound the number of bottommost leaves in  by (Section 2.2.3).

6. Bound the degree of each vertex in  by . Then, use Reduction Rules 2.1 and 2.3 to provide the kernel of size (Section 2.2.4).

Whenever we reduce the graph at some step, we also show that the reduction is correct. That is, the given instance is a yes-instance if and only if the reduced one is a yes-instance. The correctness of our kernelization algorithm then follows by the correctness of each step. We discuss in the following some details of each step.

#### 2.2.1 Items 1, 2 and 3

By Lemma 2.2 we can perform Item 1 in linear time. By Lemma 2.1 this step is correct.

A maximum matching in Item 2 can be computed by repeatedly matching a free leaf to its neighbor and by removing both vertices from the graph (thus effectively applying Reduction Rule 2.1 to ). By Lemma 2.2, this can be done in linear time.

Item 3 can be done in  time by traversing each tree in  in a BFS manner starting from the root: If a visited inner vertex  is free, then observe that all children are matched since  is maximum. Pick an arbitrary child  of  and match it with . The vertex  that was previously matched to  is now free and since it is a child of , it will be visited in the future. Observe that Items 3 and 2 do not change the graph but only the auxiliary matching , and thus these steps are correct.

#### 2.2.2 Item 4.

Recall that our goal is to upper-bound the number edges between vertices of  and , since we can then use a simple analysis as for the parameter “feedback edge set”. Furthermore, recall that by creftype 1.2 the size of any maximum matching in  is at most  plus the size of . Now, the crucial observation is that if a vertex  has at least  neighbors in  that are free wrt. , then there exists a maximum matching where  is matched to one of these  vertices since at most  can be “blocked” by other matching edges. This means we can delete all other edges incident to . Formalizing this idea, we obtain the following reduction rule.

###### Reduction Rule 2.4.

Let  be a graph, let be a subset of size ,and let be a maximum matching for . If there is a vertex  with at least  free neighbors , then delete all edges from  to vertices in .

###### Lemma 2.5.

Reduction Rule 2.4 is correct and can be exhaustively applied in  time.

###### Proof.

We first discuss the correctness and then the running time. Denote by  the size of a maximum matching in the input graph  and by  the size of a maximum matching in the new graph , where some edges incident to  are deleted. We need to show that . Since any matching in  is also a matching in , we easily obtain . It remains to show . To this end, let  be a maximum matching for  with the maximum overlap with  (see LABEL:sec:prelim). If  is free wrt.  or if  matched to a vertex  that is also in  a neighbor of , then  is also a matching in  () and thus we have in this case . Hence, consider the remaining case where  is matched to some vertex  such that , that is, the edge  was deleted by Reduction Rule 2.4. Hence, has  neighbors  in  such that each of these neighbors is free wrt.  and none of the edges , was deleted. Observe that by the choice of , the graph (the graph over vertex set  and the edges that are either in  or in , see LABEL:sec:prelim) contains exactly  paths (we do not consider isolated vertices as paths). Each of these paths is an augmenting path for . By creftype 1.2, we have . Observe that  is an edge in one of these augmenting paths; denote this path with . Thus, there are at most  paths  that do not contain . Also, each of these paths contains exactly two vertices that are free wrt. : the endpoints of the path. This means that no vertex in  is an inner vertex on such a path. Furthermore, since  is a maximum matching, it follows that for each path at most one of these two endpoints is in . Hence, at most  vertices of are contained in the  paths of  except . Therefore, one of these vertices, say , is free wrt.  and can be matched with . Thus, by reversing the augmentation along  and adding the edge  we obtain another matching  of size . Observe that  is a matching for  and for  and thus we have . This completes the proof of correctness.

Now we come to the running time. We exhaustively apply the data reduction rule as follows. First, initialize for each vertex  a counter with zero. Second, iterate over all free vertices in  in an arbitrary order. For each free vertex  iterate over its neighbors in . For each neighbor  do the following: if the counter is less than , then increase the counter by one and mark the edge  (initially all edges are unmarked). Third, iterate over all vertices in . If the counter of the currently considered vertex  is , then delete all unmarked edges incident to . This completes the algorithm. Clearly, it only deletes edges incident to a vertex  only if  has  free neighbors in  and the edges to these  neighbors are kept. The running time is : When iterating over all free vertices in  we consider each edge at most once. Furthermore, when iterating over the vertices in , we again consider each edge at most once. ∎

To finish Item 4, we exhaustively apply Reduction Rule 2.4 in linear time. Afterwards, there are at most  free (wrt. to ) leaves in  that have at least one neighbor in  since each of the  vertices in  is adjacent to at most  free leaves. Thus, applying Reduction Rule 2.1 we can remove the remaining free leaves that have no neighbor in . However, since for each degree-one vertex also its neighbor is removed, we might create new free leaves and need to again apply Reduction Rule 2.4 and update the matching (see Item 3). This process of alternating application of Reduction Rules 2.4 and 2.1 stops after at most  rounds since the neighborhood of each vertex in  can be changed by Reduction Rule 2.4 at most once. This shows the running time . We next show how to improve this to  and arrive at the final lemma of this subsection.

###### Lemma 2.6.

Given a matching instance  and a feedback vertex set , Algorithm 1 computes in linear time an instance  with feedback vertex set  and a maximum matching  in  such that the following holds.

• There is a matching of size  in  if and only if there is a matching of size  in .

• Each vertex that is free wrt.  is a leaf in .

• There are at most  free leaves in .

###### Proof.

In the following, we explain Algorithm 1 which reduces the graph with respect to Reduction Rules 2.4 and 2.1 and updates the matching  as described in Item 3. The algorithm performs in (Algorithms 1 and 1) Items 1, 2 and 3. As described in the previous section, this can be done in linear time. Next, Reduction Rule 2.4 is applied in Algorithms 1 to 1 using the approach described in the proof of Lemma 2.5: For each vertex in  a counter  is maintained. When iterating over the free leaves in , these counters will be updated. If a counter  reaches , then the algorithm knows that  has  fixed free neighbors and according to Reduction Rule 2.4 the edges to all other vertices can be deleted (see Algorithm 1). Observe that once the counter  reaches , the vertex  will never be considered again by the algorithm since its only remaining neighbors are free leaves in  that already have been popped from stack . The only difference from the description in the proof of Lemma 2.5 is that the algorithm reacts if the degree of some vertex  in  is decreased to one (see Algorithms 1 to 1). If  is matched, then simply remove  and its matched neighbor from  and . Otherwise, add  to the list  of unmatched degree-one vertices and defer dealing with  to a latter stage of the algorithm.

Observe that the matching  still satisfies the property that each free vertex in  is a leaf since only matched vertex pairs were deleted so far. When deleting unmatched degree-one vertices and their respective neighbor, the maximum matching  needs to be updated to satisfy this property. The algorithm does this from Algorithms 1 to 1: Let  be an entry in  such that  has degree one in Algorithm 1, that is, is a free leaf in  and has no neighbors in . Then, following Reduction Rule 2.1, delete  and its neighbor  and decrease the solution size  by one (see Algorithms 1 and 1). Let  denote the previously matched neighbor of . Since  was removed, is now free. If  is a leaf in , then we can simply add it to  and in this way deal with it later. If  is not a leaf, then we need to update  since only leaves are allowed to be free. To this end, take an arbitrary alternating path  from  to a leaf  of the subtree with root  and augment along  (see Algorithms 1 and 1). This can be done as follows: Pick an arbitrary child  of . Let  be the matched neighbor of . Since  is the parent of , it follows that  is a child of . Now, remove  from  and add . If  is a leaf, then the alternating path  is found with  and augmented. Otherwise, repeat the above procedure with  taking the role of . This completes the algorithm. Its correctness follows from the fact that it only deletes edges and vertices according to Reduction Rules 2.4 and 2.1.

It remains to show the running time of . To this end, we prove that the algorithm considers each edge in  only two times. First, consider the edges incident to a vertex . These edges will be inspected at most twice by the algorithm: Once, when it is marked (see Algorithm 1). The second time is when it is deleted. This bounds the running time in the first part (Algorithms 1 to 1).

Now consider the remaining edges within . To this end, observe that the algorithm performs two actions on the edges: deleting the edges (Algorithm 1) and finding and augmenting along an alternating path (Algorithms 1 and 1). Clearly, after deleting an edge it will no longer be considered, so it remains to show that edge is part of at most one alternating path used in Algorithms 1 and 1. Assume toward a contradiction that the algorithm augments along an edge twice or more. From all the edges that are augmented twice or more let  be one that is closest to the root of the tree  is contained in, that is, there is no edge closer to a root. Let  and  be the endpoints of the first augmenting path  containing  and  and  the endpoints of the second augmenting path  containing . Observe that for each augmenting path chosen in Algorithm 1 it holds that one endpoint is a leaf and the other endpoint is an ancestor of this leaf. Assume without loss of generality that  and  are the leaves and  and  are their respective ancestors. Let  and  ( and ) be the vertices deleted in Algorithm 1 which in turn made  () free. Observe that  does not contain any of these four vertices  since before augmenting  () the vertices  and  ( and ) are deleted. Since  is contained in both paths, either  is an ancestor of  or vice versa: the case cannot happen since for the second augmenting path the endpoint  would not be matched to ; a contradiction (see Algorithm 1).

We next consider the case that  is an ancestor of  (the other case will be handled subsequently). Denote with  the neighbor of  on . Observe that  since  is chosen as being closest to the root. We next distinguish the two cases whether or not  is initially matched. If  is initially free, then is matched after augmenting along . Then, by choice of , is not changed until the augmentation along . This, however, is a contradiction since augmenting along  only happens after the matched edge  is deleted. Since and  is matched all the time until  and  are deleted, this means that  would be matched to two vertices. Thus, consider the case that  is initially matched. Then, after augmenting along , is free and  is matched to its parent . As a consequence, is not matched to , neither before nor after the augmentation of . Since the algorithm augments along  only after it deleted  and  where  is matched to , it follows that the edge  is augmented before the algorithm augments along . Denote with  the augmenting path containing the edge . Since  is apparently not a free leaf, it follows that  needs to contain the matched neighbor of , which is . This means that the edge  is augmented at least twice (through  and ). However, is closer to the root than , a contradiction to the choice of . This completes the case that  is an ancestor of .

We now consider the remaining case where  is an ancestor of . In this case we have  where  is the neighbor of  on . Observe that  is a child of . Furthermore, observe that after the augmentation along  the leaf  is free and can be reached by an alternating path from . Hence, before and after the augmentation along  it holds that  can reach exactly one free leaf via an alternating path ( and ). Observe that this is true even if the algorithm removes  since then a new free leaf will be created. Thus, before deleting  and  (right before the augmentation along ), there is an augmenting path in  from  to  and to the free leaf reachable from . This is a contradiction to the fact that the matching  is maximum.

We conclude that each edge in  will be augmented at most once. Thus, the algorithm considers each edge at most twice (when augmenting it and when deleting it). Hence, the algorithm runs in linear time. ∎

Summarizing, in Item 4 we apply Algorithm 1 in order to obtain an instance with at most  free vertices in  that are all leaves. By Lemma 2.6 this can be done in linear time. Furthermore, Lemma 2.6 also shows that the step is correct.

#### 2.2.3 Item 5

In this step we reduce the graph in  time so that at most  bottommost leaves will remain in the forest . We will restrict ourselves to consider leaves that are matched with their parent vertex in  and that do not have a sibling. We call these bottommost leaves interesting. Any sibling of a bottommost leaf is by definition also a leaf. Thus, at most one of these leaves (the bottommost leaf or its siblings) is matched with respect to  and all other leaves are free. Recall that in the previous step we upper-bounded the number of free leaves with respect to  by . Hence there are at most  bottommost leaves that are not interesting.

Our general strategy for this step is to extend the idea behind Reduction Rule 2.4: We want to keep for each pair of vertices  at most  different internally vertex-disjoint augmenting paths from  to . (For ease of notation we keep  paths although keeping  is sufficient.) In this step, we only consider augmenting paths of the form  where  is a bottommost leaf and  is ’s parent in . Assume that the parent  of  is adjacent to some vertex . Observe that in this case any augmenting path starting with the two vertices  and  has to continue to  and end in a neighbor of . Thus, the edge  can be only used in augmenting paths of length three. Furthermore, all these length-three augmenting paths are clearly internally vertex-disjoint. If we do not need the edge  because we kept augmenting paths from  already, then we can delete . Furthermore, if we deleted the last edge from  to  (or had no neighbors in  in the beginning), then  is a degree-two vertex in  and can be removed by applying Reduction Rule 2.2. As the child  of  is a leaf in , it follows that  has at most  neighbors in . We show below (Lemma 2.7) that the application of Reduction Rule 2.2 to remove  takes  time. As we remove at most  vertices, at most  time is spent on Reduction Rule 2.2 in this step.

We now show that after a simple preprocessing one application of Reduction Rule 2.2 in the algorithm above can indeed be performed in  time.

###### Lemma 2.7.

Let  be a leaf in the tree , be its parent, and let  be the parent of . If  has degree two in , then applying Reduction Rule 2.2 to  (deleting , contracting  and , and setting ) can be done in  time plus  time for an initial preprocessing.

###### Proof.

The preprocessing is to simply create a partial adjacency matrix for  with the vertices in  in one dimension and  in the other dimension. This adjacency matrix has size  and can clearly be computed in  time.

Now apply Reduction Rule 2.2 to . Deleting  takes constant time. To merge  and  iterate over all neighbors of . If a neighbor  of  is already a neighbor of , then decrease the degree of  by one, otherwise add  to the neighborhood of . Then, relabel  to be the new merged vertex .

Since  is a leaf in  and its only neighbor in , namely , is deleted, it follows that all remaining neighbors of  are in . Thus, using the above adjacency matrix, one can check in constant time whether  is a neighbor of . Hence, the above algorithm runs in  time. ∎

The above ideas are used in Algorithm 2 which we use for this step (Item 5).

The algorithm is explained in the proof of the following lemma stating the correctness and the running time of Algorithm 2.

###### Lemma 2.8.

Let  be a matching instance, let be a feedback vertex set, and let  be a maximum matching for  with at most  free vertices in  that are all leaves. Then, Algorithm 2 computes in  time an instance  with feedback vertex set  and a maximum matching  in  such that the following holds.

• There is a matching of size  in  if and only if there is a matching of size  in .

• There are at most  bottommost leaves in .

• There are at most  free vertices in  and they are all leaves.

###### Proof.

We start with describing the basic idea of the algorithm. To this end, let  be an edge such that  is an interesting bottommost leaf, that is, without siblings and matched to its parent  by . Counting for each pair  and  one augmenting path gives in a simple worst-case analysis  time per edge, which is too slow for our purposes. Instead, we count for each pair consisting of a vertex  and a set  one augmenting path. In this way, we know that for each  there is one augmenting path from  to  without iterating through all . This comes at the price of considering up to  such pairs. However, we will show that we can do the computations in  time per considered edge in . The main reason for this improved running time is a simple preprocessing that allows for a bottommost vertex  to determine  in constant time.

The preprocessing is as follows (see Algorithms 2 to 2): First, fix an arbitrary bijection  between the set of all subsets of  to the numbers . This can be done for example by representing a set  by a length- binary string (a number) where the  position is 1 if and only if . Given a set  such a number can be computed in  time in a straightforward way. Thus, Algorithms 2 to 2 can be performed in  time. Furthermore, since we assume that  (otherwise the input instance is already an exponential kernel), we have that  for each . Thus, reading and comparing these numbers can be done in constant time. Furthermore, in Algorithm 2 the algorithm precomputes for each vertex the number corresponding to its neighborhood in .

After the preprocessing, the algorithm uses a table  where it counts an augmenting path from a vertex  to a set  whenever a bottommost leaf  has exactly  as neighborhood in  and the parent of  is adjacent to  (see Algorithms 2 to 2). To do this in  time, the algorithm proceeds as follows: First, it computes in Algorithm 2 the set  which contains all parents of interesting bottommost leaves. Clearly, this can be done in linear time. Next, the algorithm processes the vertices in . Observe that further vertices might be added to  (see Algorithm 2) during this processing. Let  be the currently processed vertex of , let  be its child vertex, and let  be the neighborhood of  in . For each neighbor , the algorithm checks whether there are already  augmenting paths between  and  with a table lookup in  (see Algorithm 2). If not, then the table entry is incremented by one (see Algorithm 2) since  and  provide another augmenting path. If yes, then the edge  is deleted in Algorithm 2 (we show below that this does not change the maximum matching size). If  has degree two after processing all neighbors of  in , then by applying Reduction Rule 2.2, we can remove  and contract its two neighbors  and . It follows from Lemma 2.7 that this application of Reduction Rule 2.2 can be done in  time. Hence, Algorithm 2 runs in  time.

Recall that all vertices in  that are free wrt.  are leaves. Thus, the changes to  by applying Reduction Rule 2.2 in Algorithm 2 are as follows: First, the edge  is removed and second the edge  is replaced by  for some . Hence, the matching  after running Algorithm 2 has still at most  free vertices and all of them are leaves.

It remains to prove that (a) the deletion of the edge  in Algorithm 2 results in an equivalent instance and (b) that the resulting instance has at most  bottommost leaves. First, we show (a). To this end, assume towards a contradiction that the new graph  has a smaller maximum matching than  (clearly, cannot have a larger maximum matching). Thus, any maximum matching  for  has to contain the edge . This implies that the child  of  in  is matched in  with one of its neighbors (except ): If  is free wrt. , then deleting  from  and adding  yields another maximum matching not containing , a contradiction. Recall that  where  since  is a leaf in . Thus, each maximum matching  for  contains for some  the edge . Observe that Algorithm 2 deletes  only if there are at least  other interesting bottommost leaves  in  such that their respective parent is adjacent to  and  (see Algorithms 2 to 2). Since , it follows by the pigeon hole principle that at least one of these vertices, say , is not matched to any vertex in . Thus, since  is an interesting bottommost leaf, it is matched to its only remaining neighbor: its parent  in . This implies that there is another maximum matching

 M′G:=(MG∖{{v,y},{x,u},{ui,vi}})∪{{vi,y},{x,ui},{u,v}},

a contradiction to the assumption that all maximum matchings for  have to contain .

We next show (b) that the resulting instance has at most  bottommost leaves. To this end, recall that there are at most  bottommost leaves that are not interesting (see discussion at the beginning of this subsection). Hence, it remains to upper-bound the number of interesting bottommost leaves. Observe that each parent  of an interesting bottommost leaf has to be adjacent to a vertex in  since otherwise  would have been deleted in Algorithm 2. Furthermore, after running Algorithm 2, each vertex  is adjacent to at most  parents of interesting bottommost leaves (see Algorithms 2 to 2). Thus, the number of interesting bottommost leaves is at most . Therefore the number of bottommost leaves is upper-bounded by . ∎

#### 2.2.4 Item 6

In this subsection, we provide the final step of our kernelization algorithm. Recall that in the previous steps we have upper-bounded the number of bottommost leaves in  by , we computed a maximum matching  for  such that at most  vertices are free wrt.  and all free vertices are leaves in . Using this, we next show how to reduce  to a graph of size . To this end we need some further notation. A leaf in  that is not bottommost is called a pendant. We define  to be the pendant-free tree (forest) of , that is, the tree (forest) obtained from  by removing all pendants. The next observation shows that  is not much larger than . This allows us to restrict ourselves in the following on giving an upper bound on the size of .

###### Observation 2.9.

Let  be as described above with vertex set  and let  be the pendant-free tree (forest) of  with vertex set . Then, .

###### Proof.

Observe that  is the union of all pendants in  and . Thus, it suffices to show that  contains at most  pendants. To this end, recall that we have a maximum matching for  with at most  free leaves. Thus, there are at most  leaves in  that have a sibling which is also a leaf since from two leaves with the same parent at most one can be matched. Hence, all but at most  pendants in  have pairwise different parent vertices. Since all these parent vertices are in , it follows that the number of pendants in  is . ∎

We use the following observation to provide an upper bound on the number of leaves of .

###### Observation 2.10.

Let  be a forest, let  be the pendant-free forest of , and let  be the set of all bottommost leaves in . Then, the set of leaves in  is exactly .

###### Proof.

First observe that each bottommost leaf of  is a leaf of  since we only remove vertices to obtain  from . Thus, it remains to show that each leaf  in  is a bottommost leaf in .

We distinguish two cases of whether or not  is a leaf in : First, assume that  is not a leaf in . Thus, all of it child vertices have been removed. Since we only remove pendants to obtain  from  and since each pendant is a leaf, it follows that  is in  the parent of one or more leaves . Thus, by definition, all these leaves  are bottommost leaves, a contradiction to the fact that they were deleted when creating .

Second, assume that  is a leaf in . If  is a bottommost leaf, then we are done. Thus, assume that  is not a bottommost leaf and therefore a pendant. However, since we remove all pendants to obtain  from , it follows that  is not contained in , a contradiction. ∎

From creftype 2.10 it follows that the set  of bottommost leaves in  is exactly the set of leaves in . In the previous step we reduced the graph such that . Thus, has at most  vertices of degree one and, since  is a tree (a forest), also has at most  vertices of degree at least three. Let  be the vertices of degree two in  and let  be the remaining vertices in . From the above it follows that . Hence, it remains to bound the size of . To this end, we will upper-bound the degree of each vertex in  by  and then use Reduction Rules 2.3 and 2.1. We will check for each edge  with  and