A generalization of HopcroftKarp algorithm for semimatchings and covers in bipartite graphs
Abstract
An semimatching in a bipartite graph is a set of edges such that each vertex is incident with at most edges of , and each vertex is incident with at most edges of . In this paper we give an algorithm that for a graph with vertices and edges, , constructs a maximum semimatching in running time . Using the reduction of [5] our result on maximum semimatching problem directly implies an algorithm for the optimal semimatching problem with running time .
1 Introduction
We consider finite nonoriented graphs without loops and multiple edges. In general we use standard concepts and notation of graph theory. In particular, denotes the degree of a vertex in . If then denotes the number of edges of incident with . If is and integer valued function defined for all vertices of and then stands for the sum .
Let be a bipartite graph with vertices and edges (throughout the paper we consider only nontrivial case with no isolated vertices, i.e. ). A semimatching of is a set of edges , such that each vertex of is incident with exactly one edge of .
Semimatching is a natural generalization of the classical matching in bipartite graphs. Although the name of semimatching was introduced recently in [7], semimatchings appear in many problems and were studied as early as 1970s [9] with applications in wireless sensor networks [1, 13, 14, 15, 17] and a wide area of scheduling problems [3, 6, 10, 11, 18]. For a weighted case of the problem we refer to [4, 6, 12, 19].
The problem of finding an optimal semimatching (see [7]) is motivated by the following offline load balancing scenario: Given a set of tasks and a set of machines, each of which can process a subset of tasks. Each task requires one unit of processing time and must be assigned to some machine that can process it. The tasks have to be assigned in a manner that minimizes given optimization objective. One natural goal is to process all tasks with the minimum total completion time. Another goal is to minimize the average completion time, or total flow time, which is the sum of time units necessary for completion of all jobs (including the units while a job is waiting in the queue).
Let be a semimatching. The cost of , denoted by , is defined as follows:
A semimatching is optimal, if its is the smallest one among the costs of all admissible semimatchings. The problem of computing an optimal semimatching was firstly studied by Horn [9] and Bruno et al. [3] where an algorithm was presented. The problem received considerable attention in the past few years. Harvey et al. [7] showed that by minimizing of a semimatching one minimizes simultaneously the maximum number of tasks assigned to a machine, the flow time and the variance of loads. The same authors provided also a characterization of an optimal assignment based on costreducing paths and an algorithm for finding an optimal semimatching in time . It constructs an optimal semimatching step by step starting with an empty semimatching and in each iteration finds an augmenting path from a free vertex to a vertex in with the smallest possible degree.
The semimatchings were generalized to the quasimatchings by Bokal et al. [2]. They consider an integer valued function defined on the vertex set and require that each vertex is connected to at least vertices of .
An quasimatching in a bipartite graph is a set of edges such that each vertex is incident with at most edges of , and each vertex is incident with at least edges of . The authors provided a property of lexicographically minimum quasimatching and showed that the lexicographically minimum quasimatching equals to an optimal semimatching. Moreover they also designed an algorithm to compute an optimal (lexicographically minimum) quasimatching in running time .
Similarly, in [2] was defined an semimatching of , which is a set of edges such that every element of has at most incident edges from , and every element of has at most incident edges from . A maximum semimatching is the one with as many edges as possible.
The complexity bound for computing an optimum semimatching was further improved by Fakcharoenphol et al. [4], who presented algorithm for the optimal semimatching problem. The algorithm uses a reduction to the mincost flow problem and exploits the structure of the graphs and cost functions for an elimination of many negative cycles in a single iteration.
Recently, in [5] it was presented a reduction from the optimum semimatching problem to the maximum semimatching, which shows that an optimal semimatching of can be computed in time where , , and is the time complexity of an algorithm for computing a maximum semimatching with . By a result of [16], the algorithm designed in [5] yields to a randomized algorithm for optimal semimatching with a running time of , where is the exponent of the best known matrix multiplication algorithm. Since , this algorithm broke through barrier for computing optimal semimatching in dense graphs [5].
In this paper we present an algorithm for finding a maximum semimatching in running time . For the problem of computing an quasimatching it gives an algorithm with running time . For the maximum semimatching we get an complexity upper bound , which implies a bound for computing an optimal semimatching of the algorithm presented in [5].
2 Augmenting paths and semimatchings
In this chapter we introduce concepts that will be used throughout the remaining part of the paper.
Definition 1
Let and be mappings. An semimatching in a bipartite graph is a set of edges such that for each vertex , and for each vertex .
Definition 2
An semimatching of a graph is called maximum, if for each semimatching of holds . An semimatching is called perfect, if .
Note, that semimatching is a matching in a bipartite graph.
Definition 3
Let be a bipartite graph and . A path is called an alternating path, if each internal vertex of is incident with exactly one edge of .
Definition 4
Let be a bipartite graph and . An augmenting path is an alternating path with the first and last vertex of not incident with an edge of .
Definition 5
Let be a bipartite graph, , be an alternating path and be the edge set of . We define an operator as follows:
The next theorem provides a characterisation of maximum semimatching.
Theorem 2.1
Let and be an semimatching of a graph , . Then there exists an augmenting path with endvertices , and such that .
Proof
We proceed by an induction on the size of . Evidently, the assertion of the theorem is true for the smallest cases. Now, we may assume that , otherwise the assertion follows from the induction hypothesis. Let us put
Let be the set of vertices of for which there exists an alternating path starting in a vertex of with and edge of . Here a path of length is considered to be an alternating path, therefore .
Let be the set of vertices of for which there exists an alternating path starting in a vertex of with an edge of .
Let us put and . For sets and we introduce parameters and .
From the definition of we get and the definition of yields (otherwise the existence of such an edge implies an existence of an alternating path starting at a vertex of by edge of ). This is depicted on Figure 1.
Since , we have . Moreover and which gives
(1) 
Since and , we get the inequality
(2) 
Trivially, we have the following
(4) 
From the inequality (5) we can conclude that contains a vertex with . By the definition of , it implies an existence of an augmenting path with endvertex and an endvertex from .
Theorem 2.2
A semimatching of a graph is maximum if and only if there exists no augmenting path with endvertices , and .
Proof
Suppose to the contrary that there is a maximum semimatching and augmenting path with endvertices and , . Then obviously is an semimatching with .
The opposite direction comes from Theorem 2.1.
The next theorem provides more information about the structure of augmenting paths.
Theorem 2.3
Let and be semimatchings of a bipartite graph such that . Then there exist edgedisjoint augmenting paths such that
Proof
We prove the theorem by induction on the size of the graph . The assertion obviously holds for the smallest possible cases. If , then and , is an instance of theorem of smaller size and the claim follows from induction hypothesis.
Suppose now . Using Theorem 2.1, there exists an augmenting path such that its edges alternatively belongs to and . Therefore and . Consider now the graph and edge sets , . From the induction hypothesis there exist edge disjoint paths such that . Clearly, is edge disjoint with and
Proof
Corollary 1
Let and be an semimatchings of a bipartite graph such that . Then there exist augmenting paths such that and , for each .
It follows from Theorem 2.3 and the fact , that no two of those augmenting paths may overlap in a vertex .
Let be an semimatching of a bipartite graph . Denote by . We set to be the length of a shortest alternating path starting in any vertex of and ending in . If no such alternating path exists, we put .
Theorem 2.4
Let be an semimatching of a bipartite graph and be a shortest augmenting path. Then for each vertex .
Proof
Assume to the contrary that there exists at least one vertex such that . Let us choose such a vertex with the smallest possible value of . It means that for each vertex with the inequality is valid.
Clearly cannot be , because in such a case is a vertex of for which and that is why must be zero as well.
Thus, is at least . Let be the predecessor of in a shortest alternating path starting in a vertex of . Obviously . It also holds that (otherwise was not chosen correctly), what together with the previous equation gives . Together with the initial inequality for we obtain . This implies that the edge was changed, i.e. (otherwise the edge could be used to violate the inequality ). Let us distinguish now two cases:
Case1. and . As is the predecessor of in an alternating path starting at , it implies that the edge and . Now let us consider the path . The path was the shortest alternating path starting at . Since and the path must visit the vertex before . However, in such a case, by the definition of an alternating path starting at , the edge going from to must be unmatched, a contradiction.
Case 2. and . As is a predecessor of in an alternating path started at , it implies that , consequently . The path was the shortest alternating path started at . Since and the path must first visit the vertex and then . However, in such a case, from the definition of an alternating path starting at , the edge going from to must be matched, a contradiction
3 The algorithm for finding a maximum semimatching
In this section we describe an algorithm for solving the following problem:
Problem 1
Given a bipartite graph and two mappings and . Find a maximum semimatching of .
In order to simplify the notation, for an semimatching of a bipartite graph and for each vertex of we introduce the parameter as follows:
We denote by augmenting path an augmenting path with endvertices , , such that and .
Our algorithm applies the same scheme as the wellknown algorithm of HopcroftKarp [8]. We start with an empty semimatching and in each iteration we extend by several augmenting paths. The length of a shortest augmenting path increases after each iteration and each iteration of the algorithm consumes time.
One iteration of the algorithm finds a smallest number for which an augmenting path of length exists. Next, the algorithm extends by several augmenting paths in a single iteration, while there is an augmenting path of length . More precisely:

Let

In terms of BreadthFirst Search algorithm, classify vertices of into layers such that . This can be implemented as follows:
For each do

Let be a smallest odd number such that there exists . If no such exists, by Theorem 2.2 there is no augmenting path. The algorithm stops and is a maximum semimatching, otherwise continues by step 4.

For each vertex while do:

Find arbitrary augmenting path of length starting in such that .

If such a path exists, set and recalculate values of along the path .

Theorem 3.1
The length of the shortest augmenting path increases after each iteration of the algorithm.
Proof
An iteration which processes an semimatching stops when there is no augmenting path consisting of vertices of . It remains to prove, that after such an iteration there is no augmenting path of length in the graph (a path of length less than cannot appear due to Theorem 2.4 and the fact that all vertices in layers have zero capacity).
Suppose to the contrary, that after the iteration there is an augmenting path of order in . Since all the vertices of are located in , . Since is an alternating path starting by a vertex of , then , for each . According to Theorem 2.4, the value of cannot decrease after iteration, i.e. for each . Hence, each vertex of appears in and such an augmenting path was not processed during the iteration of the algorithm, which is a contradiction.
3.1 The running time
Let be the number of vertices in a given graph and be the number of its edges, assume that since isolated vertices can be erased from the graph in linear time.
The algorithm starts with an empty semimatching and then iterates several times until at least one augmenting path is found. In the search loop, the algorithm classifies the vertices into layers and modifies by augmenting paths using vertices of . This step consumes time, since each edge is manipulated at most once during one iteration. No more iteration is performed whenever no augmenting path was found in the actual loop.
The key part of the complexity analysis is to enumerate the number of loops of the algorithm. Let be the size of a maximum semimatching . After performing iterations of the algorithm, according to Theorem 3.1, the shortest augmenting path consists of at least vertices. According to Theorem 2.3 there exist edge disjoint augmenting paths that can simultaneously extend to size and those paths consist only of edges of . As each such a path must be of length at least and is at most , these imply that . Since in each loop the algorithm finds at least one augmenting path, the algorithm surely stops after at most loops. Hence, the total number of performed loops is and the algorithm runs in time .
Moreover and and we get that the algorithm computes a maximum semimatching in running time . For the case of semimatching this gives the complexity upper bound .
To find an arbitrary quasimatching one can use the algorithm for maximum semimatching problem which computes a maximum semimatching . Clearly, if then no quasimatching exists, otherwise is an quasimatching. Moreover, for an quasimatching we may assume (otherwise no quasi matching exists), we get the algorithm with running time .
Footnotes
 email: jan.katrenic@upjs.sk, gabriel.semanisin@upjs.sk
 email: jan.katrenic@upjs.sk, gabriel.semanisin@upjs.sk
References
 P. Biró and E. McDermid Matching with sizes (or scheduling with processing set restrictions) Electronic Notes in Discrete Mathematics 36 2010 335 – 342
 D. Bokal, B. Brešar, and J. Jerebic A generalization of Hungarian method and Hall’s theorem with applications in wireless sensor networks Discrete Appl. Math. 160 2012 460–470
 J. Bruno, E. G. Coffman, Jr., and R. Sethi Scheduling independent tasks to reduce mean finishing time Commun. ACM 17 1974 382–387
 J. Fakcharoenphol, B. Laekhanukit, and D. Nanongkai Faster algorithms for semimatching problems ICALP 2010, Lecture Notes in Computer Science 6198 S. Abramsky, C. Gavoille, C. Kirchner, F. M. auf der Heide, P. G. Spirakis Springer2010176–187
 F. Galčík, J. Katrenič, and G. Semanišin On computing an optimal semimatching WG 2011, Lecture Notes in Computer Science 6986, P. Kolman and J. Kratochvíl Springer 2011 250–261
 T. Gu, L. Chang, and Z. Xu A novel symbolic algorithm for maximum weighted matching in bipartite graphs IJCNS 4 2011 111–121
 N. J. A. Harvey, R. E. Ladner, L. Lovász, and T. Tamir Semimatchings for bipartite graphs and load balancing J. Algorithms 59200653–78
 J. E. Hopcroft and R. M. Karp An n algorithm for maximum matchings in bipartite graphs SIAM J. Comput. 21973225–231
 W. A. Horn Minimizing average flow time with parallel machines Operations Research 21 1973846–847
 S. Kravchenko and F. Werner Parallel machine problems with equal processing times: a survey Journal of Scheduling 14 2011 435–444
 K. Lee, J. Leung, and M. Pinedo Scheduling jobs with equal processing times subject to machine eligibility constraints Journal of Scheduling 14201127–38
 K. Lee, J. Y.T. Leung, and M. Pinedo A note on “an approximation algorithm for the loadbalanced semimatching problem in weighted bipartite graphs“ Inf. Process. Lett. 109608–6102009
 D. Luo, X. Zhu, X. Wu, and G. Chen Maximizing lifetime for the shortest path aggregation tree in wireless sensor networks INFOCOM 2011 K. Gopalan and A.D. Striegel IEEE20111566–1574
 R. Machado and S. Tekinay A survey of gametheoretic approaches in wireless sensor networks Computer Networks 5220083047 – 3061
 B. Malhotra, I. Nikolaidis, and M. A. Nascimento Aggregation convergecast scheduling in wireless sensor networks Wirel. Netw. 17 2011 319–335
 M. Mucha and P. Sankowski Maximum matchings via gaussian elimination FOCS 2004 E. Upfal IEEE Computer Society 2004 248–255
 N. Sadagopan, M. Singh, and B. Krishnamachari Decentralized utilitybased sensor network design Mobile Networks and Applications 11 2006 341–350
 L.H. Su. Scheduling on identical parallel machines to minimize total completion time with deadline and machine eligibility constraints The International Journal of Advanced Manufacturing Technology 402009572–581
 H. Yuta, O. Hirotaka, S. Kunihiko, and Y. Masafumi. Optimal balanced semimatchings for weighted bipartite graphs IPSJ Digital Courier 32007693–702