A generalization of Hopcroft-Karp algorithm for semi-matchings and covers in bipartite graphs

# A generalization of Hopcroft-Karp algorithm for semi-matchings and covers in bipartite graphs

## Abstract

An -semi-matching in a bipartite graph is a set of edges such that each vertex is incident with at most edges of , and each vertex is incident with at most edges of . In this paper we give an algorithm that for a graph with vertices and edges, , constructs a maximum -semi-matching in running time . Using the reduction of [5] our result on maximum -semi-matching problem directly implies an algorithm for the optimal semi-matching problem with running time .

## 1 Introduction

We consider finite non-oriented graphs without loops and multiple edges. In general we use standard concepts and notation of graph theory. In particular, denotes the degree of a vertex in . If then denotes the number of edges of incident with . If is and integer valued function defined for all vertices of and then stands for the sum .

Let be a bipartite graph with vertices and edges (throughout the paper we consider only non-trivial case with no isolated vertices, i.e. ). A semi-matching of is a set of edges , such that each vertex of is incident with exactly one edge of .

Semi-matching is a natural generalization of the classical matching in bipartite graphs. Although the name of semi-matching was introduced recently in [7], semi-matchings appear in many problems and were studied as early as 1970s [9] with applications in wireless sensor networks [1, 13, 14, 15, 17] and a wide area of scheduling problems [3, 6, 10, 11, 18]. For a weighted case of the problem we refer to [4, 6, 12, 19].

The problem of finding an optimal semi-matching (see [7]) is motivated by the following off-line load balancing scenario: Given a set of tasks and a set of machines, each of which can process a subset of tasks. Each task requires one unit of processing time and must be assigned to some machine that can process it. The tasks have to be assigned in a manner that minimizes given optimization objective. One natural goal is to process all tasks with the minimum total completion time. Another goal is to minimize the average completion time, or total flow time, which is the sum of time units necessary for completion of all jobs (including the units while a job is waiting in the queue).

Let be a semi-matching. The cost of , denoted by , is defined as follows:

 cost(M)=∑v∈VdegM(v)⋅(degM(v)+1)2.

A semi-matching is optimal, if its is the smallest one among the costs of all admissible semi-matchings. The problem of computing an optimal semi-matching was firstly studied by Horn [9] and Bruno et al. [3] where an algorithm was presented. The problem received considerable attention in the past few years. Harvey et al. [7] showed that by minimizing of a semi-matching one minimizes simultaneously the maximum number of tasks assigned to a machine, the flow time and the variance of loads. The same authors provided also a characterization of an optimal assignment based on cost-reducing paths and an algorithm for finding an optimal semi-matching in time . It constructs an optimal semi-matching step by step starting with an empty semi-matching and in each iteration finds an augmenting path from a free -vertex to a vertex in with the smallest possible degree.

The semi-matchings were generalized to the quasi-matchings by Bokal et al. [2]. They consider an integer valued function defined on the vertex set and require that each vertex is connected to at least vertices of .

An -quasi-matching in a bipartite graph is a set of edges such that each vertex is incident with at most edges of , and each vertex is incident with at least edges of . The authors provided a property of lexicographically minimum -quasi-matching and showed that the lexicographically minimum -quasi-matching equals to an optimal semi-matching. Moreover they also designed an algorithm to compute an optimal (lexicographically minimum) -quasi-matching in running time .

Similarly, in [2] was defined an -semi-matching of , which is a set of edges such that every element of has at most incident edges from , and every element of has at most incident edges from . A maximum -semi-matching is the one with as many edges as possible.

The complexity bound for computing an optimum semi-matching was further improved by Fakcharoenphol et al. [4], who presented algorithm for the optimal semi-matching problem. The algorithm uses a reduction to the min-cost flow problem and exploits the structure of the graphs and cost functions for an elimination of many negative cycles in a single iteration.

Recently, in [5] it was presented a reduction from the optimum semi-matching problem to the maximum -semi-matching, which shows that an optimal semi-matching of can be computed in time where , , and is the time complexity of an algorithm for computing a maximum -semi-matching with . By a result of [16], the algorithm designed in [5] yields to a randomized algorithm for optimal semi-matching with a running time of , where is the exponent of the best known matrix multiplication algorithm. Since , this algorithm broke through barrier for computing optimal semi-matching in dense graphs [5].

In this paper we present an algorithm for finding a maximum -semi-matching in running time . For the problem of computing an -quasi-matching it gives an algorithm with running time . For the maximum -semi-matching we get an complexity upper bound , which implies a bound for computing an optimal semi-matching of the algorithm presented in [5].

## 2 Augmenting paths and (f,g)-semi-matchings

In this chapter we introduce concepts that will be used throughout the remaining part of the paper.

###### Definition 1

Let and be mappings. An -semi-matching in a bipartite graph is a set of edges such that for each vertex , and for each vertex .

###### Definition 2

An -semi-matching of a graph is called maximum, if for each -semi-matching of holds . An -semi-matching is called perfect, if .

Note, that -semi-matching is a matching in a bipartite graph.

###### Definition 3

Let be a bipartite graph and . A path is called an -alternating path, if each internal vertex of is incident with exactly one edge of .

###### Definition 4

Let be a bipartite graph and . An -augmenting path is an alternating path with the first and last vertex of not incident with an edge of .

###### Definition 5

Let be a bipartite graph, , be an -alternating path and be the edge set of . We define an operator as follows:

 H⊕P=(H∪E(P))∖(E(P)∩H).

The next theorem provides a characterisation of maximum -semi-matching.

###### Theorem 2.1

Let and be an -semi-matching of a graph , . Then there exists an -augmenting path with endvertices , and such that .

###### Proof

We proceed by an induction on the size of . Evidently, the assertion of the theorem is true for the smallest cases. Now, we may assume that , otherwise the assertion follows from the induction hypothesis. Let us put

 A={v∈V:degM(v)

Let be the set of vertices of for which there exists an -alternating path starting in a vertex of with and edge of . Here a path of length is considered to be an -alternating path, therefore .

Let be the set of vertices of for which there exists an -alternating path starting in a vertex of with an edge of .

Let us put and . For sets and we introduce parameters and .

From the definition of we get and the definition of yields (otherwise the existence of such an edge implies an existence of an -alternating path starting at a vertex of by edge of ). This is depicted on Figure 1.

Since , we have . Moreover and which gives

 m(UA,VA)+m(UB,VA)+m(UB,VB)

Since and , we get the inequality

 m(UB,VB)≥m′(UA,VB)+m′(UB,VB). (2)

By (1) and (2) we get

 m(UA,VA)+m(UB,VA)

Trivially, we have the following

 m(UB,VA)≥−m′(UA,VB). (4)

Combining (3) and (4) we obtain

 m(UA,VA)

From the inequality (5) we can conclude that contains a vertex with . By the definition of , it implies an existence of an -augmenting path with endvertex and an endvertex from .

###### Theorem 2.2

A -semi-matching of a graph is maximum if and only if there exists no -augmenting path with endvertices , and .

###### Proof

Suppose to the contrary that there is a maximum -semi-matching and -augmenting path with endvertices and , . Then obviously is an -semi-matching with .

The opposite direction comes from Theorem 2.1.

###### Theorem 2.3

Let and be -semi-matchings of a bipartite graph such that . Then there exist edge-disjoint -augmenting paths such that

###### Proof

We prove the theorem by induction on the size of the graph . The assertion obviously holds for the smallest possible cases. If , then and , is an instance of theorem of smaller size and the claim follows from induction hypothesis.

Suppose now . Using Theorem 2.1, there exists an -augmenting path such that its edges alternatively belongs to and . Therefore and . Consider now the graph and edge sets , . From the induction hypothesis there exist edge disjoint paths such that . Clearly, is edge disjoint with and

 M′ = (M′∩E(P))∪(M′∖E(P)) = ((M⊕P)∩E(P))∪((M∖E(P))⊕P1⊕…Pk−1) = M⊕P1⊕…Pk−1⊕P.
###### Corollary 1

Let and be an -semi-matchings of a bipartite graph such that . Then there exist -augmenting paths such that and , for each .

It follows from Theorem 2.3 and the fact , that no two of those -augmenting paths may overlap in a vertex .

Let be an -semi-matching of a bipartite graph . Denote by . We set to be the length of a shortest -alternating path starting in any vertex of and ending in . If no such -alternating path exists, we put .

###### Theorem 2.4

Let be an -semi-matching of a bipartite graph and be a shortest -augmenting path. Then for each vertex .

###### Proof

Assume to the contrary that there exists at least one vertex such that . Let us choose such a vertex with the smallest possible value of . It means that for each vertex with the inequality is valid.

Clearly cannot be , because in such a case is a vertex of for which and that is why must be zero as well.

Thus, is at least . Let be the predecessor of in a shortest -alternating path starting in a vertex of . Obviously . It also holds that (otherwise was not chosen correctly), what together with the previous equation gives . Together with the initial inequality for we obtain . This implies that the edge was changed, i.e. (otherwise the edge could be used to violate the inequality ). Let us distinguish now two cases:

Case1. and . As is the predecessor of in an -alternating path starting at , it implies that the edge and . Now let us consider the path . The path was the shortest -alternating path starting at . Since and the path must visit the vertex before . However, in such a case, by the definition of an alternating path starting at , the edge going from to must be unmatched, a contradiction.

Case 2. and . As is a predecessor of in an -alternating path started at , it implies that , consequently . The path was the shortest -alternating path started at . Since and the path must first visit the vertex and then . However, in such a case, from the definition of an alternating path starting at , the edge going from to must be matched, a contradiction

## 3 The algorithm for finding a maximum (f,g)-semi-matching

In this section we describe an algorithm for solving the following problem:

###### Problem 1

Given a bipartite graph and two mappings and . Find a maximum -semi-matching of .

In order to simplify the notation, for an -semi-matching of a bipartite graph and for each vertex of we introduce the parameter as follows:

 cM(u)={f(u)−degM(u)if u∈U,g(u)−degM(u)if u∈V.

We denote by -augmenting path an -augmenting path with endvertices , , such that and .

Our algorithm applies the same scheme as the well-known algorithm of Hopcroft-Karp [8]. We start with an empty -semi-matching and in each iteration we extend by several augmenting paths. The length of a shortest -augmenting path increases after each iteration and each iteration of the algorithm consumes time.

One iteration of the algorithm finds a smallest number for which an -augmenting path of length exists. Next, the algorithm extends by several augmenting paths in a single iteration, while there is an augmenting path of length . More precisely:

1. Let

2. In terms of Breadth-First Search algorithm, classify vertices of into layers such that . This can be implemented as follows:
For each do

3. Let be a smallest odd number such that there exists . If no such exists, by Theorem 2.2 there is no -augmenting path. The algorithm stops and is a maximum -semi-matching, otherwise continues by step 4.

4. For each vertex while do:

• Find arbitrary -augmenting path of length starting in such that .

• If such a path exists, set and recalculate values of along the path .

###### Theorem 3.1

The length of the shortest augmenting path increases after each iteration of the algorithm.

###### Proof

An iteration which processes an -semi-matching stops when there is no -augmenting path consisting of vertices of . It remains to prove, that after such an iteration there is no augmenting path of length in the graph (a path of length less than cannot appear due to Theorem 2.4 and the fact that all vertices in layers have zero capacity).

Suppose to the contrary, that after the iteration there is an -augmenting path of order in . Since all the vertices of are located in , . Since is an alternating path starting by a vertex of , then , for each . According to Theorem 2.4, the value of cannot decrease after iteration, i.e.  for each . Hence, each vertex of appears in and such an augmenting path was not processed during the iteration of the algorithm, which is a contradiction.

### 3.1 The running time

Let be the number of vertices in a given graph and be the number of its edges, assume that since isolated vertices can be erased from the graph in linear time.

The algorithm starts with an empty -semi-matching and then iterates several times until at least one augmenting path is found. In the search loop, the algorithm classifies the vertices into layers and modifies by augmenting paths using vertices of . This step consumes time, since each edge is manipulated at most once during one iteration. No more iteration is performed whenever no augmenting path was found in the actual loop.

The key part of the complexity analysis is to enumerate the number of loops of the algorithm. Let be the size of a maximum -semi-matching . After performing iterations of the algorithm, according to Theorem 3.1, the shortest -augmenting path consists of at least vertices. According to Theorem 2.3 there exist edge disjoint -augmenting paths that can simultaneously extend to size and those paths consist only of edges of . As each such a path must be of length at least and is at most , these imply that . Since in each loop the algorithm finds at least one augmenting path, the algorithm surely stops after at most loops. Hence, the total number of performed loops is and the algorithm runs in time .

Moreover and and we get that the algorithm computes a maximum semi-matching in running time . For the case of -semi-matching this gives the complexity upper bound .

To find an arbitrary -quasi-matching one can use the algorithm for maximum -semi-matching problem which computes a maximum -semi-matching . Clearly, if then no -quasi-matching exists, otherwise is an -quasi-matching. Moreover, for an -quasi-matching we may assume (otherwise no -quasi matching exists), we get the algorithm with running time .

### Footnotes

1. email: jan.katrenic@upjs.sk, gabriel.semanisin@upjs.sk
2. email: jan.katrenic@upjs.sk, gabriel.semanisin@upjs.sk

### References

1. P. Biró and E. McDermid Matching with sizes (or scheduling with processing set restrictions) Electronic Notes in Discrete Mathematics 36 2010 335 – 342
2. D. Bokal, B. Brešar, and J. Jerebic A generalization of Hungarian method and Hall’s theorem with applications in wireless sensor networks Discrete Appl. Math. 160 2012 460–470
3. J. Bruno, E. G. Coffman, Jr., and R. Sethi Scheduling independent tasks to reduce mean finishing time Commun. ACM 17 1974 382–387
4. J. Fakcharoenphol, B. Laekhanukit, and D. Nanongkai Faster algorithms for semi-matching problems ICALP 2010, Lecture Notes in Computer Science 6198 S. Abramsky, C. Gavoille, C. Kirchner, F. M. auf der Heide, P. G. Spirakis Springer2010176–187
5. F. Galčík, J. Katrenič, and G. Semanišin On computing an optimal semi-matching WG 2011, Lecture Notes in Computer Science 6986, P. Kolman and J. Kratochvíl Springer 2011 250–261
6. T. Gu, L. Chang, and Z. Xu A novel symbolic algorithm for maximum weighted matching in bipartite graphs IJCNS 4 2011 111–121
7. N. J. A. Harvey, R. E. Ladner, L. Lovász, and T. Tamir Semi-matchings for bipartite graphs and load balancing J. Algorithms 59200653–78
8. J. E. Hopcroft and R. M. Karp An n algorithm for maximum matchings in bipartite graphs SIAM J. Comput. 21973225–231
9. W. A. Horn Minimizing average flow time with parallel machines Operations Research 21 1973846–847
10. S. Kravchenko and F. Werner Parallel machine problems with equal processing times: a survey Journal of Scheduling 14 2011 435–444
11. K. Lee, J. Leung, and M. Pinedo Scheduling jobs with equal processing times subject to machine eligibility constraints Journal of Scheduling 14201127–38
12. K. Lee, J. Y.-T. Leung, and M. Pinedo A note on “an approximation algorithm for the load-balanced semi-matching problem in weighted bipartite graphs“ Inf. Process. Lett. 109608–6102009
13. D. Luo, X. Zhu, X. Wu, and G. Chen Maximizing lifetime for the shortest path aggregation tree in wireless sensor networks INFOCOM 2011 K. Gopalan and A.D. Striegel IEEE20111566–1574
14. R. Machado and S. Tekinay A survey of game-theoretic approaches in wireless sensor networks Computer Networks 5220083047 – 3061
15. B. Malhotra, I. Nikolaidis, and M. A. Nascimento Aggregation convergecast scheduling in wireless sensor networks Wirel. Netw. 17 2011 319–335
16. M. Mucha and P. Sankowski Maximum matchings via gaussian elimination FOCS 2004 E. Upfal IEEE Computer Society 2004 248–255
17. N. Sadagopan, M. Singh, and B. Krishnamachari Decentralized utility-based sensor network design Mobile Networks and Applications 11 2006 341–350
18. L.-H. Su. Scheduling on identical parallel machines to minimize total completion time with deadline and machine eligibility constraints The International Journal of Advanced Manufacturing Technology 402009572–581
19. H. Yuta, O. Hirotaka, S. Kunihiko, and Y. Masafumi. Optimal balanced semi-matchings for weighted bipartite graphs IPSJ Digital Courier 32007693–702
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters