Online Bipartite Matching with Amortized Replacements
Abstract
In the online bipartite matching problem with replacements, all the vertices on one side of the bipartition are given, and the vertices on the other side arrive one by one with all their incident edges. The goal is to maintain a maximum matching while minimizing the number of changes (replacements) to the matching. We show that the greedy algorithm that always takes the shortest augmenting path from the newly inserted vertex (denoted the SAP protocol) uses at most amortized replacements per insertion, where is the total number of vertices inserted. This is the first analysis to achieve a polylogarithmic number of replacements for any replacement strategy, almost matching the lower bound. The previous best strategy known achieved amortized replacements [Bosek, Leniowski, Sankowski, Zych, FOCS 2014]. For the SAP protocol in particular, nothing better than then trivial bound was known except in special cases. Our analysis immediately implies the same upper bound of reassignments for the capacitated assignment problem, where each vertex on the static side of the bipartition is initialized with the capacity to serve a number of vertices.
We also analyze the problem of minimizing the maximum server load. We show that if the final graph has maximum server load , then the SAP protocol makes amortized reassignments. We also show that this is close to tight because reassignments can be necessary.
tikzpicture*[2][]
1 Introduction
In the online bipartite matching problem, the vertices on one side are given in advance (we call these the servers ), while the vertices on the other side (the clients ) arrive one at a time with all their incident edges. In the standard online model the arriving client can only be matched immediately upon arrival, and the matching cannot be changed later. Because of this irreversibility, the final matching might not be maximum; no algorithm can guarantee better than a approximation [22]. But in many settings the irreversibility assumption is too strict: rematching a client is expensive but not impossible. This motivates the online bipartite matching problem with replacements, where the goal is to at all times match as many clients as possible, while minimizing the number of changes to the matching. Applications include hashing, job scheduling, web hosting, streaming content delivery, and data storage; see [8] for more details.
In several of the applications above, a server can serve multiple clients, which raises the question of online bipartite assignment with reassignments. There are two ways of modeling this:
 Capacitated assignments.

Each server comes with the capacity to serve some number of clients , where each is given in advance. Clients should be assigned to a server, and at no times should the capacity of a server be exceeded. There exists an easy reduction showing that this problem is equivalent to online matching with replacements [2]. A more formal description is given in Section 6.1.
 Minimize max load.

There is no limit on the number of clients a server can serve, but we want to (at all times) distribute the clients as “fairly” as possible, while still serving all the clients. Defining the load on a server as the number of clients assigned to it, the task is to, at all times, minimize the maximum server load — with as few reassignments as possible. A more formal description is given in Section 6.2
While the primary goal is to minimize the number of replacements, special emphasis has been placed on analyzing the SAP protocol in particular, which always augments down a shortest augmenting path from the newly arrived client to a free server (breaking ties arbitrarily). This is the most natural replacement strategy, and shortest augmenting paths are already of great interest in graph algorithms: they occur in for example in Dinitz’ and Edmonds and Karp’s algorithm for maximum flow [9, 10], and in Hopcroft and Karp’s algorithm for maximum matching in bipartite graphs [19].
Throughout the rest of the paper, we let be the number of clients in the final graph, and we consider the total number of replacements during the entire sequence of insertions; this is exactly times the amortized number of replacements. The reason for studying the vertexarrival model (where each client arrives with all its incident edges) instead of the (perhaps more natural) edgearrival model is the existence of a trivial lower bound of total replacements in this model: Start with a single edge, and maintaining at all times that the current graph is a path, add edges to alternating sides of the path. Every pair of insertions cause the entire path to be augmented, leading to a total of replacements.
1.1 Previous work
The problem of online bipartite matchings with replacements was introduced in 1995 by Grove, Kao, Krishnan, and Vitter [13], who showed matching upper and lower bounds of replacements for the case where each client has degree two. In 2009, Chadhuri, Daskalakis, Kleinberg, and Lin [8] showed that for any arbitrary underlying bipartite graph, if the client vertices arrive in a random order, the expected number of replacements (in their terminology, the switching cost) is using SAP, which they also show is tight. They also show that if the bipartite graph remains a forest, there exists an algorithm (not SAP) with replacements, and a matching lower bound. Bosek, Leniowski, Sankowski and Zych later analyzed the SAP protocol for forests, giving an upper bound of replacements [6], later improved to the optimal total replacements [7]. For general bipartite graphs, no analysis of SAP is known that shows better than the trivial total replacements. Bosek et al. [5] showed a different algorithm that achieves a total of replacements. They also show how to implement this algorithm in total time , which matches the best performing combinatorial algorithm for computing a maximum matching in a static bipartite graph (Hopcroft and Karp [19]).
The lower bound of by Grove et al. [13] has not been improved since, and is conjectured by Chadhuri et al. [8] to be tight, even for SAP, in the general case. We take a giant leap towards closing that conjecture.
For the problem of minimizing maximum load, [15] and [2] showed an approximation solution: with only amortized changes per client insertion they maintain an assignment such that at all times the maximum load is within a factor of of optimum.
The model of online algorithms with replacements – alternatively referred to as online algorithms with recourse – has also been studied for a variety of problems other than matching. The model is similar to that of online algorithms, except that instead of trying to maintain the best possible approximation without making any changes, the goal is to maintain an optimal solution while making as few changes to the solution as possible. This model encapsulates settings in which changes to the solution are possible but expensive. The model originally goes back to online Steiner trees [20], and there have been several recent improvements for online Steiner tree with recourse [25, 14, 17, 24]. There are many papers on online scheduling that try to minimize the number of job reassignments [26, 31, 1, 27, 29, 11]. The model has also been studied in the context of flows [31, 15], and there is a very recent result on online set cover with recourse [16].
1.2 Our results
Theorem 1.
SAP makes at most total replacements when clients are added.
This is a huge improvement of the bound by [5], and is only a log factor from the lower bound of by [13]. It is also a huge improvement of the analysis of SAP; previously no better upper bound than replacements for SAP was known. To attain the result we develop a new tool for analyzing matchingrelated properties of graphs (the balanced flow in Sections 3 and 4) that is quite general, and that we believe may be of independent interest.
Although SAP is an obvious way of serving the clients as they come, it does not immediately allow for an efficient implementation. Finding an augmenting path may take up to time, where denotes the total number of edges in the final graph. Thus, the naive implementation takes total time. However, short augmenting paths can be found substantially faster, and using the new analytical tools developed in this paper, we are able to exploit this in a data structure that finds the augmenting paths efficiently:
theoremimplement There is an implementation of the SAP protocol that runs in total time . Note that this is only an factor from the offline algorithm of Hopcroft and Karp [19]. This offline algorithm had previously been matched in the online setting by the algorithm of Bosek et al. [5], which has total running time . Our result has the advantage of combining multiple desired properties in a single algorithm: few replacements ( vs. in [5]), fast implementation ( vs. in [5]), and the most natural augmentation protocol (shortest augmenting path).
Extending our result to the case where each server can have multiple clients, we use that the capacitated assignment problem is equivalent to that of matching (see Section 6.1 to obtain:
theoremcapacitate SAP uses at most reassignments for the capacitated assignment problem, where is the number of clients.
In the case where we wish to minimize the maximum load, such a small number of total reassignments is not possible. Let denote the minimum possible maximum load in graph . We present a lower bound showing that when we may need as many as reassignments, as well as a nearly matching upper bound.
theoremlowerbound For any positive integers and divisible by there exists a graph with and , along with an ordering in which the clients in are inserted, such that any algorithm for the exact online assignment problem requires a total of changes. This lower bound holds even if the algorithm knows the entire graph in advance, as well as the order in which the clients are inserted.
theoremminmax Let be the set of all clients inserted, let , and let be the minimum possible maximum load in the final graph . SAP at all times maintains an optimal assignment while making a total of reassignments.
1.3 High level overview of techniques
Consider the standard setting in which we are given the entire graph from the beginning and want to compute a maximum matching. The classic shortestaugmenting paths algorithm constructs a matching by at every step picking a shortest augmenting path in the graph. We now show a very simple argument that the total length of all these augmenting paths is . Recall the wellknown fact that if all augmenting paths in the matching have length , then the current matching is at most edges from optimal [19]. Thus the algorithm augments down at most augmenting paths of length . Let denote all the paths augmented down by the algorithm in decreasing order of ; then , and implies . But then , so .
In the online setting, the algorithm does not have access to the entire graph. It can only choose the shortest augmenting path from the arriving client . We are nonetheless able to show a similar bound for this setting:
lemmalongpaths Consider the following protocol for constructing a matching: For each client in arbitrary order, augment along the shortest augmenting path from (if one exists). Given any , this protocol augments down a total of at most augmenting paths of length .
The proof of our main theorem then follows directly from the lemma.
Proof of Theorem 1.
Note that the SAP protocol exactly follows the condition of Lemma 1.3. Now, Given any , we say that an augmenting path is at level if its length is in the interval . By Lemma 1.3, the SAP protocol augments down at most paths of level . Since each of those paths contains at most edges, the total length of augmenting paths of level is at most . Summing over all levels yields the desired bound. ∎
The entirety of Sections 3 and 4 is devoted to proving Lemma 1.3. Previous algorithms attempted to bound the total number of reassignments by tracking how some property of the matching changes over time. For example, the analysis of Gupta et al. [15] keeps track of changes to the ”height” of vertices in , while the algorithm with reassignments [5] takes a more direct approach, and uses a nonSAP protocol whose changes to depend on how often each particular client has already been reassigned.
Unfortunately such arguments have had limited success because the matching can change quite erratically. This is especially true under the SAP protocol, which is why it has only been analyzed in very restrictive settings [8, 13, 6]. We overcome this difficulty by showing that it is enough to analyze how new clients change the structure of the graph , without reference to any particular matching.
Intuitively, our analysis keeps track of how ”necessary” each server is (denoted below). So for example, if there is a complete bipartite graph with 10 servers and 10 clients, then all servers are completely necessary. But if the complete graph has 20 servers and 10 clients, then while any matching has 10 matched servers and 10 unmatched ones, it is clear that if we abstract away from the particular matching every server is 1/2necessary. Of course in more complicated graphs different servers might have different necessities, and some necessities might be very close to 1 (say ). Note that server necessities depend only on the graph, not on any particular matching. Note also that our algorithm never computes the server necessities, as they are merely an analytical tool.
We relate necessities to the number of reassignments with 2 crucial arguments. 1. Server necessities only increase as clients are inserted, and once a server has , then regardless of the current matching, no future augmenting path will go through . 2. If, in any matching, the shortest augmenting path from a new client is long, then the insertion of will increase the necessity of servers that already had high necessity. We then argue that this cannot happen too many times before the servers involved have necessity 1, and thus do not partake in any future augmenting paths.
1.4 Paper outline
In Section 2, we introduce the terminology necessary to understand the paper. In Section 3, we introduce and reason about the abstraction of a balanced server flow, a number that reflects the necessity of each server. In Section 4, we use the balanced server flow to prove Lemma 1.3, which proves our main theorem that SAP makes a total of replacements. In Section 5, we give an efficient implementation of SAP. Finally, in Section 6, we present our results on capacitated online assignment, and for minimizing maximum server load in the online assignment problem.
2 Preliminaries and notation
Let be the vertices, and be the edges of a bipartite graph. We call the clients, and the servers. Clients arrive, one at a time, and we must maintain an explicit maximum matching of the clients. For simplicity of notation, we assume for the rest of the paper that . For any vertex , let denote the neighborhood of , and for any let .
Theorem 2 (Halls Marriage Theorem [18]).
There is a matching of size if and only if for all .
Definition 3.
Given any matching in a graph , an alternating path is one which alternates between unmatched and matched edges. An augmenting path is an alternating path that starts and ends with an unmatched vertex. Given any augmenting path , “flipping” the matched status of every edge on gives a new larger matching. We call this process augmenting down .
Denote by SAP the algorithm that upon the arrival of a new client augments down the shortest augmenting path from ; ties can be broken arbitrarily, and if no augmenting path from exists the algorithm does nothing. Chaudhuri et al. [8] showed that if the final graph contains a perfect matching, then the SAP protocol also returns a perfect matching. We now generalize this as follows
Observation 4.
Because of the nature of augmenting paths, once a client or a server is matched by the SAP protocol, it will remain matched during all future client insertions. On the other hand, if a client arrives and there is no augmenting path from to a free server, then during the entire sequence of client insertions will never be matched by the SAP protocol; no alternating path can go through because it is not incident to any matched edges.
Lemma 5.
The SAP protocol always maintains a maximum matching in the current graph .
Proof.
Consider for contradiction the first client such that after the insertion of , the matching maintained by the SAP protocol is not a maximum matching. Let be the set of clients before was inserted. Since is maximum in the graph but not in , it is clear that is matched in the maximum matching of but not in . But this contradicts the well known property of augmenting paths that the symmetric difference contains an augmenting path in from to a free server. ∎
3 The server flow abstraction
3.1 Defining the Server Flow
We now formalize the notion of server necessities from Section 1.3 by using a flowbased notation. The necessity of a server will be the value of a balanced server flow : We will now go on to define a server flow, define what it means for a server flow to be balanced, and then, show that the balanced server flow is unique.
Definition 6.
Given any graph , define a server flow as any map from to the nonnegative reals such that there exist nonnegative with:
We say that such a set of values realize the server flow.
A server flow can be thought of as a fractional assignment from to ; note, however, that is is not necessarily a fractional matching, since servers may have a load greater than . Note also that the same server flow may be realized in more than one way. Furthermore, if for some then , so no server flow is possible. So suppose (unless otherwise noted) that for all .
The following theorem can be seen as a generalization of Hall’s Marriage Theorem:
Lemma 7.
If , then there exists a server flow where every server has .
Proof.
Let be the original set but with copies of each client. Similarly let contain copies of each server, and let consist of all edges between copies of the endpoints of each edge in .
Now let , and let be the originals that the vertices in are copies of. Then , so the graph satisfies Hall’s theorem and thus it has some matching in which every client in is matched. Now, for let
Since for each all copies of are matched, for all . Similarly, since for each at most copies of are matched, . Thus, realizes the desired server flow. ∎
Definition 8.
We say that a server flow is balanced, if additionally:
That is, if each client only sends flow to its least loaded neighbours.
We call the set the active neighbors of , and we call an edge active when . We extend the definition to sets of clients in the natural way, so for , .
3.2 Uniqueness of Loads in a Balanced Server Flow
Note that while there may be more than one server flow, we will show that the balanced server flow is unique, although there may be many possible values that realize .
Lemma 9.
A unique balanced server flow exists if and only if for all .
Clearly, it is necessary for all clients to have at least one neighbor for a server flow to exist, so the “only if” part is obvious. We dedicate the rest of this section to proving that this condition is sufficient. In fact, we provide two different proofs of uniqueness, the first of which is simpler but provides less intuition for what the unique values signify about the structure of the graph.
3.2.1 Short proof of Lemma 9 via convex optimization
It is not hard to prove uniqueness by showing that a balanced server flow corresponds to the solution to a convex program^{1}^{1}1The authors thank Seffi Naor for pointing this out to us.. Consider the convex optimization problem where the constraints are those of a not necessarily balanced server flow (Definition 6), and the objective function we seek to minimize is the sum of the squares of the server loads.
To be precise, the convex program contains a variable for each server , and a variable for each edge in the graph. Its objective is to minimize the function subject to the constraints:
It is easy to check that because we introduce a separate variable for each server load, the objective function is strictly convex, so the convex program has a unique minimum with respect to the server loads (but not the edge flows).
We now observe that this unique solution is a balanced server flow: the constraints of the convex program ensure that it is a server flow, and were it not balanced, there would be some client that sends nonzero flow to both and where , which would be a contradiction because we can decrease the objective function by increasing and decreasing . We have thus proved the existence of a balanced server flow.
We must now prove uniqueness, i.e. that all balanced server flows have the same server loads. We will do this by showing that any balanced server flow optimizes the objective function of the convex function. There are many standard approaches for proving this claim, but the simplest one we know of is based on existing literature on load balancing with selfish agents. In particular, we rely on the following simple auxiliary lemma, which is a simplified version of Lemma 2.2 in [30].
Lemma 10.
Consider any balanced server flow , let be the server flow of . Let be any feasible server flow, and let be the resulting server loads. Then, we always have:
Proof.
For any client , let ( for minimum) be the minimum server load neighboring in the balanced solution . That is, . We then have
where the last inequality follows from the fact that each client sends one unit of flow, and the beforelast inequality follows from the fact that the flow is balanced, so for any edge with we have .
From the definition of it follows that for any edge , we have . This yields:
We thus have and , which yields the lemma. ∎
We now argue that any balanced flow is an optimal solution to the convex program, and is thus unique. Consider any balanced flow with loads . To show that is optimum, we need to show that for any feasible solution we have . Equivalently, let and be the vectors of server loads in the two solutions. We want to show that . This follows trivially from Lemma 10, which is equivalent to .
3.2.2 Longer combinatorial proof of uniqueness
Although the reduction to convex programming is the most direct proof of uniqueness, it has the disadvantage of not providing any insight into what the unique values actually correspond to. We thus provide a more complicated combinatorial proof which shows that the correspond to a certain hierarchical decomposition of the graph.
The following lemma will help us upper and lower bound the sum of flow to a subset of servers.
Lemma 11.
If is a balanced server flow, then
Proof.
The first inequality is true because each client in the first set contributes exactly one to the sum (but there may be other contributions). The second inequality is true because every client contributes exactly one to , and the inequality counts every client that contributes anything to as contributing one. ∎
The first step to proving that every graph has a unique server flow is to show that the maximum value is uniquely defined. We start by showing that the generalization of Hall’s Marriage Theorem from Lemma 7 is “tight” for a balanced server flow in the sense that there does indeed exist a set of clients with neighbourhood of size realizing the maximum value . In fact, the maximally necessary servers and their active neighbours (defined below) form such a pair of sets:
Lemma 12.
Let be a balanced server flow, let be the maximal necessity, let be the maximally necessary servers, and let be their active neighbours. Then and .
Proof.
Let , and note that . However, we also have : By definition of , and since we assume the server flow is balanced, , and for every , . Thus, and . Now, note that by Lemma 11
∎ 
We can thus show that exactly equals the maximal quotient over subsets of clients.
Lemma 13.
Let be a balanced server flow, and let . Then
Furthermore, for any , if , then for all .
Proof.
By definition of server flow, for , , so . Let be defined as in Lemma 12. Then . Finally, if then for all implies for . ∎
Corollary 14.
If there is a unique balanced server flow.
Proof.
We are now ready to give a combinatorial proof of uniqueness. We will do so by showing that the in fact express a very nice structural property of the graph, which can be thought of as a hierarchy of ”tightness” for the Hall constraint. As shown in Lemma 13, the maximum value corresponds to the tightest Hall constraint, i.e. the maximum possible value of . Now, there may be many sets with , so let be a maximal such set; we will show that is in fact the union of all sets with . Now, by Lemma 13, every server has . We will show that in fact, because captured all sets with tightness , all servers have . Thus, because the flow is balanced, all active edges incident to or will be between and ; there will be no active edges coming from the outside. For this reason, any balanced server flow on can be the thought of as the union of two completely independent server flows: the first (unique) flow assigns to all , while the second is a balanced server flow on the remaining graph . Since this remaining graph is smaller, we can use induction on the size of the graph to claim that this second balanced server flow has unique values, which completes the proof of uniqueness. If we follow through the entire inductive chain, we end up with a hierarchy of values, which can be viewed as the result of the following peeling procedure: first find the (maximally large) set that maximizes and assign every server a value of ; then peel off and , find the (maximally large) set in the remaining graph that maximizes , and assign every server value ; peel off and and continue in this fashion, until every server has some value . These values assigned to each server are precisely the unique in a balanced server flow.
Remark 15.
We were unaware of this when submitting the extended abstract, but a similar hierarchical decomposition was used earlier to compute an approximate matching in the semistreaming setting: see [12], [21]. Note that unlike those papers, we do not end up relying on this decomposition for our main arguments. We only present it here to give a combinatorial alternative to the convex optimization proof above: regardless of which proof we use, once uniqueness is established, the rest of our analysis is expressed purely in terms of balanced server flows.
Proof of Lemma 9.
As already noted, for all is a necessary condition. We will prove that it is sufficient by induction on . If , the flow for is trivially the unique balanced server flow. Suppose now that and that it holds for all . Now let and let
Note that for any we have
(by definition of )  
(since )  
(by definition of )  
(since is submodular) 
so and thus and . If then (otherwise ) and by Corollary 14 we are done, so suppose . Consider the subgraph induced by and the subgraph induced by .
By Corollary 14, has a unique balanced server flow with for all .
By our induction hypothesis, also has a unique balanced server flow .
We proceed to show that the combination of with constitutes a unique balanced flow of the entire graph , defined as follows:
Note first that is a balanced server flow for , because both and have a set of values that realize them, and by construction these values (together with zeroes for each edge between and ) realize a balanced server flow for .
For uniqueness, note that by Lemma 13 any balanced server flow for must have for . We now show that for any