KwInitInitialize
We study the problem of online weighted bipartite matching in which we want to find a maximum weighted matching between two sets of entities, e.g. matching impressions in online media to advertisers.
Karp et al. [14] designed the elegant algorithm Ranking with competitive ratio for the unweighted case.
Without the commonly accepted Free Disposal assumption
In light of the hardness result of Kapralov et al. [12] that restricts beating the competitive ratio for Monotone Submodular Welfare Maximization problem, our result can be seen as an evidence that solving weighted matching problem is strictly easier than submodular welfare maximization in online settings.
1 Introduction
Matchings are fundamental structures in graph theory, and there has been a lot of interest in designing efficient algorithms to find maximum matchings in terms of either the cardinality or total weight of the allocation.
In particular, matchings in bipartite graphs have found numerous applications in any setting that coupling individuals from one set of entities with another set of entities is desired, e.g. students to schools, doctors to hospitals, computing tasks to servers, impressions (in online media) to advertisers, just to name a few.
Due to the outstanding growth of matching markets in digital domains,
online matching algorithms are in great demand today. In particular, search engine companies have introduced opportunities for online allocation and matching algorithms to have significant impacts in multibillion dollar advertising markets.
Motivated by these applications, we consider the problem of matching a set of impressions that arrive one by one to a set of advertisers that are given in advance. When an impression arrives, its edges to advertisers are revealed, and an irrevocable decision has to be made on which advertiser the impression should be assigned to
Free disposal assumption. In display advertising applications, advertisers are only happier if they receive more impressions
1.1 Our contributions
After more than 25 years from [14], it is still an open problem to find an online algorithm that beats the competitive ratio of classic greedy algorithm for weighted bipartite matching. We present the first online algorithm for weighted bipartite matching (under the free disposal assumption) with competitive ratio for some constant . In Section 2, we present Algorithm that achieves competitive ratio of at least (proved in Theorem 3.4). In Section 4, we slightly change our algorithm and propose an optimized algorithm to achieve a competitive ratio of with a tighter analysis. In light of the hardness result of Kapralov et al. [12] that restricts beating the competitive ratio for Monotone Submodular Welfare Maximization problem, our algorithm can be seen as a strong evidence that solving weighted matching problem is strictly easier than submodular welfare maximization in online settings.
1.2 Related work
Most of the literature on online weighted matching algorithms is devoted to achieving better than competitive ratios (usually or ) based on assuming either the advertisers have large capacities or some stochastic information regarding the arrival process of impressions is known. Due to the large number of interesting papers in this area, we only name a few leading works and refer interested readers to this survey [20].
Large capacities.
Exploiting large capacities assumption to beat competitive ratio dates back to two decades ago [11].
Feldman et al. present a primal dual algorithm [5] with competitive ratio assuming each advertiser has some large capacity that denotes the target number of impressions that can assigned to it (Display Ads problem). Under similar assumptions, the same competitive ratio was achieved [21, 1] for Budgeted Allocation problem in which advertisers have some budget constraint on the total weight that can be assigned to them instead of the number of impressions.
From the theoretical standpoint, perhaps the most interesting open problem in the online matching literature is to provide algorithms with competitive ratio above without making any assumption on capacities which is the focus of our work. Without loss of generality, we can assume that each advertiser has capacity one
Stochastic arrivals. If one can derive some information on the patterns that impressions arrive, much better algorithms can be designed. Some typical stochastic assumptions include assuming the impressions are drawn from some known or unknown distribution [6, 19, 13, 3, 9, 10] or the impressions arrive in random order [8, 2, 4, 18, 22]. These works achieve either competitive ratio if the large capacities assumption holds on top of the stochastic assumptions or at least competitive ratio with arbitrary capacities. Korula et al. [15] show that Greedy is competitive for the more general problem of Submodular Welfare Maximization if the impressions arrive in a random order without making any assumptions on capacities. The random order assumption is particularly justified here, as Kapralov et al. [12] show that beating for Submodular Welfare Maximization in the worst case arrival model is equivalent of proving .
Although there has been great progress on predicting patterns of impressions over time, we are far from coming up with some distribution that matches the future patterns of traffic. This challenge is especially highlighted and realized due to unpredictable patterns of impressions and traffic spikes [22] and limits the applicability of known optimization methods for stochastic settings [23]. Therefore we focus on the general weighted matching problem that no assumption is made on the types of impressions and their arrival sequence.
1.3 Preliminaries
Let be the set of advertisers, be the set of impressions, and denote the nonnegative weight of edge between impression and advertiser . To simplify notation, we put an edge of weight zero between and if there is no edge between them. The set of advertisers are given in advance, and the set of impressions arrive one by one at time steps where time represents the end of the time (algorithm). We do not assume that is known to the algorithm. Let be the time that arrives. The weights of all edges incident to , namely for all , are revealed to the algorithm at time , and the algorithm has to assign to one of the advertisers at this point. This is an irrevocable decision, and cannot be changed later. At the end of the algorithm, if more than one impression is assigned to a single advertiser, only the impression with the maximum weight is kept and the rest are not counted towards the total weight of the allocation. The objective is to maximize the total weights of maximum edges assigned to each advertiser, i.e. where is the set of impressions assigned to .
We present randomized online Algorithm (pseudocode provided as Algorithm 2.2) to solve this problem in Section 2. Following we present notations used in and some intuition behind the main ideas of this algorithm. We provide figures and examples to make our algorithm more comprehensible. To evaluate how much marginal value (increment in the objective function) can be achieved by assigning some impression to an advertiser at every point throughout the algorithm, one needs to keep track of the maximum weight assigned to so far by Algorithm . So we let denote the maximum weight assigned to by Algorithm up to time (including ) for any . Given that is a randomized algorithm, are random variables. Since assignments are done at times , we define to be zero. Now we define random variable to be the marginal gain of assigning impression to advertiser when arrives, i.e. where is defined to be for any real number , and is the arrival time of impression . We note that is a random variable that depends on the choices makes up to time (before arrives). In Figure 1, we show an execution of Algorithm . The arrows show the assignments of the impressions to advertisers which depended on a) the bipartite graph between impressions and advertisers, b) the arrival order of impressions, and c) the results of coin tosses during the execution of the algorithm. Impressions are arriving in order . In this execution, is equal to where is the arrival time of impression . The marginal gain of assigning impression to a, , is equal to .
We let be the expected total weight achieved by . Since only the maximum weight edges assigned to each advertiser incorporate into the total weight of the allocation, is . Since is a randomized algorithm, is a random variable that depends on the coin tosses up to time . We can also interpret the total weight by letting where is how much the assignment of impression increases the total weight of the allocation in expectation. We note than is not a random variable instead it is the expected value of some random variable (the increment of objective by assignment of ).
We let be the maximum weight matching of the instance, and let be the advertiser impression is assigned in . We slightly abuse notation of , and let it be the weight of this allocation as well, i.e. . For the sake of analysis, we can add large enough number of dummy impressions/advertisers with edges of weight zero to all advertisers/impressions, so we can assume that all nodes (impressions and advertisers) are matched in the optimum solution. For the sake of Algorithm , we also add another dummy impression with weight for each for initialization. The competitive ratio of is defined to be the ratio .
For a random variable and some event , we define to be where is the expected value of conditioned on event . We use the following properties of operators and throughout this work. The proofs of these properties are simple and deferred to Section 5.
Lemma 1.1.
For any three real numbers , and , we have:
Lemma 1.2.
For any disjoint events , and a nonnegative random variable , we have: . If these disjoint events span the probability space , the inequality can be replaced by equality even if is not necessarily nonnegative.
1.4 When there is only one good candidate
It is folklore that algorithm Greedy achieves a competitive ratio of , and its proof is essentially showing that Greedy achieves at least by assigning impression . So we should look at as a baseline for assigning , and in case we can beat this baseline, we can hope for breaking the competitive ratio barrier. Although it is hard to find the optimum match , one can find the advertiser with maximum marginal gain for , i.e. . Assigning to this advertiser yields the possibility of beating barrier by some amount proportional to the gap . Now let us show a situation we can exploit this possibility. Assume upon arrival of , there is a gap between the maximum and second maximum . Formally assume for some constant that is at least where is . We note that all variables can be computed including the maximum and second maximum values. In this special case, assigning to indeed helps us beat the barrier by some constant proportional to . The reason is that

either is not equal to which means we are beating the baseline by at least some multiplicative factor,

or which means our assignment is consistent with the optimum solution assignment of impression . We provide simple arguments to show that consistency with optimum solution also helps us beat the barrier by some amount proportional by how much we are consistent with .
Figure 2 illustrates a situation in which the maximum for impression is twice the second maximum for it.
The above discussion suggests that we can focus on cases that there are at least two reasonable candidate advertisers (the two highest values) to match impression to. Next we show how having more than one option provides some flexibility to be adaptive and gain more value.
1.5 Adaptivity by exploiting multiple good candidates
We cannot always hope that there is a significant gap between the highest and second highest values of a newly arrived impression . We should have an alternative solution in case the two highest values are close. Having two good options helps us make a randomized decision, and introduce some flexibility to the system. For instance, assume that we flip a coin and assign to each of the top two advertisers with probability. We show how this kind of randomization might help by an example in Figure 3. All six edges in this instance have weight . The three impressions arrive from left to right. Impression is assigned to or each with probability , and is assigned to or similarly. Since we are making randomized assignments, we can and will choose top advertiser candidates to assign a newly arrived impression based on expectation value of instead of their current values which depend on the coin tosses so far. For instance, impression has two options and each with expected marginal gain where could be either of or .
It is not hard to prove that the standard competitive ratio proof of Greedy goes through if we use expected values instead of their actual realized values. But using expected value has one main advantage. If we have two options with high expected values of , we can occasionally look at their realized values in a controlled way and exploit the gap between them to get a better result. For instance, in the example of Figure 3, there is a good chance () that no impression is assigned to at least one of or when arrives. So impression can exploit this opportunity and improve its gain from the average to by looking at the previous coin tosses and the allocations made so far. We call these types of extra marginal gains Adaptive Gains denoted by .
Since making an adaptive decision brings many new challenges such as unexpected changes in the distribution of many values when conditioning on some specific event, we should be very careful in how and when make an adaptive decision. We elaborate more in Subsection 1.6.
1.6 Making adaptive decisions
We start by showing an example of how making arbitrary adaptive decisions can lead us to complex situations in which we cannot achieve much extra marginal gain. In Figure 4, we may be tempted to assign impression to advertiser conditional on the event that was assigned to (and therefore not ). This might seem like a good idea since in the absence of , advertiser might have more value to offer. But suppose in assigning impression in the past, we also made an adaptive decision. This means that we did assign to for a reason. For instance, impression may have been assigned to before arrived, and therefore advertiser was more appealing to than advertiser . At the end, it is not clear whether conditioning on the event that is assigned to does not have a negative effect on let alone improving it.
These types of cascading effects are very natural and one of the main challenges in proving competitive ratios for online matching problems. Our solution is simple. After choosing two top choices and for impression , we flip a coin and we choose each of the following three options with probability . We assign to

nonadaptively (without looking at the actual values and previous coin tosses) with probability ,

nonadaptively with probability

one of and adaptively by looking at some of the nonadaptive assignments made in the past with the remaining probability.
We do not have the cascading effect with this change since our adaptive choice depends on some nonadaptive assignment. For example, conditioning on the event that is assigned to in the first case does not change the distribution of variables for any where is the arrival time of . Note that in the first two cases and we do not look at the coin tosses results, so any allocation in these two cases do not change the decisions of the algorithm up to this point.
Now suppose in assigning , we end up in case to make an adaptive decision based on some event of the form impression (that has arrived before ) was assigned in case . Conditioning on this event does not change the distribution of the allocations before impression arrived. The problem is there could be impressions that arrive between and that can interfere and change distribution of and and destroy the potential we are looking for. To resolve this issue, we need to introduce some locking policy for advertisers. In particular, whenever an impression makes an adaptive decision, it will lock its two choices and . We do this by setting and to in the algorithm. It is possible for future impressions to acquire the possession of or , but they should pass some specific requirements to be eligible. This locking policy helps up control the behaviour of impressions between and and consequently yields the opportunity of gaining the extra adaptive gain.
2 Algorithm
We propose randomized algorithm whose pseudocode is presented as Algorithm 2.2. takes two input parameters and which we set later in the analysis section. To simplify the description of the algorithm (given its many variables and details), we focus on the main components and elaborate on each variable later.
At time , impression arrives. The main goal is to find two candidate advertisers and to achieve some average of their expected values and also exploit adaptivity to get some extra value if possible. We defer the computation of these expected values to Section 5 (Lemma 5.1) and assume that they are available. We should also note that since we can run multiple independent simulations of our randomized algorithm simultaneously, we can estimate expected values of any bounded random variable including any variable with arbitrary precision. So readers that are satisfied with accurate estimations of variables can skip Lemma 5.1. We start by setting a benchmark of maximum expected marginal gain, . Please note that if we assign to the advertiser that maximizes this expression, our expected gain will be . Therefore we should not consider options that provide value much less than this threshold, e.g. say less than .
If we find two candidates and , and assign to each with probability, we get expected value. We define some notion of adaptive gain (set in lines of ) to show that how much extra value on top of this average is achievable if we assign adaptively. We will elaborate later why we have such a complex formula for definition of in line of Algorithm 2.2. We will also explain later why we enforce the condition:
(1) 
For now, it is important to note that when impression arrives, we can compute for any advertiser . This is a well defined computable quantity that does not depend on any of the coin tosses as elaborated later in Remark 2.1. Therefore unlike , variable is not a random variable. The second very important thing to note is that if we manage to find two candidates and and assign to them in lines , the value we achieve is at least . This is proved formally in Lemma 3.5. In line of our algorithm, we form a set of good candidates that satisfy condition 1 and also have . The latter condition is intentionally not consistent with the lower bound proved in Lemma 3.5 to compensate for the extra gain we need in the analysis for some of the impressions.
If there are at least two candidates in , we assign in lines . Otherwise we jump to line , and assign in lines . Let us focus on the former case.
2.1 Adaptive choices in lines
In lines , and are chosen as the top two candidates in terms of . We need to explain what each of the variables , , and represent to fully understand the logic of the assignment in lines .

variables: Whenever an impression is assigned lines to advertisers and , we will make these two advertisers partner of each of other. We set and . So right before arrives, represents the last advertiser that were coupled with , and some impression were assigned to these two in lines . The same holds for . For the sake of completeness of the algorithm, initialize each with indicating that advertisers have no partners at the start of the algorithm.

variables: The can be either or . Color for an advertiser means no adaptive gain can be achieved on it. That is why we set to zero if is in line . We start by setting all colors to because at the beginning no advertiser has received any impression so clearly there exists no opportunity of adaptive gain. But this is not the only case we cannot count on any adaptive gain from an advertiser. Let us explain this by an example. Suppose and became partners when impression was assigned. Later on arrives and we try to assign to and in lines . We claim that after arrival of , we cannot gain the from advertiser until it partners with some other advertiser. Note that conditioning on assignment of to (or the complement of this event) will effect the distribution of as well since is also making an adaptive decision. This could potentially have cascading effects which we want to avoid. Therefore when is assigned to and in lines we change the color of their partners to in line to make sure that values of and are set to zero until they partner with some other advertiser. One special case is when and were partners of each other before arrived in which case their colors should remain (this is taken care of in the if condition of line ).
With this definition of color variable, an advertiser is if since the last time it was partnered with some other advertiser in line , its partner, , was not selected as a candidate to be assigned to in lines .

variables: Whenever we assign an impression to and in lines , we set and to indicating that was the last impression that chose them for a possibly adaptive assignment. We say possibly because in lines , we may or may not assign adaptively depending on the coin tosses and some other conditions.
We are ready to explain the allocation steps in lines . The allocation itself occurs in lines . All the updates of variables , and in lines are to make sure that we keep the values updated and therefore we can use them in the if condition of line . We basically use as a filter to make sure that is set to zero for any advertiser . We also use variable in some other lines but and variables are not used in any other part of the algorithm except the color filtering of line . We have already explained how we keep , and variables updated in lines . So we focus on the allocation in lines .
Variable is picked uniformly at random from range . We start by allocation rules of lines and which are simpler to explain. With probability (associated with ), we assign to . We note that this step is done without looking at any of the coin tosses in the past. So it is a nonadaptive allocation. It will be useful to save for future steps that we have made a nonadaptive assignment of to . Therefore we set variable to . Formally means that the last time some impression chose as one of its two candidates to match in lines , a nonadaptive choice was made and the impression was assigned to . Setting to is in some sense complement of this case. means the last time an impression chose to match in lines , it made a nonadaptive choice, however the impression was matched to and not .
With another probability of (associated with ), impression is assigned to nonadaptively, and the Mark variables are set accordingly.
With the remaining probability (associated with ), impression is assigned adaptively. We first note that and are both set to indicating that an adaptive choice was made here. To simplify the description of the algorithm, assume . This means is set to in line . The assignment of is conditioned on the allocation of where is the last impression that chose to match in lines . We note that is essentially right before arrives. This is how we make our assignment conditioned on some past events. If was matched nonadaptively to , we make an adaptive choice here, and assign to . We note that this case is associated with . This intuitively makes sense since knowing was assigned to decreases the potential gain of assigning to therefore we should consider as a better option. Similarly, we assign to if is equal to . The last case is when is . Since we do not want to deal with cascading conditional probability events, we do not make an adaptive choice in this case. We assign to either of or each with probability using a separate coin toss independent of . We do the same if is zero which is just a corner case. We note that means is also zero, therefore we do not need to achieve any extra gain, and just achieving the average suffices which can be done with a simple assignment to and .
2.2 When there are not enough choices to be adaptive: lines
If set has at most one advertiser, we have no choice but making some nonadaptive assignments. The high level idea is that we want to choose at most two candidate advertisers to match and ensure both of the following two conditions are met:

The advertiser with maximum expected gain is chosen as one of the candidates.

If is not empty, and the only advertiser in has expected gain at least , we want it to be chosen.
We note that with the definition of and in lines and , their union consists of all advertisers with expected gain at least . Therefore if the condition of line (the if condition) holds, the advertiser is the only advertiser meeting this threshold. We assign to this advertiser and will not consider a second option. Otherwise, we pick the only advertiser in (if any exists) as a candidate and pick one or two other advertisers with the highest expected gains to form two candidates and . We assign to each of and with probability. The only remaining detail is variable which keeps the expected marginal gains assigned to since the last time was chosen as a candidate in lines . In particular, in lines , we reset and to zero, and in lines we increment it according to the expected gain that is achieved by assigning to it. We note that the coefficients in line is consistent with the fact that is assigned to each of and with probability, and this coefficient is in line . This concludes the description of Algorithm 2.2.
[H] \KwInit and \ForAll Let be the impression that arrives at time , i.e. \ForAll \If \lElse Let be a uniformly random real number from interval \eIf(\tcp*[f]Enough choices to exploit adaptivity) \ForAll and \lIf and \eIf OR \lIf(\tcp*[f]Adaptive Decision)Assign to \lIf(\tcp*[f]Adaptive Decision)Assign to \If OR With prob. assign to With prob. assign to Set both and to \lIf Assign to and set and \lIf Assign to and set and \tcpNote \eIf(\tcp*[f]In this case, does not exist) Assign to the \lIf the only advertiser in \lElse and \lIf assign to \lElse assign to
2.3 Intuition behind , constraint 1 and the analysis
Variable represents how much extra marginal gain we can achieve by making our allocation adaptive. Suppose Algorithm chooses as one of the two candidate advertisers for impression and assigns in lines . Also suppose that is the top choice in set , i.e. and is also equal to . If we assign to without any knowledge of the past allocations, we achieve value. But we make an adaptive choice with probability . The adaptive choice for impression depends on the value of . In particular, in the adaptive case, we assign to if . We note that indicates that was assigned to (and not ) in a nonadaptive manner. Knowing that was not assigned to can increase the expected of up to the amount of gain was receiving from . Formally the expected value of conditioned on can increase by some value proportional to . This is why we have as the first term in the definition of . The coefficients are not very important here and they are the result of how we are making each decision with a certain probability. All details are provided in the proof of Lemma 3.5.
One simple case that makes this point very clear is when impressions and arrive consecutively. In this case, knowing that is not assigned to will indeed increase the random variable by at least in expectation if . First of all, we know that impression is assigned to with probability at least (look at the nonadaptive allocation in lines ). So conditioning on not being assigned to will introduce at least extra value available for future impressions including to exploit on advertiser . We should also note that this value is fully realized only if the weight of the edge from to is at least as high as the weight of the edge from to . For example in Figure 5, if is much larger that , knowing that is not assigned to will not help us achieve extra value proportional to or .
The extra gain for will be limited to its weight (possibly much lower than ). This is the reason we have the negative term in line to discount for cases that has a smaller weight to than . Our adaptive choices are conditioned on nonadaptive cases like or . Therefore we can limit the potential cascading effects of these events, and prove that the extra is indeed achievable. The only remaining assignments that might interfere are the impressions that choose to match in lines between the times that and arrive. In Figure 5, the two impressions and have chosen as one of their candidate advertiser(s) to match in lines . This interference is tracked by variable (the aggregate values attributed to ). We discount this interference in the definition of by deducting some term.
Remark 2.1.
We want to elaborate on the coin tosses and what they can change throughout the algorithm. First of all note that, the most important component of the algorithm, is the expectation of some random variable. So by definition it does not depend on the coin tosses. We also know that coin tosses (random number ) only effect our decisions in lines and also . By induction on time , we can show that variables and and also set do not depend on the coin tosses. To clarify, note that at time depends on values of time ; set at time depends on at time , and at time depends on set at time which makes the induction feasible. So these quantities are not random variables and can be determined without the knowledge of coin tosses.
So far we have explained why the definition of represents the potential extra gain we could achieve on top of the values. But one important question remains. Why should we enforce Constraint 1? The answer to this question sheds some light on how our analysis work. So we will explain it here before going through the details in Section 3.
For every impression , the threshold is how much we should achieve to get the competitive ratio. So we have to beat this bound by some constant factor to be able to beat the