Online Bipartite Matching with Decomposable Weights

# Online Bipartite Matching with Decomposable Weights

## Abstract

We study a weighted online bipartite matching problem: is a weighted bipartite graph where is known beforehand and the vertices of arrive online. The goal is to match vertices of as they arrive to vertices in , so as to maximize the sum of weights of edges in the matching. If assignments to cannot be changed, no bounded competitive ratio is achievable. We study the weighted online matching problem with free disposal, where vertices in can be assigned multiple times, but only get credit for the maximum weight edge assigned to them over the course of the algorithm. For this problem, the greedy algorithm is -competitive and determining whether a better competitive ratio is achievable is a well known open problem.

We identify an interesting special case where the edge weights are decomposable as the product of two factors, one corresponding to each end point of the edge. This is analogous to the well studied related machines model in the scheduling literature, although the objective functions are different. For this case of decomposable edge weights, we design a 0.5664 competitive randomized algorithm in complete bipartite graphs. We show that such instances with decomposable weights are non-trivial by establishing upper bounds of 0.618 for deterministic and for randomized algorithms.

A tight competitive ratio of was known previously for both the 0-1 case as well as the case where edge weights depend on the offline vertices only, but for these cases, reassignments cannot change the quality of the solution. Beating 0.5 for weighted matching where reassignments are necessary has been a significant challenge. We thus give the first online algorithm with competitive ratio strictly better than 0.5 for a non-trivial case of weighted matching with free disposal.

\globtoksblk\prooftoks

1000 \AtEndEnvironmenttheorem \AtEndEnvironmentlemma \AtEndEnvironmentfact \AtEndEnvironmentproposition \AtEndEnvironmentclaim \AtEndEnvironmentobservation \AtEndEnvironmentcorollary \AtEndEnvironmentremark \AtEndEnvironmentconjecture \AtEndEnvironmentdefinition

## 1 Introduction

In recent years, online bipartite matching problems have been intensely studied. Matching itself is a fundamental optimization problem with several applications, such as matching medical students to residency programs, matching men and women, matching packets to outgoing links in a router and so on. There is a rich body of work on matching problems, yet there are basic problems we don’t understand and we study one such question in this work. The study of the online setting goes back to the seminal work of Karp, Vazirani and Vazirani  [26] who gave an optimal competitive algorithm for the unweighted case. Here is a bipartite graph where is known beforehand and the vertices of arrive online. The goal of the algorithm is to match vertices of as they arrive to vertices in , so as to maximize the size of the matching.

In the weighted case, edges have weights and the goal is to maximize the sum of weights of edges in the matching. In the application of assigning ad impressions to advertisers in display advertisement, the weights could represent the (expected) value of an ad impression to an advertiser and the objective function for the maximum matching problem encodes the goal of assigning ad impressions to advertisers to as to maximize total value. If assignments to cannot be changed and if edge weights depend on the online node to which they are adjacent, it is easy to see that no competitive ratio bounded away from 0 is achievable.

Feldman et al [18] introduced the free disposal setting for weighted matching, where vertices in can be assigned multiple times, but only get credit for the maximum weight edge assigned to them over the course of the algorithm. (On the other hand, a vertex in can only be assigned at the time that it arrives with no later reassignments permitted). [18] argues that this is a realistic model for assigning ad impressions to advertisers. The greedy algorithm is competitive for the online weighted matching problem with free disposal. They study the weighted matching problem with capacities – here each vertex is associated with a capacity and gets credit for the largest edge weights from vertices in assigned to . They designed an algorithm with competitive ratio approaching as the capacities approach infinity. Specifically, if all capacities are at least , their algorithm gets competitive ratio where . If all capacities are 1, their algorithm is -competitive.

Aggarwal et al [1] considered the online weighted bipartite matching problem where edge weights are only dependent on the end point in , i.e. each vertex has a weight and the weight of all edges incident on is . This is called the vertex weighted setting. They designed a competitive algorithm. Their algorithm can be viewed as a generalization of the Ranking algorithm of [26].

It is remarkable that some basic questions about a fundamental problem such as matching are still open in the online setting. Our work is motivated by the following tantalizing open problem; Is it possible to achieve a competitive ratio better than for weighted online matching ? Currently no upper bound better than is known for the setting of general weights – in fact this bound holds even for the setting of 0-1 weights. On the other hand, no algorithm with competitive ratio better than 0.5 (achieved by the greedy algorithm) is known for this problem. By the results of [18], the case where the capacities are all 1 seems to be the hardest case and this is what we focus on.

### 1.1 Our results

Our algorithm uses a now standard randomized doubling technique [8, 20, 12, 23]; however the analysis is novel and non-trivial. We perform a recursive analysis where each step proceeds as follows: We lower bound the profit that the algorithm derives from the fastest machine (i.e. the load of the largest job placed on it) relative to the difference between two optimum solutions - one corresponding to the original instance and the other corresponding to a modified instance obtained by removing this machine and all the jobs assigned to it. This is somewhat reminiscent of, but different from the local ratio technique used to design approximation algorithms. Finally, to exploit the randomness used by the algorithm we need to establish several structural properties of the worst case sequence of jobs – this is a departure from previous applications of this randomized doubling technique. While all previous online matching algorithms were analyzed using a local, step-by-step analysis, we use a global technique, i.e. we reason about the entire sequence of jobs at once. This might be useful for solving the case of online weighted matching for general weights. The algorithm and analysis is presented in Section 3 and an outline of the analysis is presented in Section 3.1.

A priori, it may seem that the setting of decomposable weights ought to be a much easier case of weighted online matching since it does not capture the well studied setting of 0-1 weights. We show that such instances with decomposable weights are non-trivial by establishing an upper bound of on the competitive ratios of deterministic algorithms (Section 4) and an upper bound of 0.8 on the competitive ratio of randomized algorithms (Section 5). The deterministic upper bound constructs a sequence of jobs that is the solution to a certain recurrence relation. Crucial to the success of this approach is a delicate choice of parameters to ensure that the solution of the recurrence is oscillatory (i.e. the roots are complex). In contrast to the setting with capacities, for which a deterministic algorithm with competitive ratio approaching exists [18], our upper bound of ( for deterministic algorithms shows that no such competitive ratio can be achieved for the decomposable case with unit capacities. Note that the upper bound of for the unweighted case [26] is for randomized algorithms and does not apply to the setting of decomposable weights that we study here.

In contrast to the vertex weighted setting (and the special case of 0-1 weights) where reassignments to vertices in cannot improve the quality of the solution, any algorithm for the decomposable weight setting must necessarily exploit reassignments in order to achieve a competitive ratio bounded away from 0. For this class of instances, we give an upper bound approaching 0.5 for the competitive ratio of the greedy algorithm. This shows that for decomposable weights greedy’s performance cannot be better than for general weights, where it is 0.5-competitive (Section 2).

### 1.2 Related work

Goel and Mehta [21] and Birnbaum and Mathieu [10] simplified the analysis of the Ranking algorithm considerably. Devanur et al [15] recently gave an elegant randomized primal-dual interpretation of [26]; their framework also applies to the generalization to the vertex weighted setting by [1]. Haeupler et al [24] studied online weighted matching in the stochastic setting where vertices from are drawn from a known distribution. The stochastic setting had been previously studied in the context of unweighted bipartite matching in a sequence of papers [19, 28]. Recent work has also studied the random arrival model (for unweighted matching) where the order of arrival of vertices in is assumed to be a random permutation: In this setting, Karande, at al [25] and Mahdian and Yan [27] showed that the Ranking algorithm of [26] achieves a competitive ratio better than . A couple of recent papers analyze the performance of a randomized greedy algorithm and an analog of the Ranking algorithm for matching in general graphs [32, 22]. Another recent paper introduces a stochastic model for online matching where the goal is to maximize the number of successful assignments (where success is governed by a stochastic process) [30].

A related model allowing cancellation of previously accepted online nodes was studied in [13, 7, 4] and optimal deterministic and randomized algorithms were given. In their setting the weight of an edge depends only on the online node. Additionally in their model they decide in an online fashion only which online nodes to accept, not how to match these nodes to offline nodes. If a previously accepted node is later rejected, a non-negative cost is incurred. Since the actual matching is only determined after all online nodes have been seen, their model is very different from ours: Even if the cost of rejection of a previously accepted node is set to 0, the key difference is that they do not commit to a matching at every step and the intended matching can change dramatically from step to step. Thus, it does not solve the problem that we are studying.

A related problem that has been studied is online matching with preemption [29, 3, 16]. Here, the edges of a graph arrive online and the algorithm is required to maintain a subset of edges that form a matching. Previously selected edges can be rejected (preempted) in favor of newly arrived edges. This problem differs from the problem we study in two ways: (1) the graph is not necessarily bipartite, and (2) edges arrive one by one. In our (classic) case, vertices arrives online and all incident edges to a newly arrived vertex are revealed when arrives.

Another generalization of online bipartite matching is the Adwords problem [31, 14]. In addition, several online packing problems have been studied with applications to the Adwords and Display Advertisement problem [11, 21, 2].

### 1.3 Notation and preliminaries

We consider the following variant of the online bipartite matching problem. The input is a complete bipartite graph along with two weight functions and . The weight of each edge is the product . At the beginning, only is given to the algorithm. Then, the vertices of arrive one by one. When a new vertex arrives, is revealed and the algorithm has to match it to a vertex in . At the end, the reward of each vertex is the maximum weight assigned to times . The goal of the algorithm is to maximize the sum of the rewards. To simplify the presentation we will call vertices of machines and vertices of jobs. The -value of a machine will be called the speed of the machines and the -value of a job is called the size of the job. Thus, the goal of the online algorithm is to assign jobs to machines. However, we are not studying the “classic” variant of the problem since we are using a different optimization function, motivated by display advertisements.

## 2 Upper bound for the greedy algorithm

We begin by addressing an obvious question, which is how well a greedy approach would solve our problem, and using the proof to provide some intuition for our algorithm in the next section. We analyze here the following simple greedy algorithm: When a job arrives, the algorithm computes for every machine the difference between the weight of and the weight , where is the job currently assigned to . If this difference is positive for at least one machine, the job is assigned to a machine with maximum difference.

###### Theorem 1.

The competitive ratio of the greedy algorithm is at most for any .

###### Proof.

Consider the following instance. consists of a vertex with and vertices with . consists of the following vertices arriving in the same order where . We will prove by induction that all vertices are assigned to . When arrives, nothing is assigned so it is assigned to . Assume that all the first vertices are assigned to when arrives. The gain by assigning to is . The gain by assigning to some is . Thus, the algorithm can assign to . The total reward of the algorithm is . The optimal solution is to assign to and the rest to ’s, getting . Thus, the competitive ratio is at most . ∎

The instance used in the proof above suggests some of the complications an algorithm has to deal with in the setting of decomposable weights: in order to have competitive ratio bounded away from , an online algorithm must necessarily place some jobs on the slow machines. In fact it is possible to design an algorithm with competitive ratio bounded away from for the specific set of machines used in this proof (for any sequence of jobs). The idea is to ensure that a job is placed on the fast machine only if its size is larger than times the size of the largest job currently on the fast machine (for an appropriately chosen parameter ). Such a strategy works for any set of machines consisting of one fast machine and several slow machines of the same speed. However, we do not know how to generalize this approach to an arbitrary set of machines. Still, this strategy (i.e. ensuring that jobs placed on a machine increase in size geometrically) was one of the motivations behind the design of the randomized online algorithm to be presented next.

## 3 Randomized algorithm

We now describe our randomized algorithm which uses a parameter we will specify later: The algorithm picks values uniformly and at random, independently for each machine . Each job of weight considered by machine is placed in the unique interval where ranges over all integers. When a new job arrives, the algorithm checks the machines in the order of decreasing speed (with ties broken in an arbitrary but fixed way). For machine it first determines the unique interval into which falls, which depends on its choice of . If the machine currently does not have a job in this or a bigger interval (with larger ), is assigned to and the algorithm stops, otherwise the algorithm checks the next machine.

The following function arises in our analysis:

###### Definition 2.

Define
where and is the Lambert W function (i.e. inverse of ).

We will prove the following theorem:

###### Theorem 3.

For , the randomized algorithm has competitive ratio . In particular, for , the randomized algorithm has a competitive ratio .

### 3.1 Analysis Outline

We briefly outline the analysis strategy before describing the details. An instance of the problem consists of a set of jobs and a set of machines. The (offline) optimal solution to an instance is obtained by ordering machines from fastest to slowest, ordering jobs from largest to smallest and assigning the th largest job to the th fastest machine. Say the machines are numbered , from fastest to slowest. Let denote the value of the optimal solution for the instance seen by the machines from onwards, i.e. the instance consisting of machines , and the set of jobs passed by the st machine to the th machine in the online algorithm. Then , the value of the optimal solution for the original instance. Even though we defined to be the value of the optimal solution, we will sometimes use to denote the optimal assignment, although the meaning will be clear from context. Define to be 0. For , is a random variable that depends on the random values picked by the algorithm for . In the analysis, we will define random variables such that (see Lemma 4 later). Let denote the profit of the online algorithm derived from machine (i.e. the size of the largest job assigned to machine times the speed of machine ). Let be the value of the solution produced by the online algorithm. We will prove that for ,

 (1)

for a suitable choice of . The expectations in (1) are taken over the random choices of machine . Note that is a random variable, but the sum of these quantities for is , a deterministic quantity. Summing up (1) over , we get , proving that the algorithm gives an approximation.

Inequality (1) applies to a recursive application of the algorithm to the subinstance consisting of machines and the jobs passed from machine to machine . The subinstance is a function of the random choices made by the first machines. We will prove that for any instance of the random choices made by the first machines,

 (2)

Here, the expectation is taken over the random choice of machine . (2) immediately implies (1) by taking expectation over the random choices made by the first machines.

We need to establish (2). In fact, it suffices to do this for and the proof applies to all values of since (2) is a statement about a recursive application of the algorithm. Wlog, we normalize so that the fastest machine has speed and the largest job is . Note that this is done by simply multiplying all machine speeds by a suitable factor and all job sizes by a suitable factor – both the LHS and the RHS of (2) are scaled by the same quantity.

In order to compare with the profit of the algorithm, we decompose the instance into a convex combination of simpler threshold instances in Lemma 4. Here, the speeds are either all the same or take only two different values, 0 and 1. It suffices to compare the profit of the algorithm to OPT on such threshold instances.

Intuitively, if there are so few fast machines that even a relatively large job (job of weight at least 1) got assigned to a slow machine in OPT, then the original instance is mostly comparable to the threshold instance where only a few machines have speed 1 and the rest have speed 0. Even if the fastest machine gets jobs assigned to machines of speed 0 in OPT, this does not affect the profit of the algorithm relative to OPT because OPT does not profit from these jobs either. Thus we only care about jobs of weight at least 1. Because a single machine can get at most two jobs of value in the range , handling this case only requires analyzing at most two jobs. The proof for this case is contained in Lemma 7.

On the other hand, if there are a lot of fast machines so that all large jobs are assigned to fast machines in OPT, then the original instance is comparable to the threshold instance where all machines have speed 1. In this case, the fastest machine can get assigned many jobs that all contribute to OPT. However, because all speeds are the same, we can deduce the worst possible sequence of jobs: after the first few jobs, all other jobs have weights forming a geometric sequence. The rest of the proof is to analyze the algorithm on this specific sequence. The detailed proof is contained in Lemma 9.

The proofs of both Lemmata 7 and 9 use the decomposable structure of the edge weights.

### 3.2 Analysis Details

Recall that is the value of the optimal solution for the instance, and is the value of the optimal solution for the subinstance seen by machine 2 onwards. Assume wlog that all job sizes are distinct (by perturbing job sizes infinitesimally). For , let be the size of the largest job or 0, if no such job exists. Let be the speed of the machine in the optimal solution that is assigned to or 0 if . If there is a job of size then is the speed of the machine in the optimal solution that this job is assigned to. Note that is monotone increasing with . We refer to the function as the speed profile. Note that is not a random variable. Let the assignment sequence denote the set of jobs assigned to the fastest machine by the algorithm where . Let denote the maximum element in the sequence , i.e. . In Lemma 3, we bound by a function that depends only on , , and . Such a bound is possible because of the fact that any job can be assigned to any machine, i.e. the graph is a complete graph. The value we take for the aforementioned random variable turns out to be exactly this bound.

###### Proof.

Let , be the instances corresponding to and . is obtained from by removing the fastest machine and the set of jobs that are assigned to the fastest machine by the algorithm. Let us consider changing to in two steps: (1) Remove the fastest machine and the largest job assigned by the algorithm to the fastest machine. (2) Remove the jobs . For each step, we will bound the change in the value of the optimal solution resulting in a feasible solution for and computing its value – this will be a lower bound for .

First we analyze Step 1: assigns the largest job to the fastest machine, contributing to its value. The algorithm assigns to the fastest machine instead of . In , was assigned to a machine of speed . When we remove and the fastest machine from , one possible assignment to the resulting instance is obtained by placing on the machine of speed . The value of the resulting solution is lower by exactly .

Next, we analyze Step 2: Jobs were assigned to machines of speeds in . When we remove jobs , one feasible assignment for the resulting instance is simply not to assign any jobs to the machines , and keep all other assignments unchanged. The value of the solution drops by exactly .

Thus we exhibited a feasible solution to instance of value where

 OPT1−V=c−(c−w)s(w)+∑k≥1wk⋅s(wk).

But . Hence, the lemma follows. ∎

We define the random variable , a function of the assignment sequence and the speed profile , to be

 Δ1(w,s)=c−(c−w)s(w)+∑k≥1wk⋅s(wk).

As defined, . We note that even though and are functions of all the jobs in the instance, only depends on the subset of jobs assigned to the fastest machine by the algorithm. Our goal is to show .

First, we argue that it suffices to restrict our analysis to a simple set of step function speed profiles , : For , for and for . For , for all .

###### Lemma 5.

Suppose that for and for all such that there exists a job of weight , we have

 E\displaylimits[max(w)] ≥αE\displaylimits[Δ1(w,st)] (3)

Then, .

###### Proof.

Consider function defined as follows: for and for . Note that is not a random variable. We claim that . Since the largest job assigned to the fastest machine is , the term is unchanged in going from to . Further, the terms in are the corresponding terms in .

It is easy to see that is a convex combination of the step functions , . More specifically, for suitably chosen coefficients such that (a) and (b) if and no job with weight exists.

For a fixed assignment sequence , note that . Hence, for a distribution over assignment sequences ,

 E\displaylimits[Δ1(w,s′)]=E\displaylimits[OPT1(w,∑tptst)]=∑tpt⋅E\displaylimits[Δ1(w,st)]

Now, suppose that for all such that there exists a job of weight and for

 E\displaylimits[max(w)] ≥αE\displaylimits[Δ1(w,st)]. This implies that E\displaylimits[max(w)] ≥α∑tpt⋅E\displaylimits[Δ1(w,st)]=αE\displaylimits[Δ1(w,s′)] ≥αE\displaylimits[Δ1(w,s)]≥α(E\displaylimits[OPT1]−E\displaylimits[OPT2])

Note that since we scaled job sizes, the thresholds (i.e interval boundaries) should also be scaled by the same quantity (say ). After scaling, let be such that is the unique threshold from the set in . Since is uniformly distributed in , is also uniformly distributed in . Having defined thus, the interval boundaries picked by the algorithm for the fastest machine are for integers .

We prove (3) for in two separate lemmata, one for the case (Lemma 7) and the other for the case (Lemma 9). Recall that the expression for only depends on the subset of jobs assigned to the fastest machine. We call a job a local maximum if it is larger than all jobs preceding it. Since the algorithm assigns a new job to the fastest machine if and only if it falls in a larger interval than the current largest job, it follows that any job assigned to the fastest machine must be a local maximum.

Define to be the minimum job in the sequence of all local maxima in the range , i.e., the first job larger than and at most , if such a job exists and otherwise. We use in two ways. (1) We define . Note that is not a random variable. We use in Lemma 7 to prove the desired statement for . Specifically, we use to compute (i) a lower bound for as a function of (and not of any other jobs) and (ii) an upper bound for as a function of . Combining (i) and (ii) we prove that the desired inequality holds for all . (2) In Lemma 9 we bound by a sum of over suitable values of . This simplifies the analysis since the elements in the subsequence of all local maxima are not random variables, while the values in are random variables.

We first prove some simple properties of that we will use:

(1) and (2) .

###### Proof.

as is the minimum element in the sequence of all local maxima in and is the element from the interval picked by the algorithm.

is the minimum element in the sequence of local maxima in the range for and a non-negative integer. Either , or also falls into and follows from the fact that is the smallest local maximum in this range, while is an arbitrary local maximum in this range. ∎

The next lemmata conclude our algorithm analysis.

###### Lemma 7.

For , such that there exists a job of weight , and , we have

 αE\displaylimits[Δ1(w,st)]≤E\displaylimits[max(w)]
###### Proof.

Because there is a job with weight , it must be the case that . As is the job placed by the algorithm on the fastest machine, is in the same interval as for any choice of the random value . Thus, . As it follows that for all and, thus, for all . Hence . To analyze we have to consider two cases, depending on whether (and hence might contribute to ) or whether (and, thus, and does not contribute to ).

Case 1: . Since it holds that for all choices of . Thus we have

 E\displaylimits[c−(c−w)st(w)]=E\displaylimits[w]

As discussed above, and, thus, the only contribution to is from . Additionally only if , and this only happens when is chosen such that . Thus,

 E\displaylimits[∑k≥1wk⋅st(wk)]≤(1−t)ct

Note that . Thus we have

 E\displaylimits[w] ≥∫1tcxdx+∫t0ctdx≥c−ctlnc+tct

In this case, we want to show

 αE\displaylimits[Δ1]≤E\displaylimits[w].

This holds if

 α(ct+c−ctlnc)≤c−ctlnc+tct

Since , this inequality follows for all from Inequality 5 below.

Case 2: . As for , in this case, for all choices of , all speeds so . Thus, it suffices to show that , or equivalently that .

Let be the greatest local maximum that is smaller than . If , then and, thus, . If , then is the first local maximum greater than , while is a local maximum greater than . Thus, it is either equal to or a later local maximum, which by the definition of local maximum implies that it is larger than . Hence, and thus, . Therefore,

 E\displaylimits[c(1−st(w))]≤c∫z01dx=cz

We also have

 E\displaylimits[(1−αst(w))w] ≥(1−α)∫1tcxdx+(1−α)∫tzctdx+∫zlogcu0cxdx+∫logcu00u0dx =(1−α)c−ctlnc+(1−α)ct(t−z)lnc+cz−u0lnc+u0logcu0 =:V(t,u0)

Thus it suffices to show that . For any fixed and , the value of minimizing is . After fixing , we have

 V(t,1)=(1−α)c−ctlnc+(1−α)ct(t−z)lnc+cz−1lnc

Therefore,

 ∂V(t,1)∂t=(1−α)ct+tct/lnc−ct/lnc−zct/lnclnc

Notice that is non-negative for all if . Therefore, for any , it suffices to consider only and prove that

 α(ct+c−ctlnc)≤c−1lnc

for as large as possible. The following claim shows that this inequality holds for .

###### Claim 8.

For ,

 c−1lncct+c−ctlnc≥c−1clnc ∀t∈[0,1]
###### Proof.

Consider . We have . Thus, the maximum is achieved when and . Therefore,

 c−1lncct+c−ctlnc≥c−1clnc ∀t∈[0,1]

Thus, altogether the lemma holds for . ∎

###### Lemma 9.

For and , we have .

###### Proof.

Since for all choices of , it holds that . Thus we need to show . As we have the following lower bound for :

 E\displaylimits[w]≥∫1logcu0cxdx+∫logcu00u0dx=c−u0lnc+u0logcu0

Now, to prove the inequality, we only need to bound from above for a fixed . We can write in terms of as follows.

 E\displaylimits[∑k≥1wk]≤∞∑i=1∫10mS(c−i+x)dx=∫0−∞mS(cx)dx=:BS

The following claims analyze the structure of the jobs smaller than in the worst case, i.e., if maximizes .

###### Claim 10.

For any sequence of all local maxima where there are 2 consecutive local maxima with there is a sequence with at least as large and no such pair of consecutive local maxima.

###### Proof.

Add a new local maximum of weight to to form . Notice that . This argument can be repeated until there is no pair of consecutive local maxima with ratio greater than . ∎

###### Claim 11.

Consider a sequence of all local maxima with 3 consecutive local maxima where . After removing , the resulting sequence has .

###### Proof.

For all , we have . For all , we have . Thus, . ∎

###### Claim 12.

Consider a sequence of all local maxima containing satisfying

(1)

(2)

(3) , and

(4)

Then either one of the following conditions applies

(1) , or

(2) , or

there is a sequence with at most the same number of local maxima and .

###### Proof.

Assume that none of the conditions applies. We will show it is possible to move the jobs to form a sequence with .

We consider the effect of moving while maintaining the relation . We have

 BS =∫logcw′v−∞mS(x)dx+∫1logcw′umS(x)dx +∫logcw′ulogcw′vmS(x)dx =T+∫logcw′ulogczw′udx+v−u−3∑j=0∫logcc−jzlogcc−j−1zc−jzdx +∫logccu+2−vzlogcw′vcu+2−vzdx

where is a function that does not depend on . Furthermore, we have

 ∂BS∂z =−w′u1zlnc+(1−cu+2−v)1−c−1 +cu+2−v(u+2−v+1+lnzlnc−lnw′vlnc)

Notice that is monotonically increasing so the maximum of is achieved at an extreme point, which is either . When or , there are 2 jobs of the same weight and we can remove one without changing . If , the first condition in the lemma holds. If , then the second condition () holds as by the assumptions of the lemma. Thus, the conclusion follows from an inductive argument on the number of jobs. ∎

By the above claims, the only sequences we need to consider to prove Lemma 9 are of the form

 u0,u0/c,…,u0/cm,v/cm+1,v/cm+2,…

where , i.e., all pairs of consecutive jobs have ratio exactly except for possibly one pair. Thus it holds that

 BS =∫1logcu0u0dx+m∑i=1∫10u0c−idx +∫logcu0logcvu0c−mdx+∞∑i=m+1∫10vc−idx =(1−logcu0)u0+(1−c−m)u0c−1 +u0c−mlogc(u0/v)+vc−mc−1

Notice that is monotonically increasing so the choice of maximizing is either or . The following lemma proves that the value of when is larger than the value of when . Thus is maximized when , i.e., all pairs of consecutive jobs less than have ratio exactly .

###### Claim 13.
 u0c−1≥u0logc(u0/c)+cc−1 ∀u0∈[1,c]
###### Proof.

Let . We have and . Also notice that

 f′(x)=1c−1−lnx+1lnc

is monotonically decreasing in so the minimum of is achieved at the extreme points. In other words, . ∎

It follows that . Thus, we need to show for

 (4)

or equivalently

 α(u0cc−1+c−u0lnc)≤c−u0+u0lnu0lnc (5)

Let . We can rewrite the above inequality as

 α(βu0+c)≤c−u0+u0lnu0
###### Claim 14.

For ,

 α(βu0+c)≤c−u