Closing the Gap for Makespan Scheduling
via Sparsification Techniques111This work was partially supported by DFG Project, Entwicklung und Analyse von effizienten polynomiellen Approximationsschemata für Scheduling- und verwandte Optimierungsprobleme, Ja 612/14-2, by FONDECYT project 3130407, and by Nucleo Milenio Información y Coordinación en Redes ICM/FIC RC130003.
Makespan scheduling on identical machines is one of the most basic and fundamental packing problems studied in the discrete optimization literature. It asks for an assignment of jobs to a set of identical machines that minimizes the makespan. The problem is strongly NP-hard, and thus we do not expect a -approximation algorithm with a running time that depends polynomially on . Furthermore, Chen et al.  recently showed that a running time of for any would imply that the Exponential Time Hypothesis (ETH) fails. A long sequence of algorithms have been developed that try to obtain low dependencies on , the better of which achieves a running time of . In this paper we obtain an algorithm with a running time of , which is tight under ETH up to logarithmic factors on the exponent.
Our main technical contribution is a new structural result on the configuration-IP. More precisely, we show the existence of a highly symmetric and sparse optimal solution, in which all but a constant number of machines are assigned a configuration with small support. This structure can then be exploited by integer programming techniques and enumeration. We believe that our structural result is of independent interest and should find applications to other settings. In particular, we show how the structure can be applied to the minimum makespan problem on related machines and to a larger class of objective functions on parallel machines. For all these cases we obtain an efficient PTAS with running time .
Minimum makespan scheduling is one of the foundational problems in the literature on approximation algorithms [7, 8]. In the identical machine setting the problem asks for an assignment of a set of jobs to a set of identical machines . Each job is characterized by a non-negative processing time . The load of a machine is the total processing time of jobs assigned to it, and our objective is to minimize the makespan, that is, the maximum machine load. This problem is usually denoted . It is well known to admit a polynomial time approximation scheme (PTAS) , and there has been many subsequent works improving the running time or deriving PTAS’s for more general settings. The fastest PTAS for achieves a running time of for -approximate solutions . Very recently, Chen et al.  showed that, assuming the exponential time hypothesis (ETH), there is no PTAS that yields -approximate solutions for with running time for any .
Given a guess on the optimal makespan, which can be found with binary search, the problem reduces to deciding the existence of a packing of the jobs to machines (or bins) of capacity . If we aim for a -approximate solution, for some , we can assume that all processing times are integral and is a constant number, namely . This can be achieved with well known rounding and scaling techniques [1, 2, 9] which will be specified later. Let be the job sizes appearing in the instance after rounding, and let denote the number of jobs of size . The mentioned rounding procedure implies that the number of different job sizes is . Hence, for large we obtain a highly symmetric problem where several jobs will have the same processing time. Consider the knapsack polytope . A packing on one machine can be expressed as a vector , where denotes the number of jobs of size assigned to the machine. Elements in are called configurations. Considering a variable that decides the multiplicity of configuration in the solution, our problem reduces to solving the following linear integer program (ILP):
In this article we derive new insights on this ILP that help us to design faster algorithms for and other more general problems. These including makespan scheduling on related machines , and a more general class of objective functions on parallel machines. We show that all these problems admit a PTAS with running time . Hence, our algorithm is best possible up to polylogarithmic factors in the exponent assuming ETH .
1.1 Literature Review
There is an old chain of approximation algorithms for , starting from the seminal work by Graham [7, 8]. The first PTAS was given by Hochbaum and Shmoys  and had a running time of . This was improved to by Leung . Subsequent articles improve further the running time. In particular Hochbaum and Shmoys (see ) and Alon et al. [1, 2] obtain an efficient PTAS222That is, a PTAS whose running time is where is the encoding size of the input and is some function. (EPTAS) with running time . Alon et al. [1, 2] consider general techniques that work for several objective functions, including all -norm of the loads and maximizing the minimum machine load.
The fastest PTAS known up to date for achieves a running time of . More generally, this work gives an EPTAS for the case of related (uniform) machines, where each machine has a speed and assigning to job implies a processing time of . For this more general case the running time is . For the simpler case of , the ILP can be solved directly since the number of variables is a constant. This can be done with Lentras’ algorithm , or even with Kannan’s algorithm  that gives an improved running time. This technique yields a running time that is doubly exponential in . This was, in essence, the approach by Alon et al. [1, 2] and Hochbaum and Shmoys . To lower the dependency on , Jansen  uses a result by Eisenbrand and Shmonin  that implies the existence of a solution with support of size at most . First guessing the support and then solving the ILP with integer variables and using Kannan’s algorithm yields the desired running time of .
The configuration ILP has recently been studied in the context of the (1-dimensional) cutting stock problem. In this case, the dimension is constant, , and is a rational vector. Moreover, and are part of the input. Goemans and Rothvoß  obtain an optimal solution in time , where is the largest number appearing in the denominator of or the multiplicities . This is achieved by first showing that there exists a pre-computable set with polynomial many elements, such that there exists a solution that gives all but constant (depending only on ) amount of weight to . We remark that applying this result to a rounded instance of yields a running time that is doubly exponential on .
1.2 Our Contributions
Our main contribution is a new insight on the structure of the solutions of [conf-IP]. These properties are specially tailored to problems in which is bounded by a constant, which in the case of can be guaranteed by rounding and scaling. The same holds for with a more complex rounding and case analysis.
We first classify configurations by their support. We say that a configuration is simple if its support is of size at most , otherwise it is complex. Our main structural result333We remark the resemblance of this structure to the result by Goemans and Rothvoß . Indeed, similar to their result, we can precompute a subset of configurations such that all but a constant amount of weight of the solution is given to such set. In their case the set is of cardinality polynomial on the input and is constructed by covering the integral solutions of the knapsack polytope by parallelepipeds. In our case, all but weight is given to simple configurations. states that there exists a solution in which all but weight is given to simple configurations, the support is bounded by (as implied by Eisenbrand and Shmonin ) and no complex configuration has weight larger than 1.
Assume that [conf-IP] is feasible. Then there exists a feasible solution to [conf-IP] such that:
if then the configuration is simple,
the support of satisfies , and
, where denotes the set of complex configurations.
Theorem 1 (Thin solutions).
We call a solution satisfying the properties of the theorem thin. The theorem can be shown by iteratively applying a sparsification lemma that shows that if a solution gives a weight of two or more to a complex configuration, then we can replace this partial solution by two configurations with smaller support. The sparsification lemma is shown by a simple application of the pigeonhole principle. The theorem can be shown by mixing this technique with the theorem of Eisenbrand and Shmonin  and a potential function argument.
As an application to our main structural theorem, we derive a PTAS for by first guessing the jobs assigned to complex configurations. An optimal solution for this subinstance can be derived by a dynamic program. For the remaining instance we know the existence of a solution using only simple configurations. Then we can guess the support of such solution and solve the corresponding [conf-IP] restricted to the guessed variables. The main use of having simple configurations is that we can guess the support of the solution much faster, as the number of simple configuration is (asymptotically) smaller than the total number of configurations. The complete procedure takes time . Moreover, using the rounding and case analysis of Jansen , we derive an mixed integer linear program that can be suitably decomposed in order to apply our structural result iteratively. This yields a PTAS with a running time of for .
Similarly, we can extend our results to derive PTAS’s for a larger family of objective functions as considered by Alon et al. [1, 2]. Let denote the load of machine , that is, the total processing time of jobs assigned to machine for a given solution. Our techniques then gives a PTAS with the same running time for the problem of minimizing the -norms of the loads (for fixed ), and maximizing , among others. To solve this problem, we can round the instance and state an IP analogous to [conf-IP] but considering an objective function. However, the objective function prevents us to use the main theorem as it is stated. To get over this issue, we study several ILPs. In each ILP we consider to be a variable only if has a given load, and fix the rest to be some optimal solution. Applying to each such ILP Theorem 1.2, plus some extra ideas, yields an analogous structural theorem. Afterwards, an algorithm similar to the one for makespan minimization yields the desired PTAS.
From an structural point of view, our sparsification lemma has other consequences on the structure of the knapsack polytope and the LP-relaxation of the [conf-IP]. More precisely, we can show that any vertex of the convex hull of must be simple. This, for example, helps us to upper bound the number of vertices by . Moreover, we can show that the configuration-LP, obtained by replacing the integrality restriction in [conf-IP] by , if it is feasible then admits a solution whose support consist purely of simple configurations. Due to space limitations we leave many details and proofs to the appendix.
We will use the following notation throughout the paper. By default , unless stated otherwise. Given two sets , we will denote by the set of all vectors indexed by with entries in , that is, . Moreover, for , we denote the support of a vector as .
We consider an arbitrary knapsack polytope where is a non-negative integral (row) vector and is a positive integer. We assume without loss of generality that each coordinate of is upper bounded by (otherwise for all ). We focus on the set of integral vectors in which we denote by . We call an element a configuration. Given , consider the problem of decomposing as a conic integral combination of configurations. That is, our aim is to find a feasible solution to [conf-IP], defined above.
A crucial property of the [conf-IP] is that there is always a solution with a support of small cardinality. This follows from a Caratheodory-type bound obtained by Eisenbrand and Shmonin . Since we will need the argument later, we state the result applied to our case and revise its (very elegant) proof. We split the proof in two lemmas.
Lemma 2 (Eisenbrand and Shmonin ).
Let be a vector such that . Then there exist two disjoint sets with such that .
Let . Each coordinate of is smaller than . Hence, for any , each coordinate of is no larger than . Thus, belongs to , and hence there are at most different possibilities for vector , over all possible subsets . On the other hand, there are many different subsets of .
We claim that . Indeed, since then . Hence,
where the penultimate inequality follows since for all .
We obtain that . Hence, by the pigeonhole principle there are two distinct subsets such that . We can now define and and obtain . It remains to show that . Notice that if then , and the last equality of implies that . This is a contradiction since then . We conclude that . The proof that is analogous. ∎
Lemma 3 (Eisenbrand and Shmonin ).
If [conf-IP] is feasible, then there exists a feasible solution such that .
Let be a solution to [conf-IP] that minimizes . Assume by contradiction that . We show that we can find another solution to [conf-IP] with , contradicting the minimality of . By Lemma 2, there exist two disjoint subsets such that . Moreover, let . Vector is also a solution to [conf-IP] and has a strictly smaller support since a configuration satisfies . ∎
3 Structural Results
Recall that we call a configuration simple if and complex otherwise. An important observation to show Theorem 1.2 is that if is a complex configuration, then can be written as the sum of two configurations of smaller support. This is shown by the following Sparsification Lemma.
Lemma 4 (Sparsification Lemma).
Let be a complex configuration. Then there exist two configurations such that
Consider for each subset , a configuration such that if and otherwise. As the number of subsets of is , and if and only if , the collection of vectors has cardinality .
On the other hand, for any vector it holds that . Hence, can take only different values. Using that is a complex configuration and hence , the pigeonhole principle ensures that there are two different non-empty configurations with . By removing the intersection, we can assume w.l.o.g. that and have no intersection. We define and , which satisfy the properties of the lemma as
Since and , property 3 is satisfied. ∎
With Lemma 4 we are ready to show Theorem 1.2. For the proof it is tempting to apply the lemma iteratively, replacing any complex configuration that is used twice by two configurations with smaller support. This can be repeated until there is no complex configuration taken multiple times. Then we can apply the technique of Lemma 3 to the obtained solution to bound the cardinality of the support. However, the last step might break the structure obtained if the solution implied by Lemma 3 uses a complex configuration more than once. In order to avoid this issue we consider a potential function. We show that a vector minimizing the chosen potential uses each complex configuration at most once, and that the number of complex configurations in the support is bounded. Finally, we apply the techniques from Lemma 3 restricted to variables corresponding to simple configurations.
Proof of Theorem 1.2.
Consider the following potential function of a solution of [conf-IP],
Let be a solution of [conf-IP] with minimum potential , which is well defined since the set of feasible solutions has finite cardinality. We show two properties of .
P1: for each complex configuration .
Assume otherwise. Consider the two configurations and implied by the previous lemma. We define a new solution for , , and . Since and , we obtain that which contradicts the minimality of .
P2: The number of complex configurations in is at most .
Let be the vector defined as if is complex, and if is simple. Then Lemma 2 implies that there are exist two disjoint subsets of complex configurations such that . Thus, the solution and the solution are feasible for [config-IP]. By linearity, the potential function on the new solutions are or respectively . If or then we have constructed a new solution with smaller potential, contradicting our assumption on the minimality of . We conclude that and thus . By construction of , we obtain that for any complex configuration . Having multiplicity for a complex configuration , we can proceed as in Case 1 to find a new solution with decreased potential, which yields a contradiction.
Given these two properties, to conclude the theorem it suffices to upper bound the number of simple configurations by . Suppose this property is violated, then we find two sets of simple configurations (see Lemma 2) with and proceed as in Lemma 3. Since Lemma 3 is only applied to simple configurations, properties P1 and P2 continue to hold and the theorem follows. ∎
Our techniques, in particular our Sparsification Lemma, imply two corollaries on the structure of the knapsack polytope and the LP-relaxation implied by the [conf-IP].
Every vertex of is a simple configuration. Moreover, the total number of simple configurations in is upper bounded by and thus the same expression upper bounds the number of vertices of .
Consider a complex configuration . By Lemma 4 we know that there exist with such that . Hence, is not a vertex of as it can be written as a convex combination .
To bound the number of simple configurations fix a set . Notice that the number of configurations with is at most . For simple configurations it suffices to take with cardinality at most . Since the number of subsets with cardinality is , we obtain that the number of simple configurations is at most
The following corollary follows as each complex configuration can be represented by a convex combination of simple configurations.
Let [conf-LP] be the LP relaxation of [conf-IP], obtained by changing each constraint to for all . If the LP is feasible then there exists a solution such that each configuration is simple.
Consider a solution of [conf-LP]. Assume that there exists such that is complex and . Then by the previous corollary, configuration can be written as , where , for all , and if is complex. Consider a new solution defined as
This new solution is also feasible for [conf-LP]. As , the number of complex configurations in the support of the solution is reduced by . This procedure can be repeated until we have a solution whose support contains only simple configurations. ∎
4 Applications to Scheduling on Parallel Machines
In what follows we show how to exploit the structural insights of the previous section to derive faster algorithms for parallel machines scheduling problems. We start by considering , where we seek to assign a set of jobs with processing times to a set of machines. For a given assignment , we define the load of a machine as and the makespan as the maximum load of jobs over all machines, which is the minimum time needed to complete the execution of all jobs on the processors. The goal is to find an assignment that minimizes the makespan.
We first follow well known rounding techniques [1, 2, 10, 9]. Consider an error tolerance such that is an integer. To get an estimation of the optimal makespan, we follow the standard dual approximation approach. First, we can use, e.g., the 2-approximation algorithm by Graham  to get an initial guess of the optimal makespan. Using binary search, we can then estimate the optimal makespan within a factor of in iterations. Therefore, it remains to give an algorithm that decides for a given makespan , if there exists an assignment with makespan or reports that there exists no assignment with makespan .
For a given makespan we define the set of big jobs and the set of small jobs . The following lemma shows that small jobs can be replaced from the instance by adding big jobs, each of size , as placeholders. Let be the sum of processing times of jobs in and let denote the next value of rounded up to the next multiple of , that is, . We define a new instance containing only big jobs by , where contains jobs of size .
Given a feasible assignment of jobs with makespan . Then there exists a feasible assignment of makespan . Similarly, an assignment of jobs in of makespan can be transformed to an assignment of of makespan at most .
We modify the assignment of jobs in by replacing the set of small jobs on each machine by jobs in . Let be the total processing time of small jobs assigned to machine . Then the small jobs are replaced by (at most) jobs in , where denotes the value of rounded up to the next multiple of . As , the new solution processes all jobs in and the load on each machine increases hence by at most . Having an assignment for the big jobs , we can easily obtain a schedule for jobs , by adding the small items greedily into the space of the placeholder jobs . ∎
By scaling the processing times of jobs in , we can assume that the makespan has value . Also notice that we can assume that for all , otherwise we cannot pack all jobs within makespan . This implies that each job has a processing time of . In the following we give a transformation of big jobs in by rounding their processing times. We first round the jobs to the next power of as , and thus all rounded processing times belong to . We further round processing times to the next integer and define a new set . Notice that only contains integers and .
If there is a feasible schedule of jobs with processing times onto machines with makespan , then there is also a feasible schedule of jobs with rounded processing with a makespan of at most . Furthermore, the number of different processing times is at most .
Consider a feasible schedule of jobs in with processing times onto machines with makespan . Let be the set of jobs processed on machine i.e. for . Then . Hence, the same assignment with processing times yields a makespan of at most . Since , on every machine are at most jobs. Hence, rounding the processing times of each job to the next integer increase the load on each machine by at most . Recalling that , we obtain a feasible schedule with makespan at most . ∎
In what follows we give an algorithm that decides in polynomial time the existence of a solution for instance with processing times and makespan . We call numbers in by and define the vector of rounded processing times. We consider configurations to be vectors in , where is a knapsack polytope (see Section 3). As before, we say that a configuration is simple if , and complex otherwise. For a given assignment of jobs to machines, we say that a machine follows a configuration if is the number of jobs of size assigned to the machine. We denote by the set of complex configurations and by the set of simple configurations.
Let be the number of jobs of size in the instance (with processing times ). Consider an ILP with integer variables for each , which denote the number of machines that follow configuration . With these parameters the problem of scheduling all jobs in a solution of makespan is equivalent to finding a solution to [conf-IP]. To solve the ILP we use, among other techniques, Kannan’s algorithm  which is an improvement on the algorithm by Lenstra . The algorithm has a running time of where is the number of variables and is number of bits used to encode the input of the ILP in binary.
By Theorem 1.2, if [conf-IP] is feasible then there exists a thin solution. In particular if one configuration is used by more than one machine then is simple, and the total number of used configurations is . Additionally, the number of machines following a complex configurations is at most . We consider the following strategy to decide the existence of a schedule of makespan .
For each processing time , guess the number of jobs covered by complex configurations.
Find a minimum number of machines to schedule jobs with makespan .
Guess the support of simple configurations used by a thin solution, with .
Solve the ILP restricted to configurations in :
One of the key observations to prove the running time of the algorithm is that the number of simple configurations is bounded by a quasi polynomial term:
This follows easily by Corollary 5, using that and .
Algorithm 9 can be implemented with a running time of .
In step 1, the algorithm guesses which jobs are processed on machines following a complex configurations. Since each configuration contains at most jobs, there are at most jobs assigned to such machines. For each size , we guess the number of jobs of size assigned to such machines. Hence, we can enumerate all possibilities for jobs assigned to complex machines in time . After guessing the jobs, we can assign them to a minimum number of machines in step 2 (with makespan ) with a simple dynamic program that stores vectors with being the number of jobs of size used in the first processors . The size of the dynamic programming table is . For any vector , determining whether it corresponds to a feasible solution can be done by checking all vectors of the type for . Thus, the running time of the dynamic program is . Since for each , recalling that , and that , we obtain that step 2 can be implemented with running time.
In step 3, our algorithm guesses the support of a thin solution . Recall that if is thin then . Let . Then this guess can be done in time
We remark that for this step is that thin solutions are particularly useful. Indeed, guessing the support on the original ILP takes time .
In step 4, the number of variables of the restricted ILP is . Moreover, the size of the input can be bounded by . Running Kannan’s algorithm  to solve the ILP takes time . Hence, the total running time of our algorithm can be bounded by . ∎
Putting all pieces together, we conclude with the following theorem.
The minimum makespan problem on parallel machines admits an EPTAS with running time .
Consider a scheduling instance with job set , processing times for and machine set . The greedy algorithm by Graham to obtain a 2-approximation can be implemented in . After guessing the makespan , the processing times are sorted and rounded as described in Lemma 8. The rounding step can easily be implemented in time. Applying Algorithm 9 after the rounding needs, according to Theorem 10, a running time of time. Since there are at most many guessing rounds for the makespan, we obtain a total running time of .
If then the running time is upper bounded by , otherwise, the running time is at most . In any case, the running time can be bounded by . ∎
4.1 Extension to other objectives
(I) (II) ,
where denotes the load of machine . Analogously, we study maximization versions of the problems
(I’) (II’) ,
For the minimization versions of the problem we assume that is convex, while for (I’) and (II’) we assume it is concave. Moreover, we will need that the function satisfies the following sensitivity condition.
For all there exists such that for all ,
Alon et al. showed that each problem in that family admits a PTAS with running time , where is a constant term that depends only on . Moreover, if in the condition further satisfies that , the running time is . In what follows we show how to improve this dependency. Since , we know that, for small enough , there exists a constant (independent of and ) such that . Moreover, we can assume w.l.o.g. that , and thus .
It is worth noticing that many interesting functions belong to this family. In particular (II) with corresponds to the minimum makespan problem, (I) with , for constant , corresponds to a problem that is equivalent to minimizing the -norm of the vector of loads. Similarly, (II’) with corresponds to maximizing the minimum machine load. Notice that for all those objectives we have that .
The techniques of Alon et al.  are based on a rounding method and then solving an ILP. We based our results in the same rounding techniques. Consider an arbitrary instance of a scheduling problem on identical machines with objective function (I), (II), (I’) or (II’). Their first observation is that, if is the average machine load, then a job with is scheduled alone on a machine in an optimal solution . Hence, we can remove such job and a machine from the instance. In what follows, we assume without loss of generality, that for all . For the sake of brevity, we summarize the rounding techniques of Alon et al. in the following theorem.
Theorem 13 (Alon et al. ).
Consider an instance for the scheduling problem with job set , identical machines , and processing times for such that for all . There exists a linear time algorithm that creates a new instance with job set , machine set , and processing times . Moreover, there is an integer with such that the new instance satisfies the following:
Each job in has processing time , and is a integer multiple of .
If then .
Let Opt and be the optimal value of instances and , respectively. Then .
There exists a linear time algorithm that transforms a feasible solution for instance with objective value to a feasible solution for with objective value such that .
Given this result, it suffices to find a -approximate solution for instance . To do so, we further round the processing times as in the previous section by defining as the value rounded up to the next multiple of for all . Notice that