A Mazing 2+\mathbf{\epsilon} Approximation for Unsplittable Flow on a Path

A Mazing 2+ Approximation
for Unsplittable Flow on a Path

Aris Anagnostopoulos111Sapienza University of Rome, aris@dis.uniroma1.it , Fabrizio Grandoni222IDSIA, University of Lugano, fabrizio@idsia.ch ,
Stefano Leonardi333Sapienza University of Rome, leon@dis.uniroma1.it , and Andreas Wiese444Max-Planck-Institut für Informatik, Saarbrücken, awiese@mpi-inf.mpg.de. This work was partially supported by a fellowship within the Postdoc-Programme of the German Academic Exchange Service (DAAD).
Abstract

We study the unsplittable flow on a path problem (UFP) where we are given a path with non-negative edge capacities and tasks, which are characterized by a subpath, a demand, and a profit. The goal is to find the most profitable subset of tasks whose total demand does not violate the edge capacities. This problem naturally arises in many settings such as bandwidth allocation, resource constrained scheduling, and interval packing.

A natural task classification defines the size of a task to be the ratio between the demand of and the minimum capacity of any edge used by . If all tasks have sufficiently small , the problem is already well understood and there is a approximation. For the complementary setting—instances whose tasks all have large —much remains unknown, and the best known polynomial-time procedure gives only (for any constant ) an approximation ratio of .

In this paper we present a polynomial time approximation for the latter setting. Key to this result is a complex geometrically inspired dynamic program. Here each task is represented as a segment underneath the capacity curve, and we identify a proper maze-like structure so that each passage of the maze is crossed by only tasks in the computed solution. In combination with the known PTAS for -small tasks, our result implies a approximation for UFP, improving on the previous best approximation [Bonsma et al., FOCS 2011]. We remark that our improved approximation factor matches the best known approximation ratio for the considerably easier special case of uniform edge capacities.

1 Introduction

In the unsplittable flow on a path problem (UFP) we are given a set of tasks and a path . Each edge  has a capacity . Each task is specified by a subpath between the start (i.e. leftmost) vertex and the end (i.e. rightmost) vertex , a demand , and a profit (or weight) . For each edge , denote by all tasks using , i.e. such that . For a subset of tasks , let and . The goal is to select a subset of tasks with maximum profit such that , for each edge .

The problem and variations of it are motivated by several applications in settings such as bandwidth allocation, interval packing, multicommodity flow, and scheduling. For example, edge capacities might model a given resource whose supply varies over a time horizon. Here, tasks correspond to jobs with given start- and end-times and each job has a fixed demand for the mentioned resource. The goal is then to select the most profitable subset of jobs whose total demand at any time can be satisfied with the available resources.

When studying the problem algorithmically, a natural classification of the tasks is the following: W.l.o.g., let us assume that edge capacities are all distinct (this can be achieved by slight perturbations and scaling, see [4]). For each task define its bottleneck capacity . Let also the bottleneck edge of be the edge of with capacity . W.l.o.g. we can assume , otherwise task can be discarded. For any value we say that a task is -large if and -small otherwise.

If all tasks are -small (-small instances), then the problem is well understood. In particular, by applying the LP-rounding and grouping techniques from [4], we immediately obtain the following result (see the appendix for its proof).

Lemma 1.

For any there is a such that in polynomial time one can compute a -approximation for -small instances of UFP.

However, much remains unclear for the complementary case where all tasks are -large (-large instances), even if is very close to . Importantly, for such instances the canonical LP has an integrality gap of  [6]. The best known approximation factor for this setting is , where such that (and in particular [4]. Bonsma et al. [4] reduce the problem to an instance of maximum independent set of rectangles and this approach inherently loses a factor of in the approximation ratio. The best known -approximation algorithm for -large instances, for any , combines the approach above (with ) with another algorithm, which is approximate for instances that are -small and -large at the same time. Combining this approximation with the result from Lemma 1, one obtains the currently best -approximation algorithm for UFP on general instances [4].

1.1 Related Work

As said above, the best known polynomial time approximation algorithm for unsplittable flow on a path achieves an approximation factor of  [4]. This result improves on the previously best known polynomial time -approximation algorithm designed by Bansal et al. [3]. When allowing more running time, there is a quasi-PTAS, that is, a -approximation running in time that additionally assumes a quasi-polynomial bound on the edge capacities and the demands of the input instance [2]. In terms of lower bounds, the problem is strongly NP-hard, even in the case of uniform edge capacities and unit profits [4, 10, 11].

The canonical LP-relaxation suffers from a integrality gap [6]. Adding further constraints, Chekuri, Ene, and Korula give an LP relaxation with an integrality gap of only  [8], which was recently improved to  [7].

Because of the difficulty of the general problem, researchers have studied special cases. A very common assumption is the no-bottleneck assumption (NBA), which requires that . Chekuri, Mydlarz, and Shepherd [9] give the currently best known -approximation algorithm under NBA. Note that this matches the best known result for the further restricted case of uniform edge capacities of Calinescu et al. [5].

When generalizing the problem to trees, Chekuri et al. [8] give a -approximation algorithm for the case that all tasks are -small. Under the NBA, Chekuri et al. [9] design a 48-approximation algorithm. Note that on trees the problem becomes APX-hard, even for unit demands, edge capacities being either 1 or 2, and trees with depth three [12].

On arbitrary graphs, UFP generalizes the well-known Edge Disjoint Path Problem (EDPP). On directed graphs, there is a approximation algorithm by Kleinberg [14], which matches the lower bound of by Guruswami et al. [13]. When assuming the NBA, Azar, and Regev [1] give an approximation algorithm for UFP. On the other hand, they show that without the NBA the problem is NP-hard to approximate within a factor better than .

1.2 Our Contribution

In this paper, we present a polynomial-time approximation scheme for -large instances of UFP (for any constant ), improving on the previous best approximation [4]. We remark that instances with only -large tasks might be relevant in practice.

Furthermore, in combination with the algorithm from Lemma 1, our PTAS implies a approximation for arbitrary UFP instances, without any further assumptions (such as the NBA or restrictions on edge capacities). This improves on the previous best approximation for the problem [4]. Note that our approximation matches the best known approximation factor for the considerably easier special case of uniform edge-capacities [5], where, in particular, the canonical LP has an integrality gap of a small constant and the -large tasks can be handled easily with a straightforward dynamic program (DP). Therefore, we close the gap in terms of (known) polynomial-time approximation ratios between the uniform and general case.

Figure 1: An example of capacity curve, with segments associated to some tasks (dashed) and m-tasks (bold). Note that is 2-thin.

When solving general UFP, large tasks are difficult to handle: as mentioned above, the canonical LP-relaxation suffers from an integrality gap of  [6]. Also, in contrast to the NBA-case there is no canonical polynomial time dynamic program for them since the number of large tasks per edge can be up to (whereas under the NBA it is , see [9]).

In [4] a dynamic program for large tasks was presented; however, their reduction to maximum independent set in rectangle intersection graphs inherently loses a factor of 4 in the approximation ratio, already for -large tasks (and even a factor of for -large tasks).

Our PTAS for -large instances is also a dynamic program, but it deviates substantially from the DP-approaches above. We exploit the following geometric viewpoint. We represent capacities with a closed curve on the 2D plane (the capacity curve) as follows: Let us label nodes from to going from left to right. For each edge , we draw a horizontal line segment (or segment for short) . Then we add a horizontal segment , and vertical segments in a natural way to obtain a closed curve. We represent each task as a horizontal segment . In particular, this segment is contained in the capacity curve, and touches the horizontal segment corresponding to its bottleneck edge (see Figure 1).

Canonically, one might want to traverse the path from left to right and introduce DP-cells encoding all possible choices for certain subpaths. Instead, we traverse the area underneath the capacity profile using the above geometric representation, going from the root in the bottom left to the dead-ends of the maze (the leaves). To guide this traversal, we use some tasks which we call maze tasks below. Those tasks fulfill two functions: they structure the area within the capacity curve into a maze with a tree topology such that, intuitively, each passage of the maze is crossed by only tasks (see Figure 1). We will refer to this property as -thinness later. It will be crucial for bounding the number of DP-cells.

One still arising difficulty is that when traversing the maze in higher regions, we cannot afford to remember precisely which tasks were selected in lower regions. To this end, the maze tasks have a second function. We use them to make it affordable to “forget” some decisions while moving from the root to the leaves. Before returning a solution we remove all maze tasks from the computed set. By constructing our algorithm carefully, we ensure that the capacity of each maze task compensates for the information we allowed to “forget” due to . We call a solution weakly feasible if it balances the latter correctly. So our DP computes a weakly feasible -thin pair where is a set of tasks and is a set of maze tasks. The final output consists only of whose weight we seek to maximize, and the DP computes the optimal solution among all pairs .

Since at the end we will remove the maze tasks of a computed solution we need to ensure that there is in fact a solution where the weight of is close to the optimum . This is proved by a non-trivial sequence of reductions where eventually the tasks of are mapped into directed paths of a proper rooted tree. On those paths we define a min-flow LP where each integral solution induces a -thin pair where the tasks are partitioned into and . The objective is to minimize . The claim then follows by showing that there exists a cheap fractional solution of weight at most , and that the LP matrix is totally unimodular.

2 Overview of the Algorithm

In this section we describe our methodology, which results in a polynomial-time -approximation algorithm for -large UFP instances (for any two given constants ). After running a polynomial time preprocessing routine we can assume that each vertex is either the start or the end vertex of exactly one task in (similarly as in [2]). Thus, the number of nodes in the graph is .

Now we define the maze tasks, or m-tasks for short, which we use to structure our solution. For each pair of tasks and  that share the same bottleneck edge (possibly ), we define an m-task with . Analogously to regular tasks, we set and . Furthermore, we define and . Let be the m-tasks with . Note that, by the above preprocessing, no two m-tasks with different bottleneck capacity share the same endpoint.

Our goal is to search for solutions in the form of maze pairs , where we require for any two different tasks that . Let be a proper integer constant to be defined later. We restrict our attention to maze pairs that are -thin and weakly feasible, as defined below.

Intuitively, a maze pair is -thin if, for any edge , between two consecutive line segments associated to m-tasks from there are at most segments associated to tasks in (see Figure 1).

Definition 2 (-thinness).

A maze pair is -thin if for every edge and every set with there is an m-task such that .

In Section 3 we prove that, for large enough , there exists a -thin maze pair so that is a good approximation to the optimum and is feasible (i.e., on each edge ).

Lemma 3.

For any there is a , such that for any -large instance of UFP, there exists a -thin maze pair such that and is feasible.

However, we are not able to compute the most profitable -thin maze pair in polynomial time. For this reason we relax the notion of feasibility of a maze pair so that might not be feasible, but still alone is feasible (which is sufficient for our purposes). We need some definitions first. For every m-task and any subset of tasks , we partition the set of tasks of sharing some edge with into three (disjoint) subsets:

  • (above tasks) .

  • (critical tasks) .

  • (subcritical tasks) .

We also define , and we define analogously and .

Definition 4 (Weak feasibility).

We define a maze pair to be weakly feasible if for every edge it holds that , where is the m-task in of largest bottleneck capacity, or , if .

Next we show that weak feasibility of a maze pair implies feasibility of .

Lemma 5.

Let be a weakly feasible maze pair. Then is feasible.

Proof.

We order the edges by their capacities in non-decreasing order. Assume w.l.o.g. that this order is given by . We prove the claim by induction on the index . More precisely, we prove that for all .

Consider first . If there is no m-task using , then the claim is true by definition. Otherwise let be the (only) m-task in . All tasks must have (in particular, they are critical for ). Thus .

Now suppose by induction that there is a value such that for all . Consider the edge . Once again if there is no m-task using , then the claim is true by definition. Otherwise, let . Consider the subcritical tasks . By definition, is not the bottleneck edge of any task in . We partition into the sets and , containing the tasks with bottleneck edge on the left of and on the right of , respectively. Consider the set . Let be a task with maximum bottleneck capacity in and let all tasks in use and . Using the induction hypothesis on , we obtain that . Similarly, we obtain that . Since the m-task compensates for all tasks in , that is, . Hence

where the last inequality follows from the weak feasibility of . ∎

Note that, by definition, the maze pair obtained in Lemma 3 is also weakly feasible. In Section 4 we present a polynomial-time dynamic program that computes the weakly feasible -thin maze pair with highest profit.

Lemma 6.

For any constants , , there is a polynomial-time dynamic program that computes a weakly feasible -thin maze pair of largest profit .

A crucial property that we exploit in the design of our dynamic program is that for each m-task in a weakly feasible maze pair the number of critical tasks is bounded by a constant depending only on .

Lemma 7.

Let be a weakly feasible maze pair and . It holds that .

Proof.

First recall that by Lemma 5, is a feasible solution. Consider the tasks with . Because all tasks are -large, there can be at most such tasks. The remaining tasks have and must use the leftmost edge of or the rightmost edge of (or both). Consider the tasks of the first type: we will show that . A symmetric argument holds for the remaining tasks , hence giving the claim. Consider the task that has the largest . By the definition of and all tasks in must use and . Each task is critical for and thus . Also, is -large and so . Therefore, there can be at most such tasks. ∎

By combining Lemmas 3, 5 and 6 we obtain the main theorem of this paper.

Theorem 8.

For any constant , there is a PTAS for -large instances of UFP.

Combining Theorem 8 with Lemma 1, we obtain the following corollary.

Corollary 9.

There is a polynomial time approximation algorithm for UFP.

3 A Thin Profitable Maze Pair

In this section we prove Lemma 3: we present a procedure that, given an (optimal) solution , carefully selects some of the tasks from and replaces them with m-tasks from that use the same edges. The tasks are selected in such a way that the tasks removed from have small weight, and at the same time the resulting maze pair is -thin. Hence, our proof is constructive and even leads to a polynomial time algorithm; note however that for our purposes a non-constructive argumentation would be sufficient.

123456789111210
(a) Initial instance. The nubers identify some of the tasks.
123456789111210
(b) Splitting the line segments to the ones in (in bold) and to (dashed).
ABD(7)D(4)D(10)
(e) The tree created from the decomposition of (d) (for representation issues, arc directions are omitted and some nodes of degree are contracted). A and B indicate the nodes corresponding to the splittings in (d).
Figure 2: Construction of the maze

Let be the optimal solution for the instance under consideration. Let denote the line segment associated to each task . Define and . We say that a segment contains an edge if where we assume that the vertices of the graph are labeled by from left to right.

We want to select a subset such that is at most and any vertical segment intersecting more than segments in intersects at least one segment in . We call a set with the latter property -thin for .

As we will show, for proving Lemma 3 it suffices to find a -thin set for because of the following transformation of into a maze-pair . We define . For constructing we group the lines in according to the bottleneck edges of their corresponding tasks. For each edge , we define . Now for each edge with we add an m-task into whose endpoints are the leftmost and rightmost node of the

path for the task with leftmost start vertex and the task with rightmost end vertex (in a sense, we glue and together to form an m-task). Observe that, as required in the definition of a maze pair, for any two distinct .

Lemma 10.

If a set is -thin for , then the maze pair is -thin and is feasible.

Proof.

Consider any edge , and any set of tasks . Let be of them with lowest bottleneck capacity, in non-decreasing order of bottleneck capacity. Since is feasible, and since the tasks in are -large, there cannot be more than tasks in of bottleneck capacity equal to . It follows that for all . Consider a vertical line segment with -coordinate that intersects . Since is -thin, must intersect some segment . The corresponding task then induces an m-task with . Hence is -thin.

For the feasibility of recall that is feasible and all tasks in are -large. Hence, on every edge each m-task uses at most as much capacity as the tasks from whose segments are in (the latter tasks in a sense were replaced by ).∎

Next we reduce the problem of finding a -thin set with low weight to the case that each segment starts at and either goes only to the right or only to the left. See Figures 2(a) and 2(b). Formally, we split each segment into two segments and such that contains the edges of between and the right vertex of the bottleneck edge and symmetrically for . So and overlap on . We set . We define and . The next lemma shows that it suffices to find low weight -thin sets for and .

Lemma 11.

Given -thin sets for and for , there is a -thin set for with .

Proof.

We add a line segment to if and only if or . It follows directly that . Now any vertical segment crossing at least segments in must either cross segments from or segments from . Thus, crosses a segment in or a segment in , and hence crosses a segment in . ∎

Consider now only the segments (a symmetric argument holds for ). The next step is to reduce the problem to the case where, intuitively speaking, the edge capacities are strictly increasing and all segments contain the leftmost edge of the graph. To simplify the description of the step after this one, we also enforce that new segments have different -coordinates. Formally, let us assume that task labels are integers between and (in any order). For each , we construct a segment , which we denote by . Here and (so that ). Define and . (See Figure 2(c).)

Lemma 12.

Given a -thin set for , there is a -thin set for with . A symmetric claim holds for and .

Proof.

We prove the first claim only, the proof of the second one being symmetric. Let . Clearly . Consider any vertical segment that intersects at least segments from . Let be such segments of lowest capacity, breaking ties according to the lowest label of the corresponding tasks. To prove the lemma it suffices to show that at least one such segment belongs to .

W.l.o.g., assume that for any , is equal to or to the left of , and if . Then by construction , where is the -coordinate of segment . Consider a vertical segment . For small enough, we can assume that intersects precisely the segments . Hence for some . If follows that as required. ∎

It remains to prove that there is a -thin set for whose weight is bounded by . We do this by reducing this problem to a min-flow problem in a directed tree network.

Let be even. We consider the following hierarchical decomposition of the segments in , which corresponds to a (directed) rooted out-tree (see Figures 2(d) and 2(e)). We construct iteratively, starting from the root. Each node of is labelled with a triple , where is an edge in , is an interval, and contains all segments that contain and whose -coordinate is in (the representative segments of ). Let be the rightmost edge that is contained in at least segments. We let the root of be labelled with . For any constructed node , if is the leftmost edge of the graph, then is a leaf. Otherwise, consider the edge to the left of , and let be the segments in that contain . Note that, by the initial preprocessing of the instance, each edge can be the rightmost edge of at most one segment (task), hence . If , we append to a child (with a directed arc ) with label . Otherwise (i.e., if ), we append to two children and , which are labelled as follows. Let be the segments in , sorted increasingly by -coordinate. We partition into and . Let be a value such that all segments in have a -coordinate strictly less than . We label and with and , respectively.

Consider a given segment , and the nodes of that have as one of their representative segments . Then the latter nodes induce a directed path in . To see this, observe that if , then either is a leaf or for exactly one child of . Furthermore, each belongs to for some leaf of (i.e., no is empty).

We call a set of segments a segment cover if for each node of it holds that .

Lemma 13.

If is a segment cover then is -thin for . A symmetric claim holds for .

Proof.

We prove the first claim only, the proof of the second one being symmetric. Consider any vertical segment crossing at least segments from , and let be such segments of lowest -coordinate. Let also be the edge such that , and be the segments containing edge in increasing order of coordinate. Observe that segments induce a subsequence of . Furthermore, the representative sets of nodes such that partition into subsequences, each one containing between and segments. It follows that there must be one node such that . Since by assumption, it follows that for some . ∎

It remains to show that there is a segment cover with small weight.

Lemma 14.

There exists a segment cover with (where is the parameter used in the construction of ).

Proof.

We can formulate the problem of finding a satisfying the claim as a flow problem. We augment by appending a dummy node to each leaf node with a directed edge (so that all the original nodes are internal) and extend the paths consequently (so that each path contains exactly one new edge ).

We define a min-flow problem, specified by a linear program. For each directed path we define a variable . Let denote the set of all arcs in . For each arc denote by all values such that uses . We solve the following LP:

s.t.

By the construction of every arc is used by at least paths. Hence, the linear program has a fractional solution of weight , which is obtained by setting for each . Since the underlying network is a directed tree and all paths follow the direction of the arcs, the resulting network flow matrix is totally unimodular, see [15]. Therefore, there exists also an integral solution with at most the same weight. This integral solution induces the set . ∎

Now the proof of Lemma 3 follows from the previous reductions.

Proof of Lemma 3.

Suppose we are given the optimal solution . As described above, we construct the sets , , , , and . We compute segment covers for and for as described in the proof of Lemma 14. By Lemma 13 they are -thin for and , respectively. By Lemma 12 we obtain -thin sets and for and , respectively, with and . By Lemma 11 this yields a -thin set for whose weight is bounded by . Finally, set . This maze pair is feasible by definition. Furthermore, by Lemma 10, it is -thin and its weight is bounded by . ∎

4 The Dynamic Program

In this section we present a dynamic program computing the weakly feasible -thin maze pair with maximum profit (where will correspond to the constant of Lemma 3). Thus, we prove Lemma 6.

Let . To simplify the description and analysis of our DP, we introduce the following assumptions and notations. For having a clearly defined root in the DP, we add an edge to the left of with (note that is used by no task). For notational convenience, we add to two special dummy m-tasks and . The paths of and span all the edges of the graph, and they both have demand zero. Furthermore, and . In particular, with these definitions we have that , , and . We let be the rightmost edge of the graph, and we leave unspecified. However, when talking about weak-feasibility and -thinness of a maze pair we will ignore dummy tasks, that is, we will implicitly consider .

For any , , and any two m-tasks and with , the boundary tasks in for the triple are the tasks . Intuitively, boundary tasks are the tasks using edge such that the segment corresponding to  is sandwiched between the segments corresponding to and .

In our dynamic programming table we introduce a cell for each entry of the form where:

  • is an edge;

  • and , ;

  • and , with ;