Sparsest Cut on Bounded Treewidth Graphs:
We give a -approximation algorithm for Non-Uniform Sparsest Cut that runs in time , where is the treewidth of the graph. This improves on the previous -approximation in time due to Chlamtáč et al. [CKR10].
To complement this algorithm, we show the following hardness results: If the Non-Uniform Sparsest Cut problem has a -approximation for series-parallel graphs (where ), then the MaxCut problem has an algorithm with approximation factor arbitrarily close to . Hence, even for such restricted graphs (which have treewidth ), the Sparsest Cut problem is NP-hard to approximate better than for ; assuming the Unique Games Conjecture the hardness becomes . For graphs with large (but constant) treewidth, we show a hardness result of assuming the Unique Games Conjecture.
Our algorithm rounds a linear program based on (a subset of) the Sherali-Adams lift of the standard Sparsest Cut LP. We show that even for treewidth- graphs, the LP has an integrality gap close to even after polynomially many rounds of Sherali-Adams. Hence our approach cannot be improved even on such restricted graphs without using a stronger relaxation.
The Sparsest Cut problem takes as input a “supply” graph with positive edge capacities , and a “demand” graph (on the same set of vertices ) with demand values , and aims to determine
where denotes the edges crossing the cut in graph . When with , the problem is called Uniform Demands Sparsest Cut, or simply Uniform Sparsest Cut. Our results all hold for the non-uniform demands case.
The Sparsest Cut problem is known to be NP-hard due to a result of Matula and Shahrokhi [MS90], even for unit capacity edges and uniform demands. The best algorithm for Uniform Sparsest Cut on general graphs is an -approximation due to Arora, Rao, and Vazirani [ARV09]; for Non-Uniform Sparsest Cut the best factor is due to Arora, Lee and Naor [ALN08]. An older -approximation for Non-Uniform Sparsest Cut is known for all excluded-minor families of graphs [Rao99], and constant-factor approximations exist for more restricted classes of graphs [GNRS04, CGN06, CJLV08, LR10, LS09, CSW10]. Constant-factor approximations are known for Uniform Sparsest Cut for all excluded-minor families of graphs [KPR93, Rab03]. [GS13] give a -approximation algorithm for non-uniform Sparsest Cut that runs in time depending on generalized spectrum of the graphs . All above results, except [GS13], consider either the standard linear or SDP relaxations. The integrality gaps of convex relaxations of Sparsest Cut are intimately related to questions of embeddability of finite metric spaces into ; see, e.g., [LLR95, GNRS04, KV05, KR09, LN06, CKN09, LS11, CKN11] and the many references therein. Integrality gaps for LPs/SDPs obtained from lift-and-project techniques appear in [CMM09, KS09, RS09, GSZ12]. [GNRS04] conjectured that metrics supported on graphs excluding a fixed minor embed into with distortion (depending on the excluded minor, but independent of the graph size); this would imply -approximations to Non-Uniform Sparsest Cut on instances where excludes a fixed minor. This conjecture has been verified for several classes of graphs, but remains open (see, e.g., [LS09] and references therein).
The starting point of this work is the paper of Chlamtáč et al. [CKR10], who consider non-uniform Sparsest Cut on graphs of treewidth .111We emphasize that only the supply graph has bounded treewidth; the demand graphs are unrestricted. They ask if one can obtain good algorithms for such graphs without answering the [GNRS04] conjecture; in particular, they look at the Sherali-Adams hierarchy. In their paper, they give an -approximation in time by solving the -round Sherali-Adams linear program and ask whether one can achieve an algorithm whose approximation ratio is independent of the treewidth . We answer this question in the affirmative.
Theorem 1.1 (Easiness)
There is an algorithm for the Non-Uniform Sparsest Cut problem that, given any instance where has treewidth , outputs a -approximation in time .
Graphs that exclude some planar graph as a minor have bounded treewidth, and -minor-free graphs have treewidth . This implies a -approximation for planar-minor-free graphs in poly-time, and for general minor-free graphs in time . In fact, we only need has a recursive vertex separator decomposition where each separator has vertices for the above theorem to apply.
Our algorithm is also based on solving an LP relaxation, one whose constraints form a subset of the -round Sherali-Adams lift of the standard LP, and then rounding it via a natural propagation rounding procedure. We show that further applications of the Sherali-Adams operator (even for a polynomial number of rounds) cannot do better:
Theorem 1.2 (Tight Integrality Gap)
For every , there are instances of the Non-Uniform Sparsest Cut problem with having treewidth 2 (a.k.a. series-parallel graphs) for which the integrality gap after applying rounds of the Sherali-Adams hierarchy still remains , even when for some constant .
This result extends the integrality gap lower bound for the basic LP on series-parallel graphs shown by Lee and Raghavendra [LR10], for which Chekuri, Shepherd and Weibel gave a different proof [CSW10].
On the hardness side, Ambühl et al. [AMS11] showed that if Uniform Sparsest Cut admits a PTAS, then SAT has a randomized sub-exponential time algorithm. Chawla et al. [CKK06] and Khot and Vishnoi [KV05] showed that Non-Uniform Sparsest Cut is hard to approximate to any constant factor, assuming the Unique Games Conjecture. The only Apx-hardness result (based on ) for Non-Uniform Sparsest Cut is recent, due to Chuzhoy and Khanna [CK09, Theorem 1.4]. Their reduction from MaxCut shows that the problem is Apx-hard even when is , and hence of treewidth or even pathwidth 2. (This reduction was rediscovered by Chlamtáč, Krauthgamer, and Raghavendra [CKR10].) We extend their reduction to show the following hardness result for the Non-Uniform Sparsest Cut problem:
Theorem 1.3 (Improved NP-Hardness)
For every constant , the Non-Uniform Sparsest Cut problem is hard to approximate better than unless and hard to approximate better than assuming the Unique Games Conjecture, even on graphs with treewidth 2 (series-parallel graphs).
Our proof of this result gives us a hardness-of-approximation that is essentially the same as that for MaxCut (up to an additive loss). Hence, improvements in the NP-hardness for MaxCut would translate into better NP-hardness for Non-Uniform Sparsest Cut as well.
If we allow instances of larger treewidth, we get a Unique Games-based hardness that matches our algorithmic guarantee:
Theorem 1.4 (Tight UG Hardness)
For every constant , it is UG-hard to approximate Non-Uniform Sparsest Cut on bounded treewidth graphs better than . I.e., the existence of a family of algorithms, one for each treewidth , that run in time and give -approximations for Non-Uniform Sparsest Cut would disprove the Unique Games Conjecture.
1.1 Other Related Work
There is much work on algorithms for bounded treewidth graphs: many NP-hard problems can be solved exactly on such graphs in polynomial time (see, e.g., [RS86]). Bienstock and Ozbay [BO04] show, e.g., that the stable set polytope on treewidth- graphs is integral after levels of Sherali-Adams; Magen and Moharrami [MM09] use their result to show that rounds of Sherali-Adams are enough to -approximate stable set and vertex cover on minor-free graphs. Wainwright and Jordan [WJ04] show conditions under which Sherali-Adams and Lasserre relaxations are integral for combinatorial problems based on the treewidth of certain hypergraphs. In contrast, our lower bounds show that the Sparsest Cut problem is Apx-hard even on treewidth-2 supply graphs, and the integrality gap stays close to even after a polynomial number of rounds of Sherali-Adams.
2 Preliminaries and Notation
We use to denote the set . For a set and element , we use to denote .
2.1 Cuts and MaxCut Problem
All the graphs we consider are undirected. For a graph and set , let be the edges with exactly one endpoint in ; we drop the subscript when is clear from context. Given vertices and special vertices , a cut is --separating if .
In the (unweighted) MaxCut problem, we are given a graph and want to find a set that maximizes ; the weighted version has weights on edges and seeks to maximize the weight on the crossing edges. The approximability of weighted and unweighted versions of MaxCut differ only by an -factor [CST01], and henceforth we only consider the unweighted case.
2.2 Tree Decompositions and Treewidth
Given a graph , a tree decomposition consists of a tree and a collection of node subsets called “bags” such that the bags containing any node form a connected component in and each edge in lies within some bag in the collection. The width of such a tree decomposition is , and the treewidth of is the smallest width of any tree-decomposition for . See, e.g., [Die00, Bod98] for more details and references.
The notion of treewidth is intimately connected to the underlying graph having small vertex separators. Indeed, say graph admits (weighted) vertex separators of size if for every assignment of positive weights to the vertices , there is a set of size at most such that no component of contains more than of the total weight . For example, planar graphs admit weighted vertex separators of size at most . It is known (see, e.g., [Ree92, Theorem 1]) that if has treewidth then admits weighted vertex separators of size at most ; conversely, if admits weighted vertex separators of size at most then has treewidth at most . (The former statement is easy. A easy weaker version of the latter implication with treewidth is obtained as follows. Find an unweighted vertex separator of of size to get subgraphs each with at most of the nodes. Recurse on the subgraphs to get decomposition trees . Attach a new empty bag and connecting to the “root” bag in each to get the decomposition tree , add the vertices of to all the bags in , and designate as its root. Note that has height and width . In fact, this tree decomposition can be used instead of the one from Theorem 3.1 for our algorithm in Section 3 to get the same asymptotic guarantees.)
2.3 The Sherali-Adams Operator
For a graph with , we now define the Sherali-Adams polytope. We can strengthen an LP by adding all variables such that and . The variable has the “intended solution” that the chosen cut satisfies . 222In some uses of Sherali-Adams, variables are intended to mean that —this is not the case here. We can then define the -round Sherali-Adams polytope (starting with the trivial LP), denoted , to be the set of all vectors satisfying the following constraints:
We will refer to (2.3) as consistency constraints. These constraints immediately imply that the variables satisfy the following useful property:
For every pair of disjoint sets such that and for any , we have:
This follows by repeated use of (2.3). ∎
We can now use to write an LP relaxation for an instance of MaxCut:
We can also define an LP relaxation for an instance of Non-Uniform Sparsest Cut:
Note that the Sparsest Cut objective function is a ratio, so this is not actually an LP as stated. Instead, we could add the constraint , minimize , and use binary search to find the correct value of . In Section 3, we will use (a slight weakening of) this relaxation in our approximation algorithm for Sparsest Cut on bounded-treewidth graphs, and in Section 6 we will show that Sherali-Adams integrality gaps for the MaxCut LP (2.5) can be translated into integrality gaps for the Sparsest Cut LP (2.6).
3 An Algorithm for Bounded Treewidth Graphs
In this section, we present a -approximation algorithm for Sparsest Cut that runs in time . Consider an instance of Sparsest Cut, where has treewidth , but there are no constraints on the demand graph . We assume that we are also given an initial tree-decomposition for . This is without loss of generality, since such an tree-decomposition can be found, e.g., in time [ACP87] or time [Bod96]; a tree-decomposition of width can be found in time [Ami10].
3.1 Balanced Tree Decompositions and the Linear Program
We start with a result of Bodlaender [Bod89, Theorem 4.2] which converts the initial tree decomposition into a “nice” one, while increasing the width only by a constant factor:
Theorem 3.1 (Balanced Tree Decomp.)
Given graph and a tree decomposition for with width at most , there is a tree decomposition for such that
is a binary tree of depth at most , and
is at most , and hence the width is at most .
Moreover, given and , such a decomposition can be found in time .
From this point on, we will work with the balanced tree decomposition , whose root node is denoted by . Let denote the set of nodes on the tree path in between nodes (inclusive), and let be the union of the bags ’s along this - tree path. Note that .
Recall the Sherali-Adams linear program (2.6), with variables for having the intended meaning that the chosen cut satisfies . We want to use this LP with the number of rounds being , but solving this LP would require time , which is undesirable. Hence, we write an LP that uses only some of the variables from . Let denote the power set of . Let be the power set of and let . For every set , and every subset , we retain the variable in the LP, and drop all the others. There are at most nodes in , and hence sets , each of these has at most many sets. This results in an LP with variables and a similar number of constraints.
Finally, as mentioned above, to take care of the non-linear objective function in (2.6), we guess the optimal value of the denominator, and add the constraint
as an additional constraint to the LP, thereby just minimizing . For the rest of the discussion, let be an optimal solution to the resulting LP.
3.2 The Rounding Algorithm
The rounding algorithm is a very natural top-down propagation rounding procedure. We start with the root ; note that in this case. Since by the constraints (2.2) of the LP, the variables define a probability distribution over subsets of . We sample a subset from this distribution.
In general, for any node with parent , suppose we have already sampled a subset for each of its ancestor nodes , and the union of these sampled sets is . Now, let ; i.e., the family of subsets of whose intersection with is precisely . By Lemma 2.1, we have
Thus the values define a probability distribution over . We now sample a set from this distribution. Note that this rounding only uses sets we retained in our pared-down LP, so we can indeed implement this rounding. Moreover, this set . Finally, we take the union of all the sets
and output the cut . The following lemma is immediate:
For any and any , we get for all .
First, we claim that for all . This is a simple induction on the depth of : the base case is directly from the algorithm. For with parent node ,
as claimed. Now we prove the statement of the lemma: Since , we know that , because none of the future steps can add any other vertices from to . Moreover,
the last equality using the claim above. Defining , this equals , which by Lemma 2.1 equals as desired. ∎
The probability of an edge being cut by equals .
Thus the expected number of edges in the cut equals the numerator of the objective function.
The probability of a demand pair being cut by is at least .
Let denote the (least depth) nodes in such that and respectively; for simplicity, assume that the least common ancestor of and is . (An identical argument works when the least common ancestor is not the root.) We can assume that , or else we can use Lemma 3.2 to claim that the probability are separated is exactly .
Consider the set , and consider the set-valued random variable (taking on values from the power set of ) defined by . Denote the distribution by , and note that this is just the distribution specified by the Sherali-Adams LP restricted to . Let and denote the indicator random variables of the events and respectively; these variables are dependent in general. For a set , let and be indicators for the corresponding events conditioned on . Then by definition,
where the expectation is taken over outcomes of .
Let denote the distribution on cuts defined by the algorithm. Let and denote events that and respectively, and let and denote these events conditioned on . Thus the probability that and are separated by the algorithm is
where the expectation is taken over the distribution of ; by Lemma 3.2 this distribution is the same as that for .
It thus suffices to prove that for any ,
Now observe that is distributed identically to (with both being 1 with probability ), and similarly for and . However, since and lie in different subtrees, and are independent, whereas and are dependent in general.
We can assume that at least one of is at most ; if not, we can do the following analysis with the complementary events , since (3.9) depends only on random variables being unequal. Moreover, suppose
(else we can interchange in the following argument). Define the distribution where we draw from , set equal to , and draw independently from . By construction, the distributions of in and are identical, as are the distributions of in and . We claim that
Indeed, if and , then as well, with and . Thus, (3.10) claims that (recall here that is chosen independently of the other variables). This holds if , which follows from our assumptions on above. Finally,
By Lemmas 3.3 and 3.4, a random cut chosen by our algorithm cuts an expected capacity of exactly , whereas the expected demand cut is at least . This shows the existence of a cut in the distribution whose sparsity is within a factor of two of the LP value. Such a cut can be found using the method of conditional expectations; we defer the details to the next section. Moreover, the analysis of the integrality gap is tight: Section 6 shows that for any constant , the Sherali-Adams LP for Sparsest Cut has an integrality gap of at least , even after rounds.
In this section, we use the method of conditional expectations to derandomize our rounding algorithm, which allows us to efficiently find a cut with sparsity at most twice the LP value. We will think of the set as being a -assignment/labeling for the nodes in , where .
In the above randomized process, let be the indicator random variable for whether the pair is separated. We showed that for , , and for all other , . Now if we let be the r.v. denoting the edge capacity cut by the process and be the r.v. denoting the demand separated, then the analysis of the previous section shows that
(Recall that was the “guessed” value of the total demand separated by the actual sparsest cut.) Equivalently, defining , and
we know that .
The algorithm is the natural one: for the root , enumerate over all assignments for the bag , and choose the assignment minimizing . Since , it must be the case that by averaging. Similarly, given the choices for nodes such that induces a connected tree and , choose any whose parent , and choose an assignment for the nodes in so that the new . The final assignment will satisfy , which would give us a cut with sparsity at most , as desired.
It remains to show that we can compute for any subset containing the root , such that is connected. Let be the set of nodes already labeled. For any vertex , let be the highest node in such that . If is yet unlabeled, then , and hence let be the lowest ancestor of in . In other words, we have chosen an assignment for the bag . By the properties of our algorithm, we know that
Moreover, if are both unlabeled such that their highest bags share a root-leaf path in , then
where is the lowest ancestor of that has been labeled. If are yet unlabeled, but we have chosen an assignment for , then will be labeled independently using (3.12). Finally, if are unlabeled, and we have not yet chosen an assignment for , then the probability of being cut is precisely
where the probability can be computed using (3.12), since will be labeled independently after conditioning on a labeling for . There are at most terms in the sum, and hence we can compute this in the claimed time bound. Now, we can compute using the above expressions in time , which completes the proof.
3.3.1 Embedding into
Our algorithm and analysis also implies a -approximation to the minimum distortion embedding of a treewidth graph in time . We will describe an algorithm that, given , either finds an embedding with distortion or certifies that any embedding of requires distortion more than . It is easy to use such a subroutine to get a -approximation to the minimum distortion embedding problem.
Towards this end, we write a relaxation for the distortion embedding problem as follows. Given with treewidth , we start with the -round Sherali-Adams polytope with . We add the additional set of constraints , for every pair of vertices . The cut characterization of implies that this linear program is feasible whenever there is a distortion embedding. Given a solution to the linear program, we round it using the rounding algorithm of the last section. It is immediate from our analysis that a random cut sampled by the algorithm satisfies .
Moreover, since the analysis of the rounding algorithm only uses equality constraints on the expectations of random variables, we can use the approach of Karger and Koller [KK97] to get an explicit sample space of size that satisfies all these constraints. Indeed, each of the points of this sample space gives us a -embedding of the vertices of the graph. We can concatenate all these embeddings and scale down suitably in time to get an -embedding with the properties that (a) for all , and (b) for . Scaling by a factor of gives an embedding with distortion .
4 The Hardness Result
In this section, we prove the Apx-hardness claimed in Theorem 1.3. In particular, we show the following reduction from the MaxCut problem to the Non-Uniform Sparsest Cut problem.
For any , a -approximation algorithm for Non-Uniform Sparsest Cut on series-parallel graphs (with arbitrary demand graphs) that runs in time implies a -approximation to MaxCut on general graphs running in time .
The current best hardness-of-approximation results for MaxCut are: (a) the -factor hardness (assuming ) due to Håstad [Hås01] (using the gadgets from Trevisan et al. [TSSW00]) and (b) the -factor hardness (assuming the Unique Games Conjecture) due to Khot et al. [KKMO07, MOO10], where is the constant obtained in the hyperplane rounding for the MaxCut SDP. Combined with Theorem 4.1, these imply hardness results of and respectively for Non-Uniform Sparsest Cut and prove Theorem 1.3.
The proof of Theorem 4.1 proceeds by taking the hard MaxCut instances and using them to construct the demand graphs in a Sparsest Cut instance, where the supply graph is the familiar fractal obtained from the graph .333The fractal for has been used for lower bounds on the distortion incurred by tree embeddings [GNRS04], Euclidean embeddings [NR03], and low-dimensional embeddings in [BC05, LN04, Reg12]. Moreover, the fractal for shows the integrality gap for the natural metric relaxation for Sparsest Cut [LR10, CSW10]. The base case of this recursive construction is in Section 4.1, and the full construction is in Section 4.2. The analysis of the latter is based on a generic powering lemma, which will be useful for showing tight Unique Games hardness for bounded treewidth graphs in Section 5 and the Sherali-Adams integrality gap in Section 6.
4.1 The Basic Building Block
Given a connected (unweighted) MaxCut instance , let , and let . Let the supply graph be , with vertices and edges . Define the capacities . Define the demands thus: , and for , let (i.e., have demand between them if is an edge in , and zero otherwise). Let this setting of demands be denoted . (The hardness results in Chuzhoy and Khanna [CK09] and Chlamtáč et al. [CKR10] used the same graph , but with a different choice of capacities and demands.)
The sparsest cuts in are --separating, and have sparsity .
For , the cut has sparsity
since . The cut has sparsity
which is strictly worse than any --separating cut. Hence the sparsest cut is the cut that maximizes . ∎
Given a -vs- hardness result for MaxCut, this gives us a -vs- hardness for Sparsest Cut. However, we can do better using a recursive “fractal” construction, as we show next. Before we proceed further, we remark that if we remove the - demand from the instance , we obtain an instance with the following properties.
The instance constructed by removing from satisfies:
If has a cut of size , then there is an - separating cut of capacity that separates demand.
Any - separating cut has capacity at least .
If the maximum cut in has size , then every - separating cut has sparsity at least .
Any cut that does not separate and has sparsity at least .
While by itself is not a hard instance of Sparsest Cut, the above properties will make it a useful building block in the powering operation below.
4.2 An Instance Powering Operation
In this section, we describe a powering operation on Sparsest Cut instances that we use to boost the hardness result. This is the natural fractal construction. We start with an instance of the sparsest cut problem. In other words, we have a Sparsest Cut instance with two designated vertices and . (For concreteness, think of the from the previous section, but any graph would do.)
For , consider the graph obtained by taking and replacing each capacity edge in with a copy of in the natural way. In other words, for every , we create a copy of , and identify its vertex with and its with . Moreover, is scaled down by . Thus if edge has capacity in , then the corresponding edge in has capacity ; the demands in are also scaled by the same factor. In addition to the scaled demands from copies of , contains new level- demands from the base graph . Note that this instance contains vertices of in its vertex set and will have and as its designated vertices.
The following properties are immediate.
If has vertices and capacity edges, then has vertices and capacity edges. Moreover, if the supply graph in has treewidth , then the supply graph of also has treewidth .
We next argue “completeness” and “soundness” properties of this operation. We will distinguish between cuts that separate and , and those that do not. We call the former cuts admissible and the latter inadmissible.
If has an admissible cut that cuts capacity and demand, then there exists an admissible cut in of capacity that cuts demand.
The proof is by induction on . The base case is an assumption of the lemma. Assume the claim holds for . Let denote the admissible cut satisfying the induction hypothesis and let . Recall that is created by replacing the edges of by copies of . Define the cut in the natural way: Start with . Then for each such that , we place all of in ; similarly if then place all of in . For such that , we cut according to : i.e., the copy of a vertex is placed in . Similarly, if , we put the copy of in if . This defines the cut .
The capacity of the cut can be computed as follows: For each edge of cut by , the corresponding copy of contributes to the cut, where we used the inductive hypothesis for . For edges not cut by , the corresponding is uncut and contributes 0. Thus
Similarly, the demand from copies of cut by is exactly