Vertex Sparsifiers and Abstract Rounding Algorithms

# Vertex Sparsifiers and Abstract Rounding Algorithms

Moses Charikar moses@cs.princeton.edu, Center for Computational Intractability, Department of Computer Science, Princeton University, supported by NSF awards MSPA-MCS 0528414, CCF 0832797, and AF 0916218    Tom Leighton ftl@math.mit.edu, Mathematics Department, MIT and Akamai Technologies, Inc.    Shi Li shili@cs.princeton.edu, Center for Computational Intractability, Department of Computer Science, Princeton University, supported by NSF awards MSPA-MCS 0528414, CCF 0832797, and AF 0916218.    Ankur Moitra moitra@mit.edu, Computer Science and Artificial Intelligence Laboratory, MIT, This research was supported in part by a Fannie and John Hertz Foundation Fellowship. Part of this work was done while the author was visiting Princeton University.
###### Abstract

The notion of vertex sparsification (in particular cut-sparsification) is introduced in [18], where it was shown that for any graph and a subset of terminals , there is a polynomial time algorithm to construct a graph on just the terminal set so that simultaneously for all cuts , the value of the minimum cut in separating from is approximately the same as the value of the corresponding cut in . Then approximation algorithms can be run directly on as a proxy for running on , yielding approximation guarantees independent of the size of the graph. In this work, we consider how well cuts in the sparsifier can approximate the minimum cuts in , and whether algorithms that use such reductions need to incur a multiplicative penalty in the approximation guarantee depending on the quality of the sparsifier.

We give the first super-constant lower bounds for how well a cut-sparsifier can simultaneously approximate all minimum cuts in . We prove a lower bound of – this is polynomially-related to the known upper bound of . This is an exponential improvement on the bound given in [15] which in fact was for a stronger vertex sparsification guarantee, and did not apply to cut sparsifiers.

Despite this negative result, we show that for many natural problems, we do not need to incur a multiplicative penalty for our reduction. Roughly, we show that any rounding algorithm which also works for the -extension relaxation can be used to construct good vertex-sparsifiers for which the optimization problem is easy. Using this, we obtain optimal -competitive Steiner oblivious routing schemes, which generalize the results in [21]. We also demonstrate that for a wide range of graph packing problems (which includes maximum concurrent flow, maximum multiflow and multicast routing, among others, as a special case), the integrality gap of the linear program is always at most times the integrality gap restricted to trees. This result helps to explain the ubiquity of the guarantees for such problems. Lastly, we use our ideas to give an efficient construction for vertex-sparsifiers that match the current best existential results – this was previously open. Our algorithm makes novel use of Earth-mover constraints.

## 1 Introduction

### 1.1 Background

The notion of vertex sparsification (in particular cut-sparsification) is introduced in [18]: Given a graph and a subset of terminals , the goal is to construct a graph on just the terminal set so that simultaneously for all cuts , the value of the minimum cut in separating from is approximately the same as the value of the corresponding cut in . If for all cuts , the the value of the cut in is at least the value of the corresponding minimum cut in and is at most times this value, then we call a cut-sparsifier of quality .

The motivation for considering such questions is in obtaining approximation algorithms with guarantees that are independent of the size of the graph. For many graph partitioning and multicommodity flow questions, the value of the optimum solution can be approximated given just the values of the minimum cut separating from in (for every ). As a result the value of the optimum solution is approximately preserved, when mapping the optimization problem to . So approximation algorithms can be run on as a proxy for running directly on , and because the size (number of nodes) of is , any approximation algorithm that achieves a -approximation guarantee in general will achieve a approximation guarantee when run on (provided that the quality is also ). Feasible solutions in can also be mapped back to feasible solutions in for many of these problems, so polynomial time constructions for good cut-sparsifiers yield black box techniques for designing approximation algorithms with guarantees (and independent of the size of the graph).

In addition to being useful for designing approximation algorithms with improved guarantees, the notion of cut-sparsification is also a natural generalization of many methods in combinatorial optimization that attempt to preserve certain cuts in (as opposed to all minimum cuts) in a smaller graph - for example Gomory-Hu Trees, and Mader’s Theorem. Here we consider a number of questions related to cut-sparsification:

1. Is there a super-constant lower bound on the quality of cut-sparsifiers? Do the best (or even near-best) cut-sparsifiers necessarily result from (a distribution on) contractions?

2. Do we really need to pay a price (in the approximation guarantee) when applying vertex sparsification to an optimization problem?

3. Can we construct (in polynomial time) cut-sparsifiers with quality as good as the current best existential results?

We resolve all of these questions in this paper. In the preceding subsections, we will describe what is currently known about each of these questions, our results, and our techniques. 111 Recently, it has come to our attention that, independent of and concurrent to our work, Makarychev and Makarychev, and independently, Englert, Gupta, Krauthgamer, Raecke, Talgam and Talwar obtained results similar to some in this paper.

### 1.2 Super-Constant Lower Bounds and Separations

In [18], it is proven that in general there are always cut-sparsifiers of quality at most . In fact, if excludes any fixed minor then this bound improves to . Yet prior to this work, no super-constant lower bound was known for the quality of cut-sparsifiers in general. We prove

###### Theorem 1.

There is an infinite family of graphs that admits no cut-sparsifiers of quality better than .

Some results are known in more general settings. In particular, one could require that the graph not only approximately preserve minimum cuts but also approximately preserve the congestion of all multicommodity flows (with demands endpoints restricted to be in the terminal set). This notion of vertex-sparsification is referred to as flow-sparsification (see [15]) and admits a similar definition of quality. [15] gives a lower bound of for the quality of flow-sparsifiers. However, this does not apply to cut sparsifiers and in fact, for the example given in [15], there is an -quality cut-sparsifier!

Additionally, there are examples in which cuts can be preserved within a constant factor, yet flows cannot: Benczur and Karger [3] proved that given any graph on nodes, there is a sparse (weighted) graph that approximate all cuts in within a multiplicative factor, but one provably cannot preserve the congestion of all multicommodity flows within a factor better than on a sparse graph (consider the complete graph ). So here the limits of sparsification are much different for cuts than for flows.

In this paper, we give a super-constant lower bound on the quality of cut-sparsifiers in general and in fact this implies a stronger lower bound than is given in [15]. Our bound is polynomially related to the current best upper-bound, which is .

We note that the current best upper bound is actually a reduction from the upper bound on the integrality gap of a particular LP relaxation for the -extension problem [6], [8]. The integrality gap of this LP relaxation is known to be . Yet, the best lower bound we are able to obtain here is . This leads us to our next question: Do integrality gaps for the -extension LP immediately imply lower bounds for cut-sparsification? This question, as we will see, is essentially equivalent to the question of whether or not the best cut-sparsifiers necessarily come from a distribution on contractions.

Lower bounds on the quality of cut-sparsifiers (in this paper) and flow-sparsifiers ([15]) are substantially more complicated than integrality gap examples for the -extension LP relaxation. If the best cut-sparsifiers or flow-sparsifiers were actually always generated from some distribution on contractions in the original graph via strong duality (see Section ), any integrality gap would immediately imply a lower bound for cut-sparsificatin or flow-sparsification. But as we demonstrate here, this is not the case:

###### Theorem 2.

There is an infinite family of graphs so that the ratio of the best quality cut-sparsifier to the best quality cut-sparsifier that can be achieved through a distribution on contractions is

We also note that in order to prove this result we establish a somewhat surprising connection between cut-sparsification and the harmonic analysis of Boolean functions. The particular cut-sparsifier that we construct in order to prove this result is inspired by the noise stability operator, and as a result, we can use tools from harmonic analysis (Bourgain’s Junta Theorem [5] and the Hypercontractive Inequality [4], [2]) to analyze the quality of the cut-sparsifier. Casting this question of bounding the quality as a question in harmonic analysis allows us to reason about many cuts simultaneously without worrying about the messy details of the combinatorics.

### 1.3 Abstract Integrality Gaps and Rounding Algorithms

As described earlier, running an approximation algorithm on the sparsifier as a proxy for the graph pays an additional price in the approximation guarantee that corresponds to how well approximates . Here we consider the question of whether this loss can be avoided.

As a motivating example, consider the problem of Steiner oblivious routing [18]. Previous techniques for constructing Steiner oblivious routing schemes [18], [15] first construct a flow-sparsifier for , construct an oblivious routing scheme in and then map this back to a Steiner oblivious routing scheme in . Any such approach must pay a price in the competitive ratio, and cannot achieve an -competitive guarantee because (for example) expanders do not admit constant factor flow-sparsifiers [15].

So black box reductions pay a price in the competitive ratio, yet here we present a technique for combining the flow-sparsification techniques in [15] and the oblivious routing constructions in [21] into a single step, and we prove that there are -competitive Steiner oblivious routing schemes, which is optimal. This result is a corollary of a more general idea:

The constructions of flow-sparsifiers given in [15] (which is an extension of the techniques in [18]) can be regarded as a dual to the rounding algorithm in [8] for the -extension problem. What we observe here is: Suppose we are given a rounding algorithm that is used to round the fractional solution of some relaxation to an integral solution for some optimization problem. If this rounding algorithm also works for the relaxation for the -extension problem given in [12] (and also used in [6], [8]), then we can use the techniques in [18], [15] to obtain stronger flow-sparsifiers which are not only good quality flow-sparsifiers, but also for which the optimization problem is easy. So in this way we do not need to pay an additional price in the approximation guarantee in order to replace the dependence on with a dependence on . With these ideas in mind, what we observe is that the rounding algorithm in [9] wh ich embed s metric spaces into distributions on dominating tree-metrics, can also be used to round the -extension relaxation. This allows us to construct flow-sparsifiers that have -quality, and also can be explicitly written as a convex combination of -extensions that are tree-like. On trees, oblivious routing is easy, and so this gives us a way to simultaneously construct good flow-sparsifiers and good oblivious routing schemes on the sparsifier in one step!

Of course, the rounding algorithm in [9] for embedding metric spaces into distributions on dominating tree-metrics is a very common first step in rounding fractional relaxations of graph partitioning, graph layout and clustering problems. So for all problems that use this embedding as the main step, we are able to replace the dependence on with dependence on , and we do not introduce any additional poly-logarithmic factors as in previous work! One can also interpret our result as giving a generalization of the hierarchical decompositions given in [21] for approximating the cuts in a graph on trees. We state our results more formally, below, and we refer to such a statement as an Abstract Integrality Gap.

###### Definition 1.

We call a fractional packing problem a graph packing problem if the goal of the dual covering problem is to minimize the ratio of the total units of distance capacity allocated in the graph divided by some monotone increasing function of the distances between terminals.

This definition is quite general, and captures maximum concurrent flow, maximum multiflow, and multicast routing as special cases, in addition to many other common optimization problems. The integral111The notion of what constitutes an integral solution depends on the problem. In some cases, it translates to the distances are all or , and in other cases it can mean something else. The important point is that the notion of integral just defines a class of admissible metrics, as opposed to arbitrary metrics which can arise in the packing problem. dual problems are generalized sparsest cut, multicut and requirement cut respectively.

###### Theorem 3.

For any graph packing problem , the maximum ratio of the integral dual to the fractional primal is at most times the maximum ratio restricted to trees.

For a packing problem that fits into this class, this theorem allows us to reduce bounding the integrality gap in general graphs to bounding the integrality gap on trees, which is often substantially easier than for general graphs (i.e. for the example problems given above). We believe that this result helps to explain the intrinsic robustness of fractional packing problems into undirected graphs, in particular the ubiquity of the bound for the flow-cut gap for a wide range of multicommodity flow problems.

We also give a polynomial time algorithm to reduce any graph packing problem to a corresponding problem on a tree: Again, let be the set of terminals.

###### Definition 2.

Let be the optimal value of the fractional graph packing problem on the graph .

###### Theorem 4.

There is a polynomial time algorithm to construct a distribution on (a polynomial number of) trees on the terminal set , s.t.

 ET←μ[OPT(P,T)]≤O(logk)OPT(P,G)

and such that any valid integral dual of cost (for any tree in the support of ) can be immediately transformed into a valid integral dual in of cost at most .

As a corollary, given an approximation algorithm that achieves an approximation ratio of for the integral dual to a graph packing problem on trees, we obtain an approximation algorithm with a guarantee of for general graphs. We will refer to this last result as an Abstract Rounding Algorithm.

We also give a polynomial time construction of quality flow-sparsifiers (and consequently cut-sparsifiers as well), which were previously only known to exist, but finding a polynomial time construction was still open. We accomplish this by performing a lifting (inspired by Earth-mover constraints) on an appropriate linear program. This lifting allows us to implicitly enforce a constraint that previously was difficult to enforce, and required an approximate separation oracle rather than an exact separation oracle. We give the details in section  5.

## 2 Maximum Concurrent Flow

An instance of the maximum concurrent flow problem consists of an undirected graph , a capacity function that assigns a non-negative capacity to each edge, and a set of demands where and is a non-negative demand. We denote . The maximum concurrent flow question asks, given such an instance, what is the largest fraction of the demand that can be simultaneously satisfied? This problem can be formulated as a polynomial-sized linear program, and hence can be solved in polynomial time. However, a more natural formulation of the maximum concurrent flow problem can be written using an exponential number of variables.

For any let be the set of all (simple) paths from to in . Then the maximum concurrent flow problem and the corresponding dual can be written as :

 maxλmin∑ed(e)c(e)s.t.s.t.∑P∈Psi,tix(P)≥λfi∀P∈Psi,ti∑e∈Pd(e)≥D(si,ti)∑P∋ex(P)≤c(e)∑iD(si,ti)fi≥1x(P)≥0d(e)≥0,D(si,ti)≥0

For a maximum concurrent flow problem, let denote the optimum.

Let . Then for a given set of demands , we associate a vector in which each coordinate corresponds to a pair and the value is defined as the demand for the terminal pair .

###### Definition 3.

We denote

Or equivalently is the minimum s.t. can be routed in and the total flow on any edge is at most times the capacity of the edge.

Throughout we will use the notation that graphs (on the same node set) are "summed" by taking the union of their edge set (and allowing parallel edges).

### 2.1 Cut Sparsifiers

Suppose we are given an undirected, capacitated graph and a set of terminals of size . Let denote the cut function of : . We define the function which we refer to as the terminal cut function on : .

###### Definition 4.

is a cut-sparsifier for the graph and the terminal set if is a graph on just the terminal set (i.e. ) and if the cut function of satisfies (for all )

 hK(U)≤h′(U)

We can define a notion of quality for any particular cut-sparsifier:

###### Definition 5.

The quality of a cut-sparsifier is defined as

 maxU⊂Kh′(U)hK(U)

We will abuse notation and define so that when is disconnected from in or if or , the ratio of the two cut functions is and we ignore these cases when computing the worst-case ratio and consequently the quality of a cut-sparsifier.

### 2.2 0-Extensions

###### Definition 6.

is a -extension if for all , .

So a -extension is a clustering of the nodes in into sets, with the property that each set contains exactly one terminal.

###### Definition 7.

Given a graph and a set , and -extension , is a capacitated graph in which for all , the capacity of edge is

 ∑(u,v)∈E s.t. f(u)=a,f(v)=bc(u,v)

## 3 Lower Bounds for Cut Sparsifiers

Consider the following construction for a graph . Let be the hypercube of size for . Then for every node (i.e. ), we add a terminal and connect the terminal to using an edge of capacity . All the edges in the hypercube are given capacity . We’ll use this instance to show 2 lower bounds, one for 0-extension cut sparsifiers and the other for arbitrary cut sparisifers.

### 3.1 Lower bound for Cut Sparsifiers from 0-extensions

In this subseciton, we give an integrality gap for the semi-metric relaxation of the -extension problem on this graph, even when the semi-metric (actually on all of ) is . Such a bound is actually implicit in the work of [11] too. Also , we show a strong duality between the worst case integrality gap for the semi-metric relaxation (when the semi-metric on must be ) and the quality of the best cut-sparsifer that can result from contractions. This gives an lower bound on how well a distribution on -extensions can approximate the minimum cuts in .

Also, given the graph a set of terminals, and a semi-metric on we define the -extension problem as:

###### Definition 8.

The 0-Extension Problem is defined as

 min0-Extensionsf∑(u,v)∈Ec(u,v)D(f(a),f(b))

We denote as the value of this optimum.

###### Definition 9.

Let denote the cut-metric in which .

Also, given an partition of , we will refer to as the partition metric (induced by ) which is if and are contained in different subsets of the partition , and is otherwise.

 min∑(u,v)∈Ec(u,v)δ(u,v)s.t.δ is a semi-metric on V∀t,t′∈Kδ(t,t′)=D(t,t′).

We refer to this linear program as the Semi-Metric Relaxation. For a particular instance of the -extension problem, we denote the optimal solution to this linear program as .

###### Theorem 5.

[8]

 OPTsm(G,K,D)≤OPT≤O(logkloglogk)OPTsm(G,K,D)

If we are given a semi-metric which is , we can additionally define a stronger (exponentially) sized linear program.

 min∑Uδ(U)h(U)s.t.∀t,t′∈K∑Uδ(U)ΔU(t,t′)=D(t,t′).

We will refer to this linear program as the Cut-Cut Relaxation. For a particular instance of the -extension problem, we denote the optimal solution to this linear program as .

The value of this linear program is that an upper bound on the integrality gap of this linear program (for a particular graph and a set of terminals ) gives an upper bound on the quality of cut-sparsifiers. In fact, a stronger statement is true, and the quality of the best cut-sparsifier that can be achieved through contractions will be exactly equal to the maximum integrality gap of this linear program. The upper bound is given in [18] -and here we exhibit a strong duality:

###### Definition 10.

The Contraction Quality of is defined to be the minimum such that there is a distribution on -extensions and is a quality cut-sparsifier.

###### Lemma 1.

Let be the maximum integrality gap of the Cut-Cut Relaxation for a particular graph , a particular set of terminals, over all semi-metrics on . Then the Contraction Quality of is exactly .

###### Proof.

Let be the Contraction Quality of . Then implicitly in [18], . Suppose is a distribution on -extensions s.t. is a -quality cut sparsifier. Given any semi-metric on , we can solve the Cut-Cut Linear Program given above. Notice that cut that is assigned positive weight in an optimal solution must be the minimum cut separating from in . If not, we could replace this cut with the minimum cut separating from without affecting the feasibility and simultaneously reducing the cost of the solution. So for all for which , .

Consider then the cost of the semi-metric against the cut-sparsifier which is defined to be which is just the average cost of against where is sampled from the distribution . The Cut-Cut Linear Program gives a decomposition of into a weighted sum of cut-metrics - i.e. . Also, the cost of against is linear in so this implies that

 ∑(a,b)cH(a,b)D(a,b)=∑(a,b)∑UcH(a,b)δ(U)ΔU(a,b)=∑(a,b)cH(a,b)δ(U)h′(U∩K)

In the last line, we use . Then

 ∑(a,b)cH(a,b)D(a,b)≤∑Uδ(U)αhK(U∩K)=αOPTcc(G,K,D)

In the inequality, we have used the fact that is an -quality cut-sparsifier, and in the last line we have used that implies that . This completes the proof because the average cost of against where is sampled from is at most , so there must be some s.t. the cost against is at most . ∎

We will use this strong duality between the Cut-Cut Relaxation and the Contraction Quality to show that for the graph given above, no distribution on -extensions gives better than an quality cut-sparsifier, and all we need to accomplish this is to demonstrate an integrality gap on the example for the Cut-Cut Relaxation.

Let’s repeat the construction of here. Let be the hypercube of size for . Then for every node (i.e. ), we add a terminal and connect the terminal to using an edge of capacity . All the edges in the hypercube are given capacity .

Then consider the distance assignment to the edges: Each edge connecting a terminal to a node in the hypercube - i.e. an edge of the form is assigned distance and every other edge in the graph is assigned distance . Then let be the shortest path metric on given these edge distances.

###### Claim 1.

is an semi-metric on , and in fact there is a weighted combination of cuts s.t. and

###### Proof.

We can take for any cut s.t. - i.e. is the axis-cut corresponding to the bit. We also take for each . This set of weights will achieve , and also there are axis cuts each of which has capacity and there are singleton cuts of weight and capacity so the total cost is .

Yet if we take equal to the restriction of on , then :

###### Proof.

Consider any -extension . And we can define the weight of any terminal as . Then because each node in is assigned to some terminal. We can define a terminal as heavy with respect to if and light otherwise. Obviously, so the sum of the sizes of either all heavy terminals or of all light terminals is at least .

Suppose that . For any pair of terminals , . Also for any light terminal , is a subset of the Hypercube of at most nodes, and the small-set expansion of the Hypercube implies that the number of edges out of this set is at least . Each such edge pays at least cost, because for all pairs of terminals. So this implies that the total cost of the -extension is at least .

Suppose that . Consider any heavy terminal , and consider any and . Then the edge is capacity and pays a total distance of . Consider any set of nodes in the Hypercube. If we attempt to pack these nodes so as to minimize for some fixed node , then the packing that minimizes the quantity is an appropriately sized Hamming ball centered at . In a Hamming ball centered at the node of at least total nodes, the average distance from is , and so this implies that . Each such edge has capacity so the total cost of the -extension is at least

And of course using our strong duality result, this integrality gap implies that any cut-sparsifier that results from a distribution on -extensions has quality at least , and this matches the current best lower bound on the integrality gap of the Semi-Metric Relaxation for -extension, so in principle this could be the best lower bound we could hope for (if the integrality gap of the Semi-Metric Relaxation is in fact then there are always cut-sparsifiers that results from a distribution on -extensions that are quality at most ).

### 3.2 Lower bounds for Arbitrary Cut sparsifiers

We will in fact use the above example to give a lower bound on the quality of any cut-sparisifer. We will show that for the above graph, no cut-sparsifier achieves quality better than , and this gives an exponential improvement over the previous lower bound on the quality of flow-sparsifiers (which is even a stronger requirement for sparsifiers, and hence a weaker lower bound).

The particular example that we gave above has many symmetries, and we can use these symmetries to justify considering only symmetric cut-sparsifiers. The fact that these cut-sparsifiers can be assumed without loss of generality to have nice symmetry properties, translates to that any such cut-sparsifier is characterized by a much smaller set of variables rather than one variable for every pair of terminals. In fact, we will be able to reduce the number of variables from to . This in turn will allow us to consider a much smaller family of cuts in in order to derive that the system is infeasible. In fact, we will only consider sub-cube cuts (cuts in which ) and the Hamming ball .

###### Definition 11.

The operation for some which is defined as and . Also let .

###### Definition 12.

For any permutation , . Then the operation for any permutation is defined at and . Also let .

###### Claim 2.

For any subset and any , .

###### Claim 3.

For any subset and any permutation , .

Both of these operations are automorphisms of the weighted graph and also send the set to .

###### Lemma 3.

If there is a cut-sparsifier for which has quality , then there is a cut-sparsifier which has quality at most and is invariant under the automorphisms of the weighted graph that send to .

###### Proof.

Given the cut-sparsifier , we can apply an automorphism to , and because , this implies that . Also so we can re-write this last line as

 minU s.t. U∩K=Ah(J(U))=minU′ s.t. J(U′)∩K=J(A)h(J(U′))

And if we set then this last line becomes equivalent to

 minU′ s.t. J(U′)∩K=J(A)h(J(U′))=minU s.t. U∩K=J(A)h(U)=hK(J(A))

So the result is that and this implies that if we do not re-label according to , but we do re-label , then for any subset , we are checking whether the minimum cut in re-labeled according to , that separates from is close to the cut in that separates from . The minimum cut in the re-labeled that separates from , is just the minimum cut in that separates from (because the set is the set that is mapped to under ). So is an -quality cut-sparsifier for the re-labeled iff for all :

 hK(A)=hK(J−1(A))≤h′(A)≤αhK(J−1(A))=αhK(A)

which is of course true because is an -quality cut-sparsifier for .

So alternatively, we could have applied the automorphism to and not re-labeled , and this resulting graph would also be an -quality cut-sparsifier for . Also, since the set of -quality cut-sparsifiers is convex (it is defined by a system of inequalities), we can find a cut-sparsifier that has quality at most and is a fixed point of the group of automorphisms, and hence invariant under the automorphisms of as desired. ∎

###### Corollary 1.

If is the best quality cut-sparsifier for the above graph , then there is an quality cut-sparsifier in which the capacity between two terminals and is only dependent on the Hamming distance .

###### Proof.

Given any quadruple and s.t. , there is a concatenation of operations from , that sends to and to . This concatenation of operations is in the group of automorphisms that send to , and hence we can assume that is invariant under this operation which implies that . ∎

One can regard any cut-sparsifier (not just ones that result from contractions) as a set of variables, one for the capacity of each edge in . Then the constraints that be an -quality cut-sparsifier are just a system of inequalities, one for each subset that enforces that the cut in is at least as large as the minimum cut in (i.e. ) and one enforcing that the cut is not too large (i.e. ). Then in general, one can derive lower bounds on the quality of cut-sparsifiers by showing that if is not large enough, then this system of inequalities is infeasible meaning that there is not cut-sparsifier achieving quality . Unlike the above argument, this form of a lower bound is much stronger and does not assume anything about how the cut-sparsifier is generated.

Theorem 1. For , there is no cut-sparsifier for which has quality at most .

Proof (sketch): Assume that there is a cut-sparsifier of quality at most . Then using the above corollary, there is a cut-sparsifier of quality at most in which the weight from to is only a function of . Then for each , we can define a variable as the total weight of edges incident to any terminal of length . I.e. .

For simplicity, here we will assume that all cuts in the sparsifier are at most the cost of the corresponding minimum cut in and at least times the corresponding minimum cut. This of course is an identical set of constraints that we get from dividing the standard definition that we use in this paper for -quality cut-sparsifiers by .

We need to derive a contradiction from the system of inequalities that characterize the set of -quality cut sparsifiers for . As we noted, we will consider only the sub-cube cuts (cuts in which ) and the Hamming ball , which we refer to as the Majority Cut.

Consider the Majority Cut: There are terminals on each side of the cut, and most terminals have Hamming weight close to . In fact, we can sort the terminals by Hamming weight and each weight level around Hamming weight has roughly a fraction of the terminals. Any terminal of Hamming weight has roughly a constant fraction of their weight crossing the cut in , because choosing a random terminal Hamming distance from any such terminal corresponds to flipping coordinates at random, and throughout this process there are almost an equal number of s and s so this process is well-approximated by a random walk starting at on the integers, which equally likely moves forwards and backwards at each step for total steps, and asking the probability that the walk ends at a negative integer.

In particular, for any terminal of Hamming weight , the fraction of the weight that crosses the Majority Cut is . So the total weight of length edges (i.e. edges connecting two terminals at Hamming distance ) cut by the Majority Cut is because each weight close to the boundary of the Majority cut contains roughly a fraction of the terminals. So the total weight of edges crossing the Majority Cut in is

And the total weight crossing the minimum cut in separating from is . And because the cuts in are at least times the corresponding minimum cut in , this implies

Next, we consider the set of sub-cube cuts. For , let . Then the minimum cut in separating from is , because each node in the Hypercube which has the first coordinates as zero has edges out of the sub-cube, and when , we would instead choose cutting each terminal from the graph directly by cutting the edge .

Also, for any terminal in , the fraction of length edges that cross the cut is approximately . So the constraints that each cut in be at most the corresponding minimum cut in give the inequalities

We refer to the above constraint as . Multiply each constraint by and adding up the constraints yields a linear combination of the variables on the left-hand side. The coefficient of any is

 d−1∑j=1min(ijd,1)j3/2≥d/i∑j=1ijdj3/2

And using the Integration Rule this is .

This implies that the coefficients of the constraint resulting from adding up times each for each are at least as a constant times the coefficient of in the Majority Cut Inequality. So we get

 d−1∑j=11j3/2min(j,√d)≥Ω(d−1∑j=11j3/2d∑i=1min(ijd,1)wi)≥Ω(d∑i=1wi√id)≥Ω(√dα)

And we can evaluate the constant using the Integration Rule, this evaluates to . This implies and in particular this implies . So the quality of the best cut-sparsifier for is at least .

We note that this is the first super-constant lower bound on the quality of cut-sparsifiers. Recent work gives a super-constant lower bound on the quality of flow-sparsifiers in an infinite family of expander-like graphs. However, for this family there are constant-quality cut-sparsifiers. In fact, lower bounds for cut-sparsifiers imply lower bounds for flow-sparsifiers, so we are able to improve the lower bound of in the previous work for flow-sparsifiers by an exponential factor to , and this is the first lower bound that is tight to within a polynomial factor of the current best upper bound of .

This bound is not as good as the lower bound we obtained earlier in the restricted case in which the cut-sparsifier is generated as a convex combination of -extension graphs . As we will demonstrate, there are actually cut-sparsifiers that achieve quality for , and so in general restricting to convex combinations of -extensions is sub-optimal, and we leave open the possibility that the ideas in this improved bound may result in better constructions of cut (or flow)-sparsifiers that are able to beat the current best upper bound on the integrality gap of the -extension linear program.

## 4 Noise Sensitive Cut-Sparsifiers

In Appendix A, we give a brief introduction to the harmonic analysis of Boolean functions, along with formal statements that we will use in the proof of our main theorem in this section.

### 4.1 A Candidate Cut-Sparsifier

Here we give a cut-sparsifier which will achieve quality for the graph given in Section 3, which is asymptotically better than the best cut-sparsifier that can be generated from contractions.

As we noted, we can assume that the weight assigned between a pair of terminals in , is only a function of the Hamming distance from to . In , the minimum cut separating any singleton terminal from is just the cut that deletes the edge . So the capacity of this cut is . We want a good cut-sparsifier to approximately preserve this cut, so the total capacity incident to any terminal in will also be - i.e. .

We distribute this capacity among the other terminals as follows: We sample , and allocate an infinitesimal fraction of the total weight to the edge . Equivalently, the capacity of the edge connecting and is just . We choose . This choice of corresponds to flipping each bit in with probability when generating from . We prove that the graph has cuts at most the corresponding minimum-cut in .

This cut-sparsifier has cuts at most the corresponding minimum-cut in . In fact, a stronger statement is true: can be routed as a flow in with congestion . Consider the following explicit routing scheme for : Route the total flow in out of to the node in . Now we need to route these flows through the Hypercube in a way that does not incur too much congestion on any edge. Our routing scheme for routing the edge from to in from to will be symmetric with respect to the edges in the Hypercube: choose a random permutation of the bits , and given , fix each bit in the order defined by . So consider . If , and the flow is currently at the node , then flip the bit of , and continue for , .

Each permutation defines a routing scheme, and we can average over all permutations and this results in a routing scheme that routes in .

###### Claim 4.

This routing scheme is symmetric with respect to the automorphisms and of defined above.

###### Corollary 2.

The congestion on any edge in the Hypercube incurred by this routing scheme is the same.

###### Lemma 4.

The above routing scheme will achieve congestion at most for routing in .

###### Proof.

Since the congestion of any edge in the Hypercube under this routing scheme is the same, we can calculate the worst case congestion on any edge by calculating the average congestion. Using a symmetry argument, we can consider any fixed terminal and calculate the expected increase in average congestion when sampling a random permutation and routing all the edges out of in using . This expected value will be times the average congestion, and hence the worst-case congestion of routing in according to the above routing scheme.

As we noted above, we can define equivalently as arising from the random process of sampling , and routing an infinitesimal fraction of the total capacity out of to , and repeating until all of the capacity is allocated. We can then calculate the the expected increase in average congestion (under a random permutation ) caused by routing the edges out of as the expected increase in average congestion divided by the total fraction of the capacity allocated when we choose the target from . In particular, if we allocated a fraction of the capacity, the expected increase in total congestion is just the total capacity that we route multiplied by the length of the path. Of course, the length of this path is just the number of bits in which and differ, which in expectation is by our choice of .

So in this procedure, we allocate total capacity, and the expected increase in total congestion is the total capacity routed times the expected path length . We repeat this procedure times, and so the expected increase in total congestion caused by routing the edges out of in is . If we perform this procedure for each terminal, the resulting total congestion is , and because there are edges in the Hypercube, the average congestion is which implies that the worst-case congestion on any edge in the Hypercube is also , as desired. Also, the congestion on any edge is because there is a total of capacity out of in , and this is the only flow routed on this edge, which has capacity in by construction. So the worst-case congestion on any edge in the above routing scheme is . ∎

For any , .

###### Proof.

Consider any set . Let be the minimum cut in separating from . Then the total flow routed from to in is just , and if this flow can be routed in with congestion , this implies that the total capacity crossing the cut from to is at least . And of course the total capacity crossing the cut from to is just by the definition of , which implies the corollary. ∎

So we know that the cuts in are never too much larger than the corresponding minimum cut in , and all that remains to show that the quality of is is to show that the cuts in are never too small. We conjecture that the quality of is actually , and this seems natural since the quality of just restricted to the Majority Cut and the sub-cube cuts is actually , and often the Boolean functions corresponding to these cuts serve as extremal examples in the harmonic analysis of Boolean functions. In fact, our lower bound on the quality of any cut-sparsifier for is based only on analyzing these cuts so in a sense, our lower bound is tight given the choice of cuts in that we used to derive infeasibility in the system of equalities characterizing -quality cut-sparsifiers.

### 4.2 A Fourier Theoretic Characterization of Cuts in H

Here we give a simple formula for the size of a cut in , given the Fourier representation of the cut. So here we consider cuts to be Boolean functions of the form s.t. iff .

###### Proof.

We can again use the infinitesimal characterization for , in which we choose and allocate units of capacity from to and repeat until all units of capacity are spent.

If we instead choose uniformly at random, and then choose and allocate units of capacity from to , and repeat this procedure until all units of capacity are spent, then at each step the expected contribution to the cut is exactly because is exactly the probability that if we choose uniformly at random, and that which means that this edge contributes to the cut. We repeat this procedure times, so this implies the lemma. ∎

###### Proof.

Using the setting , we can compute using the above lemma:

 h′(A)=k√d4(1−NSρ[fA(x)])

And using Parseval’s Theorem,