A note on approximate strengths of edges in a hypergraph
Let be an edge-weighted hypergraph of rank . Kogan and Krauthgamer  extended Benczúr and Karger’s  random sampling scheme for cut sparsification from graphs to hypergraphs. The sampling requires an algorithm for computing the approximate strengths of edges. In this note we extend the algorithm for graphs from  to hypergraphs and describe a near-linear time algorithm to compute approximate strengths of edges; we build on a sparsification result for hypergraphs from our recent work . Combined with prior results we obtain faster algorithms for finding -approximate mincuts when the rank of the hypergraph is small.
Benczúr and Karger, in their seminal work , showed that all cuts of a weighted graph on vertices can be approximated to within a -factor by a sparse weighted graph where . Moreover, can be computed in near-linear time by a randomized algorithm. The algorithm has two steps. In the first step, it computes for each edge a number which is proportional to the inverse of the approximate strength of edge . The second step is a simple sampling scheme where each edge is independently sampled with probability and if it is chosen, the edge is assigned a capacity where is the original capacity/weight of . It is shown in  that the probabilities can be computed by a deterministic algorithm in time where if is weighted and in time if is unweighted or has polynomially bounded weights. More recent work  has shown a general framework for cut sparsification where the sampling probability can also be chosen to be approximate connectivity between the end points of . In another seminal work, Batson, Spielman and Srivastava  showed that spectral-sparsifiers, which are stronger than cut-sparsifiers, with edges exist, and can be computed in polynomial time. Lee and Sun recently showed such sparsifiers can be computed in time, where hide polylog factors .
In this paper, we are interested in hypergraphs. is a hypergraph where each is a subset of nodes. Graphs are a special case when each has cardinality . The rank of a hypergraph is . Recently Kogan and Krauthgamer  extended Benczúr and Karger’s sampling scheme and analysis to show the following: for any weighted hypergraph with rank there is a -approximate cut-sparsifier with edges. They show that sampling with (approximate) strengths will yield the desired sparsifier. Finding the strengths of edges in hypergraphs can be easily done in polynomial time. In this note, we develop a near-linear time algorithm for computing approximate strengths of edges in a weighted hypergraph. We state a formal theorem and indicate some applications after we formally define strength of an edge.
Strength of an edge:
Let be an unweighted hypergraph. For , denotes the vertex induced subhypergraph of . For we denote by the set of edges that cross ; formally . A hypergraph is -edge-connected if for every non-trivial cut . The \EMPHedge-connectivity of , denoted by , is the largest such that is -edge-connected. Equivalently, is the value of the min-cut in . The \EMPHstrength of an edge , denoted by , is defined as ; in other words the largest connectivty of a vertex induced subhypergraph that contains . We also define the cost of as the inverse of the strength; that is . We drop the subscript if there is no confusion regarding the hypergraph.
The preceding definitions generalize easily to weighted hypergraphs. Let be a non-negative integer weight for edge . Then the notion of edge-connectivity easily generalizes via the weight of a mincut. The strength of an edge is the maximum mincut value over all vertex induced subhypergraphs that contain .
For a hypergraph we use to denote the number of vertices, to denote the number of edges and to denote the total degree. We observe that is the natural representation size of when the rank of the hypergraph is not bounded.
Our main technical result is the following.
Let be a edge-weighted hypergraph on vertices. There is an efficient algorithm that computes for each edge an approximate strength such that the following properties are satisfied:
lower bound property: and
-cost property: where .
For unweighted hypergraphs the running time of the algorithm is and for weighted hypergraphs the running time is .
The function that satisfies the theorem is called a -approximate strength. One natural approach is converting the hypergraph to a graph, and hope the strength approximately carries over. For example, replacing each hyperedge with a clique, or a star spanning the vertices in the edge. The minimum strength of the replaced edges might give us a -approximation. Unfortunately, this will not work even when rank is only . Consider the following hypergraph with vertices . There is an edge , and edge for all . The strength of in is . Let be a graph where each in is replaced with a star centered at and spans . The strength of in is . This bound also holds if each hyperedge is replaced by a clique.
Our proof of the preceding theorem closely follows the corresponding theorem for graphs from . A key technical tool for graphs is the deterministic sparsification algorithm for edge-connectivity due to Nagamochi and Ibaraki . Here we rely on a generalization to hypergraphs from our recent work .
A weighted hypergraph is a -cut approximation of a weighted hypergraph , if for every , the cut value of in is within factor of the cut value of in . Below we state formally the sampling theorem from .
Theorem 2 ()
Let be a rank weighted hypergraph where edge has weight . Let be a hypergraph obtained by independently sampling each edge in with probability and weight if sampled. With probability at least , the hypergraph has edges and is a -cut-approximation of .
The expected weight of each edge in is the weight of the edge in . It is not difficult to prove that -approximate strength also suffices to get a -cut-approximation. That is, if we replace with -approximate strength , the sampling algorithm will still output a -cut-approximation of . Indeed, the lower bound property shows each edge will be sampled with a higher probability, therefore the probability of obtaining a -sparsifier increases. However, the number of edges in the sparsifier increases. The -cost property shows the hypergraph will have edges.
From Theorem 1, we can find an -approximate strength function in time.
A -cut approximation of with edges can be found in time with high probability.
The number of edges in the -cut approximation is worse than Theorem 2 by a factor of . It is an open problem whether the extra factor of can be removed.
As is the case for graphs, cut sparsification allows for faster approximation algorithms for cut problems. We mention two below.
The best running time known currently to compute a (global) mincut in a hypergraph is [6, 11]. A -approximation can be computed in time . Via 1 we can first sparsify the hypergraph and then apply the time algorithm to the sparsified graph. This gives a randomized algorithm that outpus a -approximate mincut in a rank hypergraph in time and works with high probability. For small the running time is for a -approximation.
The standard technique to compute a - mincut in a hypergraph is computing a - maximum flow in an associated capacitated digraph with vertices and edges . A approximation algorithm of - maximum flow in such graph can be found in time . Via sparsification 1, we can obtain a randomized algorithm to find a -approximate - mincut in a rank hypergraph in time. For small the running time is for a -approximation.
A \EMPH-strong component of is a inclusion-wise maximal set of vertices such that is -edge-connected. An edge is \EMPH-strong if , otherwise is \EMPH-weak. is the number of components of a hypergraph . The \EMPHsize of is . The \EMPHdegree of a vertex , , is the number of edges incident to . denotes the number of vertices in , denotes the size of and denotes the rank of .
Contracting an edge in a hypergraph results in a new hypergraph where all vertices in are identified into a single new vertex . Any edge is removed. Any edge that properly intersects with is adjusted by replacing by the new vertex . We note by .
Deleting an edge does not increase the strength of any remaining edge. Contracting an edge does not decrease the strength of any remaining edge.
Let be a hypergraph. If an edge crosses a cut of value , then .
Suppose is a cut such that and . Consider any that contains as a subset and let . It is easy to see that and hence . Therefore, .
Lemma 2 (Extends Lemma 4.5,4.6 )
Let be distinct edges in a hypergraph .
If and is a -weak edge then where .
Suppose (that is is a -weak edge) and is a -strong edge. Let obtained by contracting . If is the corresponding edge of in then .
In short, contracting a -strong edge does not increase the strength of a -weak edge and deleting a -weak edge does not decrease the strength of a -strong edge.
We prove the two claims separately.
Deleting a -weak edge:
Since is -strong in there is such that and . If , would be a -strong edge. Since is -weak, does not belong to . Thus which implies that .
Contracting a -strong edge:
Let where is a -strong edge in . Let be the vertex in obtained by contracting . Let be the set that certifies the strength of , that is and . If then it is easy to see that and which would imply that . Thus, we can assume that . Let be the set of vertices in obtained by uncontracting . We claim that . If this is the case we would have . We now prove the claim. Let be the set that certifies the strength of , namely . Consider any cut in . If crosses or strictly contained in , then has cut value at least . Otherwise, or . By symmetry, we only consider . because . is also a cut in , and . This shows that
Because , and , we arrive at .
Theorem 3 (Extends Lemma 4.10 )
Consider a connected unweighted hypergraph . A weighted hypergraph with weight on each edge has minimum cut value .
Consider a cut of value in ; that is . For each edge , by 1. Therefore has weight at least in . It follows the cut value of in is at least . Thus, the mincut of is at least .
Let be a mincut in whose value is . It is easy to see that for each , . Thus the value of the cut in is exactly .
Lemma 3 (Extends Lemma 4.11 )
For a unweighted hypergraph with at least vertex,
Let . We prove the theorem by induction on .
For the base case, if , then has no edges and therefore the sum is .
Otherwise, let . Let be a connected component of with at least vertices. There exists a cut of cost in by Theorem 3.
Let . has at least one more connected component than . By 1, for each ; hence . By the inductive hypothesis, . The edges in are exactly . By the same argument as in the preceding lemma, . Therefore,
The number of -weak edges in an unweighted hypergraph on vertices is at most .
The -strong components are pairwise disjoint.
Consider -strong components and . Assume , then using triangle inequality for connectivity. This shows by maximality of and .
3 Estimating strengths in unweighted hypergraphs
In this section, we consider unweighted hypergraphs and describe a near-linear time algorithm to estimate the strengths of all edges as stated in Theorem 1. Let be the given hypergraph. The high-level idea is simple. We assume that there is a fast algorithm that returns a set of edges such that contains all the -weak edge in ; the important aspect here is that the output may contain some edges which are not -weak, however, the algorithm should not output too many such edges (this will be quantified later).
The estimation algorithm is defined in Figure 1. The algorithm repeatedly calls for increasing values of while removing the edges found in previous iterations.
Let be an unweighted hypergraph. Then,
For each , . That is, the strength of all edges deleted in iteration is at least .
For each , .
for all and because deleting edges cannot increase strength. Let denote the set of edges in with .
We prove that for all by induction on . If , then for all . Now we assume . At end of iteration , by induction, we have that for each . In iteration , contains all -weak edges in the graph . We hav . Thus, for any edge , . This proves the claim.
Since , it follows from the previous claim that for all . Since the form a partition of , we have for all .
Note that in principle WeakEdges could output all the edges of the graph for any . This would result in a high cost for the resulting strength estimate. Thus, we need some additional properties on the output of the procedure WeakEdges. Let be a hypergraph. A set of edges is called \EMPH-light if . Intuitively, on average, we remove edges in to increase the number of components of by .
If outputs a set of -light edges for all , then the output of the algorithm satisfies the -cost property. That is, .
From the description of , and using the fact that the edges sets partition , we have .
is the output of . From the lightness property we assumed, . Combining this with the preceding equality,
3.1 Implementing WeakEdges
We now show describe an implementation of WeakEdges that outputs a -light set.
Let be a hypergraph. An edge is \EMPH-crisp with respect to if it crosses a cut of value less than . In other words, there is a cut , such that and . Note that any -crisp edge is -weak. A set of edges is a \EMPH-partition, if contains all the -crisp edges in . A -partition may contain non--crisp edges.
We will assume access to a subroutine that given and integer , it finds a -light -partition of . We will show how to implement Partition later. See Figure 2 for the implementation of WeakEdges.
WeakEdges returns a -light set such that contains all the -weak edges of with calls to Partition.
First, we assume has no -strong component with more than vertex and that is connected. Then all edges are -weak and the number of -weak edges in is at most by 2. It also implies that . By Markov’s inequality at least half the vertices have degree less than . For any vertex with degree less than , all edges incident to it are -crisp. Thus, after the first iteration, all such vertices become isolated since contains all -crisp edges. If is not connected then we can apply this same argument to each connected component and deduce that at least half of the vertices in each component will be isolated in the first iteration Therefore, in iterations, all vertices become isolated. Hence returns all the edges of .
Now consider the general case when may have -strong components. We can apply the same argument as above to the hypergraph obtained by contracting each -strong component into a single vertex. The is well defined because the -strong components are disjoint by 4.
Let the edges removed in the th iteration to be , and the hypergraph before the edge removal to be . So . Recall that Partition returns a -light set. Hence we know .
This shows is -light.
It remains to implement Partition() that returns a -light -partition. To do this, we introduce -sparse certificates. Let be a hypergraph. A subset of edges , define is a \EMPH-sparse certificate if for all where .
There is an algorithm that given hypergraph and integer , finds a -sparse certificate of in time such that .
See Appendix A.
A -sparse certificate is certainly a -partition. However, may contain too many edges to be -light. Thus, we would like to find a smaller subset of . Note that every -crisp edge must be in a -sparse certificate and hence no edge in can be -crisp. Hence we will contract the edges in , and find a -sparse certificate in the hypergraph after the contraction. We repeat the process until eventually we reach a -light set. See Figure 3 for the formal description of the algorithm.
Partition() outputs a -light -partition in time.
If the algorithm either returns all the edges of the graph in the first step then it is easy to see that the output is a -light -partition since the algorithm explicitly checks for the lightness condition.
Otherwise let be the output of . As we argued earlier, contains no -crisp edges and hence contracting them is safe. Moreover, all the original -crisp edges remain -crisp after the contraction. Since the algorithm recurses on the new graph, this establishes the correctness of the output.
We now argue for termination and running time by showing that the number of vertices halves in each recursive call. Assume contains vertices, the algorithm finds a -sparse certificate and contracts all edges not in the certificate. The resulting hypergraph has vertices and edges. We have by Theorem 5. If , then the number of vertices halved. Otherwise , then the number of edges , and the algorithm terminates in the next recursive call.
The running time of the algorithm for a size hypergraph with vertices is . satisfies the recurrence .
Putting things together, finds the desired -approximate strength.
Let be a unweighted hypergraph. The output of is a -approximate strength function of . can be implemented in time.
Combining 5, 6 and Theorem 4, we get the output of is a -approximate strength function. The maximum strength in the graph is at most , all edges with be removed at the th iteration of the while loop. In each iteration, there is one call to WeakEdges. Each call of WeakEdges takes time by combining Theorem 6 and Theorem 4. The step outside the while loop takes linear time, since we can set to be as a lower bound of the strength. Hence overall, the running time is .
4 Estimating strengths in weighted hypergraphs
Consider a weighted hypergraph with an associated weight function ; that is, we assume all weights are non-negative integers. For proving correctness, we consider an unweighted hypergraph that simulates . Let contain copies of edge for every edge in ; one can see that the strength of each of the copies of in is the same as the strength of in . Thus, it suffices to compute strengths of edges in . We can apply the correctness proofs from the previous sections to . In the remainder of the section we will only be concerned with the running time issue since we do not wish to explicitly create . We say is the implicit representation of .
We can implement that finds an implicit representation of the -sparse certificate of such that , in time, where is the number of edges in the implicit representation, see Appendix A.
The remaining operations in Partition and WeakEdges consist only of adding edges, deleting edges and contracting edges. These operations take time only depending on the size of the implicit representation. Therefore the running time in Theorem 6 and Theorem 4 still holds for weighted hypergraph.
Given a lower bound of the strength. If the total weight of all edges is at most , then can be implemented in time.
Because is a lower bound of the strength, we can set to be in the first step. The maximum strength in the graph is at most , all edges will be removed at the th iteration of the while loop. Each iteration calls WeakEdges once, hence the running time is .
Assume we have disjoint intervals where for every , . In addition, assume , and for all . We can essentially apply the estimation algorithm to edges with strength inside each interval. Let be the set of edges whose strengh lies in interval . Indeed, let to be the graph obtained from by contracting all edges in where , and deleting all edges in where . For edge let be its corresponding edge in . From 2, . The total weight of is at most . We can run to estimate since the ratio between the lower bound and upper bound is . Let be the size of , and be the number of vertices in , the running time for is by Theorem 8. The total running time of Estimation over all is . Constructing from takes time: contract all edges in and then add all the edges where . Therefore we can construct all in time.
It remains to find the intervals . For each edge , we first find values , such that . The maximal intervals in are the desired intervals. We now describe the procedure to find the values for all .
The star approximate graph of is a weighted graph obtained by replacing each hyperedge in with a star , where the center of the star is an arbitrary vertex in , and the star spans each vertex in . Every edge in has weight equal to the weight of .
It is important that is a multigraph: parallel edges are distinguished by which hyperedge it came from. We define a correspondence between the edges in and by a function . For an edge in , if . Let be a maximum weight spanning tree in . For , define to be the minimal subtree of that contains all vertices in . Note that all the leaves of are vertices from .
For any two vertices and , we define to be the weight of the minimum weight edge in the unique --path in . For each edge we let . We will show satisfies the property that .
Let . Certainly, , because all vertices of are contained in . because every cut in has to cross some where , and the weight of is at least . Hence .
We claim if we remove all edges in with weight at most , then it will disconnect some . If the claim is true then crosses a cut of value at most . By 1, . Assume that the claim is not true. Then, in the graph we can remove all edges with weight at most and the vertices in will still be connected. We can assume without loss of generality that the maximum weight spanning tree in is computed using the greedy Kruskal’s algorithm. This implies that will contain only edges with weight strictly greater than . This contradicts the definition of .
can be constructed in time. The maximum spanning tree can be found in time. We can construct a data structure on in time, such that for any , it returns in time.  To compute , we fix some vertex in , and compute using the data structure in time. Computing for all takes in time. The total running time is . We conclude the following theorem.
Given a weighted hypergraph , we can find a value for each edge , such that in time.
The preceding lemma gives us the desired intervals. Using Theorem 8, we have the desired theorem.
Given a rank weighted hypergraph with weight function , in time, one can find a -approximate strength function of .
-  Joshua Batson, Daniel A Spielman, and Nikhil Srivastava. Twice-ramanujan sparsifiers. SIAM Journal on Computing, 41(6):1704–1721, 2012.
-  András A. Benczúr and David R. Karger. Randomized approximation schemes for cuts and flows in capacitated graphs. SIAM J. Comput., 44(2):290–319, 2015. Preliminary versions appeared in STOC ’96 and SODA ’98.
-  Bernard Chazelle. Computing on a free tree via complexity-preserving mappings. Algorithmica, 2(1):337–361, 1987.
-  Chandra Chekuri and Chao Xu. Computing minimum cuts in hypergraphs. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1085–1100, 2017.
-  Wai Shing Fung, Ramesh Hariharan, Nicholas J.A. Harvey, and Debmalya Panigrahi. A general framework for graph sparsification. In Proceedings of the Forty-third Annual ACM Symposium on Theory of Computing, STOC ’11, pages 71–80, New York, NY, USA, 2011. ACM.
-  Regina Klimmek and Frank Wagner. A simple hypergraph min cut algorithm. Technical Report B 96-02, Bericht FU Berlin Fachbereich Mathematik und Informatik, 1996. Available at http://edocs.fu-berlin.de/docs/servlets/MCRFileNodeServlet/FUDOCS_derivate_000000000297/1996_02.pdf.
-  Dmitry Kogan and Robert Krauthgamer. Sketching cuts in graphs and hypergraphs. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, ITCS ’15, pages 367–376, New York, NY, USA, 2015. ACM.
-  E. L. Lawler. Cutsets and partitions of hypergraphs. Networks, 3(3):275–285, 1973.
-  Yin Tat Lee and Aaron Sidford. Path finding methods for linear programming: Solving linear programs in iterations and faster algorithms for maximum flow. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on, pages 424–433. IEEE, 2014.
-  Yin Tat Lee and He Sun. An SDP-Based Algorithm for Linear-Sized Spectral Sparsification. ArXiv e-prints, February 2017.
-  Wai-Kei Mak and D.F. Wong. A fast hypergraph min-cut algorithm for circuit partitioning. Integration, the VLSI Journal, 30(1):1 – 11, 2000.
-  Hiroshi Nagamochi and Toshihide Ibaraki. A linear-time algorithm for finding a sparse -connected spanning subgraph of a -connected graph. Algorithmica, 7(1-6):583–596, 1992.
Appendix A -sparse certificate
We show how to find a -sparse certificate for weighted and unweighted hypergraphs.
We refer to  for terminology. Given a hypergraph consider an MA-ordering of vertices . Let be the induced head ordering of the edges.
For the unweighted version, let be the first backward edges of in the head ordering, or all the backward edges of if there are fewer than backward edges. Given and , is defined such that . It is easy to see can be constructed in linear time once the MA-ordering have been computed. It is a -sparse certificate follows directly from . It is also obvious it contain at most edges, since and .
The algorithm is similar for the weighted version. For a vertex , assume its backward edges are with weights , and if then comes before in the head order of the edges. Define and for all other edges. Define . A weighted hypergraph with the weight function is a -sparse certificate of . With careful bookkeeping, can be computed in time.
The running time is dominated by finding the MA-ordering, which is time.