Approximating the Smallest Spanning Subgraph for 2-Edge-Connectivity in Directed Graphs††thanks: A preliminary version of some of the results of this work was presented at ESA 2015.
Let be a strongly connected directed graph. We consider the following three problems, where we wish to compute the smallest strongly connected spanning subgraph of that maintains respectively: the -edge-connected blocks of (2EC-B); the -edge-connected components of (2EC-C); both the -edge-connected blocks and the -edge-connected components of (2EC-B-C). All three problems are NP-hard, and thus we are interested in efficient approximation algorithms. For 2EC-C we can obtain a -approximation by combining previously known results. For 2EC-B and 2EC-B-C, we present new -approximation algorithms that run in linear time. We also propose various heuristics to improve the size of the computed subgraphs in practice, and conduct a thorough experimental study to assess their merits in practical scenarios.
Let be a directed graph (digraph), with edges and vertices. An edge of is a strong bridge if its removal increases the number of strongly connected components of . A digraph is -edge-connected if it has no strong bridges. The -edge-connected components of are its maximal -edge-connected subgraphs. Let and be two distinct vertices: and are -edge-connected, denoted by , if there are two edge-disjoint directed paths from to and two edge-disjoint directed paths from to . (Note that a path from to and a path from to need not be edge-disjoint.) A -edge-connected block of is a maximal subset such that for all . Differently from undirected graphs, in digraphs -edge-connected blocks can be different from the -edge-connected components, i.e., two vertices may be -edge-connected but lie in different -edge-connected components. See Figure 1.
A spanning subgraph of has the same vertices as and contains a subset of the edges of . Computing a smallest spanning subgraph (i.e., one with minimum number of edges) that maintains the same edge or vertex connectivity properties of the original graph is a fundamental problem in network design, with many practical applications . In this paper we consider the problem of finding the smallest spanning subgraph of that maintains certain -edge-connectivity requirements in addition to strong connectivity. Specifically, we distinguish three problems that we refer to as 2EC-B, 2EC-C and 2EC-B-C. In particular, we wish to compute the smallest strongly connected spanning subgraph of a digraph that maintains the following properties:
the pairwise -edge-connectivity of , i.e., the -edge-connected blocks of (2EC-B);
the -edge-connected components of (2EC-C);
both the -edge-connected blocks and the -edge-connected components of (2EC-B-C).
Since all those problems are NP-hard , we are interested in designing efficient approximation algorithms.
While for 2EC-C one can obtain a -approximation using known results, for the other two problems no efficient approximation algorithms were previously known. Here we present a linear-time algorithm for 2EC-B that achieve an approximation ratio of . Then we extend this algorithm so that it approximates the smallest 2EC-B-C also within a factor of . This algorithm runs in linear time if the -edge-connected components of are known, otherwise it requires the computation of these components, which can be done in time . Moreover, we give efficient implementations of our algorithms that run very fast in practice. Then we consider various heuristics that improve the size of the computed subgraph in practice. Some of these heuristics require time in the worst case, so we also consider various techniques that achieve significant speed up.
1.1 Related work
Finding a smallest -edge-connected (resp. -vertex-connected) spanning subgraph of a given -edge-connected (resp. -vertex-connected) digraph is NP-hard for for undirected graphs, and for for digraphs . More precisely, if the input graph consists of a single -edge-connected block then the problem asks for the smallest -edge-connected subgraph, whereas if the input graph consists of singleton -edge-connected blocks then the problem coincides with the smallest strongly connected spanning subgraph. Problems of this type, together with more general variants of approximating minimum-cost subgraphs that satisfy certain connectivity requirements, have received a lot of attention, and several important results have been obtained. More general problems of approximating minimum-cost subgraphs that satisfy certain connectivity requirements has also received a lot of attention; see, e.g., the survey .
Currently, the best approximation ratio for computing the smallest strongly connected spanning subgraph (SCSS) is , achieved by Vetta with a polynomial-time algorithm . Although Vetta did not analyze exactly the running time of his algorithm, it needs to solve a maximum matching problem in a relaxation problem. A faster linear-time algorithm that achieves a -approximation was given by Zhao et al. . For the smallest -edge-connected spanning subgraph (kECSS), Laehanukit et al.  gave a randomized -approximation algorithm. Regarding hardness of approximation, Gabow et al.  showed that there exists an absolute constant such that for any integer , approximating the smallest kECSS on directed multigraphs to within a factor in polynomial time implies . Jaberi  considered various optimization problems related to 2EC-B and proposed corresponding approximation algorithms. The approximation ratio in Jaberi’s algorithms, however, is linear in the number of strong bridges, and hence in the worst case.
1.2 Our results
In this paper we provide both theoretical and experimental contributions to the 2EC-B, 2EC-C and 2EC-B-C problems. A -approximation for 2EC-C can be obtained by carefully combining the 2ECSS randomized algorithm of Laehanukit et al.  and the SCSS algorithm of Vetta . A faster and deterministic 2-approximation algorithm for 2EC-C can be obtained by combining techniques based on edge-disjoint spanning trees [4, 19] with the SCSS algorithm of Zhao et al. . We remark that the other two problems considered here, 2EC-B and 2EC-B-C, seem harder to approximate. The only known result is the sparse certificate for -edge-connected blocks of . In this context, a sparse certificate of a strongly connected digraph is a spanning subgraph of with edges. Such a sparse spanning subgraph implies a linear-time -approximation algorithm for 2EC-B. Unfortunately, no good bound for the approximation constant was previously known, and indeed achieving a small constant seemed to be non-trivial. In this paper, we make a substantial progress in this direction by presenting new 4-approximation algorithms for 2EC-B and 2EC-B-C that run in linear time (the algorithm for 2EC-B-C runs in linear time once the -edge-connected components of are available; if not, they can be computed in time ).
From the practical viewpoint, we provide efficient implementations of our algorithms that are very fast in practice. We further propose and implement several heuristics that improve the size (i.e., the number of edges) of the computed spanning subgraphs in practice. Some of our algorithms require time in the worst case, so we also present several techniques to achieve significant speedups in their running times. With all these implementations, we conduct a thorough experimental study and report its main findings. We believe that this is crucial to assess the merits of all the algorithms considered in practical scenarios.
The remainder of this paper is organized as follows. We introduce some preliminary definitions and graph-theoretical terminology in Section 2. Then, in Section 3 we describe our basic approaches and provide a -approximation algorithm for 2EC-C and -approximation algorithms for 2EC-B and 2EC-B-C. Our empirical study is presented in Section 4. Finally, in Section 5 we discuss some open problems and directions for future work.
In this section, we introduce some basic terminology that will be useful throughout the paper.
Flow graphs, dominators, and independent spanning trees.
A flow graph is a digraph such that every vertex is reachable from a distinguished start vertex. Let be a strongly connected digraph. For any vertex , we denote by the corresponding flow graph with start vertex ; all vertices in are reachable from since is strongly connected. The dominator relation in is defined as follows: A vertex is a dominator of a vertex ( dominates ) if every path from to contains ; is a proper dominator of if dominates and . The dominator relation is reflexive and transitive. Its transitive reduction is a rooted tree, the dominator tree : dominates if and only if is an ancestor of in . If , , the parent of in , is the immediate dominator of : it is the unique proper dominator of that is dominated by all proper dominators of . The dominator tree of a flow graph can be computed in linear time, see, e.g., [1, 2]. An edge is a bridge in if all paths from to include .111Throughout, we use consistently the term bridge to refer to a bridge of a flow graph and the term strong bridge to refer to a strong bridge in the original digraph . Italiano et al.  showed that the strong bridges of can be computed from the bridges of the flow graphs and , where is an arbitrary start vertex and is the digraph that results from after reversing edge directions.
A spanning tree of a flow graph is a tree with root that contains a path from to for all vertices . Two spanning trees and rooted at are edge-disjoint if they have no edge in common. A flow graph has two such spanning trees if and only if it has no bridges . The two spanning trees are maximally edge-disjoint if the only edges they have in common are the bridges of . Two (maximally) edge-disjoint spanning trees can be computed in linear-time by an algorithm of Tarjan , using the disjoint set union data structure of Gabow and Tarjan . Two spanning trees and rooted at are independent if for all vertices , the paths from to in and share only the dominators of . Every flow graph has two such spanning trees, computable in linear time [10, 11] which are maximally edge-disjoint.
The condensed graph is the digraph obtained from by contracting each -edge-connected component of into a single supervertex. Note that is a multigraph since the contractions can create loops and parallel edges; see Figure 2. For any vertex of , we denote by the supervertex of that contains . Every edge of is associated with the corresponding original edge of . Given a condensed graph , we can obtain the expanded graph by reversing the contractions; each supervertex is replaced by the subgraph induced by the original vertices with , and each edge of is replaced with the corresponding original edge .
3 Approximation algorithms and heuristics
We start by describing our main approaches for solving problem 2EC-B. Let be the input directed graph. The first two algorithms process one edge of the current subgraph of at a time, and test if it is safe to remove . Initially , and the order in which the edges are processed is arbitrary. The third algorithm starts with the empty graph , and adds the edges of spanning trees of certain subgraphs of until the resulting digraph is strongly connected and has the same -edge-connected blocks as .
- Two Edge-Disjoint Paths Test.
We test if contains two edge-disjoint paths from to . If this is the case, then we remove edge . This test takes time per edge, so the total running time is . We refer to this algorithm as Test2EDP-B. Note that Test2EDP-B computes a minimal -approximate solution for the 2ECSS problem , which is not necessarily minimal for the 2EC-B problem.
- -Edge-Connected Blocks Test.
If is not a strong bridge in , we test if has the same -edge-connected blocks as . If this is the case then we remove edge . We refer to this algorithm as Test2ECB-B. Since the -edge-connected blocks of a graph can be computed in linear time , Test2ECB-B runs in time. Test2ECB-B computes a minimal solution for 2EC-B and achieves an approximation ratio of (see Section 3.3.1).
- Independent Spanning Trees.
We can compute a sparse certificate for -edge-connected blocks as in , based on a linear-time construction of two independent spanning trees of a flow graph [10, 11]. We refer to this algorithm as IST-B original. We will show later that a suitably modified construction, which we refer to as IST-B, yields a linear-time -approximation algorithm.
The first two approaches Test2EDP-B and Test2ECB-B can be combined into a hybrid algorithm (Hybrid-B), as follows:
if the tested edge connects vertices in the same -edge-connected block (i.e., ), then apply Test2EDP-B; otherwise, apply Test2ECB-B.
One can show that Hybrid-B returns the same sparse subgraph as Test2ECB-B.
Let be an edge of . Algorithm Test2EDP-B deletes only if Test2ECB-B does as well. Moreover, if and belong to the same -edge-connected block of , then algorithms Test2EDP-B and Test2ECB-B are equivalent for , i.e., edge is deleted by Test2ECB-B if and only if it is deleted by Test2EDP-B.
To prove the first part of the lemma, suppose that is deleted by Test2EDP-B. We show that the -edge-connected blocks of are not affected by this deletion. Consider any pair of -edge-connected vertices and that was affected by the deletion of , that is, the number of edge-disjoint paths from to was reduced. Let be a minimum - cut in , i.e., , , and . Then we also have and . Since has at least two edge-disjoint paths from to , Menger’s theorem implies that there are at least two edges directed from to . Thus, Menger’s theorem implies has at least two edge-disjoint paths from to .
We now prove the second part of the lemma. Suppose that and lie in the same -edge-connected block of , and edge is deleted by algorithm Test2ECB-B. This implies that has two edge-disjoint paths from to , so algorithm Test2EDP-B would also delete . ∎
3.1 Providing a sparse certificate as input
As we mentioned above, algorithm IST-B computes in linear time a sparse certificate for the -edge-connected blocks of an input digraph , i.e., a spanning subgraph of with edges that has the same -edge-connected blocks with . In order to speed up our slower heuristics, Test2EDP-B, Test2ECB-B and Hybrid-B, we can apply them on the sparse certificate instead of the original digraph, thus reducing their running time from to . Moreover, given that IST-B achieves a -approximation (Theorem 3.5), it follows that Test2EDP-B, Test2ECB-B and Hybrid-B produce a -approximation for 2EC-B in time. Therefore, we applied this idea in all our implementations. See Table 1 in Section 4. We also note that for the tested inputs, the quality of the computed solutions was not affected significantly by the fact that we applied the heuristics on the sparse certificate computed by IST-B instead of the original digraph. Indeed, on average, the number of edges in the computed subgraph was reduced by for Test2EDP-B and increased by less than for Hybrid-B. The speed up gained, on the other hand, was by a factor slightly less than for Test2EDP-B and by a factor slightly larger than for Hybrid-B.
3.2 Maintaining the -edge-connected components
Although all the above algorithms do not maintain the -edge-connected components of the original graph, we can still apply them to get an approximation for 2EC-B-C, as follows. First, we compute the -edge-connected components of and solve the 2ECSS problem independently for each such component. Then, we can apply any of the algorithms for 2EC-B (Test2EDP-B, Test2ECB-B, Hybrid-B or IST-B) for the edges that connect different components. To speed them up, we apply them to the condensed graph of . Let be the subgraph of computed by any of the above heuristics, and let be the expanded graph of , were we replace each supervertex of with the corresponding -edge-connected sparse subgraph computed before. We refer to the corresponding algorithms obtained this way as Test2EDP-BC, Test2ECB-BC, Hybrid-BC and IST-BC. The next lemma shows that indeed is a valid solution to the 2EC-B-C problem.
Digraph is strongly connected and has the same -edge-connected components and blocks as .
Digraph is strongly connected because the algorithms do not remove strong bridges. It is also clear that and have the same -edge-connected components. So it remains to consider the -edge-connected blocks. Let and be two arbitrary vertices of . We show that and are -edge-connected in if and only if they are -edge-connected in . The “only if” direction follows from the fact that is a subgraph of . We now prove the “if” direction. Suppose and are -edge-connected in . If and are located in the same -edge-connected component then obviously they are -edge-connected in . Suppose now that and are located in different components, so . By construction, for any cut in such that and there are at least two edges directed from to and at least two edges directed from to . This property is maintained by all algorithms, so it also holds in . Then, for any cut in the expanded graph such that and there are at least two edges directed from to and at least two edges directed from to . So and are -edge-connected in by Menger’s theorem. ∎
As a special case of applying Test2EDP-B to , we can immediately remove loops and parallel edges if has more than two edges directed from to . To obtain faster implementations, we solve the 2ECSS problems in linear-time using edge-disjoint spanning trees [4, 19]. Let be a -edge-connected component of . We select an arbitrary vertex as a root and compute two edge-disjoint spanning trees in the flow graph and two edge-disjoint spanning trees in the reverse flow graph . The edges of these spanning trees give a -approximate solution for 2ECSS on . Moreover, as in 2EC-B, we can apply algorithms Test2EDP-BC, Test2ECB-BC and Hybrid-BC on the sparse subgraph computed by IST-BC. Then, these algorithms produce a -approximation for 2EC-B-C in time. Furthermore, for these -time algorithms, we can improve the approximate solution for 2ECSS on each -edge-connected component of , by applying the two edge-disjoint paths test on the edges of . We incorporate all these ideas in all our implementations.
We can also use the condensed graph in order to obtain an efficient approximation algorithm for 2EC-C. To that end, we can apply the algorithm of Laehanukit et al.  and get a -approximation of the 2ECSS problem independently for each -edge-connected component of . Then, since we only need to preserve the strong connectivity of , we can run the algorithm of Vetta  on a digraph that results from after removing all loops and parallel edges. This computes a spanning subgraph of that is a -approximation for SCSS in . The corresponding expanded graph , where we substitute each supervertex of with the approximate smallest 2ECSS, gives a -approximation for 2EC-C. A faster and deterministic -approximation algorithm for 2EC-C can be obtained as follows. For the 2ECSS problems we use the edge-disjoint spanning trees -approximation algorithm described above. Then, we solve SCSS on by applying the linear-time algorithm of Zhao et al. . This yields a -approximation algorithm for 2EC-C that runs in linear time once the -edge-connected components of are available (if not, they can be computed in time ). We refer to this algorithm as ZNI-C.
There is a polynomial-time algorithm for 2EC-C that achieves an approximation ratio of . Moreover, if the -edge-connected components of are available, then we can compute a -approximate 2EC-C in linear time.
3.3 Independent Spanning Trees
Here we present our new algorithm IST-B and prove that it gives a linear-time -approximation for 2EC-B and 2EC-B-C. Since IST-B is a modified version of the sparse certificate for the -edge-connected blocks of a digraph  (IST-B original), let us review IST-B original first.
Let be an arbitrarily chosen start vertex of the strongly connected digraph . The canonical decomposition of the dominator tree is the forest of rooted trees that results from after the deletion of all the bridges of . Let denote the tree containing vertex in this decomposition. We refer to the subtree roots in the canonical decomposition as marked vertices. For each marked vertex we define the auxiliary graph of as follows.
The vertex set of consists of all the vertices in , referred to as ordinary vertices, and a set of auxiliary vertices, which are obtained by contracting vertices in , as follows.
Let be a vertex in . We say that is a boundary vertex in if has a marked child in . Let be a marked child of a boundary vertex : all the vertices that are descendants of in are contracted into .
All vertices in that are not descendants of are contracted into ( if any such vertex exists).
During those contractions, parallel edges are eliminated. We call an edge in shortcut edge. Such an edge has an auxiliary vertex as an endpoint. We associate each shortcut edge with a corresponding original edge , i.e., was contracted into or was contracted into (or both). If has bridges then all the auxiliary graphs have at most vertices and edges in total and can be computed in time. As shown in , two ordinary vertices of an auxiliary graph are -edge-connected in if and only if they are -edge-connected in . Thus the -edge-connected blocks of are a refinement of the vertex sets in the trees of the canonical decomposition. The sparse certificate of  is constructed in three phases. We maintain a list (multiset) of the edges to be added in ; initially . The same edge may be inserted into multiple times, but the total number of insertions will be . So the edges of can be obtained from after we remove duplicates, e.g. by using radix sort. Also, during the construction, the algorithm may choose a shortcut edge or a reverse edge to be inserted into . In this case we insert the associated original edge instead.
- Phase 1.
We insert into the edges of two independent spanning trees, and of .
- Phase 2.
For each auxiliary graph of , that we refer to as the first-level auxiliary graphs, we compute two independent spanning trees and for the corresponding reverse flow graph with start vertex . We insert into the edges of these two spanning trees. We note that induces a strongly connected spanning subgraph of at the end of this phase.
- Phase 3.
Finally, in the third phase we process the second-level auxiliary graphs, which are the auxiliary graphs of for all first-level auxiliary graphs . Let be a bridge of , and let be the corresponding second-level auxiliary graph. For every strongly connected component of , we choose an arbitrary vertex and compute a spanning tree of and a spanning tree of , and insert their edges into ; see Figure 4.
The above construction inserts edges into , and therefore achieves a constant approximation ratio for 2EC-B. It is not straightforward, however, to give a good bound for this constant, since the spanning trees that are used in this construction contain auxiliary vertices that are created by applying two levels of the canonical decomposition. In the next section we analyze a modified version of the sparse certificate construction, and show that it achieves a -approximation for 2EC-B. Then we show that we also achieve a -approximation for 2EC-B-C by applying this sparse certificate on the condensed graph .
3.3.1 The new algorithm Ist-B
The main idea behind IST-B is to limit the number of edges added to the sparse certificate because of auxiliary vertices. In particular, we show that in Phase 2 of the construction it suffices to add at most one new edge for each first-level auxiliary vertex, while in Phase 3 at most additional edges are necessary for all second-level auxiliary vertices, where is the number of bridges in .
We will use the following lemma about the strong bridges in auxiliary graphs, which implies that for any second-level auxiliary vertex that was not an auxiliary vertex in the first level, subgraph contains the unique edge leaving in .
Let be a strong bridge of a first-level auxiliary graph that is not a bridge in . Then is a bridge in the flow graph .
Consider the dominator tree of the flow graph . Let be the tree that results from after the deletion of the auxiliary vertices. Then we have . Moreover, for each auxiliary vertex , is the unique edge entering in , which is a bridge in . Also, is the unique edge leaving in which too is a bridge in . By  we have that a strong bridge of must appear as a bridge of or as the reverse of a bridge in , so the lemma follows. ∎
First we will describe our modified construction and apply a charging scheme for the edges added to that are adjacent to auxiliary vertices. Then, we use this scheme to prove that the modified algorithm achieves the desired -approximation. Phase 1 remains the same and we explain the necessary modifications for Phases 2 and 3.
- Modified Phase 2.
Let be a first-level auxiliary graph. In the sparse certificate we include two independent spanning trees, and , of the reverse flow graph with start vertex . In our new construction, each auxiliary vertex in will contribute at most one new edge in . Suppose first that , which exists if . The only edge entering in is which is the reverse edge of the bridge of . So does not add a new edge in , since all the bridges of were added in the first phase of the construction. Next we consider an auxiliary vertex . In there is a unique edge leaving , where . This edge is the reverse of the bridge of . Suppose that has no children in and . Deleting and its two entering edges in both spanning trees does not affect the existence of two edge-disjoint paths from to in , for any ordinary vertex . However, the resulting graph at the end may not be strongly connected. To fix this, it suffices to include in the reverse of an edge entering from only one spanning tree. Finally, suppose that has children, say in . Then is the unique child of in , and the reverse of the edge of is already included in by Phase 1. Therefore, in all cases, we can charge to at most one new edge.
- Modified Phase 3.
Let be a second-level auxiliary graph of . Let be the strong bridge entering in , and let be a strongly connected component in . In our sparse certificate we include the edges of a strongly connected subgraph of , so we have spanning trees and of and , respectively, rooted at an arbitrary ordinary vertex . Let be an auxiliary vertex of . We distinguish two cases:
If is a first-level auxiliary vertex in then it has a unique entering edge which is a bridge in already included in .
If is ordinary in but a second-level auxiliary vertex in then it has a unique leaving edge , which, by Lemma 3.4, is a bridge in and already contains a corresponding original edge.
Consider the first case. If is a leaf in then we can delete the edge entering in . Otherwise, is the unique child of in , and the corresponding edge entering in has already been inserted in . The symmetric arguments hold if is ordinary in .
This analysis implies that we can associate each second-level auxiliary vertex with one edge in each of and that is either not needed in or has already been inserted. If all such auxiliary vertices are associated with distinct edges then they do not contribute any new edges in . Suppose now that there are two second-level auxiliary vertices and that are associated with a common edge . This can happen only if one of these vertices, say , is a first-level auxiliary vertex, and is ordinary in . Then has a unique entering edge in , which means that is a strong bridge, and thus already in . Also and . In this case, we can treat and as a single auxiliary vertex that results from the contraction of , which contributes at most two new edges in . Since is a first-level auxiliary vertex, this can happen at most times in all second-level auxiliary graphs, so a bound of such edges follows.
Using the above construction we can now prove that our modified version of the sparse certificate achieves an approximation ratio of .
There is a linear-time approximation algorithm for the 2EC-B problem that achieves an approximation ratio of . Moreover, if the -edge-connected components of the input digraph are known in advance, we can compute a -approximation for the 2EC-B-C problem in linear time.
Let denote (as above) the number of bridges in the flow graph . Note that . We consider the three phases of the construction of separately and account for the new edges that are added in each phase. Consider the two independent spanning trees and of that are computed in the first phase. If an edge is a bridge in then it is the unique edge entering in . Thus these two independent spanning trees add into exactly edges.
Now we consider the Modified Phase 2. Let be a first-level auxiliary graph. Let and be, respectively, the number of ordinary and auxiliary vertices in . In the sparse certificate we include two independent spanning trees, and , of the reverse flow graph with start vertex . As already explained in the analysis of this phase, each auxiliary vertex in may contribute at most one new edge in . Since and do not contribute any new edges, the total number of edges added for is at most . Hence, the total number of edges added during the second phase is at most , where the sum is taken over all marked vertices . Observe that and , so we have . We note that, as in the original construction, is strongly connected at the end of this phase. Moreover, in this phase we include in the strong bridges of that are not bridges in .
It remains to account for the edges added during the third phase. Here we consider the strongly connected components for each auxiliary graph of after removing the strong bridge entering in . By the argument in the description of the Modified Phase 3, the second-level auxiliary vertices contribute at most new edges in total.
We note that the -edge-connected blocks of are formed by the ordinary vertices in each strongly connected component computed for the second-level auxiliary graphs. Consider such a strongly connected component . Let be the number of ordinary vertices in . If then we do not include any edges for . So suppose that . Excluding the at most additional edges, the auxiliary vertices in do not contribute any new edges. So the number of edges added by is bounded by . Then, the third phase adds edges in total, where and the sum is taken over all strongly connected components with .
Overall, the number of edges added in is at most . Next, we observe that these vertices must have indegree and outdegree at least equal to in any solution to the 2EC-B problem. The remaining vertices must have indegree and outdegree at least equal to one, since the spanning subgraph must be strongly connected. Therefore, the smallest 2EC-B has at least edges. The approximation ratio of follows.
Now consider the 2EC-B-C problem, where we apply our new sparse certificate on the condensed graph . Let be the number of edges computed by our algorithm for all -edge-connected components, where we apply the edge-disjoint spanning trees construction. Let be the total number of edges in an optimal solution. Then . Suppose that the condensed graph has vertices. By the previous analysis, we have that the sparse certificate of has less than edges, where is the number of vertices in nontrivial blocks in the condensed graph. So, our algorithm computes a sparse certificate for with less than edges. The smallest 2EC-B-C has at least , so the approximation ratio of follows. ∎
Next we note that the above proof implies that the Test2ECB algorithms also achieve a -approximation even when they are run on the original digraphs instead of the sparse certificates.
Algorithm Test2ECB-B (resp., Test2ECB-BC) applied on the original input (resp., condensed) graph gives a -approximate solution for 2EC-B (resp., 2EC-B-C).
We consider first algorithm Test2ECB-B for the 2EC-B problem. Let be a strongly connected digraph with vertices, and let be the number of vertices in nontrivial blocks (i.e., -edge-connected blocks of size at least ). Let be the spanning subgraph of produced by running Test2ECB-B on . It suffices to argue that contains less than edges. Suppose that we run IST-B on . Let be the resulting subgraph of . Then, is also a solution to 2EC-B for , and by the proof of Theorem 3.5 it has at most edges. But since is a minimal solution to 2EC-B for , we must have .
For the 2EC-B-C problem, assume that the edge-disjoint spanning trees construction produces edges. Then , where is number of edges in an optimal solution. Let be the condensed graph of , and let be the number of its vertices. Let be the spanning subgraph of produced by running Test2ECB-BC on . By the proof of Theorem 3.5 and the same argument as for the 2EC-B problem, we have that contains at most edges, where is the total number of vertices in nontrivial blocks of . So the corresponding expanded graph has at most edges. Since the smallest 2EC-B-C solution has at least edges, the 4-approximation follows. ∎
3.4 Implementation details
Here we provide some implementation details for our algorithms. In order to obtain a more efficient implementation of the IST algorithms that achieve better quality ratio in practice, we try to reuse as many edges as possible when we build the spanning trees in the three phases of the algorithm. In the third phase of the construction we need to solve the smallest SCSS problem for each subgraph induced by a strongly connected component in the second-level auxiliary graphs after the deletion of a strong bridge. To that end, we apply a modified version of the linear-time -approximation algorithm of Zhao et al. . This algorithm computes a SCSS of a strongly connected digraph by performing a depth-first search (DFS) traversal. During the DFS traversal, any cycle that is detected is contracted into a single vertex. We modify this approach so that we can avoid inserting new edges into the sparse certificate as follows. Since we only care about the ordinary vertices in , we can construct a subgraph of that contains edges already added in . We compute the strongly connected components of this subgraph and contract them. Then we apply the algorithm of Zhao et al. on the contracted graph of . Furthermore, during the DFS traversal we give priority to edges already added in .
We can apply a similar idea in the second phase of the construction as well. The algorithm of  for computing two independent spanning trees of a flow graph uses the edges of a DFS spanning tree, together with at most other edges. Hence, we can modify the DFS traversal so that we give priority to edges already added in .
3.5 Heuristics applied on auxiliary graphs
To speed up algorithms from the Test2EDP and Hybrid families, we applied them to the first-level and second-level auxiliary graphs. Since auxiliary graphs are supposed to be smaller than the original graph, one could expect to obtain some performance gain at the price of a slightly worse approximation. However, this performance gain cannot be taken completely for granted, as auxiliary vertices and shortcut edges may be repeated in several auxiliary graphs. Our experiments indicated that applying this heuristic to second-level auxiliary graphs yields better results than the ones obtained on first-level auxiliary graphs. We refer to those variants as
Test2EDP-B-Aux and Hybrid-B-Aux,
Test2EDP-BC-Aux and Hybrid-BC-Aux,
depending on the algorithm (Test2EDP or Hybrid) and problem (2EC-B or 2EC-B-C) considered.
3.6 Trivial edges
For the algorithms of the Test2EDP and Hybrid families we use an additional speed-up heuristic in order to avoid testing edges that trivially belong to the computed solution. We say that is a trivial edge of the current graph if it satisfies one of the following conditions:
belongs to a -edge-connected block of size at least two (nontrivial block) and has outdegree at most two, or belongs to a -edge-connected block of size at least two (nontrivial block) and has indegree at most two;
belongs to a -edge-connected block of size one (trivial block) and has outdegree one, or belongs to a -edge-connected block of size one (trivial block) and has indegree one.
Clearly, the removal of a trivial edge will result in a digraph that either has different -edge-connected blocks or is not strongly connected. Therefore these edges should remain in . As we show later in our experiments, such a simple test can yield significant performance gains.
4 Experimental analysis
We implemented the algorithms previously described: for 2EC-B, for 2EC-B-C, and one for 2EC-C, as summarized in Table 1. All implementations were written in C++ and compiled with g++ v.4.4.7 with flag -O3. We performed our experiments on a GNU/Linux machine, with Red Hat Enterprise Server v6.6: a PowerEdge T420 server 64-bit NUMA with two Intel Xeon E5-2430 v2 processors and 16GB of RAM RDIMM memory. Each processor has 6 cores sharing a 15MB L3 cache, and each core has a 2MB private L2 cache and 2.50GHz speed. In our experiments we did not use any parallelization, and each algorithm ran on a single core. We report CPU times measured with the getrusage function. All our running times were averaged over ten different runs.
For the experimental evaluation we use the datasets shown in Table 2. We measure the quality of the solution computed by algorithm on problem by a quality ratio defined as , where is the average vertex indegree of the spanning subgraph computed by and is a lower bound on the average vertex indegree of the optimal solution for . Specifically, for 2EC-B and 2EC-B-C we define , where is the total number of vertices of the input digraph and is the number of vertices that belong in nontrivial -edge-connected blocks 222This follows from the fact that in the sparse subgraph the vertices in nontrivial blocks must have indegree at least two, while the remaining vertices must have indegree at least one, since we seek for a strongly connected spanning subgraph.. We set a similar lower bound for 2EC-C, with the only difference that is the number of vertices that belong in nontrivial -edge-connected components. Note that the quality ratio is an upper bound of the actual approximation ratio of the specific input. The smaller the values of (i.e., the closer to 1), the better is the approximation obtained by algorithm for problem .
4.1 Experimental results
We now report the results of our experiments with all the algorithms considered for problems 2EC-B, 2EC-B-C and 2EC-C. As previously mentioned, for the sake of efficiency, all variants of Test2EDP, Test2ECB and Hybrid were run on the sparse certificate computed by either IST-B or IST-BC (depending on the problem at hand) instead of the original digraph.
We group the experimental results into two categories: results on the 2EC-B problem and results on both 2EC-C and 2EC-B-C problems. In all cases we are interested in the quality ratio of the computed solutions and the corresponding running times. Moreover, in order to better highlight the different behaviour of our algorithms, we present for each algorithm both the quality ratio for each individual input and also give an overall view in terms of box-and-whisker diagrams. Specifically, we report the following experimental results:
For the 2EC-C and 2EC-B-C problems:
their running times are given in Table 6, while the corresponding plotted values are shown in Figure 7 (bottom). We note that the running times include the time to compute the -edge-connected components of the input digraph. To that end, we use the algorithm from , which is fast in practice despite the fact that its worst-case running time is .
4.2 Evaluation of the experimental results
There are two peculiarities related to road networks that emerge immediately from the analysis of our experimental data. First, all algorithms achieve consistently better approximations for road networks than for most of the other graphs in our data set. Second, for the 2EC-B problem the Hybrid algorithms (Hybrid-B and Hybrid-B-Aux) seem to achieve substantial speedups on road networks; for the 2EC-B-C problem, this is even true for Test2ECB-BC. The first phenomenon can be explained by taking into account the macroscopic structure of road networks, which is rather different from other networks. Indeed, road networks are very close to be “undirected”: i.e., whenever there is an edge , there is also the reverse edge (expect for one-way roads). Roughly speaking, road networks mainly consist of the union of -edge-connected components, joined together by strong bridges, and their -edge-connected blocks coincide with their -edge-connected components. In this setting, a sparse strongly connected subgraph of the condensed graph will preserve both blocks and components. The second phenomenon is mainly due to the trivial edge heuristic described in Section 3.6.
Apart from the peculiarities of road networks, ZNI-C behaves as expected for 2EC-C through its linear-time -approximation algorithm. Note that for both problems 2EC-B and 2EC-B-C, all algorithms achieve quality ratio significantly smaller than our theoretical bound of . Regarding running times, we observe that the 2EC-B-C algorithms are faster than the 2EC-B algorithms, sometimes significantly, as they take advantage of the condensed graph that seems to admit small size in real-world applications. In addition, our experiments highlight interesting tradeoffs between practical performance and quality of the obtained solutions. Indeed, the fastest (IST-B and IST-B original for problem 2EC-B; IST-BC for 2EC-B-C) and the slowest algorithms (Test2ECB-B and Hybrid-B for 2EC-B; Test2ECB-BC and Hybrid-BC for 2EC-B-C) tend to produce respectively the worst and the best approximations. Note that IST-B improves the quality of the solution of IST-B original at the price of slightly higher running times, while Hybrid-B (resp., Hybrid-BC) produces the same solutions as Test2ECB-B (resp., Test2ECB-BC) with rather impressive speedups. Running an algorithm on the second-level auxiliary graphs seems to produce substantial performance benefits at the price of a slightly worse approximation (Test2EDP-B-Aux, Hybrid-B-Aux, Test2EDP-BC-Aux and Hybrid-BC-Aux versus Test2EDP-B, Hybrid-B, Test2EDP-BC and Hybrid-BC). Overall, in our experiments Test2EDP-B-Aux and Test2EDP-BC-Aux seem to provide good quality solutions for the problems considered without being penalized too much by a substantial performance degradation.
5 Concluding remarks
We do not know if the approximation ratio of that we provided for the algorithms of the IST and Hybrid families are tight. Figure 8(a) shows a digraph such that a sparse certificate constructible by algorithm IST-B has edges. This digraph has a single nontrivial -edge-connected block consisting of the vertices , which also form a -edge-connected component. An optimal solution for 2EC-B on this instance, shown in Figure 8(b), has edges, where each vertex has indegree and outdegree equal to two, while the other four vertices have indegree and oudegree equal to one. Figure 8(c) shows a minimal solution with edges, where again each vertex has indegree and outdegree equal to two but vertex has indegree equal to and vertex has outdegree equal to ; removing any edge of this minimal solution either destroys the strong connectivity of the subgraph or partitions the nontrivial block. So, for this instance IST-B achieves a -approximation, while Hybrid-B achieves a -approximation. The three phases of the sparse certificate construction by IST-B are given in Figures 9, 10 and 11.
We also note that the example of Figure 8 is not a worst-case instance for Hybrid-B. If the input digraph is -edge-connected then we seek for a smallest -edge-connected spanning subgraph, and Lemma 3.1 implies that Hybrid-B produces the same output as Test2EDP-B. So, in this case Hybrid-B achieves an approximation ratio of , which is known to be tight . In light of our experimental results, it seems possible that the Hybrid algorithms always achieve a -approximation, but we have no proof.
We close with a couple of few more open questions and possible directions for future work. First, we can consider the case of vertex-connectivity, where we can define the corresponding problems of computing the smallest strongly connected spanning subgraph that maintains the -vertex-connected blocks, or the -vertex-connected components, or both. A sparse certificate for the -vertex-connected blocks is given in , so it would interesting to study if based on this construction we can achieve a similar approximation ratio for the -vertex-connectivity case. Furthermore, the concept of -edge-connected blocks may well be generalized to -edge disjoint paths, for . Keeping in mind that the underlying graph should remain strongly connected, it is natural to ask if computing smallest such spanning subgraph achieves a better approximation ratio for . Such a phenomenon occurs in approximating the smallest spanning -edge connected subgraph [3, 5].
-  S. Alstrup, D. Harel, P. W. Lauridsen, and M. Thorup. Dominators in linear time. SIAM Journal on Computing, 28(6):2117–32, 1999.
-  A. L. Buchsbaum, L. Georgiadis, H. Kaplan, A. Rogers, R. E. Tarjan, and J. R. Westbrook. Linear-time algorithms for dominators and other path-evaluation problems. SIAM Journal on Computing, 38(4):1533–1573, 2008.
-  J. Cheriyan and R. Thurimella. Approximating minimum-size -connected spanning subgraphs via matching. SIAM J. Comput., 30(2):528–560, 2000.
-  J. Edmonds. Edge-disjoint branchings. Combinat. Algorithms, pages 91–96, 1972.
-  H. N. Gabow, M. X. Goemans, E. Tardos, and D. P. Williamson. Approximating the smallest -edge connected spanning subgraph by LP-rounding. Networks, 53(4):345–357, 2009.
-  H. N. Gabow and R. E. Tarjan. A linear-time algorithm for a special case of disjoint set union. Journal of Computer and System Sciences, 30(2):209–21, 1985.
-  M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York, NY, USA, 1979.
-  L. Georgiadis, G. F. Italiano, L. Laura, and N. Parotsidis. 2-edge connectivity in directed graphs. In SODA 2015, pages 1988–2005, 2015.
-  L. Georgiadis, G. F. Italiano, L. Laura, and N. Parotsidis. 2-vertex connectivity in directed graphs. In ICALP 2015, pages 605–616, 2015.
-  L. Georgiadis and R. E. Tarjan. Dominator tree verification and vertex-disjoint paths. In SODA 2005, pages 433–442, 2005.
-  L. Georgiadis and R. E. Tarjan. Dominator tree certification and independent spanning trees. CoRR, abs/1210.8303, 2012.
-  M. Henzinger, S. Krinninger, and V. Loitzenbauer. Finding 2-edge and 2-vertex strongly connected components in quadratic time. In ICALP 2015, pages 713–724, 2015.
-  G. F. Italiano, L. Laura, and F. Santaroni. Finding strong bridges and strong articulation points in linear time. Theor. Comput. Sci., 447(0):74–84, 2012.
-  R. Jaberi. Computing the -blocks of directed graphs. RAIRO-Theor. Inf. Appl., 49(2):93–119, 2015.
-  G. Kortsarz and Z. Nutov. Approximating minimum cost connectivity problems. Approximation Algorithms and Metaheuristics, 2007.
-  B. Laekhanukit, S. O. Gharan, and M. Singh. A rounding by sampling approach to the minimum size k-arc connected subgraph problem. In ICALP 2012, pages 606–616, 2012.
-  W. Di Luigi, L. Georgiadis, G. F. Italiano, L. Laura, and N. Parotsidis. 2-connectivity in directed graphs: An experimental study. In ALENEX 2015, pages 173–187, 2015.
-  H. Nagamochi and T. Ibaraki. Algorithmic Aspects of Graph Connectivity. Cambridge University Press, 2008. 1st edition.
-  R. E. Tarjan. Edge-disjoint spanning trees and depth-first search. Acta Informatica, 6(2):171–85, 1976.
-  A. Vetta. Approximating the minimum strongly connected subgraph via a matching lower bound. In SODA 2001, pages 417–426, 2001.
-  L. Zhao, H. Nagamochi, and T. Ibaraki. A linear time 5/3-approximation for the minimum strongly-connected spanning subgraph problem. Information Processing Letters, 86(2):63–70, 2003.