iBGP and Constrained Connectivity

We initiate the theoretical study of the problem of minimizing the size of an iBGP overlay in an Autonomous System (AS) in the Internet subject to a natural notion of correctness derived from the standard “hot-potato” routing rules. For both natural versions of the problem (where we measure the size of an overlay by either the number of edges or the maximum degree) we prove that it is NP-hard to approximate to a factor better than and provide approximation algorithms with ratio . This algorithm is based on a natural LP relaxation and randomized rounding technique inspired by the recent work on approximating directed spanners by Bhattacharyya et al. [SODA 2009], Dinitz and Krauthgamer [STOC 2011], and Berman et al. [ICALP 2011]. In addition to this theoretical algorithm, we give a slightly worse -approximation based on primal-dual techniques that has the virtue of being both fast (in theory and in practice) and good in practice, which we show via simulations on the actual topologies of five large Autonomous Systems.

The main technique we use is a reduction to a new connectivity-based network design problem that we call Constrained Connectivity. In this problem we are given a graph , and for every pair of vertices we are given a set called the safe set of the pair. The goal is to find the smallest subgraph of in which every pair of vertices is connected by a path contained in . We show that the iBGP problem can be reduced to the special case of Constrained Connectivity where and safe sets are defined geometrically based on the IGP distances in the AS. Indeed, our algorithmic upper bounds generalize to Constrained Connectivity on , and our -lower bound for the special case of iBGP implies hardness for the general case. Furthermore, we believe that Constrained Connectivity is an interesting problem in its own right, so provide stronger hardness results (-hardness of approximation based on reductions from Label Cover) and integrality gaps ( based on random instances of Unique Games) for the general case. On the positive side, we show that Constrained Connectivity turns out to be much simpler for some interesting special cases other than iBGP: when safe sets are symmetric and hierarchical, we give a polynomial time algorithm that computes an optimal solution.

1 Introduction

The Internet consists of a number of interconnected subnetworks called Autonomous Systems (ASes). As described in [1], the way that routes to a given destination are chosen by routers within an AS can be viewed as follows. Routers have a ranking of routes based on economic considerations of the AS. Without loss of generality, in what follows we assume that all routes are equally ranked. Thus routers must use some tie-breaking scheme in order to choose a route from amongst the equally ranked routes. Tie-breaking is based on traffic engineering considerations and in particular, the goal is to get packets out of the AS as quickly as possible (called hot-potato routing).

An AS attempts to achieve hot-potato routing using iBGP, the version of the interdomain routing protocol BGP [17] used by routers within a subnetwork to announce routes to each other that have been learned from outside the subnetwork. An iBGP configuration is defined by a signaling graph, which is supposed to enforce hot-potato routing. Unfortunately, while iBGP has many nice properties that make it useful in practice, constructing a good signaling graph turns out to be a computationally difficult problem. For example, it is not clear a priori that it is even possible to check in polynomial time that a signaling graph is correct, i.e. it is not obvious that the problem is even in NP! In this paper we study the problem of constructing small and correct signaling graphs, as well as a natural extension to a more general problem that we call Constrained Connectivity.

1.1 iBGP

At a high level, iBGP works as follows. The routers that initially know of a route are called border routers. (These initial routes are those learned by the border routers from routers outside the AS.) The border router that initially knows of a route is said to be the egress router of that route. Each border router knows of at most one route. Thus an initial set of routes defines a set of egress routers where there is a one-to-one relationship between routes in and routers in . The AS has an underlying physical network with edge weights (e.g., IGP distances or OSPF weights). The distance between two routers is then defined to be the length of the shortest path (according to the edge weights) between them. Given a set of routes, a router will rank highest the one whose egress router is closest according to this definition of distance. The signaling graph is an overlay network whose nodes represent routers and whose edges represent the fact that the two routers at its endpoints use iBGP to inform one another of their current chosen route. The endpoints of an edge in are called iBGP neighbors. A path in is called a signaling path. Note that iBGP neighbors are not necessarily neighbors in the underlying graph, since is an overlay and can include any possible edge.

Finally, iBGP can be thought of as working as follows: in an asynchronous fashion, each router considers all the latest routes it has heard about from its iBGP neighbors, chooses the one with the closest egress router and tells its iBGP neighbors about the route it has chosen. This continues until no router learns of a route whose egress router is closer than that of its currently chosen route. When this process ends the route chosen by router is denoted by . Let be the shortest path from to , the egress router of . When a packet arrives at , it sends it to the next router on , in turn sends the packet to the next router on and so on. Thus if is not the subpath of starting at then the packet will not get routed as expected.

A signaling graph has the complete visibility property for a set of egress routers if each router hears about (and hence chooses as ) the route in whose egress router is closest to from amongst all routers in . It is easy to see that will achieve hot-potato routing for if and only if it has the complete visibility property for . So we say that a signaling graph is correct if it has the complete visibility property for all possible .

Clearly if is the complete graph then is correct. Because of this, the default configuration of iBGP and the original standard was to maintain a complete graph, also called a full mesh [17]. However the complete graph is not practical and so network managers have adopted various configuration techniques to reduce the size of the signaling graph [2, 18]. Unfortunately these methods do not guarantee correct signaling graphs [1, 11]. Thus our goal is to determine correct signaling graphs with fewer edges than the complete graph. Slightly more formally, two natural questions are to minimize the number of edges in the signaling graph or to minimize the maximum number of iBGP neighbors for any router while guaranteeing correctness. We define iBGP-Sum to be the problem of finding a correct signaling graph with the fewest edges, and similarly we define iBGP-Degree to be the problem of finding a correct signaling graph with the minimum possible maximum degree.

1.2 Constrained Connectivity

All we know a priori about the complexity of iBGP-Sum and iBGP-Degree is that they are in (the second existential level of the polynomial hierarchy), since the statement of correctness is that “there exists a small graph such that for all possible subsets each router hears about the route with the closest egress router”. In particular, it is not obvious that these problems are in NP, i.e. that there is a short certificate that a signaling graph is correct. However, it turns out that these problems are actually in NP (see Section 2.1), and the proof of this fact naturally gives rise to a more general network design problem that we call Constrained Connectivity. In this problem we are given a graph and for each pair of nodes we are given a set . Each such is called a safe set and it is assumed that . We say that a subgraph of is safely connected if for each pair of nodes there is a path in from to where each node in the path is in . As with iBGP, we are interested in two optimization versions of this problem:

  1. Constrained Connectivity-Sum: compute a safely connected subgraph with the minimum number of edges, and

  2. Constrained Connectivity-Degree: compute a safely connected subgraph that minimizes the maximum degree over all nodes.

It turns out (see Theorem 2.1) that the iBGP problems can be viewed as Constrained Connectivity problems with and safe sets defined in a particular geometric way. While the motivation for studying Constrained Connectivity comes from iBGP, we believe that it is an interesting problem in its own right. It is an extremely natural and general network design problem that, somewhat surprisingly, seems to have not been considered before. While we only provide negative results for the general problem (hardness of approximation and integrality gaps), a better understanding of Constrained Connectivity might lead to a better understanding of other network design problems, both explicitly via reductions and implicitly through techniques. For example, many of the techniques used in this paper come from recent literature on directed spanners [4, 8, 3], and given these similarities it is not unreasonable to think that insight into Constrained Connectivity might provide insight into directed spanners.

For a more direct example, there is a natural security application of Constrained Connectivity. Suppose we have players who wish to communicate with each other but they do not all trust one another with messages they send to others. That is, when wishes to send a message to there is a subset of players that it trusts to see the messages that it sends to . Of course, if for every pair of players there were direct communication channels between the two players, then there would be no problem. But suppose there is a cost to protect communication channels from eavesdropping or other such attacks. Then a goal would be to have a network of fewer than communication channels that would still allow a route from each to each with the route completely contained within . Thus this problem defines a Constrained Connectivity-Sum problem.

1.3 Summary of Main Results

In Section 3 we give a polynomial approximation for the iBGP problems, by giving the same approximations for the more general problem of Constrained Connectivity on .

Theorem 3.4.  There is an -approximation to the Constrained Connectivity problems on .

Corollary.  There is an -approximation to iBGP-Sum and iBGP-Degree.

To go along with these theoretical upper bounds, we design a different (but related) algorithm for Constrained Connectivity-Sum on that provides a worse theoretical upper bound (a -approximation) but is faster in both practice and theory, and show by simulation on five real AS topologies (Telstra, Sprint, NTT, TINET, and Level 3) that in practice it provides an extremely good approximation. Details of these simulations are in Section 3.3

To complement these upper bounds, in Section 4 we show that the iBGP problem is hard to approximate, even with the extra power afforded us by the geometry of the safe sets:

Theorems 4.5 and 4.4.  It is NP-hard to approximate iBGP-Sum or iBGP-Degree to a factor better than .

We then study the more general Constrained Connectivity problems, and in Section 5 we show that the fully general constrained connectivity problems are hard to approximate:

Theorem 5.2.The Constrained Connectivity-Sum and Constrained Connectivity-Degree problems do not admit a -approximation algorithm for any constant unless

This is basically the same inapproximability factor as for Label Cover, and in fact our reduction is from a minimization version of Label Cover known as Min-Rep. Moreover, we show that the natural LP relaxation has a polynomial integrality gap of .

Finally, in Section 6 we consider some other special cases of Constrained Connectivity that turn out to be easier. In particular, we say that a collection of safe sets is symmetric if for all and that it is hierarchical if for all , if then and . It turns out that all of our hardness results and integrality gaps also hold for symmetric instances, but adding the hierarchical property makes things easier:

Theorem 6.6.Constrained Connectivity-Sum with symmetric and hierarchical safe sets can be solved optimally in polynomial time.

1.4 Related Work

Issues involving eBGP, the version of BGP that routers in different ASes use to announce routes to one another, have recently received significant attention from the theoretical computer science community, especially stability and game-theoretic issues (e.g.,  [10, 14, 9]). However, not nearly as much work has been done on problems related to iBGP which distributes routes internally in an AS. There has been some work on the problem of guaranteeing hot-potato routing in any AS with a route reflector architecture [2]. These earlier papers did not consider the issue of finding small signaling graphs that achieved the hot-potato goal. Instead they either provided sufficient conditions for correctness relating the underlying physical network with the route reflector configuration [11] or they showed that by allowing some specific extra routes to be announced (rather than just the one chosen route) they would guarantee a version of hot-potato routing [1]. The first people to consider the problem of designing small iBGP overlays subject to achieving hot-potato correctness were Vutukuru et al. [19], who used graph graph partitioning schemes to give such configurations. But while they proved that their algorithm gave correct configurations, they only gave simulated evidence that the configurations it produced were small. Buob et al. [6] considered the problem of designing small correct solutions and gave a mathematical programming formulation, but then simply solved the integer program user super-polynomial time algorithms.

2 Preliminaries

2.1 Relationship between iBGP and Constrained Connectivity

We will now show that the iBGP problems are just special cases of Constrained Connectivity-Sum and Constrained Connectivity-Degree. This will be a natural consequence of the proof that iBGP-Sum and iBGP-Degree are in NP.

To see this we will need the following definitions. We will assume that there are no ties, i.e. all distances are distinct. For two routers and , let be the set of routers that are farther from than is. Let be the set of routers that are closer to than to any router not in the ball around of radius . We will refer to as “safe” routers for the pair . A path from to in a signaling graph is said to be a safe signaling path if it is contained in . It turns out that these safe sets characterize correct signaling graphs:

Theorem 2.1.

An iBGP signaling graph is correct if and only if for every pair there is a signaling path from to that uses only routers in .

Proof.

We first show that if every pair has a safe signaling path then every node hears about the route that has the closest egress router no matter what the set of egress routers is. This is simple: let be a router, and let be its closest egress router. Let be the route whose egress router is . By assumption there is a signaling path from to that uses only routers in . By definition, every one of these routers is closer to than to any router farther from than is. Since is the closest egress to , this means that for all of the routers in , will be the closest egress router. A simple induction then shows that the routers in a safe signaling path will each choose and hence tell their iBGP neighbor in the path about . That is, hears about .

For the other direction we need to show that if a signaling graph is correct then every pair has a safe signaling path. For contradiction, suppose that there is no safe signaling path from to . Let , the set of egress routers, be . Let be the route whose egress router is . Since every router in is farther from than is, this means that for this set of egress routers is closer to than any other egress. By correctness we know that does hear about . Let be the (or at least a) signaling path from to through which hears about . Since there are no safe signaling paths from to , we know that there exists some such that . This means that there is some such that . Since we assumed correctness we know that heard about the route with the closest egress router to , and (since in particular is closer). So will not tell its iBGP neighbors about , which is a contradiction since is on the signaling path from which heard about . Thus a safe signaling path must exist. ∎

Note that this condition is easy to check in polynomial time, so we have shown membership in NP. Also this characterization shows that the problems iBGP-Sum and iBGP-Degree are Constrained Connectivity problems where the underlying graph is and the safe sets are defined by certain geometric properties. While the proof of this is obviously relatively simple, we believe that it is an important contribution of this paper as it allows us to characterize the behavior of a protocol (iBGP) using only the static information of the signaling graph and the network distances.

2.2 Linear Programming Relaxations

There are two obvious linear programming relaxations of the Constrained Connectivity problems (and thus the iBGP problems): the flow LP and the cut LP. For every pair let be the collection of paths that are contained in . The flow LP has a variable for every edge (called the capacity of edge ) and a variable for every path in for every (called the flow assigned to path ). The flow LP simply requires that at least one unit of flow is sent between all pairs while obeying capacity constraints:

s.t

This is obviously a valid relaxation of Constrained Connectivity-Sum: given a valid solution to Constrained Connectivity-Sum, let denote the required safe path for every . For every edge in some set to , and set to for every . This is clearly a valid solution to the linear program with the exact same value. To change the LP for Constrained Connectivity-Degree we can just introduce a new variable , change the objective function to , and add the extra constraints for all . And while this LP can be exponential in size (since there is a variable for every path), it is also easy to design a compact representation that has only variables and constraints. This compact representation has variables instead of , where represents the amount of flow from to along edge for the demand . Then we can write the normal flow conservation and capacity constraints for every demand independently, restricted to . Indeed, this compact representation is one of the main reasons to prefer the flow LP over the cut LP.

The cut LP is basically equivalent to the flow LP, except that instead of requiring flow to be sent, it requires the min-cut to be large large enough. Given a pair , let be the collection of safe set cuts that separate and . Furthermore, given a set let be the set of safe edges that cross . The cut LP has a variable for every edge (equivalent to in the flow LP), and is quite simple:

s.t.

This LP simply minimizes the sum of the edge variables subject to the constraint that for every cut between two nodes there must be at least one safe edge crossing it. While the flow LP and the cut LP are not technically duals of each other (since capacities are variables), it is easy to see from the max flow-min cut theorem that they do in fact describe the same polytope (with respect to the capacity variables). Thus integrality gaps for one automatically hold for the other, as do approximations achieved by LP rounding.

3 Algorithms for iBGP and Constrained Connectivity on

3.1 -approximation

In this section we show that there is a -approximation algorithm for both Constrained Connectivity problems as long as the underlying graph is the complete graph . This algorithm is inspired by the recent progress on directed spanners by Bhattacharyya et al. [4], Dinitz and Krauthgamer [8], and Berman et al. [3]. In particular, we use the same two-component framework that they do: a randomized rounding of the LP and a separate random tree-sampling step. The randomized rounding we do is simple independent rounding with inflated probabilities. The next lemma implies that this works well when the safe sets are small.

Lemma 3.1.

Let be obtained by adding every edge to independently with probability at least . Then with probability at least , will have a path between and contained in .

Proof.

Let be a partition of so that and , i.e.  is an cut of . Note that there are only such cuts, and by standard arguments if at least one edge from every cut is chosen to be in then contains an path in . Since in any LP solution at least one unit of flow is sent from to in , every cut has capacity at least . Let be the set of edges that cross the cut . If for any then is selected with probability , and thus is spanned. Otherwise, the probability that no edge from is chosen is at most . Thus by a simple union bound the probability that we fail on any cut is at most

Another important part of our algorithm will be random sampling that is independent of the LP. We will use two different types of sampling: star sampling for the sum version and edge sampling for the degree version. First we consider star sampling, in which we independently sample nodes with probability , and every sampled node becomes the center of a star that spans the vertex set.

Lemma 3.2.

All pairs with safe sets of size at least will be satisfied by random star sampling with high probability if .

Proof.

Consider some pair with . If some node (say ) from is sampled then the pair is satisfied, since the creation of a star at would create a path that would satisfy . The probability that no node from is sampled is

Since there are less than pairs, we can take a union bound over all pairs with , giving us that all such pairs are satisfied with probability at least . ∎

For edge sampling, we essentially consider the Erdős-Rényi graph , i.e. we just sample every edge independently with probability . We will actually consider the union of independent graphs, where for some small . Let be this random graph.

Lemma 3.3.

With probability at least , all pairs with safe sets of size at least will be connected by a safe path in .

Proof.

Let be a pair with . Obviously is satisfied if the graph induced on is connected. It is known [5] that there is some small with so that is connected with probability at least . Since is the union of instantiations of , we know that the probability that the subgraph of induced on is not connected is at most . We can now take a union bound over all such pairs, giving us that the probability that there is some unsatisfied pairs with is at most . ∎

We will now combine the randomized rounding of the LP and the random sampling into a single approximation algorithm. Our algorithm is divided into two phases: first, we solve the LP and randomly include every edge with probability . By Lemma 3.1 this takes care of safe sets of size at most . Second, if the objective is to minimize the number of edges we do star sampling with probability , and if the objective is to minimize the maximum degree we do edge sampling using the construction of Lemma 3.3 with . It is easy to see that this algorithm with high probability results in a valid solution that is a -approximation.

Theorem 3.4.

This algorithm is a -approximation to both Constrained Connectivity-Sum and Constrained Connectivity-Degree on .

Proof.

We first argue that the algorithm does indeed give a valid solution to the problem. Let be an arbitrary pair. If , then Lemma 3.1 implies that the first phase of the algorithm results in a safe path. If , then Lemma 3.2 or Lemma 3.3 imply that the second phase of the algorithm results in a safe path. So every pair has a safe path, and thus the solution is valid.

We now show that the cost of this algorithm is at most . We first consider the objective function of minimizing the number of edges. In the LP rounding step we only increase capacities by at most a factor of , so since the LP is a relaxation of the problem we know that the expected cost cost of the rounding is at most . For phase 2, in expectation we chose stars, for a total of at most edges. But since there is a demand for every pair we know that , so phase 2 has total cost at most .

If instead our objective function is to minimize the maximum degree, then since phase 1 only increases capacities by we know that after phase 1 the maximum degree is at most (by a Chernoff bound, with high probability every vertex has degree at most times its fractional degree in the LP). In phase 2, a simple Chernoff bound implies that with high probability every node gets new edges, and thus the node with maximum degree still has degree at most . ∎

3.2 Primal-Dual Algorithm

We also have a primal-dual algorithm that gives a slightly worse result for the Constrained Connectivity-Sum problem. While this algorithm and its analysis is slightly more complicated and only works for the Sum version, by not solving the linear program we get a faster algorithm. In particular, the best known algorithms for solving linear programs with variables take time on general LPs, so since there are variables in the compact version of the flow LP this takes time. The primal-dual algorithm, on the other hand, is significantly faster: a naïve analysis shows that it takes time.

In this algorithm we use the cut LP rather than the flow LP (in fact, the algorithm is very similar to the primal-dual algorithm for Steiner Forest, which uses a similar cut LP but doesn’t have to deal with safe sets). Since this is a primal-dual algorithm, instead of solving and rounding the cut LP we will consider the dual, which has a variable for every pair and . We say the an edge if both endpoints of are in .

s.t.

Unfortunately we will not be able to use a pure primal-dual approximation, but will have to trade off with a random sampling scheme as in the rounding algorithm. So instead of this primal, we will only have constraints for with for some parameter that we will set later. Thus in the dual we will only have variables for with . This clearly preserves the property that the primal is a valid relaxation of the actual problem. Let .

Our primal-dual algorithm, like most primal-dual algorithms, maintains a set of active dual variables that it increases until some dual constraint becomes tight. Once that happens we buy an edge (i.e. set some to in the primal), change the set of active dual variables, and repeat. We do this until we have a feasible primal.

Initially our primal solution is empty and the active dual variables are for every , i.e. every node has an active dual variable for every other that it has a demand with corresponding to the cut in that is the singleton . We raise these variables uniformly until some constraint (say the one for ) becomes tight. At this point we add to our current primal solution . We now change the active dual variables by “merging” moats that cross . In particular, there are some active variables where (which implies that as well). Let denote the subgraph of induced on . Without loss of generality we can assume that and . Let be the connected component of containing . We now make inactive, and make active. We do this for all such active variables, and then repeat this process (incrementing all dual variables until some dual constraint becomes tight, adding that edge to , and then merging moats that cross it) until all pairs have a safe path in .

Lemma 3.5.

This algorithm always maintains a feasible dual solution and an active set that does not contribute to any tight constraint.

Proof.

We will show this by induction, where the inductive hypothesis is that the dual solution is feasible and that no dual variables that contribute to a tight constraint are active. Initially all dual variable are , so it is obviously a feasible solution and no constraints are tight. Now suppose this is true after we add some edge . We need to show that it is also true after we add the next edge . By induction the dual solution after we added is feasible and none of the active dual variables contribute to any tight constraints. Thus raising the active dual variables until some constraint becomes tight maintains dual feasibility.

To prove that no active variables contribute to a tight constraint, note that the only new tight constraint is the one corresponding to . The only variables contributing to that constraint are of the form where . But our algorithm made all of these variables inactive, and only added new active variables for sets that contain both and and thus do not contribute to the newly tight constraint. Furthermore, these sets are formed by the union of and the connected component in containing the other endpoint, so no newly active variable contributes to a constraints that became tight previously (since they correspond to edges in ). ∎

Theorem 3.6.

The primal-dual algorithm returns a graph with at most edges in which every pair with has a safe path.

Proof.

After every iteration of the algorithm all of the tight constraints are added to , which together with Lemma 3.5 implies that the algorithm never gets stuck. Thus it will run until every pair with has a safe path. It just remains to show that the total number of edges returned is at most . To see this, note that every edge in corresponds to a tight constraint in the feasible dual solution we constructed, so if then . Thus we have that

where the last inequality is by duality, and the next to last inequality is because (since ). ∎

Lemma 3.7.

The primal-dual algorithm takes at most time.

Proof.

The primal-dual algorithm adds at least one new edge per iteration, so there can be at most iterations. In each iteration we have to figure out the current value of every dual constraint and the number of active variables in each constraint, which together will imply what the next tight constraint is and how much to raise the variables. We then need to raise the active variables by that amount and merge moats. Note that for every demand there are at most two active moats, so the total number of active variables is at most . Thus each iteration can be done in time , where the dominant term is the time taken to calculate the value of each dual constraint. So the total time is , where there are extra poylogarithmic terms due to data structure overhead. ∎

Now we can trade this off with the random sampling solution for large safe sets to get an actual approximation algorithm:

Theorem 3.8.

There is a approximation algorithm for the Constrained Connectivity-Sum problem on that runs in time .

Proof.

Our algorithm first runs the primal-dual algorithm with . By Theorem 3.6, this returns a graph with at most edges in which there is a safe path for every with . We then use the random star sampling of Lemma 3.2 with and thus . By Lemma 3.2 this satisfies the rest of the demands (the pairs with ) with high probability, and the number of edges added is with high probability at most as desired.

The time bound follows from Lemma 3.7 together with the trivial fact that star sampling can be done in time. ∎

3.3 Simulations

In this section we discuss some the results of simulations using our algorithms. While we believe that the main contribution of this work is theoretical, it is interesting that the algorithms are fast enough to be practical and give solutions that are in practice far superior to the worst case bound.

We implemented both the LP rounding and the primal-dual algorithm for the iBGP-Sum problem. However, the rounding algorithm turned out to be impractical, mainly due to memory constraints. Recall that in the compact version of the flow LP there is a flow variable for every pair and . This variable denotes the amount of flow from to along the edge for the demand . There are also capacity constraints. So on even a modest size AS topology, say one with nodes, the linear program has over six million variables and constraints. Running on a commodity desktop, the memory used by CPLEX merely to create and store this LP results in an extremely large running time, even without attempting to solve it. Our primal-dual algorithm, on the other hand, only needs to keep track of active dual variables and the current values of the dual constraints. So we can actually run this algorithm on reasonably sized graphs.

One change that we make from the theoretical algorithm is the tradeoff with random sampling. In the theoretical analysis we are only able to get a nontrivial approximation bound by using the primal-dual algorithm to handle small safe sets and random sampling to handle large safe sets, but experimentation revealed that the simpler algorithm of using the primal-dual technique to handle all safe sets was sufficient.

AS Name Number of PoPs Number of links
1221 Telstra 44 88
1239 Sprint 52 168
2914 NTT 70 222
3257 TINET 41 174
3356 Level 3 63 570
Table 1: ISP Topologies Used

To test out this algorithm we ran it on five real-world ISP topologies with link weights given by the Rocketfuel project [16]. Our implementation is still relatively slow, so we consider Point-of-Presence level topologies rather than router-level topologies. We feel that this is not unrealistic, though, since in practice the routers at a given PoP would probably just use a single router at that PoP as a route reflector [15, Section 3.1]. The topologies we used are summarized in Table 1.

We compare the number of iBGP sessions used by a full mesh to the number of edges in the overlay produced by the primal-dual algorithm. We assume (conservatively) that all the nodes in the topology are external BGP routers. Our results are shown in Table 2 and in Figure 1. These results show that the primal-dual algorithm gives graphs that are much smaller than the default full mesh. Of course, we do not model additional requirements such as fault-tolerance and stability, but the massive gap suggests that even if adding extra requirements results in doubling or tripling the size of the overlay we will still see a large benefit over the full mesh. Moreover, these results show that the upper bound on the approximation ratio that we proved in Section 3.2 is extremely pessimistic. On these actual topologies the primal-dual algorithm gives results that are only slightly larger than (the worst case is for Level 3, in which the primal-dual algorithm gives an overlay with about edges). Since is an obvious lower bound (the overlays clearly must be connected), this means that in practice our algorithm gives a -approximation.

AS full-mesh Primal-Dual Fraction of full-mesh
1221 946 44 4.65%
1239 1326 83 6.26%
2914 2415 109 4.5%
3257 820 75 9.15%
3356 1953 173 8.86%
Table 2: Primal-Dual vs. full-mesh
Figure 1: Primal-Dual vs. full-mesh

4 Complexity of iBGP-Sum and iBGP-Degree

In this section we will show that the iBGP problems are -hard to approximate by a reduction from Hitting Set (or equivalently from Set Cover). This is a much weaker hardness than the hardness that we prove for the general Constrained Connectivity problems in Section 5, but the iBGP problems are much more restrictive. We note that this hardness is easy to prove for Constrained Connectivity on ; the main difficulty is constructing a metric so that the geometrically defined safe sets of iBGP have the structure that we want.

We begin by giving a useful gadget that encodes a Hitting Set instance as an instance of an iBGP problem in which all we care about is minimizing the degree of a particular vertex. We will then show how a simple combination of these gadgets can be used to prove that iBGP-Degree is hard to approximate, and how more complicated modifications to the gadget can be used to prove that iBGP-Sum is hard to approximate.

Suppose we are given an instance of hitting set with elements (note that we are overloading these as both integers and elements) and sets . Our gadget will contain a node whose degree we want to minimize, a node for all elements , and a node for each set in the instance. We will also have four extra “dummy” nodes: , and . The following table specifies some of the distances between points. All other distances are the shortest path graph distances given these. Let be some large value (e.g. ), and let be some extremely small value larger than .

x z y u h
x M
z M
y
(if )
(if )

It is easy to check that this is indeed a metric space. Informally, we want to claim that any solution to the iBGP problems on this instance must have an edge from to nodes such that the associated elements form a hitting set. Here , and are nodes that force the safe sets into the form we want, and is used to guarantee the existence of a small solution.

Lemma 4.1.

Let be any feasible solution to the above iBGP instance. For every vertex there is either an edge or an edge where

Proof.

We will prove this by analyzing . If we can show that then we will be finished. Note that , so the vertices outside are (distance from ), (distance from ), (distance at least from ), and with (distance from ). The vertices inside the ball are , all nodes, and with .

Obviously and are in by definition. Let be a vertex with . It is easy to verify that is closer to than to any vertex outside of the ball: it has distance from , distance from with , distance from , distance from , and distance greater than from . So as required. On the other hand, suppose . Then , while , so . Similarly, any vertex with is closer to (distance ) than to (distance at least ) and is closer to (distance ) than to (distance at least ). Thus , so must include an edge from to either or an with . ∎

We now want to use this gadget to prove logarithmic hardness for iBGP-Sum. We will use the basic gadget but will duplicate . So there will be copies of , which we will call , and their distances are defined to be and with all other distances defined to be the shortest path. Note that all we did was modify the gadget to “break ties” between the ’s. Also note that the shortest path between and is through , for a total distance of . As before, let be the smallest hitting set.

Lemma 4.2.

Any feasible iBGP-Sum solution has at least edges.

Proof.

It is easy to see that Lemma 4.1 still holds, i.e. that . Intuitively this is because all other nodes are outside of and all distances from to the gadget are the same as before except with an additional . This implies that the number of and nodes adjacent to in any feasible solution must be at least , since if there were fewer such adjacent nodes it would imply the existence of a smaller hitting set (any nodes adjacent to could just be covered using an arbitrary element in at the same cost as using the set itself). Thus the total number of edges must be at least . ∎

Lemma 4.3.

There is a feasible iBGP-Sum solution with at most edges.

Proof.

The solution is simple: create a clique on the nodes (which obviously has size at most ), include an edge from every to (another edges) and include an edge from every to every with (another edges). Obviously there are the right number of edges in this solution, so it remains to prove that it is feasible. To show this we partition the pairs into types and show that every pair in every type is satisfied. The types are 1) , 2) , 3) , 4) (where is any other node in the gadget not included in a previous type), and 5) This is clearly an exhaustive partitioning, so we can just demonstrate that each type is satisfied in turn.

For the first type we already showed that includes all where . Since is a valid hitting set must be adjacent to one such , which in turn is adjacent to , forming a valid safe path. For the second type the only vertices outside are with , and is closer to than to any such . Thus so the path in our solution is a valid safe path. For the third type the vertices outside are . Because of the tie-breaking we introduced, while , and thus and so the path in our solution is a valid safe path. The fourth type is even simpler, since must be either , or an node and the shortest path from to any of these is through . So and is a valid safe path. Finally, for the last type the vertices outside are , and is closer to (distance ) than any such (distance ). So again and thus is a valid safe path. ∎

Theorem 4.4.

It is NP-hard to approximate iBGP-Sum to a factor better than , where is the number of vertices in the metric.

Proof.

It is known that there is some for which it is NP-hard to distinguish hitting set instances with a hitting set of size at most from instances in which all hitting sets have size at least . In the first case we know from Lemma 4.3 that there is a valid iBGP-Sum solution of size at most . In the second cast we know from Lemma 4.2 that any valid iBGP-Sum solution must have size at least . If we set = this gives a gap of . The number of vertices in the iBGP-Sum instance is so , and thus we get hardness of approximation. ∎

It is also fairly simple to modify the basic gadget to prove the same logarithmic hardness for iBGP-Degree. We do this by duplicating everything other than , instead of duplicating . This will force to have the largest degree.

Theorem 4.5.

It is NP-hard to approximate iBGP-Degree to a factor better than , where is the number of vertices in the metric.

Proof.

We will use multiple copies of the above gadget. Let be some large integer that we will define later. We create copies of the gadget but identify all of the vertices, so there is still a unique but for all other nodes in the original there are now copies . The distance between two nodes in the same copy is exactly as in the original gadget, and the distance between two nodes in different copies (say and ) is the distance implied by forcing them to go through (i.e. ). Call this metric . Every vertex in copy is closer to the rest of copy than to any vertex in copy , so Lemma 4.1 holds for every copy. Thus if the smallest hitting set is the degree of in any feasible solution to iBGP-Degree on must be at least .

Conversely, we claim that there is a feasible solution to iBGP-Degree in which every vertex has degree at most . Consider the solution in which is adjacent to and to for all and , and all nodes (other than ) in copy are adjacent to all other nodes (other than ) in copy for all . By the above analysis of we know that this solution satisfies these safe sets (via the safe path where is an element in ). It also obviously satisfies pairs not involving in the same copy, since there is an edge directly between them. It remains to show that pairs involving are satisfied and that pairs involving two different copies are satisfied.

For the first of these we will show that is in all safe sets of the form where is not a node. This is easy to verify exhaustively. It is also true that is in all safe sets of the form even when is a node, since all vertices outside the ball are in different copies and the shortest path from to any node in a different copy must go through . Thus the path in our solution satisfies both of these safe sets. Finally, it is again easy to verify that pairs in different copies are also satisfied.

Now by setting appropriately we are finished. Each copy has nodes, so in the feasible solution we have constructed the degree of any node other than is at most . If we set to some value larger than this, say , we know that the degree of has to be at least . It is known that it is hard to distinguish between hitting set instances with hitting sets of size at most and those in which every hitting set has size at least for some value . Suppose that we are in the first case, where there is a hitting set of size at most . Then we constructed a feasible solution to the iBGP-Degree problem with maximum degree at most . In the second case, where every hitting set has size at least , we showed that the degree of (and thus the maximum degree) must be at least . This gives a gap of , which is clearly . Since the number of vertices in the iBGP-Degree instance is polynomial in , this implies -hardness. ∎

5 Constrained Connectivity

In this section we consider the hardness of the Constrained Connectivity problems and the integrality gaps of the natural LP relaxations.

5.1 Hardness

We now show that the Constrained Connectivity-Sum and Constrained Connectivity-Degree problems are both hard to approximate to better than for any constant . We do this via a reduction from Min-Rep, a problem that is known to be impossible to approximate to better than unless  [13]. An instance of Min-Rep is a bipartite graph in which is partitioned into groups and is partitioned into group . There is a super-edge between and if there is an edge such that and . The goal is to find a minimum set of vertices such that for all super-edges there is some edge with and and . Vertices from a group that are in are called the representatives of the group. It is easy to prove by a reduction from Label Cover that Min-Rep is hard to approximate to better than , and in particular it is hard to distinguish the case when vertices are enough (one from each part in the partition for each side of the graph) from the case when vertices are necessary [13].

Given an instance of Min-Rep, we want to convert it into an instance of Constrained Connectivity-Sum. We will create a graph with five types of vertices: for and ; ; ; for and ; and . Here the nodes represent copies of the groups of and the nodes represent copies of the groups of , where is some parameter that we will define later. is a dummy node that we will use to connect pairs that are not crucial to the analysis. Given this vertex set, there will be four types of edges: for all and and ; for all edges in the original Min-Rep instance; for all and and ; and for all vertices .

Figure 2: Basic hardness construction.

This construction is shown in Figure 2, except in the actual construction there are copies of each node in the top and bottom layer and there is a node that is adjacent to all other nodes. In Figure 2 the middle two layers are identical to the original Min-Rep problem, and the large ellipses represent the groups. In the figure we have simply added a new vertex for each group, and in the construction there are such new vertices per group as well as a vertex.

Now that we have described the constrained connectivity graph, we need to define the safe sets. There are two types of safe sets: if in the original instance there is a super-edge between and then for all . All other safe sets consist of the two endpoints and . Let denote the number of super-edges in the Min-Rep instance, let denote the number of vertices.

The following theorem shows that this reduction works. The intuition behind it is that a safe path between an node and a node corresponds to using the intermediate nodes in the path as the representatives of the groups corresponding to the and nodes, so minimizing the number of labels is like minimizing the number of edges incident on and nodes.

Theorem 5.1.

The original Min-Rep instance has a solution of size at most if and only if there is a solution to the reduced Constrained Connectivity problem of size at most .

Proof.

We first prove the only if direction by showing that if there is a Min-Rep solution of size then there is a Constrained Connectivity solution of size . Let be the set of vertices in a Min-Rep solution of size . Our constrained connectivity solution includes all edges of type , i.e. we include a star centered at . For each and we also include all edges of the form where and all edges of the form where . Finally, for each super-edge in the Min-Rep instance we include the edge between the pair from that satisfies it (if there is more than one such pair we choose one arbitrarily). The star clearly has edges, there are edges from and nodes to nodes in , and there are clearly of the third type of edges, so the total number of edges in our solution is as required. To prove that it is a valid solution, we first note that for all pairs except those of the form or where is a super-edge are satisfied via the star centered at . For pairs and with an associated super-edge, since is a valid solution there must be some and that have an edge between them, and the above solution would include that edge as well as the edge from to and from to , thus forming a safe path of length .

For the if direction we need to show that if there is a Constrained Connectivity solution of size then there is a Min-Rep solution of size at most . Let be a constrained connectivity solution with edges. Since for all vertices , of those edges must be a star centered at , so only edges are between other vertices. Obviously there need to be at least