On Pairwise Spanners111Partially supported by the ERC Starting Grant NEWNET 279352. This work was partially done while the third author was visiting IDSIA.
Given an undirected -node unweighted graph , a spanner with stretch function is a subgraph such that, if two nodes are at distance in , then they are at distance at most in . Spanners are very well studied in the literature. The typical goal is to construct the sparsest possible spanner for a given stretch function.
In this paper we study pairwise spanners, where we require to approximate the - distance only for pairs in a given set . Such -spanners were studied before [Coppersmith,Elkin’05] only in the special case that is the identity function, i.e. distances between relevant pairs must be preserved exactly (a.k.a. pairwise preservers).
Here we present pairwise spanners which are at the same time sparser than the best known preservers (on the same ) and of the best known spanners (with the same ). In more detail, for arbitrary , we show that there exists a -spanner of size with . Alternatively, for any , there exists a -spanner of size with . We also consider the relevant special case that there is a critical set of nodes , and we wish to approximate either the distances within nodes in or from nodes in to any other node. We show that there exists an -spanner of size with , and an -spanner of size with . All the mentioned pairwise spanners can be constructed in polynomial time.
Let be an undirected unweighted graph. A subgraph of is a spanner with stretch function if, given any two nodes at distance in , the distance between the same two nodes in is at most . An spanner is a spanner with stretch functions . ( and are the multiplicative stretch and additive stretch of the spanner, respectively). If the spanner is called multiplicative, and if the spanner is called purely-additive. Spanners are very well studied in the literature (see Section 1.2). The typical goal is to achieve the sparsest possible spanner for a given stretch function [4, 5, 11, 12, 13, 15, 17, 19, 20, 22].
In this paper we address the natural problem of finding (even sparser) spanners in the case that we want to approximately preserve distances only among a given subset of pairs. More formally a pairwise spanner on pairs , or -spanner for short, with stretch function is a subgraph such that, for any , . In particular, a classical (all-pairs) spanner is a -spanner. Pairwise spanners capture scenarios where we only (or mostly) care about some distances in the graph.
To the best of our knowledge, pairwise spanners were studied before only in the special case that is the identity function, i.e. distances between relevant pairs have to be preserved exactly. Coppersmith and Elkin  call such spanners pairwise (distance) preservers, and show that one can compute pairwise preservers of size (i.e., number of edges) .
The authors left it as an open problem to study the approximate variants of these preservers, i.e. what we call pairwise spanners here. This paper takes the first step in answering this question. We show that (for suitable ) it is possible to achieve -spanners which are at the same time sparser than the preservers in  (on the same set ) and than the sparsest known classical spanners (with the same stretch function).
1.1 Our Results and Techniques
In this paper we present some polynomial-time algorithms to construct -spanners for unweighted graphs. Our spanners are either purely-additive (i.e. ) or near-additive (i.e. for an arbitrarily small ). For arbitrary , we achieve the following main results (see Section 5).
(near-additive pairwise) For any and any , there is a polynomial time algorithm to compute a -spanner of size .
For any integer and any , there is a polynomial time algorithm to compute a -spanner of size
We also consider the relevant special case that all the pairs involve at least one node from a critical set . More precisely, we distinguish two types of such pairwise spanners: in subsetwise spanners (see Section 3) we wish to approximate distances between nodes in , i.e. ; in sourcewise spanners (see Section 4) we wish to approximate distances from nodes in , i.e. . We obtain the following improved results for the mentioned cases.
(subsetwise) For any , there is a polynomial time algorithm to compute a -spanner of size .
(sourcewise) For any and any integer , there is a polynomial time algorithm to compute a -spanner of size .
In particular, by choosing , we obtain a sourcewise spanner of size , and a pairwise spanner of size .
All our spanners rely on a path-buying strategy which was first exploited in the spanner by Baswana et al. . The high-level idea is as follows. There is an initial clustering phase, where we compute a suitable clustering of the nodes, and an associated subset of edges which are added to the spanner. Then there is a path-buying phase, where we consider an appropriate sequence of paths, and decide whether to add or not each path in the spanner under construction222In the spanner from Theorem 1.1 there is also a final step where we add a multiplicative -spanner.. In particular, each path has a cost which is given by the number of edges of the path not already contained in the spanner, and a value which measures how much the path helps to satisfy the considered set of constraints on pairwise distances. If the value is sufficiently larger than the cost, we add the considered path to the spanner, otherwise we do not.
In more detail, all our pairwise spanners exploit the same clustering phase. We compute a partition of a subset of the nodes, and call unclustered the remaining nodes . The initial value of the spanner is , where contains all the edges of but possibly a subset of the inter-cluster edges (with endpoints in two different clusters). The common clustering phase is described in Section 2.
During the path-buying phase we add to the spanner some extra inter-cluster edges. Here we need to finely tune the sequence of paths that we consider, and also the definition of value of a path. In our subsetwise and sourcewise spanners the value of a path reflects the number of pairs , where is the endpoint of some pair and is a cluster, such that adding to the current spanner decreases the distance between and (the closest node in) . In the remaining pairwise spanners, we use a similar notion of value, but considering the distance between pairs of clusters .
The sequence of paths used in our subsetwise spanner and near-additive pairwise spanner is simply given by the shortest paths among the relevant pairs. This naturally generalizes the set of paths considered in . However, for the sourcewise spanner and the purely-additive pairwise spanner we need to consider a carefully constructed sequence of paths, which includes slightly suboptimal paths. In more detail, we start with the set of shortest paths between the relevant pairs. Then, for each such path , if the cost of is sufficiently smaller than its value, we include in the spanner. Otherwise, we replace with a slightly longer path between the same endpoints which is much cheaper, and iterate the process on . After a small number of iterations, the considered path becomes cheap enough and hence we include it in the spanner. This (non-trivial) iterative construction of candidate paths during the path-buying phase is probably the main algorithmic contribution of this paper.
1.2 Related Work
Graph spanners were introduced by Peleg and Schaffer  in 1989. Spanners have been extensively studied since then, and there are numerous applications involving spanners, such as algorithms for approximate shortest paths [1, 7, 12], labeling schemes [16, 14], approximate distance oracles [21, 6, 3], routing [2, 9, 10], and network design .
There are several algorithms for computing multiplicative and additive spanners in weighted and unweighted graphs. In unweighted graphs, for any integer , Halperin and Zwick  gave a linear time algorithm to compute a multiplicative -spanner of size , where is the number of vertices. Note that for one obtains a spanner with multiplicative stretch and with size : we will use this type of spanner in Theorem 1.1. Analogous results are also known for weighted graphs [5, 20, 19].
The first purely-additive spanner (for unweighted graphs) is due to Dor et al. . They describe a spanner of size . This was subsequently improved to . Note that our subsetwise spanner from Theorem 1.3 generalizes this result: in particular, it has the same stretch function and is sparser whenever . Baswana et al.  describe a -spanner of size . Whenever for some constant , we achieve an asymptotically sparser pairwise spanner with constant additive stretch (depending on ). The same holds for our sourcewise spanner if .
The result in  shows an elegant trade-off between the size of the spanner and its multiplicative stretch. No such trade-off is known for purely-additive spanners. In particular, the spanner in  is the sparsest known purely-additive spanner. Theorems 1.2 and 1.4 show a non-trivial trade-off between the size and additive stretch of pairwise spanners.
There have also been several results on near-additive spanners [13, 12, 22]. For example, there is a -spanner of size for any . Our pairwise spanner from Theorem 1.1 has the same stretch function, and is sparser for .
A clustering of a graph is a collection of pairwise disjoint subsets of nodes . Note that we do not require to span all the nodes : we call unclustered the nodes .
We will crucially exploit the following construction of a clustering and of an associated cluster subgraph .
There is a polynomial time algorithm which, given and a graph , computes a clustering with at most clusters and a subgraph of size such that:
(missing-edge property) If an edge is absent in , then and belong to two different clusters.
(cluster-diameter property) The distance in between any two vertices of the same cluster is at most .
Let be the set of nodes which are not yet clustered (initially we set . As long as there exists a vertex with at least neighbors in , let contain exactly arbitrary neighbors of in . Add to , set and add to all the edges of with both endpoints in . When no node satisfies the mentioned property, we stop creating new clusters and add to all the edges incident to the final set of unclustered nodes .
By construction, clusters are pairwise disjoint. Each time we create a new cluster, the size of decreases by at least , hence there cannot be more than clusters. Any two nodes in the same cluster have some common neighbor in , hence Property 2 is satisfied. By construction, all the edges incident to unclustered nodes plus the intra-cluster edges (with both endpoints in the same cluster) belong to , which implies Property 1.
It remains to bound the number of edges of . Each time we create a new cluster, the number of edges of grows by at most : this gives edges altogether. When we stop creating clusters, each (clustered or unclustered) node has at most neighbors in : consequently the number of edges incident to unclustered nodes that we add at the end of the procedure is at most . ∎
The following technical lemma turns out to be useful in the remaining sections.
Let and be constructed with the procedure from Lemma 2.1 w.r.t. a given graph . If the shortest path in between any two nodes contains edges that are absent in , then there are at least clusters of having at least one vertex on .
We prove the lemma by counting pairs , where is an edge of absent in and is one of the endpoints of : let be the set of such pairs. Since contains edges that are absent in there are exactly pairs in (each edge belongs to two pairs: and ). We say that a cluster owns a pair if . By the missing-edge property, each edge of absent in has both endpoints clustered, hence each pair of is owned by some cluster.
Let us assume that there are clusters of having at least one vertex on . By the cluster-diameter property any cluster contains at most vertices on , since otherwise would not be a shortest path between and . However, if a cluster contains exactly vertices on , those have to be consecutive vertices of , since is a shortest path and we know by the cluster-diameter property that there is a path of length at most 2 between every pair in . By the missing-edge property both edges and are present in , and consequently owns at most two pairs of . Clearly if a cluster contains at most vertices on , then it owns at most pairs of . Therefore each cluster owns at most pairs of : since has pairs we have . ∎
3 Subsetwise Spanners
In this section we present our algorithm to compute a subsetwise spanner, and prove Theorem 1.3.
Our algorithm consists of two main phases: a clustering phase and a path-buying phase. In the clustering phase we invoke Lemma 2.1 and obtain a cluster subgraph of of size , together with a set of at most clusters. The value of will be defined later.
In the path-buying phase we proceed as follows. Initially set and let , denote the set of shortest paths between all pairs of vertices in . We let denote the endpoints of . Next, we iterate over the paths for . To determine which paths are affordable, we define the functions and :
let be the number of edges of that are absent in
let be the number of pairs , where and is a cluster, such that contains at least one vertex of and the distance between and in the graph is strictly greater than the distance between and in , i.e., .
Our path-buying strategy is as follows. If
then we buy the path , that is we set (in words, is given by plus the edges of not in ). Otherwise (i.e., ), we do not buy and set . The subsetwise spanner is given by .
The next two lemmas bound the stretch and the size of the constructed spanner , respectively
For any , .
Clearly the claim holds if our algorithm bought the path , hence we assume . Let , that is there are exactly edges of which are not present in the graph . By Lemma 2.2 there are at least clusters having at least one vertex on . If there is no cluster among them such that and , then all these clusters would contribute to (either with or with or both) which leads to a contradiction, because .
Thus there is a cluster having a vertex of such that and . This implies:
where the first inequality is because is a subgraph of , the second inequality holds since any two vertices of are at distance at most two in (by the cluster-diameter property) and the last inequality follows from the assumption that contains a vertex of . ∎
For such that the graph contains edges.
The clustering phase produces a graph with edges. Let be the set of paths bought in the path-buying phase. The total number of edges that appear in and do not appear in is equal to , which is upper bounded by . Observe that after the first contribution of a pair to the above sum, the distance between and is at most , hence each pair can contribute to the sum at most times. Therefore the total number of edges added in the second phase of our algorithm is upper bounded by . ∎
4 Sourcewise spanners
In this section we present our algorithm to compute a sourcewise spanner from sources , and prove Theorem 1.4.
Our algorithm again consists of two phases, where the first is a clustering phase and the second is a path-buying phase. The clustering phase is as in the algorithm from previous section, for a proper value of to be defined later. Let and be the resulting clustering and cluster subgraph.
At the start of the second phase we set and define as the set of shortest paths between any two vertices of such that at least one of them belongs to . Let us assume that the path is a shortest path between and . Next, we iterate over paths for . For a given we are going to define paths , where , maintaining the following invariants:
is a path between and in of length at most ,
any cluster contains at most three vertices of ,
, where is the number of edges of absent in , and .
Our algorithm will buy exactly one path for , which will ensure (by Invariant (i)) that in , the distance between and is at most .
We set . Observe that for , Invariant (i) is trivially satisfied, Invariant (ii) is satisfied by the cluster-diameter property (otherwise would not be a shortest path), and Invariant (iii) is satisfied because there are at most clusters in and consequently by Lemma 2.2 .
Say we have constructed , where . Let us define the function as the number of clusters such that contains a vertex of and the distance between and in is strictly greater than the distance between and in , i.e., . Now we check the condition
If that is the case, then we buy the path . That is, is set to . We ignore the remaining values of and proceed with the next value of . Else we construct as follows:
Let be the longest suffix of containing exactly edges that are absent in . Observe that the first node of is clustered: by the maximality of , the edge of preceding is absent in , and hence both the endpoints of (one of which is the first node of ) are clustered by the missing-edge property of . Consequently at least vertices of are clustered, as contains edges absent in and the endpoints of these edges are clustered.
By Invariant (ii) there are at least clusters in having at least one vertex of . Since we did not buy , there exists a cluster containing a vertex of such that the distance between and in is at most the distance between and in . We construct the path by taking a shortest path in from to the closest node , then we add a path of length at most two between and (which exists in hence in by the cluster-diameter property), and finally add the suffix of starting at (see Fig. 1).
Let us show that maintains the invariants. Note that by construction, Invariant (i) is satisfied, since the length of is at most the length of plus . Then, as long as there is a cluster containing at least four vertices on , we let , be the vertices of closest to and respectively. Note that there are at least three edges on between and , hence we can replace the subpath of by adding the at most two edges of guaranteed by the cluster-diameter property. Consequently, Invariant (ii) is satisfied. Moreover, by the choice of , Invariant (iii) is also satisfied. This finishes the construction of .
Observe that by Invariant (iii) we have : since has only integral values, it has to be that , which ensures that we buy a path for some .
Finally, as our spanner we take .
For any pair , .
From the above discussion, we buy at least one path for some . By Invariant (i), the length of the latter path is at most the length of the shortest path between and plus . ∎
For such that , the subgraph contains edges.
To bound the size of we recall that in the first phase we have inserted edges. Let be the index of a path bought for a given . We claim, that any cluster contributes to of at most bought paths. This holds because when for a supported path is bought the distance between and is at most greater than the distance between and in : otherwise one could shorten by more than , obtaining a contradiction with Invariant (i). Therefore the total number of edges added during the second phase is upper bounded by , since each cluster supports at most bought paths. The claim follows. ∎
5 Pairwise spanners
In this section we present our pairwise spanners for arbitrary . We start with a near-additive spanner (see Section 5.1) and then present a purely-additive spanner (see Section 5.2). In both cases we let denote the set of pairs, .
5.1 A Near-Additive Pairwise Spanner
Our algorithm to construct the near-additive -spanner from Theorem 1.1 consists of three phases. First, we use Lemma 2.1 with the value of to be determined later, obtaining a cluster subgraph of of size together with a set of at most clusters.
At the start of the second phase we set and consider the set of paths , where is a shortest path between and in . Next we iterate over the paths for . By we denote the number of edges of absent in , and by we denote the number of pairs of clusters , such that both and contain at least one vertex of and . For a given if
then we buy , that is we set . Otherwise we set .
In the third phase we add to the multiplicative spanner of size given in : this way we obtain the desired spanner .
In the following two lemmas we bound the stretch and size of , respectively.
For each , .
Clearly we can assume that the path was not bought in the second phase, since otherwise the claim trivially holds. Therefore
Let and let be the set of all indices such that is clustered. Observe that if , then by the missing-edge property the whole path is present in , and hence the claim holds. Therefore denote , where and . Let be two indices, such that , (for some ), and the value of is maximized. Note that such a pair of indices always exists, since we can take .
Let . Observe that any cluster contains at most vertices of , since otherwise by the cluster-diameter property would not be a shortest - path. Therefore there are at least clusters having at least one vertex in the set , and at least clusters having at least one vertex in the set . However, each of the at least pairs of clusters in contributes to since the difference between indices in the corresponding set is at least . Therefore and hence .
The latter bound on is sufficient to prove the claim. In fact, consider the path between and in obtained by concatenating the following paths (as illustrated in Fig. 2):
A shortest path in from to . Note that in the prefix of between and there are clustered nodes and hence at most edges absent in (by the missing-edge property). Since contains the -spanner added in the third phase, each missing edge can be replaced by at path of length . Consequently, there is a path from to of length at most in .
A shortest path in from to . Let be the clusters containing and respectively. We know that in there is a path from to of length at most , which can be extended to a path between and in by adding at most edges (by the cluster-diameter property).
A shortest path in from to . Observe that in the suffix of between and there are at most edges absent in by the same argument as above. Hence, thanks to the -spanner added in the third phase, there is a path from to of length at most in .
The resulting path is of length at most
where the last inequality follows from together with .
For such that the size of is .
The clustering phase gives edges, which matches the desired bound on . Let be the set of paths bought in the path-buying phase. Observe, that if a pair of clusters contributes to of a bought path , then when is bought we have , since otherwise the subpath of between and might be shortened (by the cluster-diameter property). It follows that each pair of clusters contributes at most times to , and hence
The total number of edges added in the second phase is therefore upper bounded by
Finally, in the last phase we insert only edges when adding the -spanner. ∎
5.2 A Purely-Additive Pairwise Spanner
Our algorithm consists of the usual clustering phase (for an appropriate parameter ) followed by a path-buying phase that we next describe.
Let and be the clustering and the associated cluster graph. At the beginning of the path-buying phase, we set and consider the set , where is a shortest path between and in . Next we iterate over the paths for . For a given we are going to define paths , where , maintaining the following invariants:
is a path between and in of length at most ,
any cluster contains at most three vertices of ,
, where is the number of edges of absent in , and .
Our algorithm will buy exactly one path , which will ensure by Invariant (i) that in the distance between and is at most . By let us denote the number of pairs of clusters , such that both and contain at least one vertex of and .
We set . Observe that for Invariant (i) is trivially satisfied, Invariant (ii) is satisfied by the cluster-diameter property (otherwise would not be a shortest path), and Invariant (iii) is satisfied because there are at most clusters in and consequently by Lemma 2.2, .
Say we have constructed , where . If
then we buy the path , i.e. as we take the union of and , ignore remaining values of and proceed with the next value of . Otherwise (i.e., ), we construct a path as follows:
Let and let be the set of all indices such that is clustered. Observe that if , then by the missing-edge property the whole path is present in , and hence it is of zero cost, which contradicts the assumption . Therefore denote , where and . Let be two indices, such that , (for some ), and the value of is maximized. Note that such a pair of indices always exists, since we can take .
Let . By Invariant (ii) there are at least clusters having at least one vertex in the set , and at least clusters having at least one vertex in the set . However, each of the at least pairs of clusters in contributes to since the difference between indices in the corresponding set is at least . Therefore
We construct the path by appending the following three paths , , and :
As we take the prefix of from to . Note that this prefix contains clustered nodes and hence at most edges absent in (by the missing-edge property of ).
Let be the clusters containing