Distributed Construction of Purely Additive SpannersAn extended abstract of this work will be presented in DISC 2016.

# Distributed Construction of Purely Additive Spanners††thanks: An extended abstract of this work will be presented in DISC 2016.

Keren Censor-Hillel222Department of Computer Science, Technion, Israel. {ckeren,amipaz}@cs.technion.ac.il. Supported by ISF individual research grant 1696/14. Part of this work was done while Ami Paz was visiting TIFR, Mumbai.    Telikepalli Kavitha333Tata Institute of Fundamental Research, India. kavitha@tcs.tifr.res.in.    Ami Paz222Department of Computer Science, Technion, Israel. {ckeren,amipaz}@cs.technion.ac.il. Supported by ISF individual research grant 1696/14. Part of this work was done while Ami Paz was visiting TIFR, Mumbai.    Amir Yehudayoff444Department of Mathematics, Technion, Israel. amir.yehudayoff@gmail.com.
###### Abstract

This paper studies the complexity of distributed construction of purely additive spanners in the CONGEST model. We describe algorithms for building such spanners in several cases. Because of the need to simultaneously make decisions at far apart locations, the algorithms use additional mechanisms compared to their sequential counterparts.

We complement our algorithms with a lower bound on the number of rounds required for computing pairwise spanners. The standard reductions from set-disjointness and equality seem unsuitable for this task because no specific edge needs to be removed from the graph. Instead, to obtain our lower bound, we define a new communication complexity problem that reduces to computing a sparse spanner, and prove a lower bound on its communication complexity using information theory. This technique significantly extends the current toolbox used for obtaining lower bounds for the CONGEST model, and we believe it may find additional applications.

## 1 Introduction

A graph spanner is a sparse subgraph that guarantees some bound on how much the original distances are stretched. Graph spanners, introduced in 1989 [PS89, PU89a], are fundamental graph structures which are central for many applications, such as synchronizing distributed networks [PU89a], information dissemination [CHKM12], compact routing schemes [Che13a, PU89b, TZ01], and more.

Due to the importance of spanners, the trade-offs between their possible sparsity and stretch have been the focus of a huge amount of literature. Moreover, finding time-efficient constructions of spanners with optimal guarantees has been a major goal for the distributed computing community, with ingenious algorithms given in many studies (see, e.g., [EP04, Elk05, EZ06, DGPV09, BKMP10, BS07, DG08, DGP07, DGPV08, DMP05, Pet10]). One particular type of spanners are purely additive spanners, in which the distances are promised to be stretched by no more than an additive term. However, distributed constructions of such spanners have been scarce, with the only known construction being a -additive spanner construction with edges in rounds in a network of size and diameter [LP13] (also follows from [HW12]).

The absence of distributed constructions of purely additive spanners is explicitly brought into light by Pettie [Pet10], and implicitly mentioned in [DGP07].

This paper remedies this state of affairs, by providing a study of the complexity of constructing sparse purely additive spanners in the synchronous CONGEST model [Pel00], in which each of nodes can send an -bit message to each of its neighbors in every round. Our contribution is twofold: first, we provide efficient constructions of several spanners with different guarantees, and second, we present new lower bounds for the number of rounds required for such constructions, using tools that are not standard in this context.

### 1.1 The Challenge

A subgraph of an undirected unweighted graph is called a purely additive spanner with stretch if for every every pair , we have , where is the - distance in and is the - distance in . The goal in spanner problems is to construct a subgraph that is as sparse as possible with as small as possible, i.e., we seek a sparse subgraph of which approximates all distances with a small stretch.

The problem of computing sparse spanners with small stretch is well-studied and we know how to construct sparse purely additive spanners for . These have sizes  [ACIM99],  [Che13b], and  [BKMP10], respectively. In a very recent breakthrough, it was shown that there is no purely additive spanner of size at most  [AB16a].

In a bid to get sparser subgraphs than all-pairs spanners with the same stretch, the following relaxation of pairwise spanners has attracted recent interest. Here we are given : these are our “relevant pairs” and we seek a sparse subgraph which approximates distances between all pairs in with a small stretch. That is, for every pair , the graph should satisfy and for pairs outside , the value could be arbitrarily large. Such a subgraph is called a -pairwise spanner. We use to denote the number of nodes appearing in , i.e. .

The problem of constructing sparse pairwise spanners was first studied by Coppersmith and Elkin [CE06] who showed sparse subgraphs where distances for pairs in were exactly preserved; these subgraphs were called pairwise preservers. A natural case for is when , where is a set of source nodes — here we seek for a sparse subgraph that well-approximates - distances for all . Such pairwise spanners are called sourcewise spanners. Another natural setting is when and such pairwise spanners are called subsetwise spanners.

Purely additive spanners are usually built in three steps: first, building clusters which contain all high-degree nodes and adding all the edges of the unclustered nodes; second, building BFS trees which -approximate all the paths with many missing edges; and third, adding more edges to approximate the other paths.

While our constructions follow the general outline of known sequential constructions of pairwise additive spanners [Kav15, KV13], their techniques cannot be directly implemented in a distributed setting. In the sequential setting, the clustering phase is implemented by repeatedly choosing a high-degree node and adding some of its edges to the spanner; these neighbors are marked and ignored in the rest of the phase. In the distributed setting, going over high degree nodes one by one, creating clusters and updating the degrees is too costly. Instead, we choose the cluster centers at random, as done by Thorup and Zwick [TZ05], Baswana and Sen [BS07], and Chechik [Che13b] (see also Aingworth et al. [ACIM99] for an earlier use of randomization for the a dominating set problem).

Sources for BFS trees are carefully chosen in the sequential setting by approximately solving a set-cover problem, in order to cover all paths with many missing edges. Once again, this cannot be directly implemented in the distributed setting, as the knowledge of all paths cannot be quickly gathered in one location, so we choose the BFS sources randomly [Che13b]. In both the clustering and BFS phases, the number of edges increases by a multiplicative factor, for .

The main challenge left is to choose additional edges to add to the spanner in order to approximate the remaining paths well. To this end, we make heavy use of the parallel-BFS technique of Holzer and Wattenhofer [HW12], which allows to construct BFS trees rooted at different nodes in rounds. We use this technique to count edges in a path, to count missing edges in it, and to choose which edges to add to the spanner. Yet, interestingly, we are unable to match the guarantee on the number of edges of more sophisticated algorithms [BKMP10, Kav15, Woo10]. Some of these algorithms use the value of a path, which is roughly the number of pairs of cluster that get closer if the path is added to the spanner. We are not able to measure this quantity efficiently in the distributed setting, and this is one of the reasons we are unable to introduce -all-pairs spanner matching the sequential constructions.

### 1.2 Our Contribution

We provide various spanner constructions in the CONGEST model, as summarized in Tables 1 and 2.

The distributed spanner construction algorithms we present have three main properties: stretch, number of edges, and running time. All three properties hold w.h.p. (with high probability). That is, the algorithm stops in the desired time, with the desired number of edges and the spanner produced has the desired stretch with probability , where is constant of choice. However, we can trade the properties and guarantee two of the three to always hold: if the spanner is too dense or the stretch is too large, we can repeat the algorithm; if the running time exceeds some threshold, we can stop the execution and output the whole graph to get stretch, or output an empty graph to get the desired number of edges. The edges of the constructed spanner can be counted over a BFS tree in within rounds. In sourcewise, subsetwise and pairwise spanners, the stretch is measured by running BFS from the relevant nodes (nodes of of appearing in ) for rounds in and again in ; in all-pairs spanners, the stretch is measured by measuring the stretch of the underlying sourcewise or subsetwise spanner.

We complement our algorithms with some lower bounds for the CONGEST model. We show that any algorithm that constructs an additive -pairwise spanner with edges on pairs must have at least rounds, as long as . For example, a CONGEST construction of a -pairwise spanner must take rounds. We also prove lower bounds for -pairwise spanners (i.e., for which ). We show that any algorithm that constructs an -pairwise spanner with edges on pairs must have at least rounds, as long as , where the constant in the notation depends on .

We believe the difficulty in obtaining this lower bound arises from the fact that standard reductions from set-disjointness and equality are unsuitable for this task. At a high level, in most standard reductions the problem boils down to deciding the existence of an edge (which can represent, e.g., the intersecting element between the inputs); when constructing spanners, no specific edge needs to be added to the spanner or omitted from it, so the solution is allowed a considerable amount of slack that is not affected by any particular edge alone.

Instead, to obtain our lower bound, we define a new communication complexity problem that reduces to computing a sparse spanner, and prove a lower bound on its communication complexity using information theory. In this new problem, which we call PART-COMP, Alice has a set of size , and Bob has to output a set of size so that . We show that any protocol that solves PART-COMP must convey bits of information about the set . This technique significantly extends the current toolbox used for obtaining lower bounds for the CONGEST model. As such, we believe it may find additional applications, especially in obtaining lower bounds for computing in this model.

We conclude this section with a further discussion of related work. Section 2 contains the definition of the model and some basic routines. In Section 3 we present distributed algorithms for computing the various types of spanners discussed above. In Section 4 we present our new lower bounds, and we conclude with a short discussion in Section 5.

### 1.3 Related Work

Sparse spanners with a small multiplicative stretch are well-understood: Althöfer et al. [ADD93] in 1993 showed that any weighted graph on vertices has a spanner of size with multiplicative stretch , for every integer . Since then, several works [BS07, DHZ00, EP04, Elk05, Knu14, Pet09, RTZ05, RZ11, TZ06] have considered the problem of efficiently constructing sparse spanners with small stretch and have used spanners in the applications of computing approximate distances and approximate shortest paths efficiently.

For unweighted graphs, one seeks spanners where the stretch is purely additive and as mentioned earlier, an almost tight bound of is known for how sparse a purely additive spanner can be. Bollobás et al. [BCE05] were the first to study a variant of pairwise preservers called distance preservers, where the set of relevant pairs is , for a given parameter . Coppersmith and Elkin [CE06] showed pairwise preservers of size and for any . For , the bound of for pairwise preservers has very recently been improved to by Bodwin and Williams [BW16].

The problem of designing sparse pairwise spanners was first considered by Cygan et al. [CGK13] who showed a tradeoff between the additive stretch and size of the spanner. The current sparsest pairwise spanner with purely additive stretch has size and additive stretch 6 [Kav15]. Woodruff [Woo10] and Abboud and Bodwin [AB16b, AB16a] showed lower bounds for additive spanners and pairwise spanners. Parter [Par14] showed sparse multiplicative sourcewise spanners and a lower bound of on the size of a sourcewise spanner with additive stretch , for any integer .

Distributed construction of sparse spanners with multiplicative stretch was addressed in several studies [BKMP10, BS07, DG08, DGP07, DGPV08, DMP05, Pet10]. Constructions of -spanners were addressed in [BKMP10, DGPV09, Pet10]. Towards the goal of obtaining purely additive spanners, for which , Elkin and Peleg [EP04] introduced nearly-additive spanners, for which . Additional distributed constructions of nearly-additive spanners are given in [Elk05, EZ06, DGPV09, Pet10]. Finally, somewhat related, are constructions of various spanners in the streaming model, and in dynamic settings, both centralized and distributed [BKS12, Bas08, BS08, Elk07a, Elk07b].

In his seminal paper, Pettie [Pet10] presents lower bounds for the number of rounds needed by distributed algorithms in order to construct several families of spanners. Specifically, it is shown that computing an all-pair additive -spanner with size in expectation, for a constant , requires rounds of communication. Because this is an indistinguishability-based lower bound, it holds even for the less restricted LOCAL mode, where message lengths can be unbounded.

The lower bound is obtained by showing an -node graph with diameter where, roughly speaking, removing wrong edges induces a stretch that is too large, and identifying these wrong edges takes rounds. This gives a lower bound of rounds. Examining the construction in detail, it is not hard to show it works for other types of spanners as well: even for a single pair of nodes, or a set of size , at least rounds are necessary in order to avoid removing wrong edges.

## 2 Preliminaries

##### The Model:

The distributed model we assume is the well-known CONGEST model [Pel00]. Such a system consists of a set of computational units, who exchange messages according to an undirected communication graph , , where nodes represent the computational units and edges the communication links. Each node has a unique identifier which can be encoded using bits. The diameter of is denoted by .

When the computation starts, each node knows its own identifier and the identifiers of its neighbors; when there is a set of nodes or a set of node-pairs involved in the computation, it also knows if it belongs to , or all the pairs in it belongs to. The computation proceeds in rounds, where in each round each node sends an -bits message to each of its neighbors, receives a message from each neighbor, and performs a computation. We use the number of rounds as our complexity measure, while ignoring the local computation time; however, in our algorithms all local computations take polynomial time. When the computation ends, each node knows which of its neighbors is also its neighbor in the new graph generated. We do not assume that the global structure of is known to any of the nodes.

##### Clustering and BFS:

The first building block in all of our algorithms is clustering. A cluster around a cluster center is a subset of , the set of neighbors of in (which does not include itself). A node belonging to a cluster is clustered, while the other nodes of are unclustered. We use to denote the set of cluster centers and to denote the set of clusters.

In the clustering phase of our algorithms we divide some of the nodes into clusters. We create a new graph containing all the edges connecting a clustered node to its cluster center, and all the edges incident on unclustered nodes.

Another building block is BFS trees. A BFS tree in a graph , rooted at a node , consists of shortest paths from to all other nodes in . The process of creating a BFS tree, known as BFS search, is well-known in the sequential setting. In the distributed setting, a single BFS tree can be easily constructed by a techniques called flooding (see, e.g. [Pel00, §3]), and a celebrated result of Holzer and Wattenhofer [HW12] asserts that multiple BFS trees, rooted at a set of nodes, can be constructed in rounds. Here, denotes the diameter of the graph, i.e. the maximal distance between two nodes. We use this technique to add BFS trees to the spanner we construct, and to measure distances in the original graph.

## 3 Building Spanners

In this section we present distributed algorithms for building several types of additive spanners. For each spanner, we first describe a template for constructing it independently of a computational model and analyze its stretch and number of edges. Then, we provide a distributed implementation of the algorithm in the CONGEST model and analyze its running time.

In a nutshell, our algorithms have three steps: first, each node tosses a coin to decide if it will serve as a cluster center; second, each cluster center tosses another coin to decide if it will serve as a root of a BFS tree; third, add to the current graph edges that are part of certain short paths. The parameters of the coins and the meaning of “short” are carefully chosen, depending on the input to the problem and the desired stretch.

Proving that the algorithms perform well is about analyzing the probability of failure. This analysis uses the graph structure as well as standard concentration bounds. In all of our algorithms, is a constant that can be chosen according to the desired exponent of in the failure probability.

### 3.1 A (+2)-Sourcewise Spanner

Our first algorithm constructs a -sourcewise spanner. Given a set , the algorithm returns a subgraph of satisfying for all , with guarantees as given in the following theorem.

###### Theorem 1.

Given a graph on nodes and a set of sources , a -sourcewise spanner with edges can be constructed in rounds in the CONGEST model w.h.p.

This is only a factor more than the number of edges given by the best sequential algorithm known for this type of spanners [KV13]. Lemmas 2 and 3 analyze the size and stretch of Algorithm  given below. The number of rounds of its distributed implementation is analyzed in Lemma 4, giving Theorem 1.

#### 3.1.1 Algorithm 2S

Input: a graph ; a set of source nodes
Output: a subgraph
Initialization: , , and

##### Clustering

Pick each node as a cluster center w.p. , and denote the set of selected nodes by . For each node , choose a neighbor of which is a cluster center, if such a neighbor exists, add the edge to , and add to ; if none of the neighbors of is a cluster center, add to all the edges belongs to.

##### Bfs

Pick each cluster center as a root of a BFS tree w.p. , and add to a BFS tree rooted at each chosen root.

For each source-cluster pair : build a temporary set of paths, containing a single, arbitrary shortest path from to each ; omit from the set all paths with more than missing edges (i.e. edges in but not in ); if any paths are left, add to the shortest among them.

#### 3.1.2 Analysis of Algorithm 2S

We now study the properties of the spanner created by the algorithm; in the next section, we describe the implementation of the different phases in the CONGEST model and analyze the running time of the algorithm.

###### Lemma 2.

Given a graph with and a set , Algorithm  outputs a graph , , with edges w.p. at least .

###### Proof.

The algorithm starts with , and adds to it only edges from . We analyze the number of edges added in each phase.

In the first part of the clustering phase, each node adds to at most one edge, connecting it to a single cluster center, for a total of edges. Then, the probability that a node of degree at least is left unclustered is at most which is . A union bound implies that all nodes of degree at least are clustered w.p. , and thus the total number of edges added to by unclustered nodes in the second part of the clustering phase is w.p. .

A node becomes a root in the BFS phase if it is chosen as a cluster center and then as a root, which happens with probability . Letting denote the set of trees gives , and a Chernoff bound implies that . As we have , and the BFS phase adds at most trees, which are edges.

Finally, each of the nodes is chosen as a cluster center with probability , so . A Chernoff bound implies ; as , we have . For each pair in , at most edges are added in the path buying phase, for a total of edges.

Substituting gives a total of edges, as claimed. ∎

###### Lemma 3.

Given a graph with and a set , the graph constructed by Algorithm  satisfies for each pair w.p. at least .

###### Proof.

Consider a shortest path between and in .

If has more than missing edges in after the clustering phase, then it traverses more than clusters, as otherwise there is a shorter path between and in . The probability that none of the centers of these clusters is chosen as a root in the BFS phase is at most Let be a cluster that traverses, and let be a node in . Adding a BFS tree rooted at ensures that and similarly . By the triangle inequality

 δH(s,v)≤δH(s,ci)+δH(ci,v)≤δG(s,u)+δG(u,v)+2

which equals since is on . This completes the proof for with many missing edges.

Consider the complementary case, where has at most missing edges in after the clustering phase. If traverses no clusters, then it is contained in , and . Otherwise, if belongs to some cluster , then there is a node (possibly itself) such that the shortest path between and is added to the graph in the path buying phase. The nodes and both belong to the same cluster, so and the triangle inequality implies , as claimed. Finally, consider the case where traverses at least one cluster and is unclustered; let be the clustered node closest to on . The sub-path from to is contained in , so and by the previous analysis ; the triangle inequality implies

 δH(s,v)≤δH(s,u)+δH(u,v)≤δG(s,u)+2+δG(u,v)

and since is on , we have and the claim follows. ∎

#### 3.1.3 Implementing Algorithm 2S in the CONGEST Model

We now discuss the implementation of Algorithm  in the CONGEST model.

###### Lemma 4.

Algorithm  can be implemented in rounds in the CONGEST model, w.p. at least .

###### Proof.

We present distributed implementations for each of the phases in Algorithm , and analyze their running time.

##### Preprocessing

In order to run the algorithm properly, we need each node to know the parameter , which in turn depends on and . These parameters are not given in advance to all graph nodes, but they can be gathered along a BFS tree rooted at a predetermined node, e.g. the node with minimal identifier, and then spread to all the nodes over the same tree. This is done in rounds.

##### Clustering

The clustering phase is implemented as follows: first, each node becomes a cluster center w.p. and sends a message to all its neighbors; then, each node that gets at least one message joins a cluster of one of its neighbors, by sending a message to that neighbor and adding their connecting edge to the graph; finally, nodes that are not neighbors of any cluster center send a message to all their neighbors and add all their incident edges to the graph. The round complexity of this phase is constant.

##### Bfs

Each cluster center becomes a root of a BFS tree w.p. , which is done without communication. Then, all BFS roots run BFS searches in parallel. The number of BFS trees is w.p. , as seen in the proof of Lemma 2, and this number of BFS searches can be run in parallel in rounds, using an algorithm of Holzer and Wattenhofer [HW12, §6.1]. Their algorithm outputs the distances along the BFS trees, whereas we wish to mark the BFS tree edges and add them to the graph; this requires a simple change to the algorithm, which does not affect its correctness or asymptotic running time.

This phase starts with measuring all the distances between pairs of nodes in , and the number of missing edges in each shortest path measured. To find all distances from a node to all other nodes, we run a BFS search from ; moreover, we augment each BFS procedure with a counter that counts the missing edges in each path from the root to a node on the BFS tree. Running BFS searches from all the nodes of is done in rounds, as before, and adding a counter does not change the time complexity. When a node receives a message of a BFS initiated by some , it learns its distance from and the number of missing edges on one shortest path from to , which lies within the BFS tree; we refer to this path as the shortest path from to .

After all the BFS searches complete, each clustered node sends to its cluster center the distance from each to , and the number of missing edges on the corresponding path. This sub-phase takes rounds to complete.

Each cluster center now knows, for each , the length of the shortest path from to each , and the number of missing edges in each such path; it then locally chooses the shortest among all paths with at most missing edges. Finally, for each chosen path, sends a message to containing the identifier of . All BFS searches are now executed backwards, by sending all the messages in opposite direction and order; when runs backwards the BFS search initiated by , it marks the message to his parent with a “buy” bit, which is passed up the tree and makes each of its receivers add the appropriate edge to the graph. This sub-phase requires rounds as well.

In total, the running time of the algorithm is , w.p. at least , which completes the proof for the case . In the case , we can replace the algorithm by a simpler algorithm that returns the union of BFS trees rooted at all nodes of . This creates a graph that exactly preserves all distances among pairs , and takes rounds to complete. The number of edges in the created spanner is , and the assumption implies , as desired. ∎

### 3.2 A (+4)-All-Pairs Spanner

Recall that a subgraph of is a -all-pairs spanner if for all pairs . We present an algorithm, based on Algorithm , which builds a -all-pairs spanner and has the properties guaranteed by the following theorem.

###### Theorem 5.

Given a graph on nodes, a -all-pairs spanner with edges can be constructed in rounds in the CONGEST model w.h.p.

The main idea is that cluster centers are now sources for a -sourcewise spanner, which, as we show, promises a -stretch to all pairs. Lemmas 6 and 7 analyze the size and the stretch of Algorithm  below. Lemma 8 analyzes the running time of its distributed implementation, completing the proof of Theorem 5.

#### 3.2.1 Algorithm 4AP

Input:
Output: a subgraph
Initialization: , , and

##### Clustering

Run clustering as in Algorithm .

Run the BFS and path buying phases from Algorithm , with cluster centers as sources, i.e. .

#### 3.2.2 Analysis of Algorithm 4AP

###### Lemma 6.

Given a graph with , Algorithm  outputs a graph , , with edges w.p. at least .

###### Proof.

The lemma follows from the proof of Lemma 2: in Algorithm , is the set of all cluster centers, whose amount is , and by the proof w.h.p. Substituting and in Lemma 2, we get that the graph created by Algorithm  contains edges w.p. at least . ∎

###### Lemma 7.

Given a graph with , Algorithm  outputs a graph satisfying for each pair of vertices w.p. at least .

###### Proof.

Let be an arbitrary pair of nodes, and set a shortest path in between them.

If is not incident on any clustered node, then all its nodes are unclustered and all its edges are present in . Otherwise, let be the first clustered node on , when traversing it from to , and let be the cluster containing . The sub-path of from to exists in , as all nodes on this sub-path except for are unclustered; the distance from to satisfies , as the stretch of in is at most 2 by Lemma 3, w.p. at least . The triangle inequality completes the proof:

 δH(u,v)≤δH(u,x)+δH(x,ci)+δH(ci,v)≤δG(u,x)+1+δG(ci,v)+2≤δG(u,x)+δG(ci,x)+δG(x,v)+3=δG(u,v)+4.

#### 3.2.3 Implementing Algorithm 4AP

Running Algorithm  is done by executing Algorithm  with a specific set ; thus, their running times are identical, as stated in the next lemma.

###### Lemma 8.

Algorithm  can be implemented in the CONGEST model in rounds w.p. at least .

###### Proof.

The lemma follows from the proof of Lemmas 2 and 4: in Algorithm , is the set of cluster centers, whose amount is , and by the proof of lemma 2, w.h.p. Substituting and in Lemma 4, we get that the algorithm completes in rounds, with the claimed probability. ∎

### 3.3 A (+2)-Pairwise Spanner

Recall that a -pairwise spanner, for a set of pairs , is subgraph of satisfying for all pairs . Recall that denotes the number of nodes appearing in , i.e. .

We present a distributed algorithm, Algorithm , which returns a -pairwise spanner with the properties described in the following theorem.

###### Theorem 9.

Given a graph on nodes and a set of pairs of nodes in , a -pairwise spanner with edges can be constructed in rounds in the CONGEST model w.h.p.

If , achieving the desired spanner is simple: for each appearing in , add to a BFS from . The number of edges is , the stretch is for all pairs in , and the running time is , as desired. Otherwise, Lemmas 10 and 11 prove the claimed size and stretch of Algorithm  below. Lemma 12 proves the running time of its distributed implementation, giving Theorem 9.

#### 3.3.1 Algorithm 2P

Input: ,
Output: a subgraph
Initialization: , , and

##### Clustering and BFS

Run clustering and add BFS trees from selected cluster centers, as in Algorithm .

For each pair , if the shortest path between and in has at most missing edges in , add it to .

#### 3.3.2 Analysis of Algorithm 2P

###### Lemma 10.

Given a graph with and a set , Algorithm  outputs a graph , , with edges, w.p. at least .

###### Proof.

The lemma follows from the proof of Lemma 2: the clustering and BFS phases add edges to the graph w.p. at least , as long as . The first inequality comes from the comment after the statement of Theorem 9, the fact that and the choice of , and the second inequality is immediate.

In the path buying phase, at most edges are added for each pair in , for a total of edges. Substituting , we get a total of edges in . ∎

###### Lemma 11.

Given a graph with , Algorithm  outputs a graph satisfying for each pair of vertices , w.p. at least .

###### Proof.

Let be an arbitrary pair of nodes, and fix a shortest path in between them.

If has at most missing edges in before the path buying phase, it is added to , and . Otherwise, has more than missing edges before the BFS phase, so it traverse at least clusters. As in the proof of Lemma 3, at least one of the corresponding cluster centers is chosen as a root of a BFS tree w.p. at least , and , as claimed. ∎

#### 3.3.3 Implementing Algorithm 2P

###### Lemma 12.

Algorithm  can be implemented in rounds in the CONGEST model w.p. at least .

###### Proof.

We can implement the clustering and path buying phases in rounds with success probability , as seen in the proof of Lemma 4. In order to count missing edges in paths, we run a BFS search in from each node appearing in . Then, the BFS search is run backwards, and is used to add the “cheap” paths: for a pair in , if the BFS from arrives at traversing at most missing edges, then sends back a “buy” message up the tree, and the path is added. We may end up adding two shortest path for a pair , but this does not affect the asymptotic number of edges or the time complexity. This phase is implemented in rounds, by running the BFS searches in parallel. ∎

### 3.4 A (+4)-Pairwise Spanner

We present an algorithm for constructing a -pairwise spanner, with the parameters described by the following theorem.

###### Theorem 13.

Given a graph on nodes and a set of pairs, a -pairwise spanner with edges can be constructed in rounds in the CONGEST model w.h.p.

If , the -pairwise spanner from Theorem 9 is sparser than the one promised by Theorem 13, and can be constructed in the same running time. Otherwise, Lemmas 14 and 15 show the claimed size and stretch of Algorithm  below, which together with Lemma 16, which analyzes the running time of its distributed implementation, proves Theorem 13.

#### 3.4.1 Algorithm 4P

Input: a graph ; a set of pairs
Output: a subgraph
Initialization: , , and

##### Clustering and BFS

Run clustering and add BFS trees from selected cluster centers, as in Algorithm .

For each pair , let be a shortest path from to . Add to the first missing edges and the last missing edges in .

##### Choosing Cluster Centers

Construct a set of cluster centers by adding to it each cluster center independently w.p. .

For each pair : fix a set of paths containing a single shortest path from to each ; omit all paths with more than missing edges in ; if any paths are left, add to the shortest among them.

#### 3.4.2 Analysis of Algorithm 4P

###### Lemma 14.

Given a graph with and a set , Algorithm  outputs a graph , , with edges w.p. at least .

###### Proof.

The clustering and BFS phases add edges to the graph w.p. at least , as seen in the proof of Lemma 2, as long as . The first inequality comes from the discussion below the statement of Theorem 13, and the second is immediate.

In the prefix-suffix buying phase, at most edges are bought for each pair in , for a total of edges.

Finally, in the path buying phase we add to at most paths, with missing edges in each. Each node is chosen to be a cluster center w.p. , and then to enter w.p. , so . A Chernoff bound implies ; the last equality holds under the assumption , as discussed below the statement of Theorem 9. Hence, the number of edges added in the path buying step is edges. In total, has edges w.p. at least . ∎

###### Lemma 15.

Given a graph with , Algorithm  outputs a graph satisfying for each pair of vertices w.p. at least .

###### Proof.

Let be an arbitrary pair of nodes, and let be an arbitrary shortest path from to . If has at most missing edges in after the clustering phase, it is added to in the prefix-suffix buying phase and .

Otherwise, the prefix of with missing edges is incident on at least clusters. Each cluster center is added to independently w.p.  so the expected number of clusters in which are also incident on the prefix is , and a Chernoff bound implies that the probability that less than of the centers of these clusters are chosen to is at most . The same argument shows that the suffix of is incident on a cluster in , and a union bound implies that all prefixes and suffixes are incident on clusters in w.p. at least .

Let be a center of a cluster in which is incident on the prefix of , and a center of a cluster incident on the suffix of . Let and be nodes in and respectively, and let be a path between and in .

If the number of edges of missing in after the clustering phase is more than , then is incident on at least clusters. In this case, a cluster incident on is a source of a BFS tree w.p. at least , as seen in the proof of Lemma 3. Let be such a cluster, then after adding the BFS trees it holds that , which implies .

If has less than missing edges then a path between and some