Deterministic Digital Clustering of Wireless Ad Hoc Networks ^{*}^{*}*This work was supported by the Polish National Science Centre grant DEC2012/07/B/ST6/01534.
Abstract
We consider deterministic distributed communication in wireless ad hoc networks of identical weak devices under the SINR model without predefined infrastructure. Most algorithmic results in this model rely on various additional features or capabilities, e.g., randomization, access to geographic coordinates, power control, carrier sensing with various precision of measurements, and/or interference cancellation. We study a pure scenario, when no such properties are available.
The key difficulty in the considered pure scenario stems from the fact that it is not possible to distinguish successful delivery of message sent by close neighbors from those sent by nodes located on transmission boundaries. This problem has been avoided by appropriate model assumptions in many results, which simplified algorithmic solutions.
As a general tool, we develop a deterministic distributed clustering algorithm, which splits nodes of a multihop network into clusters such that: (i) each cluster is included in a ball of constant diameter; (ii) each ball of diameter contains nodes from clusters. Our solution relies on a new type of combinatorial structures (selectors), which might be of independent interest.
Using the clustering, we develop a deterministic distributed local broadcast algorithm accomplishing this task in rounds, where is the density of the network. To the best of our knowledge, this is the first solution in pure scenario which is only away from the universal lower bound , valid also for scenarios with randomization and other features. Therefore, none of these features substantially helps in performing the local broadcast task.
Using clustering, we also build a deterministic global broadcast algorithm that terminates within rounds, where is the diameter of the network. This result is complemented by a lower bound , where is the pathloss parameter of the environment. This lower bound, in view of previous work, shows that randomization or knowledge of own location substantially help (by a factor polynomial in ) in the global broadcast. Therefore, unlike in the case of local broadcast, some additional model features may help in global broadcast.
1 Introduction
We study distributed algorithms in ad hoc wireless networks in the SINR model with uniform transmission power. We consider the ad hoc setting where both capability and knowledge of nodes are limited – nodes know only the basic parameters of the SINR model (i.e., ) and upper bounds , on the degree and the size of the network, such that the actual maximal degree is and the size is . Such a setting appears in networks without infrastructure of base nodes, access points, etc., reflecting e.g. scenarios where large sets of sensors are distributed in an area of rescue operation, environment monitoring, or in prospective internet of things applications. Among others, we study basic communication problems as global and local broadcast, as well as primitives used as tools to coordinate computation and communication in the networks (e.g., leader election and wakeup). Most of these problems were studied in the model of graphbased radio networks over the years. Algorithmic study of the closer to reality SINR model has started much later. This might be caused by specific dynamics of interferences and signal attenuation, which seems to be more difficult for modeling and analysis than graph properties appearing in the radio network model (see e.g. [22]).
As for the problems studied in this paper, several distributed algorithms have been presented in recent years. However, all these solutions were either randomized or relied on the assumption that nodes of a network know their own coordinates in a given metric space. In contrast, the aim of this paper is to check how efficient could be solutions without randomization, availability of locations, power control, carrier sensing, interference cancellation or other additional model features, in the context of local and global communication problems.
1.1 The Network Model.
We consider a wireless network consisting of nodes located on the 2dimensional Euclidean space^{2}^{2}2Results of this paper can be generalized to socalled boundedgrowth metric spaces with the same asymptotic complexity bounds.. We model transmissions in the network with the SINR (SignaltoInterferenceandNoise Ratio) constraints. The model is determined by fixed parameters: path loss , threshold , ambient noise and transmission power . The value of for given nodes and a set of concurrently transmitting nodes is defined as
(1) 
where denotes the distance between and . A node successfully receives a message from iff and , where is the set of nodes transmitting at the same time. Transmission range is the maximal distance at which a node can be heard provided there are no other transmitters in the network. Without loss of generality we assume that the transmission range is all equal to . This assumption implies that the relationship holds. However, it does not affect generality and asymptotic complexity of presented results.
Communication graph In order to describe the topology of a network as a graph, we set a connectivity parameter . The communication graph of a given network consists of all nodes and edges between nodes that are within distance of at most , where is a fixed constant (the connectivity parameter). Observe that a node can receive a message from a node that is not its neighbor in the graph, provided interference from other nodes is small enough. The communication graph, defined as above, is a standard notion in the analysis of ad hoc multihop communication in the SINR model, cf., [8, 19].
Synchronization and content of messages We assume that algorithms work synchronously in rounds. In a single round, a node can transmit or receive a message from some node in the network and perform local computation. The size of a single message is limited to .
Knowledge of nodes Each node has a unique identifier from the set , where is the upper bound on the size of the network. Moreover, nodes know , the SINR parameters – , the linear upper bounds on the diameter and the degree of the communication graph.
Communication problems The global broadcast problem is to deliver a message from the designated source node to all the nodes in the network, perhaps through relay nodes as not all nodes are within transmission range of the source in multihop networks. At the beginning of an execution of a global broadcast algorithm, only the source node is active (i.e., only can participate in the algorithm’s execution). A node starts participating in an execution of the algorithm only after receiving the first message from another node. (This is socalled nonspontaneous wakeup model, popular in the literature.) The problem of local broadcast is local in its nature – the goal is to make each node to successfully transmit its own message to its neighbors in the communication graph. Here, all nodes start participating in an algorithm’s execution at the same round.
1.2 Related Work
In the last years the SINR model was extensively studied, both from the perspective of its structural properties [1, 15, 22] and algorithm design [12, 3, 8, 13, 17, 27, 23, 16]. The first work on local broadcast in SINR model by Goussevskaia et al. [12] presented an randomized algorithm. After that, the problem was studied in various settings. Halldorsson and Mitra presented an algorithm in a model with feedback [14]. Recently, for the same setting Barenboim and Peleg presented solution working in time [3]. For the scenario when the degree is not known Yu et al. in [27] improved on the solution of Goussevskaia et al. to . However, no deterministic algorithm for local broadcast was known in the scenario that nodes do not know their coordinates.
For the global broadcast problem a few deterministic solutions are known and all of them use the information about location of nodes. Broadcast can be accomplished deterministically in time in such setting [20]. The main randomized results on broadcast in ad hoc settings include papers of Daum et al. [8] and JurdziÅski et al. [19]. Solutions with complexity, respectively and are presented, where is a parameter depending on the geometry of the network. Recently Halldorsson et al. [13] proposed an algorithm which can be faster assuming that nodes are equipped with some extra capabilities (e.g., detection whether a received message is sent from a close neighbor) and the interference function on the local (onehop) level is defined as in the classic radio network model. In the harsh scenario, where connectivity of a network might rely on socalled weak links, deterministic global broadcast requires rounds [21].
In the related multihop radio network model the broadcast problem is well examined. The complexity of randomized broadcast is of order of [2, 7, 24]. A series of papers on deterministic broadcast give the upper bound of [4, 7, 9] with the lower bound of [24]. In terms of and , the best bounds are and [6]. For the closest to our model geometric unitdiskgraph radio networks, the complexity of broadcast is [10, 11], even when nodes know their coordinates.
1.3 Our Contribution
As a general tool, we develop a clustering algorithm which splits nodes of a multihop network into clusters such that: (i) each cluster is included in a ball of constant diameter; (ii) each ball of diameter contains nodes from clusters. Using the clustering algorithm, we develop a local broadcast algorithm in this setting for ad hoc wireless networks, which accomplishes the task in rounds, where is the density of the network. Up to our knowledge, this is the first solution in the considered pure ad hoc scenario which is only away from the trivial universal lower bound , valid also for scenarios with randomization and other features.
Using clustering, we also build a deterministic global broadcasting algorithm that terminates within rounds, where is the diameter of the network graph. This result is complemented by a lower bound in the network of degree , where is the pathloss parameter of the environment. This lower bound, in view of previous work, shows that randomization or knowledge of own location substantially help (by a factor polynomial in ) in the global broadcast. Previous results on global broadcast in related models achieved time , thus they were independent of the networks density . They relied, however, on either randomization, or access to coordinates of nodes in the Euclidean space, or carrier sensing (a form of measurement of strength of received signal). Without these capabilities, as we show, the polynomial dependence of is unavoidable.
Therefore, our results prove that additional model features may help in global communication problems, but not much in local problems such as local broadcast, in which advanced algorithms are (almost) equivalent to the presence of additional model features. Using designed algorithmic techniques we also provide efficient solutions to the wakeup problem and the global leader election problem.
1.4 High Level Description of Our Technique
The key challenge in designing efficient algorithm in the ad hoc wireless scenario is the interference from dense areas of a network. Note that if randomization was available, nodes could adjust their transmission probabilities and/or signal strength such that the expected interference from each ball of radius is bounded by a constant.
Another problem stems from the fact that it is impossible to distinguish received messages which are sent by close neighbors from those sent by nodes on boundaries of the transmission range. This issue could be managed if nodes had access to their geographic coordinates or were equipped with carrier sensing capabilities (thanks to them, the distance of a neighbor can be estimated based on a measurement of the strength of received signal). In the pure ad hoc model considered in this paper, all those tools are not available.
Our algorithmic solutions rely on the notion of clustering, i.e., a partition of a set of nodes into clusters such that: (i) each cluster is included in a ball of diameter ; (ii) each ball of diameter contains nodes from clusters and (iii) each node knows its cluster ID. First, we develop tools for (partially) clustered networks. We start with a sparsification algorithm, which, given a set o clustered nodes , gradually decreases the largest number of nodes in a cluster and eventually ends with a set such that (and at least one!) nodes from each cluster of belongs to . Using sparsification algorithm, we develop a tool for imperfect labeling of clusters, which results in assigning temporary IDs (tempID) in range to nodes such that nodes in each cluster have the same tempID. Moreover, an efficient radius reduction algorithm is presented, which transforms clustering into clustering.
Two communication primitives are essential for efficient implementation of our tools, namely, Sparse Network Schedule (SNS) and Close Neighbors Schedule (CNS). Sparse Network Schedule is a communication protocol of length which guarantees that, given an arbitrary set of nodes with constant density, each element of performs local broadcast (i.e., there is a round in which the message transmitted by is received in distance ). Close Neighbors Schedule is a communication protocol of length which guarantees that, given an arbitrary set of nodes of density with clustering of for , each close enough pair of elements of from each sufficiently dense cluster can hear each other during the schedule.
Given the tools for clustered set of nodes, we build a global broadcasting algorithm, which works in phases. In the th phase we assure that all nodes awaken^{3}^{3}3A node is awaken in the phase if it receives the broadcast message for the first time in that phase. in the st phase perform local broadcast. In this way, the set of nodes awaken in the first phases contains all nodes in communication graph of distance from the source. Moreover, we assure that all nodes awaken in a phase are clustered. (We start with the cluster formed by nodes in distance from the source , awaken in a round in which is the unique transmitter.) A phase consists of three stages. In Stage 1, an imperfect labeling of each cluster is done. In Stage 2, Sparse Network Schedule is executed times^{4}^{4}4Here, we assume that is the maximal number of nodes in a ball of radius . This number differs from the degree of the communication graph by a constant multiplicative factor.. A node with label participates in the th execution of SNS only. In this way, all nodes transmit successfully on distance . All nodes awaken in Stage 2 inherit cluster ID from nodes which awaken them. In this way, we have clustering of all nodes awaken in Stage 2. In Stage 3, a clustering of awaken nodes is formed by using an efficient algorithm that reduces the radius of clustering.
Our algorithm for local broadcast builds a clustering of the whole network, assigns tempIDs to nodes in clusters with use of the imperfect labeling algorithm, and eventually performs local broadcast by applying Sparse Network Schedule for each prospective tempID separately. (Recall that, in the local broadcast problem, all nodes are awaken simultaneously, just at the beginning of a protocol.) Thus, the key challenge here is to build a clustering starting from an unclustered network. First, we use our sparsification technique to build a sparse set of leaders and a schedule such that each node of the network is in distance of hops from some leader with respect to the schedule . Then, starting from clusters containing neighbors of leaders, we gradually build clustering of the whole network.
For efficient implementation of our solutions, we build a new type of selectors, called witnessed (cluster aware) strong selectors. These selectors implement algorithmically an implicit collision detection, which filters out most connections on large distance in an execution of Close Neighbors Schedule. Therefore, properties of the new selectors might be applicable for designing efficient (deterministic) solutions for other communication problems.
1.5 Organization of the paper
Section 2 contains basic notions and definitions. Combinatorial tools are introduced and applied to build SINR communication primitives in Section 3. Section 4 describes the sparsification technique which leads to the clustering algorithm. In Section 5, we describe solutions of global/local broadcast as well as other communication problems. Finally, a lower bound for global broadcast is provided.
2 Preliminaries
Geometric definitions Let denote the ball of radius around point on the plane. That is, . We identify with the set of nodes of the network that are located inside . A unit ball is a ball with radius . Let denote the maximal number of points which can be located in a ball of radius such that each pair of points from the set isin distance larger or equal to .
Clustered and unclustered sets of nodes Assume that each node from a given set is associated with a pair of numbers , where is its unique ID and is the cluster of , . A set of pairs associated to nodes is called a clustered set of nodes. The set is partitioned into clusters, where cluster denotes the set .
An unclustered set of nodes is just a subset of , which might be also considered as a clustered set, where each node’s cluster ID is equal to , i.e., for each .
Geometric clusters Consider a clustered set of nodes such that each node is located in the Euclidean plane (i.e, it has assigned coordinates determining its location). The clustering of is an clustering for if the following conditions are satisfied:

For each cluster , all of nodes from the cluster are located in the ball , where is an element of called the center of .

For each clusters , the centers of are located in distance at least , i.e., .
The density of an unclustered network/set denotes the largest number of nodes in a unit ball. For a clustered network/set , the density is equal to the largest number of elements of a cluster. Below, we characterize dependencies between the degree of the communication graph and the density of the network.
Fact 1.
Let be equal to the largest degree in the communication graph . Then,
1. There exist constants such that the density of each unclustered network is in the range .
2. There exist constants such that the density of each clustered network for is in the range .
As the density of a network and the degree of its communication graph are linearly dependent, we will use / to denote both the density and the degree of the communication graph.^{5}^{5}5This overuse of notations has no impact on asymptotic upper bounds on complexity of algorithms designed in this paper, since dependence on density/degree will be at most linear. For a clustered set of density , a cluster is dense when it contains at least elements. Similarly, for an unclustered set of density , a unit ball is dense when it contains at least elements.
In the following, we define the notion of a close pair, essential for our network sparsification technique. Let be the number satisfying the equation . Thus, according to the definition of the function and the definition of a dense cluster/ball, (, resp.) is the upper bound on the smallest distance between nodes of a dense cluster in an clustered (unclustered, resp.) network.
Definition 1.
Nodes form a close pair in an clustered network of density if the following conditions hold for some and cluster :

,

,

and for each from the cluster ;

for any such that and .
The nodes form a close pair in an unclustered network of density iff the above conditions (a)–(d) hold provided for each node of a network and .
A node is a close neighbor of a node iff is a close pair.
An intuition behind the requirements of the definition of a close pair is as follows. In the clustered case, only the pairs inside the same cluster are considered to be a close pair, by (a). The requirement of (b) states that and can form a close pair only in the case that is at most the upper bound on the smallest distance between closest nodes of a dense cluster/ball. The item (c) assures that is the closest node to and is the closest node . Finally, (d) states that and can be a close pair only in the case that the distances between nodes in their close neighbourhood are not much smaller than . This requirement’s goal is to count as a close pair only in the case that is of the order of the smallest distance between nodes in some neighbourhood. Polynomial attenuation of strength of signals with the power in the SINR model will assure that, for a close pair , a successful reception of a message from to will depend on behaviour of some constant number of the nodes which are closest to and . We formalize this intuition in further part of this work (Lemmas 5 and 6). In the following lemma we observe presence of close pairs in dense areas of a network.
Lemma 1.
Assume that the density of a set of nodes is . Then,

If is unclustered, then there is a close pair in each ball such that is dense.

If is clustered, then there is a close pair in each dense cluster.
Imperfect labeling Assume that a clustering of a set of nodes with density is given. Then, imperfect labeling of is a labeling of all elements of such that label for each and, for each cluster and each label , the number of nodes from the cluster with label is at most , where is a fixed constant. That is, the labeling is “imperfect” in the sense that, instead of unique labels, we guarantee that each label is assigned to at most nodes in a cluster.
3 Combinatorial Tools for SINR communication
In this section, we introduce combinatorial structures applied in our algorithms and basic communication primitives using these structures.
3.1 Combinatorial tools
In this section we will apply the idea (known e.g. from deterministic distributed algorithms in radio networks) to use families of sets with specific combinatorial properties as communication protocols in such a way that the nodes from the th set of the family are transmitters in the th round. Below, we give necessary definitions to apply this approach, recall some results and build our new combinatorial structures.
A transmission schedule for unclustered (clustered, resp.) sets is defined by (and identified with) a sequence of subsets of (, resp.), where the th set determines nodes transmitting in the th round of the schedule. That is, a node with ID (and , resp.) transmits in round if and only if (, resp.).
A set selects from when . The family of sets over is called strongly selective family (or ssf) if for each subset such that and each there is a set that selects from . It is well known that there exist ssf of size for each , [5].
We introduce a combinatorial structure generalizing ssf in two ways. Firstly, it will take clustering into account, assuming that some clusters might be “in conflict”. Secondly, selections of elements from a given set will be “witnessed” by all nodes outside of . As we show later, this structure helps to determine close pairs of nodes efficiently in a SINR network. We start from a basic variant called witnessed strong selector (wss), which does not take clustering into account. (A restricted variant of wss has been recently presented in [21].) Then, we generalize the structure to clustered sets. We call this variant witnessed cluster aware strong selector (wcss).
Witnessed strong selector. The definition of witnessed strong selector extends the notion of ssf by requiring that for each element of a given set of size and each , is selected from by such a set from the selector that as well.
A sequence of sets over satisfies witnessed strong selection property for a set , if for each and there is a set such that and . One may think that is a “witness” of a selection of in such a case. A sequence is a witnessed strong selector (or wss) of size if, for every subset of size , the family satisfies the witnessed strong selection property for .
Note that any wss is also, by definition, an ssf. Additionally, wss guarantees that each element outside of a given set of size has to be a “witness” of selection of every element from .
Below we state an upper bound on the optimal size of wss. It is presented without a proof, since its generalization is given in Lemma 3 and accompanied by a proof which implies Lemma 2 (Lemma 2 is an instance of Lemma 3 for the number of clusters ).
Lemma 2.
For each positive integers and , there exists an wss of size .
Now, we generalize the notion of wss to the situation that witnessed strong selection property is analyzed for each cluster separately, assuming that each cluster might be in conflict with other clusters. The conflict between the clusters and means that a selection of an element from by a set from the selector is possible only in the case that the considered set does not contain any element from the cluster .
witnessed cluster aware strong selector. We say that a set is free of cluster if for all we have . A set is free of the set of clusters if it is free of each cluster . Let be a set of nodes from the cluster and be a set of clusters in conflict with the cluster . Then, a sequence of subsets of satisfies witnessed cluster aware selection property (wcss property) for with respect to if for each and each from cluster (i.e., ) there is a set such that , and (i.e., is free of clusters from ). In other words, wcss property requires that for each and each from , is selected by some , is a witness of a selection of by (i.e., ), and is free of the clusters from .
A sequence of subsets of is a witnessed clusters aware strong selector (or wcss) if for any set of clusters of size , any cluster and any set of size , satisfies wcss property for with respect to .
Lemma 3.
For each natural , and , there exists a wcss of size .
Proof.
We use the probabilistic method. Let be a sequence of sets build in the following way. The set is chosen randomly as follows. First, the set of “allowed” clusters is determined by adding each cluster ID from to with probability . Then, the set is determined by adding independently to for each and to the set , with probability . Let be the set of tuples such that , , , , , , . The size of is
For a fixed tuple , let be the conjunction of three independent events:

: and .

: ,

: ,
Then, occurs with probability
The sequence is a wcss when the event occurs for each element of for some index . The probability that, for all indices , does not occur for the tuple is equal to
Thus, the probability that there exists a tuple for which does not occur for all is smaller or equal to
By choosing large enough, the above probability gets strictly smaller than and therefore the probability of obtaining wcss is positive. ∎
3.2 Basic communication under SINR interference model
Using introduced selectors, we provide some basic communication primitives on which we build our sparsification and clustering algorithms. As the proofs of stated properties are fairly standard, we present them in Appendix.
Firstly, consider networks with constant density. Below, we state that a fast efficient communication is possible in such a case.
Lemma 4.
(Sparse Network Lemma) Let be a fixed constant and be the parameter defining the communication graph. There exists a schedule of length such that each node transmits a message which can be received (at each point) in distance from in an execution of , provided there are at most nodes in each unit ball.
The schedule defined in Lemma 4 will be informally called Sparse Network Schedule or shortly SNS. The following lemmas imply that, for a successful transmission of a message on a link connecting a close pair, it is sufficient that some set of constant size of close neighbors is not transmitting at the same time (provided a round is free of some set of clusters of constant size ). This fact will allow to apply wss (and wcss) for efficient communication in the SINR model.
Lemma 5.
There exists a constant (which depends merely on the SINR parameters) which satisfies the following property. Let be a close pair of nodes in an unclustered set . Then, there exists a set such that , and receives a message transmitted from provided is sending a message and no other element from is sending a message.
Lemma 6.
Let be an clustered set for a fixed . Then, there exists constants (depending only on and SINR parameters) satisfying the following condition. For each cluster and each close pair of nodes from , there exists of size and a set of clusters of size such that receives a message transmitted by in a round satisfying the conditions:

no node from (except of ) transmits a message in the round,

the round is free of clusters from .
3.3 Proximity graphs
The idea behind sparsification algorithm extensively applied in our solutions is to repeat several times the following procedure: identify a graph with edges connecting all close pairs and “sparsify this” graph (switch off some nodes) appropriately. To this aim, we introduce the notion of proximity graph. For a given (clustered) set of nodes , a proximity graph of is any graph on this set such that:

vertices of each close pair are connected by an edge,

the degree of each node of the graph is bounded by a fixed constant,

for each edge of .
Using wellknown strongly selective families (ssf) and Lemma 5, one can build a schedule of length such that each close pair of an unclustered network exchange messages during an execution of . However, this property is not sufficient for fast construction of a proximity graph, since nodes may also receive messages from distant neighbors during an execution of which migh result in large degrees. Moreover, each node knows only received messages after an execution of , but it is not aware of the fact which of its messages were received by other nodes. Finally, a direct application of ssf for clustered network/set requires additional increase of time complexity. In order to build proximity graphs efficiently, we use the algorithm ProximityGraphConstruction (Alg. 1). This algorithm relies on properties of witnessed (cluster aware) strong selectors.
Our construction builds on the following observations. Firstly, if can hear in a round in which is transmitting as well, then is for sure not a close pair (otherwise, generates interferences which prevents reception of the message from by ). Secondly, given a close pair , can hear in a round in which transmits, provided:
Given an wss/wcss for constants from Lemmas 5 and 6, one can build a proximity graph in rounds using the following distributed algorithm at a node (see pseudocode in Alg. 1 and an illustration on Fig. 1):

Exchange Phase: Execute .

Filtering Phase:

Determine the set of all nodes such that has received a message from during and has not received any other message in rounds in which is transmitting (according to ). Remark: ignore messages from other clusters in the clustered case.

If , then remove all elements from .


Confirmation Phase

Send information about the content of to other nodes in consecutive repetitions of .

Choose as the set of neighbours of in the final graph. (That is, exchange messages with its neighbours during an execution .)

Lemma 7.
(Close Neighbors Lemma) ProximityGraphConstruction executed on a (clustered) set builds a proximity graph of constant degree in rounds. Moreover, ProximityGraphConstruction builds a schedule of size such that and exchange messages during for each edge of .
Proof.
First, we show that each close pair in is connected by an edge in , i.e., and after an execution of ProximityGraphConstruction on .
By Lemma 5 we know that, if is the only transmitter among nodes closest to (including ), then receives the message. By the fact that the schedule is a wcss, we know that there is a round satisfying this condition during an execution of . More precisely, we choose and from Lemma 6. Thus, and after Exchange phase (see Fig. 1(a)).
Since is a close pair, it is impossible that receives a message from in a round in which transmits a message (since and ). Thus, (, resp.) belongs to (, resp.) after Filtering Phase.
A node can also purge the candidate set during Filtering Phase, if the set of candidates contains more than nodes. However, as is wcss, if is a close pair then the sizes of and ar at most . Indeed, let denote the set of nodes closest to (including ). Then, for any node which is not among closest nodes to , there is a round in in which transmits uniquely in and also transmits by the witnessed strong selection property of . Thus, is eliminated from in Filtering Phase, as well as any other node not in . So, the set of candidates for is a subset of , thus its size is at most (see Fig. 1(b)).
It is clear that the degree of resulting graph is at most . The round complexity of the procedure is at most . ∎
4 Network sparsification and its applications
In this section we describe a sparsification algorithm which decreases density of a network by removing some nodes from dense areas and assigning them to their “parents” which are not removed. Simultaneously, the algorithm builds a schedule in which removed nodes exchange messages with their parents. Using sparsification as a tool, we develop other tools for (non)sparsified networks. Finally, a clustering algorithm is given which partitions a network (of awaken nodes) into clusters in time independent of the diameter of a network.
4.1 Sparsification
A sparsification algorithm for a constant is a distributed ad hoc algorithm which, executed on a set of nodes of density , determines a set such that:

density of is at most ,

each knows whether and each has assigned such that and parent exchange messages during an execution of on .
Moreover, if is a clustered set, cluster=cluster for each with assigned .
Alg. 2 contains a pseudocode of our sparsification algorithm. The algorithm builds a proximity graph of a network several times. Each time a proximity graph is determined, an independent set of is computed such that:

for each dense cluster , at least one element of is in (clustered case) or

for each dense unitball , there is an element in located close to the center of .
Then, some elements of are linked with their neighbours in by parent/child relation and removed from the set of nodes attending consecutive executions of ProximityGraphConstruction. In this way, the density of the set of nodes attending the algorithm gradually decreases. Details of implementation of the algorithm (including differences between clustered and unclustered variant) and analysis of its efficiency are presented below.
For a clustered network, IndependentSet is chosen as the set of local minima in a proximity graph, that is,
Lemma 8.
Alg. 2 is a sparsificaton algorithm for clustered networks, it works in time .
Proof.
First, let us discuss an implementation of the algorithm in a distributed ad hoc network. A proximity graph is built by Alg. 1 in rounds. Moreover, after an execution of Alg. 1, all nodes know their transmission patterns in the schedule of length such that each pair of neighbors in exchange messages during . Therefore, each node knows its neighbors in as well. In order to execute the remaining steps in the forloop, nodes determine locally whether they belong to NewChl. Then elements of NewChl choose their parents and send messages to them using the schedule . Simultaneously, each node updates the local children variable if it receives messages from its new child(ren). (Note that a directed graph connecting children with their parents is acyclic.
Lemma 7 implies that the graph built by ProximityGraphConstruction connects by an edge each close pair. Thus, by Lemma 1.2, contains at least one edge connecting elements of , for each dense cluster . Thus, contains at least one element of and, since edges connect only nodes of the same cluster, at least one element of determines its parent inside . As the result, at least two elements of are removed from Active in each execution of ProximityGraphConstruction.
These properties guarantee that each repetition of the main forloop decreases the number of elements of Active in each dense cluster. Therefore, each cluster contains at most elements of Active after repetitions of the loop.
Let , , be the intersections of Active, Prnts and GlobChl with a cluster . Above, we have shown that , while the algorithm returns . Thus, in order to prove the lemma, it is sufficient to show that . For each cluster , each element of from this cluster has assigned nonempty subset of the cluster as children of and the sets of children of nodes are disjoint. Thus, the number of elements of in the cluster is at most as large as the number of elements of in that cluster: . Summing up, , , . As , , and are disjoint, the finally returned set contains at most elements from the considered cluster . ∎
For unclustered networks IndependentSet is determined by a simulation of the distributed Maximal Independent Set (MIS) algorithm for the LOCAL model [26] which, for graphs of constant degree, works in time . As ProximityGraphConstruction builds an time schedule in which each pair of neighbors (of a built proximity graph) exchange messages, we can simulate each step of the algorithm from [26] in rounds. In contrast to the clustered networks, we are unable to guarantee that a single execution of Sparsification reduces the density in an unclustered network. This is due to the fact that a node from dense unit ball might become a parent of a node outside of this ball; as a result the number of elements in the considered unit ball is not reduced during an execution of Sparsification. Therefore, we have to deal with unclustered case more carefully. Let and let SparsificationU be an algorithm which executes Sparsification for , where , is the set returned by Sparsification and is the set returned as the final result.
Lemma 9.
An execution of SparsificationU on an unclustered set of density returns a sequence of set and schedules such that the density of is at most , each node exchange messages with parent during . Morover, SparsificationU works in time .
Proof.
The analysis of the sparsification algorithm for clustered networks relies on the fact that parent/child of each node belongs to the same cluster as . Thanks to that, each parent removed from Active (and added to Prnts) can be associated with at least one child in the same cluster and this child will not belong to the sparsified set returned by the algorithm. In the unclustered network, it might be the case that nodes from a dense unitball become parents of other nodes which do not belong to . That is why the reasoning from Lemma 8 does not apply here.
We define an auxiliary notion of saturation. For a unit ball , the saturation of with respect to the set of nodes is the number of elements of in . As long as contains at least nodes in an execution of ProximityGraphConstruction, there is a close pair in (Lemma 1) and therefore are connected by an edge in . Thus, or is not in the computed MIS (line 5); wlog assume that is not in MIS. Thus, is dominated by a node from MIS and therefore it becomes a member of GlobChl and it is switched off. This in turn decreases saturation of . Thus, an execution of Sparsification results either in decreasing the number of nodes in to or in reducing saturation of by at least . As might be covered by unit balls, there are at most nodes in and saturation can be reduced at most times. Hence, repetitions of Sparsification eventually leads to the reduction of the number of elements of to . ∎
4.2 Complete Sparsification
A complete sparsification algorithm is a distributed ad hoc algorithm/schedule which, executed on an clustered set of nodes of density determines sets such that (each node knows whether ), and, for each :

the density of is at most for constant and ,

each has assigned such that and parent exchange messages during and cluster=cluster (in the clustered case).
Below, we describe a complete sparsification algorithm. In Lemma 10, the properties of this algorithm (following directly from Lemma 8) are summarized.
Lemma 10.
Algorithm 4 is a complete sparsificaton algorithm for clustered networks which works in time .
4.3 Imperfect labelings of clusters
Using clustering of a set with density , it is possible to build efficiently an imperfect labeling of , where is a constant which depends merely on and SINR parameters.
Lemma 11.
Assume that an clustering of a set of density is given. Then, it is possible to build imperfect labeling of in rounds, where depends merely on and SINR parameters.
Proof.
Let and be the schedules and the sets obtained as the result of an execution of CompleteSparsification. CompleteSparsification splits each cluster in trees, defined by the childparent relation build during an execution of CompleteSparsification. (Indeed, the elements of are the roots of the trees.) The schedule (and its reverse ) of length allows for bottomup (topdown) communication inside these trees. Using and one can implement treelabeling algorithm as follows. First, each node learns the size of its subtree in a bottomup communication. Then, the root starts topdown phase, where each node assigns the smallest label in a given range to itself and assigns appropriate subranges to its children. More precisely, the root starts from the range , where is the size of the tree. Given the interval , each node assigns as its own ID and splits into its subtrees. ∎
4.4 Reduction of radius of clusters
In this section we show how to reduce the radius of a clustering. Given an clustering of a set of density , our goal is to build an clustering of . The idea is to repeat the following steps several times. First, is sparsified such that nodes remain from each nonempty cluster. Then, a minimum independent set (MIS) of this sparse set is determined on the graph with edges connecting , which exchange messages during an execution of Sparse Network Schedule (see Lemma 4). The elements of MIS become the centers of new clusters in the new clustering and they execute Sparse Network Schedule. A node (not in MIS) which receives a message from during this execution of SNS, becomes an element of and its is removed from further consideration. As we show below, a clustering of can be obtained in this way efficiently by Alg. 5.
Lemma 12.
Assume that a clustering of a set of density is given for a fixed constant . Then, Algorithm 5 builds clustering of in rounds.
Proof.
By Lemma 8, the set has density and it contains at least one element from each nonempty cluster of . Each pair of nodes from in distance exchange messages during step 4, therefore the graph contains the communication graph of . As computed in step 6 is a MIS of in the graph and contains at least one element from each nonempty cluster, there is an element of in close neighborhood of each cluster. More precisely, for a dense cluster with nodes located inside , there is an element of the computed maximal independent set (MIS) in . Indeed, by Lemma 10, there is from in . Thus, , where is the center of the cluster . Either is in the computed MIS or is in distance from an element from the MIS. In the former case, . Then, steps 7–9 assign all nodes in distance (and some in distance ) from elements of to new clusters included in unit balls (defined by the elements of ). Thus, the nodes from each “old” cluster are assigned to new clusters after repetitions of the main forloop. ∎
4.5 Clustering algorithm
In this section we provide an algorithm which, given an unclustered set , builds a clustering of . The algorithm consists of two main parts. In the former part, the sequence of sets is built using SparsificationU, for and . By Lemma 9, the density of is at most . Moreover, a sequence of schedules is built such that each exchange messages with parent. In the latter part, we start from clustering of , which is obtained by assigning each node to a separate cluster. Then, given an clustering of for , we get clustering of by executing with messages equal to cluster IDs of transmitting nodes. The elements of choose clusters of their parents. Using RadiusReduction (Lemma 12), the obtained clustering is transformed into an clustering. A pseudocode is presented in Alg. 6.
Theorem 1.
The algorithm Clustering builds clustering of an unclustered set of density in time .