Rapid selforganised initiation of ad hoc sensor networks close above the percolation threshold
Abstract
This work shows potentials for rapid selforganisation of sensor networks where nodes collaborate to relay messages to a common data collecting unit (sink node). The study problem is, in the sense of graph theory, to find a shortest path tree spanning a weighted graph. This is a wellstudied problem where for example Dijkstra’s algorithm provides a solution for nonnegative edge weights. The present contribution shows by simulation examples that simple modifications of known distributed approaches here can provide significant improvements in performance. Phase transition phenomena, which are known to take place in networks close to percolation thresholds, may explain these observations. An initial method, which here serves as reference, assumes the sink node starts organisation of the network (tree) by transmitting a control message advertising its availability for its neighbours. These neighbours then advertise their current cost estimate for routing a message to the sink. A node which in this way receives a message implying an improved route to the sink, advertises its new finding and remembers which neighbouring node the message came from. This activity proceeds until there are no more improvements to advertise to neighbours. The result is a tree network for cost effective transmission of messages to the sink (root). This distributed approach has potential for simple improvements which are of interest when minimisation of storage and communication of network information are a concern. Fast organisation of the network takes place when the number of connections for each node (degree) is close above its critical value for global network percolation and at the same time there is a threshold for the nodes to decide to advertise network route updates.
1 Transmission within sensor networks
This contribution shows simulated examples of simple and rapid organisation of large ad hoc sensor networks [1]. The work may have direct interest for design and application of sensor networks. The present practical examples may also impact theoretical development [2] and have general interest for understanding network organisation outside the scope of sensor and computer networks.
The present work is about minimisation of transmission power for sending messages between nodes in a sensor network. This is directly relevant for underwater sensor networks based on acoustic communication. Energy consumption for transmission in this case dominates total energy consumption and hence battery life time [3].
Rapid network organisation with a minimum of control traffic and low transmission power is a general protection measure for ad hoc sensor networks. The amount of control traffic and the level of transmission power directly affect the probability to discover and to map a sensor network from outside. Minimisation of storage of network information in nodes (klocal information) and low processing complexity is also a general protection measure reducing opportunities for malicious attacks.
2 The contribution of this work
Assume a connected weighted graph in the sense of graph theory. It is a wellstudied problem to find the shortest path tree spanning such a graph [4, 5, 6, 7, 8]. Vertices () are below called ’nodes’, edges () are called ’links’ and weights are called ’link cost’. This wording is due to the present focus on sensor networks. The present simulations demonstrate potentials for simple, fast, silent (few messages between nodes) and selforganised buildup of a spanning tree for efficient data relaying from any node (sensor) and to a common sink node (root). A point is also to minimise storage of network information in single nodes. Each node has (or can generate) an estimate of a cost (a real number) for transmitting a message back to a node from which it directly receives a message (for example a function of the ratio between received and transmitted energy assuming the node can measure the received effect and knows the initial output effect).
The present simulations are for 4000 nodes with random positions within a flat (2d) square area of . The nodes can transmit messages to each other within a restricted range (assuming electromagnetic or acoustic communication). Each node initially stores (for formal reasons) a cost estimate equal to infinity () for relaying a message to the sink node . The sink node at first advertises to its neighbours a cost estimate equal zero for sending a messages to the sink (i.e. to itself). This gives cost estimates for the neighbours and which they further advertise to their neighbours. A node , which in general receives a cost estimate form a neighbour (which might be the sink), adds the received cost estimate from the sender plus a cost to transmit directly back to it. This gives a cost estimate to send a message to the sink. If this estimate represents an improved path to the sink (i.e. ), it updates its cost estimate (setting its value to ). It then also sets a pointer to point to the neighbour (or ). In addition it advertises to its neighbours its latest improved cost estimate. This procedure proceeds until convergence, and the set of such pointers gives a shortest path tree spanning all nodes.
The above method to find a covering minimum path length tree is wellknown. However, this work tests out the idea to do the following modifications to this approach:

Restrict the set of edges (neighbours) in the original network (subtract) to the nearest neighbours so that the network is close to loose its global connectivity. This means, in other words, that nodes ignore their most distant neighbours.

Restrict advertisement of improved cost estimates (so that nodes report to their neighbours only ”significant” improvements of their costs estimates for the path to the sink).
These restrictions can give a surprisingly fast network layout where each node needs to transmit to its neighbours typically only 24 messages during buildup.
The actual cost for transmitting data here increases with distance faster than linearly. This favours relying data through many small hops as compared to few large (spatially long) hops. Hence reduction of actual connections to the ’nearest’ neighbours (in the sense of transmission cost), allows for reduction of radio transmission range and probability for communication interference (data packet loss).
Several authors have discussed connectivity in wireless ad hoc networks as a function of number of node neighbours [9, 10, 11]. At least eight neighbours for each node are enough to assure connectivity within a set of nodes with random distribution on a flat (twodimensional) area.
Authors have applied ideas on existence of critical transmission power and also percolation theory to investigate connectivity in ad hoc sensor networks [12, 13, 14, 15, 16, 17]. Flaxman et. al [13] had a setting of unreliable nodes and they found a critical radius to guarantee multihop communication links where nodes have a random distribution within a square area. Muthukrishnan and Gopal Pandurangan [18] similarly found critical values for transmission range to ensure connectivity and path length estimates using random graph analysis. An intuition behind the present work is to see if network formation can benefit from (critical) phenomena, such as change in correlation lengths or ’temporal alignments’ or order, taking place in networks close to their percolation threshold. Tuning of number of neighbours and restrictions for sending control messages affect correlation lengths emerging in systems close to their phase transitions [see for example Refs. [19, 20, 21]].
A reason to pay attention to the present approach is its simplicity. Alternatives can for example be to use dominating (sub)sets of nodes to reduce control traffic and data packet collision in ad hoc sensor networks [22, 23, 24, 25]. A selected subset of nodes here function as a traffic backbone. The present work assumes no such structure and dependencies in the network adding complexity. The present method for relaying data to a sink node, also requires no network wide (global) identity for nodes (it assumes for example no IP addresses).
3 The sink direction protocol
3.1 Finding minimum cost routes to a sink
This section addresses locally based organisation of cost optimal multihop routing from a set of sensor nodes and to a common sink or receiver. The approach in this section serves as a reference below and it is similar to common search for optimal paths on road maps [cf Dijkstra’s algorithm [6]]. Eq. 2 here defines the cost for transmitting a signal between nodes. It exemplifies a general class of cost functions increasing with distance between transmitter and receiver. Each node in the established tree network has a pointer telling to which neighbour to direct a message for further routing to the sink (root). Actual cost functions are additive so that the total cost of a route is the sum of the cost for each step along the route.
A simple method to create a tree network for data routing, is that any node receiving information about a decreased value for the cost of sending a message to the sink, transmits to its neighbours a message telling its identity and its new cost value. A node which receives such a message, updates its pointer and cost estimate if it represents a better (more cost effective) way to the sink. Each node , in other words, keeps a variable (route cost) where the value is an estimate for the total cost for transmitting a message to the sink . This variable is, for formal reasons, infinite (i.e. ) for each node which has not received a message. When a node receives a message from a neighbour giving an improved cost estimate, i.e. if
(1) 
then it sets its pointer to the edge leading to () which then is part of a finite cost route from to the sink . Section 4 gives a modified version of this simple sink direction method and which possesses improved convergence.
Figure 1 shows an example from simulation of initiation of an energy optimal tree network where transmission cost per packet, , scales with range as proposed by Stojanovic [3] for underwater acoustic systems:
(2) 
and are in this example both for simplicity set to 100 m.
The simulation is for 4000 nodes with a random distribution within a square area of . The transmission range is 300 m. The average number of neighbours (node degree) is in this case 70 for nodes more than 300 m from the border. The nodes can send data packets to their neighbourhood at given time steps (for example each second). Note that the routes tend to consist of small steps (due to the definition of link cost).
The author made the software for the present simulations (which produced data for Figure 1) by direct programming in Ada 2005 (GNU Ada under Linux). The size of the actual program illustrates the low complexity of the present approach (about 300 Ada program statements in total). A simulation took less than one minute on a common laptop with 64bit CPU (the quickest simulations  below called the ”rapid method”  took less than 4 seconds). This simulation included generation of random node positions and link connections as illustrated by Figure 1.
4 Modification of the sink direction protocol
4.1 Using link cost for temporal control
Assume a network as in Section 3.1 above. Let ”agents” start to walk along each edge from a sink node with constant velocity . Each time an agent arrives at a node, it triggers other agents to precede the walk with the same velocity along adjacent paths. The first arrival at a node in this case gives the shortest path to the sink. This is an intuitive distributed implementation of Dijkstra’s breadthfirst search algorithm [6]. The ”agent walk” can be defined to be selfavoiding since repeated walk along the same path is not any optimal path.
Synchronised clocks at the nodes give the opportunity to implement this idea looking at link cost as ”road length”. Assume a number of nodes as above. The sink node, as above, initiates tree network formation by sending a message telling its link cost (equal zero). Assume the nodes have synchronised clocks giving the time elapsed since the sink sent its message starting network organisation. Any node (as above) keeps updated its cost estimate according to estimates received. However, it delays to advertise its cost estimate until the condition is fulfilled (where is the assumed ”velocity” as above or simply cost per time unit). Assume here that the probability for simultaneous advertisements is zero.
Time synchronisation here functions as an alternative to central synchronisation by the sink node as described in [8]. Each node in this case only send one message during the network (tree) formation. However, access to a common time parameter normally requires (for example radio) receivers or clocks and communication to synchronise them. The nodes could keep track of time since start of a signal from the sink. This requires estimates of time for signals in for example the water (for underwater sensor networks based on acoustic communication). Such time control for sending messages also requires conservative waiting times (i.e. a ”velocity” small enough) to assure that a node does not receive a better cost estimate after it already has advertised its cost estimate.
Time synchronised search as described above, seems to fall in the category of centrally designed systems. I.e. the alternative nonsynchronised and distributed approach below may have more general interest for signalling within for example biological systems.
4.2 Application of onestep neighbour lists
Assume a set of nodes act as in Section 3.1 above. Each node in addition initially collects the identity of the neighbours of its neighbours and their links with associated cost (neighbour lists). Consider, in this case, a node receives a message directly from a node which tells its present cost estimate for the cost to transport a data package to the sink node . If ), then can (as in Section 3.1 above) update its cost estimate and set its pointer to point to . It could then advertise the value of its new cost estimate . However, can use its information about its neighbourhood to check if it can expect better cost estimates from its neighbourhood if there is a multihop route from to via its neighbours giving an even better (total) cost estimate for routing data to the sink. will in this case wait to transmit its latest cost estimate until it receives a better cost estimate from one of its neighbours. This will reduce the number of messages transmitted during network buildup (as compared to the approach of Section 3.1). Note, however, that creation of neighbour lists for each node requires to send messages reducing the net gain with respect to minimising number of messages.
4.3 Relaxation of condition to inform neighbours on new cost estimates
The sink direction protocol above makes nodes create a pointer system converging towards a minimum path tree for transmitting single messages from a node to the sink. This work shows by example that it is possible, by simple modifications, significantly to reduce the number of messages during this process. Actual modifications are:

The following condition (test) replaces Eq. 1:
(3) where the constant defines a threshold for a node to tell neighbours about improved cost estimate. The results below are from simulations with .

The nodes only listen to their nearest neighbouring nodes (in terms of link cost). The results below are from simulations with .
Eq. 1 and 3 define the decision of a node to transmit to its neighbours its newest (best) cost estimate . The present simulations employ a random time delay from when the node recognises validity of this condition (defined by Eq. 1 and 3) and until the transmission actually takes place. This delay time has an uniform distribution in the range time steps (seconds). Random delay times for sending data packets is a common technique to avoid packet collisions. A node may receive messages from the neighbours during the delay time.
Figure 2 shows time series of total number of messages (from all 4000 nodes) produced via the present (simulated) approach.
The upper graph (green, with label ”local*, ”) shows results from a naive local procedure where each node reports any improved cost estimate (given by the condition set by Eq. 1). The graph next below the upper graph (with label ”local, ”) shows cost estimates where the number of neighbours are restricted to and (same as for upper graph). The red graph shows result from simulation where nodes use neighbourhood information as described in Section 4.2.
Figure 3
shows the distribution (histogram) of number of messages sent by each node to generate a tree network to relay data to the sink for respectively and . Note that for the case and 8 neighbours () each node needs typically to transmit only threefour messages to generate the whole tree network. Hence there is little room for further improvements of performance defined as number of calculation and transmissions of signals (however the resulting tree network is not fully cost optimal as noted below). The number of primitive calculations scales linearly with the number of nodes. Note that the nonlocal method includes transmissions to obtain neighbour list from the neighbours. The numerical results above do not include this type of initial traffic.
The simulations with resulted in a tree network which is not cost optimal to transport of data to the sink (Figure 4 shows the distribution of cost).
Figure 5 shows the distribution
of the number of nodes on its route (downstream) to the sink for two simulations with respectively and where is number of neighbours and is as above. The figure shows that the most rapid method () actually gives fewer number of nodes on the route down to the sink as compared to the more comprehensive search method (). One may guess the opposite would take place since restricting the neighbourhood to the few nearest neighbours would make more small hops.
Figure 6 shows average values for the 4000 nodes in the present simulations. It shows (average) number of messages from a node and the cost of the path from the node and to the sink. These (node average) values are for the threshold in the range 1 to 1.5 and network degree . The (average) number of messages per node here increases from about 2 for and to about 30 for and . Such a significant change in number of messages suggests a ”regime shift” in the network signalling without similar change in cost.
5 Discussion
The present sink direction protocol has similarities to heat conduction (elliptic problems). The application of Eq. 1 resembles solving a heat equation where the cost estimate at a node is the temperature which relaxes (adapts) to the neighbourhood. The application of Eq. 3 similarly resembles an energy distribution with a quantum (barrier) for small scale energy movements. One may speculate if the Ising model can be used to make fast and favourable selforganised long range correlations in sensor networks and minimise network related control traffic.
Figure 7 illustrates how long range correlation
can emerge in a situation of a threshold and network degree close to the threshold for global network percolation. For these values of and , there is a larger probability that the process will more directly (with less updates) give the shortest/stable path as compared to otherwise (i.e. for and larger than necessary for global network percolation). This tendency will increase for subsequent link updates as the tree approaches the node far from .
This work assumes, for simplicity, no packet loss. Packet loss may affect convergence, but it seems intuitively not to affect the main conclusions of this work. Lost messages can cause delay and redundant transmissions.
The sink direction protocol generates pointers forming a tree network leading to a data collection unit (sink node). Data may in this situation be lost if a node do not perform its task to pass on data packets. Figure 5 shows the distribution of number of upstream nodes within the (simulated) network of Figure 1. Both of these figures indicate that data from large parts of the network may not reach the sink node if another node close to it malfunctions. However, several nodes can normally only by listening detect whether a node within its neighbourhood do not perform its data packet relay function. Hence one of the neighbouring nodes may take over its function to relay data. Data streams (use of the network) may constantly be used to optimise the network (reorganisation of pointers) if data packets contain information about (one step) transmission cost (given by for example Eq. 2). This also gives opportunities for fine tuning of transport ways. Note that this behaviour is similar to biological systems.
The present protocol may allow nodes to be passive in data relaying for example by simply not advertising their cost estimates. These nodes can still take part in data collection as (leaf) nodes in the tree network (but not initially relay data).
References

[1]
I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, E. Cayirci, A survey on sensor
networks, IEEE Communications Magazine (2002) 102–114.
URL http://citeseer.ist.psu.edu/akyildiz02survey.html  [2] S. N. Dorogovtsev, J. F. F. Mendes, Evolution of Networks: From Biological Nets to the Internet and Www (Physics), Oxford University Press, 2003.
 [3] M. Stojanovic, On the relationship between capacity and distance in an underwater acoustic communication channel, in: WUWNet ’06: Proceedings of the 1st ACM international workshop on Underwater networks, ACM, New York, NY, USA, 2006, pp. 41–47.
 [4] B. Y. Wu, K.M. Chao, Spanning Trees and Optimization Problems, CRC Press, 2004.
 [5] D. Eppstein, Spanning trees and spanners, in: Handbook of Computational Geometry, Elsevier, 1999, pp. 425––461.
 [6] E. W. Dijkstra, A note on two problems in connexion with graphs, Numerische Mathematik 1 (1959) 269–271.
 [7] M. Elkin, Computing almost shortest paths, ACM Trans. Algorithms 1 (2) (2005) 283–323.
 [8] B. Awerbuch, Randomized distributed shortest paths algorithms, in: STOC ’89: Proceedings of the twentyfirst annual ACM symposium on Theory of computing, ACM, New York, NY, USA, 1989, pp. 490–500.
 [9] L. Kleinrock, J. Silvester, Optimum transmission radii in packet radio networks or why six is a magic number, in: National Telecommunications Conference, IEEE, Birmingham, Alabama, 1978, pp. 4.3.1–4.3.5.
 [10] H. Takagi, L. Kleinrock, Optimal transmission ranges for randomly distributed packet radio terminals, IEEE Transactions on Communications (1984) 246–257.
 [11] F. Xue, P. R. Kumar, The number of neighbors needed for connectivity of wireless networks, Wirel. Netw. 10 (2) (2004) 169–181.
 [12] S. Meguerdichian, F. Koushanfar, M. Potkonjak, M. B. Srivastava, Coverage problems in wireless adhoc sensor networks, in: IEEE INFOCOM, 2001, pp. 1380–1387.
 [13] A. Flaxman, A. Frieze, E. Upfal, Efficient communication in an adhoc network, J. Algorithms 52 (1) (2004) 1–7.
 [14] L. Ding, Z.H. Guan, Modeling wireless sensor networks using random graph theory, Physica A: Statistical Mechanics and its Applications 387 (12) (2008) 3008–3016.
 [15] B. Krishnamachari, S. B. Wicker, R. Béjar, Phase transition phenomena in wireless adhoc networks, in: Global Telecommunications Conference, 2001. GLOBECOM ’01. IEEE, Vol. 5, 2001, pp. 2921–2925.
 [16] P. Gupta, P. R. Kumar, Critical power for asymptotic connectivity, in: Proceedings of the 37th IEEE Conference on Decision and Control, Tampa, Florida, 1998, pp. 547–566.
 [17] M. Sánchez, P. Manzoni, Z. J. Haas, Determination of critical transmission range in adhoc networks, in: In Multiaccess, Mobility and Teletraffic for Wireless Communications (MMT’99), 1999, pp. 6–8.
 [18] S. Muthukrishnan, G. Pandurangan, The bincovering technique for thresholding random geometric graph properties., in: SODA, SIAM, 2005, pp. 989–998.
 [19] H. E. Stanley, Introduction to phase transitions and critical phenomena, Oxford, Clarendon Press, 1971.
 [20] N. Goldenfeld, Lectures on Phase Transitions and the Renormalization Group, Perseus Publishing, 1992.
 [21] L. P. Kadanoff, Statistical Physics: Statics, Dynamics and Renormalization, World Scientific Pub, 2000.
 [22] Y. Wu, Y. Li, Construction algorithms for kconnected mdominating sets in wireless sensor networks, in: MobiHoc ’08: Proceedings of the 9th ACM international symposium on Mobile ad hoc networking and computing, ACM, New York, NY, USA, 2008, pp. 83–90.
 [23] W. Shang, F. Yao, P. Wan, X. Hu, On minimum connected k dominating set problem in unit disc graphs, Journal of Combinatorial Optimization 16 (2) (2008) 99–106.
 [24] M. T. Thai, N. Zhang, R. Tiwari, X. Xu, On approximation algorithms of kconnected mdominating sets in disk graphs, Theor. Comput. Sci. 385 (13) (2007) 49–59.
 [25] Y. Li, M. T. Thai, F. Wang, C.W. Yi, P.J. Wan, D.Z. Du, On greedy construction of connected dominating sets in wireless networks: Research articles, Wirel. Commun. Mob. Comput. 5 (8) (2005) 927–932.